Comments on P1 from Hans van halteren

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Comments on P1 from Hans van halteren

Lou Burnard-7
[ Again, with his permission, I forward the following collection of
  comments on the Draft TEI Guidelines from Hans van Halteren, of
  Nijmegen University. -LB ]

Fragmentary comments on the TEI report

- Status of the guidelines

  In some places in the report (sorry, can't find exact spots right now) I
  had the impression that the guidelines allow several different methods
  of tagging the same thing. Is this part of the discussion and will one
  method be chosen eventually, or will this freedom remain? In the last case
  it will be harder to create software which can handle all TEI encoded texts.

- 5.3.5 Glosses

  Looking at the examples, there appear to be several kinds of glosses:
    an added gloss (e.g. eluthemen), which does not function in the sentence
      and may or may not (not determinable in the example) be present in the
      actual text
    a gloss which is in the text and actually functions in the sentence
      (e.g. parser)
  This difference is found important enough in normal words, as there are
  separate tags <term> and <cited.word>. Should the difference not be tagged
  for glosses as well?

- 5.3.8 Lists

  List items may also be marked in different ways (cf. Latex). I would
  propose that a list item consists of an <itemmark> and an <itembody>.
  This seems more general than to introduce an exception for the case of
  gloss lists.

- 5.3.4 Foreign words and 5.3.5 Terms

  Similar tags could be created for

    substandard words/expressions (e.g. heavy dialect)

    deliberately illformed words (e.g. to simulate a foreigner speaking or
       someone with a speech impediment)

    idiomatic expressions

- 5.3.1 Paragraphs and Their Contents

  I am not sure whether figures and tables should be seen as part of the
  contents of the paragraph. Is it not possible that they function on a
  higher structural level? What do you propose for illuminations, which
  do not actually function in the text at all?

- Tagging vs. Actual Text

  Before reading the report I assumed that all information added to the raw
  text would be placed inside tags. In this case, throwing away all tags (as
  proposed on by some) would leave the raw text. In the report (mainly in
  chapter 6) I see that some information is placed between tags instead of
  inside them (e.g. <f.name> SING </f.name>).

- 5.11.2 Special Layout Tags

"considerable work is needed": yes indeed. In the system I am building I
want to display the text exactly as it occurred in the original (well,
as close as technically possible). Therefore, I not only need the
structure of the text, but also the layout. For the moment I am using a
homegrown tagset (appended below [deleted -LB]). Some of these tags are
mappable to the TEI tagset, some I can't find right away (e.g. tabbing).

  Something I haven't quite worked out (for myself) is the treatment of <extm>
  (figures and such). Floating figures have separate two places in the text:
  the place where they are found in the text and the place they are referred
  to in the text. The place where they are found may be in the middle of a word:

    ...................................... hyphen-

         FIGURE

    <pagebreak>

    ated .......................

  Therefore I use (at least at the moment) <extm> as well as <?extm> to
  represent these two places.

  Note that all layout tag may occur in the middle of a word, which causes
  all kinds of processing problems. However, seeing my goal, you understand
  I want to keep them there rather than just shift them to the end of the word.

- 3.2.3 Entity References

  Would it be a good idea to set up a central administration of special
  character names, i.e. TEI additions to appendix D.4?

==========