[ Again, with his permission, I forward the following collection of
comments on the Draft TEI Guidelines from Hans van Halteren, of
Nijmegen University. -LB ]
Fragmentary comments on the TEI report
- Status of the guidelines
In some places in the report (sorry, can't find exact spots right now) I
had the impression that the guidelines allow several different methods
of tagging the same thing. Is this part of the discussion and will one
method be chosen eventually, or will this freedom remain? In the last case
it will be harder to create software which can handle all TEI encoded texts.
- 5.3.5 Glosses
Looking at the examples, there appear to be several kinds of glosses:
an added gloss (e.g. eluthemen), which does not function in the sentence
and may or may not (not determinable in the example) be present in the
a gloss which is in the text and actually functions in the sentence
This difference is found important enough in normal words, as there are
separate tags <term> and <cited.word>. Should the difference not be tagged
for glosses as well?
- 5.3.8 Lists
List items may also be marked in different ways (cf. Latex). I would
propose that a list item consists of an <itemmark> and an <itembody>.
This seems more general than to introduce an exception for the case of
- 5.3.4 Foreign words and 5.3.5 Terms
Similar tags could be created for
substandard words/expressions (e.g. heavy dialect)
deliberately illformed words (e.g. to simulate a foreigner speaking or
someone with a speech impediment)
- 5.3.1 Paragraphs and Their Contents
I am not sure whether figures and tables should be seen as part of the
contents of the paragraph. Is it not possible that they function on a
higher structural level? What do you propose for illuminations, which
do not actually function in the text at all?
- Tagging vs. Actual Text
Before reading the report I assumed that all information added to the raw
text would be placed inside tags. In this case, throwing away all tags (as
proposed on by some) would leave the raw text. In the report (mainly in
chapter 6) I see that some information is placed between tags instead of
inside them (e.g. <f.name> SING </f.name>).
- 5.11.2 Special Layout Tags
"considerable work is needed": yes indeed. In the system I am building I
want to display the text exactly as it occurred in the original (well,
as close as technically possible). Therefore, I not only need the
structure of the text, but also the layout. For the moment I am using a
homegrown tagset (appended below [deleted -LB]). Some of these tags are
mappable to the TEI tagset, some I can't find right away (e.g. tabbing).
Something I haven't quite worked out (for myself) is the treatment of <extm>
(figures and such). Floating figures have separate two places in the text:
the place where they are found in the text and the place they are referred
to in the text. The place where they are found may be in the middle of a word:
Therefore I use (at least at the moment) <extm> as well as <?extm> to
represent these two places.
Note that all layout tag may occur in the middle of a word, which causes
all kinds of processing problems. However, seeing my goal, you understand
I want to keep them there rather than just shift them to the end of the word.
- 3.2.3 Entity References
Would it be a good idea to set up a central administration of special
character names, i.e. TEI additions to appendix D.4?
|Free forum by Nabble||Edit this page|