[ Again, with his permission, I forward the following collection of
comments on the Draft TEI Guidelines from Hans van Halteren, of Nijmegen University. -LB ] Fragmentary comments on the TEI report - Status of the guidelines In some places in the report (sorry, can't find exact spots right now) I had the impression that the guidelines allow several different methods of tagging the same thing. Is this part of the discussion and will one method be chosen eventually, or will this freedom remain? In the last case it will be harder to create software which can handle all TEI encoded texts. - 5.3.5 Glosses Looking at the examples, there appear to be several kinds of glosses: an added gloss (e.g. eluthemen), which does not function in the sentence and may or may not (not determinable in the example) be present in the actual text a gloss which is in the text and actually functions in the sentence (e.g. parser) This difference is found important enough in normal words, as there are separate tags <term> and <cited.word>. Should the difference not be tagged for glosses as well? - 5.3.8 Lists List items may also be marked in different ways (cf. Latex). I would propose that a list item consists of an <itemmark> and an <itembody>. This seems more general than to introduce an exception for the case of gloss lists. - 5.3.4 Foreign words and 5.3.5 Terms Similar tags could be created for substandard words/expressions (e.g. heavy dialect) deliberately illformed words (e.g. to simulate a foreigner speaking or someone with a speech impediment) idiomatic expressions - 5.3.1 Paragraphs and Their Contents I am not sure whether figures and tables should be seen as part of the contents of the paragraph. Is it not possible that they function on a higher structural level? What do you propose for illuminations, which do not actually function in the text at all? - Tagging vs. Actual Text Before reading the report I assumed that all information added to the raw text would be placed inside tags. In this case, throwing away all tags (as proposed on by some) would leave the raw text. In the report (mainly in chapter 6) I see that some information is placed between tags instead of inside them (e.g. <f.name> SING </f.name>). - 5.11.2 Special Layout Tags "considerable work is needed": yes indeed. In the system I am building I want to display the text exactly as it occurred in the original (well, as close as technically possible). Therefore, I not only need the structure of the text, but also the layout. For the moment I am using a homegrown tagset (appended below [deleted -LB]). Some of these tags are mappable to the TEI tagset, some I can't find right away (e.g. tabbing). Something I haven't quite worked out (for myself) is the treatment of <extm> (figures and such). Floating figures have separate two places in the text: the place where they are found in the text and the place they are referred to in the text. The place where they are found may be in the middle of a word: ...................................... hyphen- FIGURE <pagebreak> ated ....................... Therefore I use (at least at the moment) <extm> as well as <?extm> to represent these two places. Note that all layout tag may occur in the middle of a word, which causes all kinds of processing problems. However, seeing my goal, you understand I want to keep them there rather than just shift them to the end of the word. - 3.2.3 Entity References Would it be a good idea to set up a central administration of special character names, i.e. TEI additions to appendix D.4? ========== |
Free forum by Nabble | Edit this page |