Lou's problem of translating sporadically recorded information into TEI
form will surely, as David Chesnutt has already observed, be a very common and important one. I'm with Frank Tompa on this one, though: the TEI scheme in its current form already provides methods for (a) marking place names and (b) providing a linguistic analysis or categorization of a word or phrase. So I don't at all see what the problem is: when the source text marks 'Bath' as a place name, tag it as a place name, and when the source text provides word-class information for a word, tag it in the usual way. Lou suggests that this would be inelegant or misleading, since not all place names are so tagged, and not all words are classified. But consider the alternatives: 1 lose the information 2 leave the information in its non-TEI form 3 complete the tagging (ie tag all the rest of the place names, and give part of speech and sense number for all words), so the tagging is consistent and complete, and then use the existing TEI tags 4 use the existing TEI tags, and note in the header that not all words are classed, not all place names are tagged, ... 5 invent new TEI-style tags, and continue to note in the header that not all words are classed, and not all place names tagged Of these, 1 is a bad idea. 2 is occasionally tempting, especially for things one doesn't know how to handle in TEI tagging, but it really just means engaging in only a partial conversion to TEI markup. Particularly when the information left unconverted *does* have a TEI form, such texts should not be regarded as TEI conformant. 3 is a pipedream in most cases, and violates the spirit of TEI's role as a format for interchange of texts without incurring information loss or requiring information enrichment. The only difference I see between 4 (which LB was uncomfortable with, and which prompted his inquiry) and 5 (which he suggests as a solution) is that the one uses the existing tags, and the other doesn't. I don't see that as a big advantage for choice 5, myself. Why on earth do we want to distinguish between the concepts PLACENAME (for which we have a tag) and PLACENAME-TAGGED-EVEN-THOUGH-OTHER-PLACENAMES-ARE-NOT-TAGGED (for which we don't, yet, though it would be a legal tag name). Perhaps a fuller description is needed in the header to allow users to specify how consistently and how thoroughly various tags (particularly for text enrichment) have been used. But not new tags for the same old information. Wearing no hat at all except a woolen cap to keep out Chicago's wind, Michael Sperberg-McQueen |
Perhaps TEI needs to add for some tags corresponding tags to indicate that
tags of the first sort are not being used consistently, for example, for placename tags a corresponding tag indicating that placename tags are not used consistently. (This may sound a bit as if I am poking fun, but I am not.) |
Free forum by Nabble | Edit this page |