CMSP-Q, What do you want to do with this message. It has been in the
account for a while. S/Kev. ----------------------------Original message---------------------------- I am interested in whether I can use existing TEI facilities for writing about language. I am currently using a two-part distinction: the <lma> tag marks a lemmatised headword, while the <frm> tag marks a specific written instance. For example, if I were dealing with Early Modern English, <frm>shyppe</frm> and <frm>ship</frm> would both be forms of <lma>Ship</lma>. The <lem> tag in TEI is available only in apparati critici, and the dictionary tags seem to want more structure (where I want to use the tags for phrasal elements in running prose). Any suggestions? David --- David Megginson Department of English, University of Ottawa, [hidden email] Ottawa, Ontario, CANADA K1N 6N5 [hidden email] Phone: (613) 564-6850 (Office) [hidden email] (613) 564-9175 (FAX) |
On Thu, 8 Sep 1994 15:35:57 CDT David Megginson said:
>I am interested in whether I can use existing TEI facilities for >writing about language. I am currently using a two-part distinction: >the <lma> tag marks a lemmatised headword, while the <frm> tag marks a >specific written instance. For example, if I were dealing with Early >Modern English, <frm>shyppe</frm> and <frm>ship</frm> would both be >forms of <lma>Ship</lma>. I think the answer is, yes you can. Both of these usages look (at first glance, anyway) like specializations of the TEI element MENTIONED, which is explicitly intended for metalinguistic discussion. As defined by TEI P3, however, this element has no TYPE attribute, so the very simplest way of handling your distinction, namely to write In Early Modern English, <mentioned type='form'>shyppe</> and <mentioned type='form'>ship</> are both forms of <mentioned type='lemma'>Ship</>. won't work, because TYPE is not declared as an attribute of MENTIONED. (Perhaps the TYPE attribute ought to be made universal, so as to make such specializations / subclass elements easier to handle in all cases?) >The <lem> tag in TEI is available only in apparati critici, and the >dictionary tags seem to want more structure (where I want to use the >tags for phrasal elements in running prose). Any suggestions? Indeed -- apart from their etymology, the 'lemma' of a critical text and the 'lemma' of lemmatization have very little in common. (And mathematicians, used to the term 'lemma' as meaning 'an auxiliary proposition used in the proof of a theorem' have reported deep confusion when reading both the text-critical and the dictionary chapters.) So don't use LEM. The dictionary tags are closer to the semantics you are aiming at, I think, but as you say they aren't well suited for running text. The simplest way to tag your words, I would suggest, might be one of these: 1 select the additional tag set for analysis and interpretation, and use the INTERP element to define what you mean by the distinction --- perhaps something like: <interp id=lma resp='David Megginson' type='word form' value='dictionary form' > <interp id=frm resp='David Megginson' type='word form' value='attested (ms) form' > <!-- or perhaps value='oblique form' ? --> These can go virtually anywhere (but you need the revised, fixed DTD; in the first issue, these elements, like APP and others, are unreachable from anywhere, even with analysis selected). Each use of MENTIONED can now be labeled a lemma or an inflected form: In Early Modern English, <mentioned ana='frm'>shyppe</> and <mentioned ana='frm'>ship</> are both forms of <mentioned ana='lma'>Ship</>. 2 define two new elements, FRM and LMA (or INFLECTED and LEMMA if you prefer clarity to brevity), identifying them as subclasses of MENTIONED by using the TEIForm attribute. In one file (call it mytags.ent), put the declaration <!ENTITY % x.hqphrase 'inflected | lemma |' > In another (call it mytags.dtd) declare the two elements, copying the content model of MENTIONED: <!ELEMENT inflected - - (%phrase.seq) > <!ATTLIST inflected %a.global; TEIform CDATA 'mentioned' > <!ELEMENT lemma - - (%phrase.seq) > <!ATTLIST lemma %a.global; TEIform CDATA 'mentioned' > In the DTD subset of your document, declare these two files thus: <!ENTITY % TEI.extensions.ent SYSTEM 'mytags.ent' > <!ENTITY % TEI.extensions.dtd SYSTEM 'mytags.dtd' > And you're done. N.B. since the name FORM is taken, for the dictionary tag set's 'form group', it is not strictly conformant to reuse that name (since it would introduce a name collision if your extensions were to be used in connection with the dictionary tag set). I hope this helps. -C. M. Sperberg-McQueen ACH / ACL / ALLC Text Encoding Initiative University of Illinois at Chicago [hidden email] / u35395@uicvm |
Free forum by Nabble | Edit this page |