Comments on Textual Variation in TEI Draft

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

Comments on Textual Variation in TEI Draft

Lou Burnard-7
[I forward these comments, unedited, from my colleague
Peter Robinson, at his request pending the ListServer's noticing
his request for a subscription to TEI-L
   -- LB  ]

Some thoughts on encoding of textual variation for TEI.

The guidelines proposes four systems of encoding textual variation:
1. Parallel segmentation (5.10.3);
2. Single end-point attachment, attaching variants at the end of the
 corresponding reading in the base text (5.10.4);
3. Single end-point attachment, attaching variants at the beginning of
the corresponding reading in the base text (5.10.4);
4. Double end-point attachment (5.10.5).

This looks to me to be three systems too many. Double end-point attachment
might be all we need:

1. Parallel segmentation can be treated as a special case of double
end-point attachment, one in which every variant in every text begins
and  ends at exactly the same point.

2. Single end point attachment must be converted to double end
point  attachment before it is useful. This could prove difficult:
software would  have to find the other end point by comparing the lemma
with the base text,  scanning the text forward or back from the single
declared end point. Where  the lemma abbreviates or otherwise alters the
base text (as in the example on  p. 114 of the guidelines) this could
fail. Better to begin with double end  point attachment and have done
with it.

Double end point attachment allows explicit and orderly treatment of
overlapping lemmata (which parallel segmentation does not). It is
unambiguous (which single end point attachment is not).  Nothing is lost
by  concentrating on it, except the deficiencies of the other systems.

There is also the question of how we indicate the end points, and how
we indicate the link between the lemma (placed between the end points)
and  the variant on the lemma (that is, on the text between the end
points). The  guidelines use the "anchor" method: identifiers are placed
in the base text  before and after each lemma (<anchor id=a1> etc); at
the beginning of each  variant entry in the apparatus the span of that
variant is stated (<app  startpoint=a1 endpoint=a2> etc).

It seems a little odd to me that we mark explicitly the beginning and
end of the lemma in the base text, but we do not mark it explicitly in
the  variant. Of course, when one is looking at the variant only within
the  apparatus this does not matter: the whole variant is given, placed
beside  the lemma, so the beginning and end of the variant declare
themselves. But  one can imagine many circumstances where one is not
looking at the variant  within the apparatus. For example, one might be
reading through the variant  source itself, rather than just reading
bits of it decomposed through an  apparatus. If one marked the beginning
and end of the variant text, as well  as the lemma, those markers could
then be read back into the variant source,  and could then be used to
"look up" the parallel text in the master, or in  some other text.

It looks to me as if the method outlined in 6.2.5, "explicit alignment
of multiple analyses" (p. 142) would permit something just like this. At
the  least, it would be inconsistent to adopt one method of indicating
anchors and  links in critical apparatus and another method when dealing
with the very  similar matter of alignment of multiple analyses.

Finally: I am suspicious of the system of "nesting" variants given in
the guidelines. For example: on p. 117 the apparatus states that
witnesses A  and C both read "The quick". But they don't! C actually
reads "The sleek", as  we learn three lines down. This looks a nonsense
to me. Either C reads "The  quick" or it reads "The sleek". It cannot
read both, and the apparatus should  not try and suggest that it does. I
cannot see any advantages in this, and I can see lots of possibilities
for confusion of both man and machine.