[I forward these comments, unedited, from my colleague
Peter Robinson, at his request pending the ListServer's noticing
his request for a subscription to TEI-L
-- LB ]
Some thoughts on encoding of textual variation for TEI.
The guidelines proposes four systems of encoding textual variation:
1. Parallel segmentation (5.10.3);
2. Single end-point attachment, attaching variants at the end of the
corresponding reading in the base text (5.10.4);
3. Single end-point attachment, attaching variants at the beginning of
the corresponding reading in the base text (5.10.4);
4. Double end-point attachment (5.10.5).
This looks to me to be three systems too many. Double end-point attachment
might be all we need:
1. Parallel segmentation can be treated as a special case of double
end-point attachment, one in which every variant in every text begins
and ends at exactly the same point.
2. Single end point attachment must be converted to double end
point attachment before it is useful. This could prove difficult:
software would have to find the other end point by comparing the lemma
with the base text, scanning the text forward or back from the single
declared end point. Where the lemma abbreviates or otherwise alters the
base text (as in the example on p. 114 of the guidelines) this could
fail. Better to begin with double end point attachment and have done
Double end point attachment allows explicit and orderly treatment of
overlapping lemmata (which parallel segmentation does not). It is
unambiguous (which single end point attachment is not). Nothing is lost
by concentrating on it, except the deficiencies of the other systems.
There is also the question of how we indicate the end points, and how
we indicate the link between the lemma (placed between the end points)
and the variant on the lemma (that is, on the text between the end
points). The guidelines use the "anchor" method: identifiers are placed
in the base text before and after each lemma (<anchor id=a1> etc); at
the beginning of each variant entry in the apparatus the span of that
variant is stated (<app startpoint=a1 endpoint=a2> etc).
It seems a little odd to me that we mark explicitly the beginning and
end of the lemma in the base text, but we do not mark it explicitly in
the variant. Of course, when one is looking at the variant only within
the apparatus this does not matter: the whole variant is given, placed
beside the lemma, so the beginning and end of the variant declare
themselves. But one can imagine many circumstances where one is not
looking at the variant within the apparatus. For example, one might be
reading through the variant source itself, rather than just reading
bits of it decomposed through an apparatus. If one marked the beginning
and end of the variant text, as well as the lemma, those markers could
then be read back into the variant source, and could then be used to
"look up" the parallel text in the master, or in some other text.
It looks to me as if the method outlined in 6.2.5, "explicit alignment
of multiple analyses" (p. 142) would permit something just like this. At
the least, it would be inconsistent to adopt one method of indicating
anchors and links in critical apparatus and another method when dealing
with the very similar matter of alignment of multiple analyses.
Finally: I am suspicious of the system of "nesting" variants given in
the guidelines. For example: on p. 117 the apparatus states that
witnesses A and C both read "The quick". But they don't! C actually
reads "The sleek", as we learn three lines down. This looks a nonsense
to me. Either C reads "The quick" or it reads "The sleek". It cannot
read both, and the apparatus should not try and suggest that it does. I
cannot see any advantages in this, and I can see lots of possibilities
for confusion of both man and machine.
|Free forum by Nabble||Edit this page|