Encoding examples in Lily's grammar

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Encoding examples in Lily's grammar

Martin Mueller

A very talented Latin student of mine is transcribing and encoding the 1543 first English version of Lily’s Latin grammar, using the digital facsimile of one of the treasures of Rare Book Library of the University of Illinois.  There are several encoding question on which I’d appreciate advice.

 

There are lots of inline list in the text as in

 

<p>Aduerbes some be of tyme, as<hi rend=”Antiqua”>Hodie, cras, olim, aliquando</hi>.</p>

 

One could leave the encoding at that level, but if you use <list> you do a better job of maximizing what I like to call the “query potential of the digital surrogate”: you can easily extract text used as an example,  which in turn makes it easier to look for traces of these examples in the Early Modern corpus. What do you do with the comma connectors of such lists, not to speak of the final “, and” connector>? In principle it should be possible to capture that as a @rend attribute. You could also fudge things, leave the trailing commas with the previous item, and add “,and” to the last item. I don’t like that. Is there a standard practice for this?

 

Then there are bilingual examples. Consider the following paragraph that could be encoded roughly  as

 

<p rend=Fraktur”>The accusatyue case foloweth the verbe, and aunswereth to this question, whom, or what, as
<hi rend=”Antiqua”>Amo magistrum</hi>,I loue the maister</q>
</p>

 

But that obscures the logical structure of the paragraph, which consists of an explanation, followed by an example, which is subdivided into a Latin text and its  translation.

 

<p>The accusatyue case foloweth the verbe, and aunswereth to this question, whom, or what, as
<q type="example">

<q xm:lang="la">Amo magistrum</q>,

<q xm:lang="la">I loue the maister</q>

</q>
</p>

 

That’s a lot of elements for a short  paragraph, but I can’t think of a more economic way of expressing this structure. Is there?

 

 

Reply | Threaded
Open this post in threaded view
|

Re: Encoding examples in Lily's grammar

Peter Flynn-8
On 01/08/2020 23:21, Martin Mueller wrote:

> A very talented Latin student of mine is transcribing and encoding the
> 1543 first English version of Lily’s Latin grammar, using the digital
> facsimile of one of the treasures of Rare Book Library of the University
> of Illinois.  There are several encoding question on which I’d
> appreciate advice.
>
> There are lots of inline list in the text as in
>
> <p>Aduerbes some be of tyme, as<hi rend=”Antiqua”>Hodie, cras, olim,
> aliquando</hi>.</p>

(Should be a space before the <hi ?)

> One could leave the encoding at that level, but if you use <list> you do
> a better job of maximizing what I like to call the “query potential of
> the digital surrogate”:

Sounds like a quotable term :-)

> What do you do with the comma connectors of
> such lists, not to speak of the final “, and” connector>?

How consistent is the document in the way these lists are printed?

> In principle it should be possible to capture that as a @rend
> attribute.
Provided YOU are consistent in your markup (regardless of how
inconsistent the original is) AND you document what you have done in
your header, you can choose whatever way is most convenient, I think.

> You could also fudge things, leave the trailing commas with the
> previous item, and add “,and” to the last item. I don’t like that. Is
> there a standard practice for this?

Leaving the commas inside each list item is doubleplusungood because it
messes up extracts because the commas are an artifact of formatting (the
Latin for today is "hodie" not "hodie,")

> Then there are bilingual examples. Consider the following paragraph that
> could be encoded roughly  as
>
> <p rend=Fraktur”>The accusatyue case foloweth the verbe, and aunswereth
> to this question, whom, or what, as
> <hi rend=”Antiqua”>Amo magistrum</hi>,I loue the maister</q>
> </p>
>
> But that obscures the logical structure of the paragraph, which consists
> of an explanation, followed by an example, which is subdivided into a
> Latin text and its  translation.
>
> <p>The accusatyue case foloweth the verbe, and aunswereth to this
> question, whom, or what, as
> <q type="example">
>
> <q xm:lang="la">Amo magistrum</q>,
>
> <q xm:lang="la">I loue the maister</q>
>
> </q>
> </p>
>
> That’s a lot of elements for a short  paragraph, but I can’t think of a
> more economic way of expressing this structure. Is there?

I haven't looked yet (need more coffee), but I would have done similarly
to what you have done, but only for the Latin, assuming the base
language is already set as Englysshe:

<p rend=Fraktur”>The accusatyue case foloweth the verbe, and aunswereth
to this question, whom, or what, as <hi rend=”Antiqua” xml:lang="la">Amo
magistrum</hi>, I loue the maister</q></p>


<rant>
If there is one thing that 30 years of dealing with the CELT texts has
taught me, it's that analytic markup of DH texts has no boundaries in
respect of depth. This is particularly true when a text (eventually)
contains additional markup inserted by generations of scholars for
placenames, personal names (often embedded inside one another), dates,
events, objects, parts of speech, editorial emendations, scribal
oddities, hands, lemmata, etc. I think the deepest one of their
documents currently goes is 19 levels, and you know what? It doesn't
matter: XML software will happily elide or act upon whatever markup you
have. We're building the bedrock upon which future scholarship can base
its activities, so I wouldn't let depth be a concern. If it worries
individuals, there are some excellent training courses in the use of
markup and XML tools :-)
</rant>

Peter
lou
Reply | Threaded
Open this post in threaded view
|

Re: Encoding examples in Lily's grammar

lou
In reply to this post by Martin Mueller
As usual, the answer to these questions is "it depends on your project goals". You can't safely assume (particularly in a scholarly edition of a single text like this) that no-one is ever going to be interested in the punctuation so it can't safely be algorithmized out of existence.  Did Lily use the Oxford comma? I can see at least one PhD thesis in the making. I think I would mark the list of examples as an example, with individual words within it tagged:

<p>Aduerbes some be of tyme, as <eg rend="antiqua"><term>Hodie</term>, <term>cras</term>,
<term>olim</term>, <term>aliquando</term>.</eg>

Likewise, even if the variation in type-style is systematic (e.g. Latin is always in italic), you might well choose to signal the two properties separately  e.g. <foreign xml:lang="la" rend="it"> precisely so as to test or demonstrate that systematicity.   As regards the examples, I think I would choose something other than <q> (which is a bit underspecified) such as <eg>. To show that the one example sentence is a translation of the other, you have the oft misused @corresp attribute, though this comes with the overhead of requiring an identifier on each item.

<p>The accusatyue case foloweth the verbe, and aunswereth to this question, whom, or what, as
<eg xml:lang="la" xml:id="LA01" corresp="#EN01" rend=”antiqua”>Amo magistrum</eg>,
<eg xml:lang="en" xml:id="EN01" corresp="#LA01" rend="antiqua">I loue the maister</eg>
</p>

If using <eg> here worries you, you could always fall back on <seg type="eg"> I suppose.

On Sat, 1 Aug 2020 at 23:22, Martin Mueller <[hidden email]> wrote:

>
> A very talented Latin student of mine is transcribing and encoding the 1543 first English version of Lily’s Latin grammar, using the digital facsimile of one of the treasures of Rare Book Library of the University of Illinois.  There are several encoding question on which I’d appreciate advice.
>
>  
>
> There are lots of inline list in the text as in
>
>  
>
> <p>Aduerbes some be of tyme, as<hi rend=”Antiqua”>Hodie, cras, olim, aliquando</hi>.</p>
>
>  
>
> One could leave the encoding at that level, but if you use <list> you do a better job of maximizing what I like to call the “query potential of the digital surrogate”: you can easily extract text used as an example,  which in turn makes it easier to look for traces of these examples in the Early Modern corpus. What do you do with the comma connectors of such lists, not to speak of the final “, and” connector>? In principle it should be possible to capture that as a @rend attribute. You could also fudge things, leave the trailing commas with the previous item, and add “,and” to the last item. I don’t like that. Is there a standard practice for this?
>
>  
>
> Then there are bilingual examples. Consider the following paragraph that could be encoded roughly  as
>
>  
>
> <p rend=Fraktur”>The accusatyue case foloweth the verbe, and aunswereth to this question, whom, or what, as
> <hi rend=”Antiqua”>Amo magistrum</hi>,I loue the maister</q>
> </p>
>
>  
>
> But that obscures the logical structure of the paragraph, which consists of an explanation, followed by an example, which is subdivided into a Latin text and its  translation.
>
>  
>
> <p>The accusatyue case foloweth the verbe, and aunswereth to this question, whom, or what, as
> <q type="example">
>
> <q xm:lang="la">Amo magistrum</q>,
>
> <q xm:lang="la">I loue the maister</q>
>
> </q>
> </p>
>
>  
>
> That’s a lot of elements for a short  paragraph, but I can’t think of a more economic way of expressing this structure. Is there?
>
>  
>
>