to <expan> or not to <expan>

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

to <expan> or not to <expan>

Sebastiaan Verweij-3
Dear all

A brief question: I’m considering not using the <expan> tags while transcribing a large body of seventeenth century mss. I note that the TEI P5 guidelines give a range of examples for <ex>, which we will use, and it seems optional to surround the entire expanded word with the <expan> tags (to mark its boundaries in some way). E.g., 

exa<ex>m</ex>ple
or
<expan>exa<ex>m</ex>ple</expan>

Our rationale is mainly around time saving, so I was wondering if you have a view on this in terms of TEI practice. Is there a good reason to retain <expan> if this will not add any functionality to our project? Have you omitted these tags in the past and wished you hadn’t? Thanks so much.  

Sebastiaan

Dr Sebastiaan Verweij
Lecturer in Late-Medieval and Early Modern English Literature
University of Bristol 
(+44) (0) 117 92 88090

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: to <expan> or not to <expan>

James Cummings-4
Hi Sebastiaan,

I would always wrap it in <expan> however this is probably
scriptable in XSLT from what you have, so it doesn't necessarily
need to be done by hand by the encoders. Indeed I think it would
be fairly straightforward to go from
exa<ex>m</ex>ple
to
<choice>
    <abbr>exa<am/>ple</abbr>
    <expan>exa<ex>m</ex>ple</expan>
</choice>
At least for values of orthographic, whitespace/punctuation
separated words with no other markup in them. ;-)
-james

On 18/07/17 15:44, Sebastiaan Verweij wrote:

> Dear all
>
> A brief question: I’m considering not using the <expan> tags
> while transcribing a large body of seventeenth century mss. I
> note that the TEI P5 guidelines give a range of examples for
> <ex>, which we will use, and it seems optional to surround the
> entire expanded word with the <expan> tags (to mark its
> boundaries in some way). E.g.,
>
> exa<ex>m</ex>ple
> or
> <expan>exa<ex>m</ex>ple</expan>
>
> Our rationale is mainly around time saving, so I was wondering
> if you have a view on this in terms of TEI practice. Is there a
> good reason to retain <expan> if this will not add any
> functionality to our project? Have you omitted these tags in
> the past and wished you hadn’t? Thanks so much.
>
> Sebastiaan
>
> —
> Dr Sebastiaan Verweij
> Lecturer in Late-Medieval and Early Modern English Literature
> University of Bristol
> (+44) (0) 117 92 88090
>

--
Dr James Cummings, [hidden email]
Academic IT Services, University of Oxford
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: to <expan> or not to <expan>

Lou Burnard-6

What on earth is that empty <am/> doing in your example James?

If the source actually  read exāple, the a-with a macron might  be inside the <am> I suppose. Or did you mean to put a floating macron in there?

Either way, I can hear Matthew Driscoll using the word "nutty".
 

On 18/07/17 16:09, James Cummings wrote:
Hi Sebastiaan,

I would always wrap it in <expan> however this is probably scriptable in XSLT from what you have, so it doesn't necessarily need to be done by hand by the encoders. Indeed I think it would be fairly straightforward to go from
exa<ex>m</ex>ple
to
<choice>
   <abbr>exa<am/>ple</abbr>
   <expan>exa<ex>m</ex>ple</expan>
</choice>
At least for values of orthographic, whitespace/punctuation separated words with no other markup in them. ;-)
-james

On 18/07/17 15:44, Sebastiaan Verweij wrote:
Dear all

A brief question: I’m considering not using the <expan> tags while transcribing a large body of seventeenth century mss. I note that the TEI P5 guidelines give a range of examples for <ex>, which we will use, and it seems optional to surround the entire expanded word with the <expan> tags (to mark its boundaries in some way). E.g.,

exa<ex>m</ex>ple
or
<expan>exa<ex>m</ex>ple</expan>

Our rationale is mainly around time saving, so I was wondering if you have a view on this in terms of TEI practice. Is there a good reason to retain <expan> if this will not add any functionality to our project? Have you omitted these tags in the past and wished you hadn’t? Thanks so much.

Sebastiaan


Dr Sebastiaan Verweij
Lecturer in Late-Medieval and Early Modern English Literature
University of Bristol
(+44) (0) 117 92 88090



Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: to <expan> or not to <expan>

James Cummings-4
Hi Lou,

You are, of course, correct. However, you can't automagically
determine what the abbreviation marker was by a script. (Well,
given a dictionary of abbreviations you might make a reasonable
_guess_)  So I was just putting it at the site of the expanded
text so that someone could then edit the <am/> to make it
kosher.  I somehow think it is vaguely better to indicate in the
abbreviated form that 'this is the place where the word is
abbreviated' demonstrated by that being where one expands it. I
was just suggesting that with a text using only <ex> elements in
words that one could get most of the way to a detailed <choice>.  
This falls down even more outside a western european context of
course where the abbreviation marker may be completely separate
from the orthographic word for all I know.

-James

On 18/07/17 20:44, Lou Burnard wrote:

>
> What on earth is that empty <am/> doing in your example James?
>
> If the source actually  read exāple, the a-with a macron might  
> be inside the <am> I suppose. Or did you mean to put a floating
> macron in there?
>
> Either way, I can hear Matthew Driscoll using the word "nutty".
>
>
> On 18/07/17 16:09, James Cummings wrote:
>> Hi Sebastiaan,
>>
>> I would always wrap it in <expan> however this is probably
>> scriptable in XSLT from what you have, so it doesn't
>> necessarily need to be done by hand by the encoders. Indeed I
>> think it would be fairly straightforward to go from
>> exa<ex>m</ex>ple
>> to
>> <choice>
>>    <abbr>exa<am/>ple</abbr>
>>    <expan>exa<ex>m</ex>ple</expan>
>> </choice>
>> At least for values of orthographic, whitespace/punctuation
>> separated words with no other markup in them. ;-)
>> -james
>>
>> On 18/07/17 15:44, Sebastiaan Verweij wrote:
>>> Dear all
>>>
>>> A brief question: I’m considering not using the <expan> tags
>>> while transcribing a large body of seventeenth century mss. I
>>> note that the TEI P5 guidelines give a range of examples for
>>> <ex>, which we will use, and it seems optional to surround
>>> the entire expanded word with the <expan> tags (to mark its
>>> boundaries in some way). E.g.,
>>>
>>> exa<ex>m</ex>ple
>>> or
>>> <expan>exa<ex>m</ex>ple</expan>
>>>
>>> Our rationale is mainly around time saving, so I was
>>> wondering if you have a view on this in terms of TEI
>>> practice. Is there a good reason to retain <expan> if this
>>> will not add any functionality to our project? Have you
>>> omitted these tags in the past and wished you hadn’t? Thanks
>>> so much.
>>>
>>> Sebastiaan
>>>
>>> —
>>> Dr Sebastiaan Verweij
>>> Lecturer in Late-Medieval and Early Modern English Literature
>>> University of Bristol
>>> (+44) (0) 117 92 88090
>>>
>>
>

--
Dr James Cummings, [hidden email]
Academic IT Services, University of Oxford
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: to <expan> or not to <expan>

Lou Burnard-6
Well yes indeed,  <ex> and <am> really only work under some assumptions
which just don't hold true for many non-Western European scripts, and
even I am sure in some WE cases. And looking again at the OP, I wonder
why it's necessary to mark up these expansions at all. Why not just
transcribe what is there, using the appropriate Unicode character if
there is one (which there is for 90% of cases) and <g> for the others?
Generating a modernized/expanded version seems to be more of a
formatting/rendering issue than an editorial one in this case. And
transcribing what is actually there surely must be quicker than trying
to recode it on the fly.

  On 19/07/17 10:06, James Cummings wrote:

>
> Hi Lou,
>
> You are, of course, correct. However, you can't automagically
> determine what the abbreviation marker was by a script. (Well, given a
> dictionary of abbreviations you might make a reasonable _guess_)  So I
> was just putting it at the site of the expanded text so that someone
> could then edit the <am/> to make it kosher.  I somehow think it is
> vaguely better to indicate in the abbreviated form that 'this is the
> place where the word is abbreviated' demonstrated by that being where
> one expands it. I was just suggesting that with a text using only <ex>
> elements in words that one could get most of the way to a detailed
> <choice>.  This falls down even more outside a western european
> context of course where the abbreviation marker may be completely
> separate from the orthographic word for all I know.
>
> -James
>
> On 18/07/17 20:44, Lou Burnard wrote:
>>
>> What on earth is that empty <am/> doing in your example James?
>>
>> If the source actually  read exāple, the a-with a macron might be
>> inside the <am> I suppose. Or did you mean to put a floating macron
>> in there?
>>
>> Either way, I can hear Matthew Driscoll using the word "nutty".
>>
>>
>> On 18/07/17 16:09, James Cummings wrote:
>>> Hi Sebastiaan,
>>>
>>> I would always wrap it in <expan> however this is probably
>>> scriptable in XSLT from what you have, so it doesn't necessarily
>>> need to be done by hand by the encoders. Indeed I think it would be
>>> fairly straightforward to go from
>>> exa<ex>m</ex>ple
>>> to
>>> <choice>
>>>    <abbr>exa<am/>ple</abbr>
>>>    <expan>exa<ex>m</ex>ple</expan>
>>> </choice>
>>> At least for values of orthographic, whitespace/punctuation
>>> separated words with no other markup in them. ;-)
>>> -james
>>>
>>> On 18/07/17 15:44, Sebastiaan Verweij wrote:
>>>> Dear all
>>>>
>>>> A brief question: I’m considering not using the <expan> tags while
>>>> transcribing a large body of seventeenth century mss. I note that
>>>> the TEI P5 guidelines give a range of examples for <ex>, which we
>>>> will use, and it seems optional to surround the entire expanded
>>>> word with the <expan> tags (to mark its boundaries in some way). E.g.,
>>>>
>>>> exa<ex>m</ex>ple
>>>> or
>>>> <expan>exa<ex>m</ex>ple</expan>
>>>>
>>>> Our rationale is mainly around time saving, so I was wondering if
>>>> you have a view on this in terms of TEI practice. Is there a good
>>>> reason to retain <expan> if this will not add any functionality to
>>>> our project? Have you omitted these tags in the past and wished you
>>>> hadn’t? Thanks so much.
>>>>
>>>> Sebastiaan
>>>>
>>>> —
>>>> Dr Sebastiaan Verweij
>>>> Lecturer in Late-Medieval and Early Modern English Literature
>>>> University of Bristol
>>>> (+44) (0) 117 92 88090
>>>>
>>>
>>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: to <expan> or not to <expan>

bertrand Gaiffe-2
In our project, we prefer to encode the expansion, because some
abbreviations are ambiguous; for instance ᵱ may be "par" or "per".

Even though it is not kasher, we encode, for instance :
<ex>par</ex>agraph so that we know there was an abbreviation an we
resolve it.

It then would be much easier to revert <ex>par</ex> and <ex>per</ex> to
ᵱ than it would be to resolve ᵱ as either par or per.


     Bertrand




Le 19/07/2017 à 12:30, Lou Burnard a écrit :

> Well yes indeed,  <ex> and <am> really only work under some
> assumptions which just don't hold true for many non-Western European
> scripts, and even I am sure in some WE cases. And looking again at the
> OP, I wonder why it's necessary to mark up these expansions at all.
> Why not just transcribe what is there, using the appropriate Unicode
> character if there is one (which there is for 90% of cases) and <g>
> for the others? Generating a modernized/expanded version seems to be
> more of a formatting/rendering issue than an editorial one in this
> case. And transcribing what is actually there surely must be quicker
> than trying to recode it on the fly.
>
>  On 19/07/17 10:06, James Cummings wrote:
>>
>> Hi Lou,
>>
>> You are, of course, correct. However, you can't automagically
>> determine what the abbreviation marker was by a script. (Well, given
>> a dictionary of abbreviations you might make a reasonable _guess_)  
>> So I was just putting it at the site of the expanded text so that
>> someone could then edit the <am/> to make it kosher.  I somehow think
>> it is vaguely better to indicate in the abbreviated form that 'this
>> is the place where the word is abbreviated' demonstrated by that
>> being where one expands it. I was just suggesting that with a text
>> using only <ex> elements in words that one could get most of the way
>> to a detailed <choice>.  This falls down even more outside a western
>> european context of course where the abbreviation marker may be
>> completely separate from the orthographic word for all I know.
>>
>> -James
>>
>> On 18/07/17 20:44, Lou Burnard wrote:
>>>
>>> What on earth is that empty <am/> doing in your example James?
>>>
>>> If the source actually  read exāple, the a-with a macron might be
>>> inside the <am> I suppose. Or did you mean to put a floating macron
>>> in there?
>>>
>>> Either way, I can hear Matthew Driscoll using the word "nutty".
>>>
>>>
>>> On 18/07/17 16:09, James Cummings wrote:
>>>> Hi Sebastiaan,
>>>>
>>>> I would always wrap it in <expan> however this is probably
>>>> scriptable in XSLT from what you have, so it doesn't necessarily
>>>> need to be done by hand by the encoders. Indeed I think it would be
>>>> fairly straightforward to go from
>>>> exa<ex>m</ex>ple
>>>> to
>>>> <choice>
>>>>    <abbr>exa<am/>ple</abbr>
>>>>    <expan>exa<ex>m</ex>ple</expan>
>>>> </choice>
>>>> At least for values of orthographic, whitespace/punctuation
>>>> separated words with no other markup in them. ;-)
>>>> -james
>>>>
>>>> On 18/07/17 15:44, Sebastiaan Verweij wrote:
>>>>> Dear all
>>>>>
>>>>> A brief question: I’m considering not using the <expan> tags while
>>>>> transcribing a large body of seventeenth century mss. I note that
>>>>> the TEI P5 guidelines give a range of examples for <ex>, which we
>>>>> will use, and it seems optional to surround the entire expanded
>>>>> word with the <expan> tags (to mark its boundaries in some way).
>>>>> E.g.,
>>>>>
>>>>> exa<ex>m</ex>ple
>>>>> or
>>>>> <expan>exa<ex>m</ex>ple</expan>
>>>>>
>>>>> Our rationale is mainly around time saving, so I was wondering if
>>>>> you have a view on this in terms of TEI practice. Is there a good
>>>>> reason to retain <expan> if this will not add any functionality to
>>>>> our project? Have you omitted these tags in the past and wished
>>>>> you hadn’t? Thanks so much.
>>>>>
>>>>> Sebastiaan
>>>>>
>>>>> —
>>>>> Dr Sebastiaan Verweij
>>>>> Lecturer in Late-Medieval and Early Modern English Literature
>>>>> University of Bristol
>>>>> (+44) (0) 117 92 88090
>>>>>
>>>>
>>>
>>
MLH
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

to <expan> or not to <expan>

MLH
In reply to this post by Sebastiaan Verweij-3
Could anyone explain the rationale for combining <ex> and <expan> in this way? (for someone not as familiar with the transcription module as he probably should be!)
Thanks,
Matthew

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: to <expan> or not to <expan>

Elli Mylonas
In reply to this post by Sebastiaan Verweij-3
A bit late to the party. Been trying to send this for the last day...

If you aren't using the <abbr> element to surround the part of the word that is visible in the document, then surrounding with <expan> is very useful for demarcating the extent of the abbreviated word. It also serves as a container for the <am> (abbreviation marker) element. 

It's possible to use a script as James has already described to put in the full encoding for the expan/abbr/ex cluster if words are unambiguous, or to fill out the tags and supplement ambiguities by hand.

It's easier for encoders to only use the <ex>, and perhaps should be part of encoding practice, but that shouldn't necessarily determine the resulting encoding. 

 --elli

[Elli Mylonas
 Senior Digital Humanities Librarian
 and
 Center for Digital Scholarship
 University Library
 Brown University
 library.brown.edu/cds]

On Tue, Jul 18, 2017 at 10:44 AM, Sebastiaan Verweij <[hidden email]> wrote:
Dear all

A brief question: I’m considering not using the <expan> tags while transcribing a large body of seventeenth century mss. I note that the TEI P5 guidelines give a range of examples for <ex>, which we will use, and it seems optional to surround the entire expanded word with the <expan> tags (to mark its boundaries in some way). E.g., 

exa<ex>m</ex>ple
or
<expan>exa<ex>m</ex>ple</expan>

Our rationale is mainly around time saving, so I was wondering if you have a view on this in terms of TEI practice. Is there a good reason to retain <expan> if this will not add any functionality to our project? Have you omitted these tags in the past and wished you hadn’t? Thanks so much.  

Sebastiaan

Dr Sebastiaan Verweij
Lecturer in Late-Medieval and Early Modern English Literature
University of Bristol 
(+44) (0) 117 92 88090


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: to <expan> or not to <expan>

Lou Burnard-6
In reply to this post by MLH
<expan> should contain the whole of a word that has been expanded. <expan>Mister</expan> when the source says "Mr."

<ex> should contain the bits of an expansion which are not present in the source text but have been added to it by an editor.
M<ex>iste</ex>r.

The truly obsessive will also wish to mark (using <am>) the full stop which signals that "Mr." is an abbreviation of course.

<expan>M<ex>iste</ex>r</expan><am>.</am>

Of course, it might be that (this is for Bertrand) you think "Mr." is actually an abbreviation for "Monseigneur", in which case you'd have

<expan>M<ex>onseigneu</ex>r</expan><am>.</am>

Or, if unable to decide, a <choice> containing both, and maybe also an <abbr> holding the original form just for fun. Though in that case, where to put the <am> becomes a bit trickier.




On 19/07/17 17:42, MLH wrote:
Could anyone explain the rationale for combining <ex> and <expan> in this way? (for someone not as familiar with the transcription module as he probably should be!)
Thanks,
Matthew


MLH
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

to <expan> or not to <expan>

MLH

Thank you Lou for this clear explanation.


I am still missing a couple of things though (sorry):


-is it recommended that <ex> should always be nested inside <expan>? (If so, this might be stated more explicitly in the Guidelines?)

-if it is optional to nest <ex> inside <expan>, what are the specific benefits one might realise from `<expan>ev<ex>er</ex>y</expan>` as opposed to `ev<ex>er</ex>y`?


I suppose the answer will depend on the project to some extent, but say for argument that it is a scholarly edition where it is sufficient to note the presence of an abbreviation that has been expanded, not the form of the original abbreviation.


Matthew




From: Lou Burnard <[hidden email]>
Sent: 19 July 2017 18:06
To: MLH; [hidden email]
Subject: Re: to <expan> or not to <expan>
 
<expan> should contain the whole of a word that has been expanded. <expan>Mister</expan> when the source says "Mr."

<ex> should contain the bits of an expansion which are not present in the source text but have been added to it by an editor.
M<ex>iste</ex>r.

The truly obsessive will also wish to mark (using <am>) the full stop which signals that "Mr." is an abbreviation of course.

<expan>M<ex>iste</ex>r</expan><am>.</am>

Of course, it might be that (this is for Bertrand) you think "Mr." is actually an abbreviation for "Monseigneur", in which case you'd have

<expan>M<ex>onseigneu</ex>r</expan><am>.</am>

Or, if unable to decide, a <choice> containing both, and maybe also an <abbr> holding the original form just for fun. Though in that case, where to put the <am> becomes a bit trickier.




On 19/07/17 17:42, MLH wrote:
Could anyone explain the rationale for combining <ex> and <expan> in this way? (for someone not as familiar with the transcription module as he probably should be!)
Thanks,
Matthew


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: to <expan> or not to <expan>

James Cummings-4
Hi Matthew,

1) No it is not explicitly recommended that <ex> should always be
nested inside <expan>, the TEI allows people a lot of flexibility
in how they mark abbreviated words, expanded words, abbreviation
markers, and expanded text because there are lots of different
needs, traditions, and editorial perspectives.

2) I personally always would recommend have the enclosing markup
inside <expan>, and indeed inside a <choice> with an <abbr>
marking the abbreviated word.  e.g. in its fullest form something
like:
<choice>
     <abbr>ev<am><g ref="#abbr-er">er</g></am>y</abbr>
     <expan>ev<ex>er</ex>y</expan>
<choice>
with the #abbr-er pointing to a <char> or <glyph> in the header.
My reason for wanting this is one of future utility of the
(openly released) data to do other things and ease of processing.
While you can construct most of this programmatically from
ev<ex>er</ex>y (and I might have people on projects I'm doing
encode it as such initially), it is much easier to offer more to
readers having the choice between the <abbr> and <expan>. i.e.
you can toggle on and off the abbreviated forms and have the
abbreviation marking character there instead (depending on font,
etc.).  Having at least the orthographic word delimited by
<expan>, to me, makes it so much easier to deal with regardless
of form of output by grabbing the 'expan' in processing. But one
can do this in post processing but that has its limits.

In your hypothetical case there is an editorial decision being
made: namely that you will show the expanded text but not the
form of the abbreviation. That limits, to some degree, the forms
of output and what can be done with the data at a later date.
That is fine, of course. It is better to have this hypothetical
project complete their materials than chase some perfection.

I would note that if you have: ev<ex>er</ex>y then
programatically getting to <expan>ev<ex>er</ex>y</expan> is
fairly straightforward as is getting to:
<choice>
     <abbr>evy</abbr>
     <expan>ev<ex>er</ex>y</expan>
<choice>

It is knowing what to put in the <am> if you wanted to mark that
which is more difficult. (But still solvable for some texts.)
I've been thinking about making a general abbreviation-dictionary
lookup conversion that would make best guesses for that kind of
thing in Latin and Middle English abbreviations.

-James

On 20/07/17 09:26, MLH wrote:

>
> Thank you Lou for this clear explanation.
>
>
> I am still missing a couple of things though (sorry):
>
>
> -is it recommended that <ex> should always be nested inside
> <expan>? (If so, this might be stated more explicitly in the
> Guidelines?)
>
> -if it is optional to nest <ex> inside <expan>, what are the
> specific benefits one might realise from
> `<expan>ev<ex>er</ex>y</expan>` as opposed to `ev<ex>er</ex>y`?
>
>
> I suppose the answer will depend on the project to some extent,
> but say for argument that it is a scholarly edition where it is
> sufficient to note the presence of an abbreviation that has
> been expanded, not the form of the original abbreviation.
>
>
> Matthew
>
>
>
> -----------------------------------------------------------------
> *From:* Lou Burnard <[hidden email]>
> *Sent:* 19 July 2017 18:06
> *To:* MLH; [hidden email]
> *Subject:* Re: to <expan> or not to <expan>
> <expan> should contain the whole of a word that has been
> expanded. <expan>Mister</expan> when the source says "Mr."
>
> <ex> should contain the bits of an expansion which are not
> present in the source text but have been added to it by an editor.
> M<ex>iste</ex>r.
>
> The truly obsessive will also wish to mark (using <am>) the
> full stop which signals that "Mr." is an abbreviation of course.
>
> <expan>M<ex>iste</ex>r</expan><am>.</am>
>
> Of course, it might be that (this is for Bertrand) you think
> "Mr." is actually an abbreviation for "Monseigneur", in which
> case you'd have
>
> <expan>M<ex>onseigneu</ex>r</expan><am>.</am>
>
> Or, if unable to decide, a <choice> containing both, and maybe
> also an <abbr> holding the original form just for fun. Though
> in that case, where to put the <am> becomes a bit trickier.
>
>
>
>
> On 19/07/17 17:42, MLH wrote:
>> Could anyone explain the rationale for combining <ex> and
>> <expan> in this way? (for someone not as familiar with the
>> transcription module as he probably should be!)
>> Thanks,
>> Matthew
>>
>

--
Dr James Cummings, [hidden email]
Academic IT Services, University of Oxford
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: to <expan> or not to <expan>

Sebastiaan Verweij-3
Dear all

Thanks for your answers and contributions to this v interesting discussion. I’ve always, so far, nested <ex> within <expan> but more recently started to wonder why, as did Matthew. We will go the way of James’s suggestion, of some post-processing to save on mark-up time, which is our biggest issue at the moment (isn’t it always!). We will also feature manuscript images for most, not all, of the texts, so the form of the abbreviation should be visible at least, if not hard coded into the transcription. 

Toggling between diplomatic, semi-diplomatic, modern (a la Folger’s EMMO, for instance) isn’t on our list of immediate desirables (once more because of time), though I have a lot of sympathy with James below, on catering for future use as much as possible. 

Thanks for your thoughts! 

Sebastiaan
 

On 20 July 2017 at 10:51:25, James Cummings ([hidden email]) wrote:

Hi Matthew,

1) No it is not explicitly recommended that <ex> should always be
nested inside <expan>, the TEI allows people a lot of flexibility
in how they mark abbreviated words, expanded words, abbreviation
markers, and expanded text because there are lots of different
needs, traditions, and editorial perspectives.

2) I personally always would recommend have the enclosing markup
inside <expan>, and indeed inside a <choice> with an <abbr>
marking the abbreviated word. e.g. in its fullest form something
like:
<choice>
<abbr>ev<am><g ref="#abbr-er">er</g></am>y</abbr>
<expan>ev<ex>er</ex>y</expan>
<choice>
with the #abbr-er pointing to a <char> or <glyph> in the header.
My reason for wanting this is one of future utility of the
(openly released) data to do other things and ease of processing.
While you can construct most of this programmatically from
ev<ex>er</ex>y (and I might have people on projects I'm doing
encode it as such initially), it is much easier to offer more to
readers having the choice between the <abbr> and <expan>. i.e.
you can toggle on and off the abbreviated forms and have the
abbreviation marking character there instead (depending on font,
etc.). Having at least the orthographic word delimited by
<expan>, to me, makes it so much easier to deal with regardless
of form of output by grabbing the 'expan' in processing. But one
can do this in post processing but that has its limits.

In your hypothetical case there is an editorial decision being
made: namely that you will show the expanded text but not the
form of the abbreviation. That limits, to some degree, the forms
of output and what can be done with the data at a later date.
That is fine, of course. It is better to have this hypothetical
project complete their materials than chase some perfection.

I would note that if you have: ev<ex>er</ex>y then
programatically getting to <expan>ev<ex>er</ex>y</expan> is
fairly straightforward as is getting to:
<choice>
<abbr>evy</abbr>
<expan>ev<ex>er</ex>y</expan>
<choice>

It is knowing what to put in the <am> if you wanted to mark that
which is more difficult. (But still solvable for some texts.)
I've been thinking about making a general abbreviation-dictionary
lookup conversion that would make best guesses for that kind of
thing in Latin and Middle English abbreviations.

-James

On 20/07/17 09:26, MLH wrote:
>
> Thank you Lou for this clear explanation.
>
>
> I am still missing a couple of things though (sorry):
>
>
> -is it recommended that <ex> should always be nested inside
> <expan>? (If so, this might be stated more explicitly in the
> Guidelines?)
>
> -if it is optional to nest <ex> inside <expan>, what are the
> specific benefits one might realise from
> `<expan>ev<ex>er</ex>y</expan>` as opposed to `ev<ex>er</ex>y`?
>
>
> I suppose the answer will depend on the project to some extent,
> but say for argument that it is a scholarly edition where it is
> sufficient to note the presence of an abbreviation that has
> been expanded, not the form of the original abbreviation.
>
>
> Matthew
>
>
>
> -----------------------------------------------------------------
> *From:* Lou Burnard <[hidden email]>
> *Sent:* 19 July 2017 18:06
> *To:* MLH; [hidden email]
> *Subject:* Re: to <expan> or not to <expan>
> <expan> should contain the whole of a word that has been
> expanded. <expan>Mister</expan> when the source says "Mr."
>
> <ex> should contain the bits of an expansion which are not
> present in the source text but have been added to it by an editor.
> M<ex>iste</ex>r.
>
> The truly obsessive will also wish to mark (using <am>) the
> full stop which signals that "Mr." is an abbreviation of course.
>
> <expan>M<ex>iste</ex>r</expan><am>.</am>
>
> Of course, it might be that (this is for Bertrand) you think
> "Mr." is actually an abbreviation for "Monseigneur", in which
> case you'd have
>
> <expan>M<ex>onseigneu</ex>r</expan><am>.</am>
>
> Or, if unable to decide, a <choice> containing both, and maybe
> also an <abbr> holding the original form just for fun. Though
> in that case, where to put the <am> becomes a bit trickier.
>
>
>
>
> On 19/07/17 17:42, MLH wrote:
>> Could anyone explain the rationale for combining <ex> and
>> <expan> in this way? (for someone not as familiar with the
>> transcription module as he probably should be!)
>> Thanks,
>> Matthew
>>
>

--
Dr James Cummings, [hidden email]
Academic IT Services, University of Oxford
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: to <expan> or not to <expan>

James Cummings-4
Hi Sebastiaan and TEI,

An interesting article that includes some discussion of the
subject by Matthew Driscoll is available at
http://www.driscoll.dk/docs/Abbreviations.pdf and says all sorts
of sensible things -- I generally agree with its point of view.

What hasn't really been mentioned is that there is at least one
important sub-community in the TEI which uses the markup for
abbreviations and expansions in a very different manner, that is,
the use of the <abbr> element by the EpiDoc community. While I
disagree whole-heartedly with their recommendation of embedding
of <abbr> *inside* <expan>, and think this is a perversion that
even I can't stomach, the great thing about the TEI Guidelines
are that they allow such a wide variety of practice (and
documentation of your particular bizarre kinks in the TEI ODD
Customization file). See
http://www.stoa.org/epidoc/gl/latest/trans-abbrevfully.html for
their strange fetish. (Safe for work. ;-) ) And while I disagree
with it, I do support their right to do so in the flexible TEI
framework, and arguments like this do occasionally surface at the
meetings of the TEI Technical Council. (And as an elected
representative, I try to reduce such madness, but am often
out-voted. ;-) )

Best wishes,
-James

On 20/07/17 16:58, Sebastiaan Verweij wrote:

> Dear all
>
> Thanks for your answers and contributions to this v interesting
> discussion. I’ve always, so far, nested <ex> within <expan> but
> more recently started to wonder why, as did Matthew. We will go
> the way of James’s suggestion, of some post-processing to save
> on mark-up time, which is our biggest issue at the moment
> (isn’t it always!). We will also feature manuscript images for
> most, not all, of the texts, so the form of the abbreviation
> should be visible at least, if not hard coded into the
> transcription.
>
> Toggling between diplomatic, semi-diplomatic, modern (a la
> Folger’s EMMO, for instance) isn’t on our list of immediate
> desirables (once more because of time), though I have a lot of
> sympathy with James below, on catering for future use as much
> as possible.
>
> Thanks for your thoughts!
>
> Sebastiaan
>
>
> On 20 July 2017 at 10:51:25, James Cummings
> ([hidden email]
> <mailto:[hidden email]>) wrote:
>
>> Hi Matthew,
>>
>> 1) No it is not explicitly recommended that <ex> should always be
>> nested inside <expan>, the TEI allows people a lot of flexibility
>> in how they mark abbreviated words, expanded words, abbreviation
>> markers, and expanded text because there are lots of different
>> needs, traditions, and editorial perspectives.
>>
>> 2) I personally always would recommend have the enclosing markup
>> inside <expan>, and indeed inside a <choice> with an <abbr>
>> marking the abbreviated word. e.g. in its fullest form something
>> like:
>> <choice>
>> <abbr>ev<am><g ref="#abbr-er">er</g></am>y</abbr>
>> <expan>ev<ex>er</ex>y</expan>
>> <choice>
>> with the #abbr-er pointing to a <char> or <glyph> in the header.
>> My reason for wanting this is one of future utility of the
>> (openly released) data to do other things and ease of processing.
>> While you can construct most of this programmatically from
>> ev<ex>er</ex>y (and I might have people on projects I'm doing
>> encode it as such initially), it is much easier to offer more to
>> readers having the choice between the <abbr> and <expan>. i.e.
>> you can toggle on and off the abbreviated forms and have the
>> abbreviation marking character there instead (depending on font,
>> etc.). Having at least the orthographic word delimited by
>> <expan>, to me, makes it so much easier to deal with regardless
>> of form of output by grabbing the 'expan' in processing. But one
>> can do this in post processing but that has its limits.
>>
>> In your hypothetical case there is an editorial decision being
>> made: namely that you will show the expanded text but not the
>> form of the abbreviation. That limits, to some degree, the forms
>> of output and what can be done with the data at a later date.
>> That is fine, of course. It is better to have this hypothetical
>> project complete their materials than chase some perfection.
>>
>> I would note that if you have: ev<ex>er</ex>y then
>> programatically getting to <expan>ev<ex>er</ex>y</expan> is
>> fairly straightforward as is getting to:
>> <choice>
>> <abbr>evy</abbr>
>> <expan>ev<ex>er</ex>y</expan>
>> <choice>
>>
>> It is knowing what to put in the <am> if you wanted to mark that
>> which is more difficult. (But still solvable for some texts.)
>> I've been thinking about making a general abbreviation-dictionary
>> lookup conversion that would make best guesses for that kind of
>> thing in Latin and Middle English abbreviations.
>>
>> -James
>>
>> On 20/07/17 09:26, MLH wrote:
>> >
>> > Thank you Lou for this clear explanation.
>> >
>> >
>> > I am still missing a couple of things though (sorry):
>> >
>> >
>> > -is it recommended that <ex> should always be nested inside
>> > <expan>? (If so, this might be stated more explicitly in the
>> > Guidelines?)
>> >
>> > -if it is optional to nest <ex> inside <expan>, what are the
>> > specific benefits one might realise from
>> > `<expan>ev<ex>er</ex>y</expan>` as opposed to `ev<ex>er</ex>y`?
>> >
>> >
>> > I suppose the answer will depend on the project to some extent,
>> > but say for argument that it is a scholarly edition where it is
>> > sufficient to note the presence of an abbreviation that has
>> > been expanded, not the form of the original abbreviation.
>> >
>> >
>> > Matthew
>> >
>> >
>> >
>> >
>> -----------------------------------------------------------------
>> > *From:* Lou Burnard <[hidden email]>
>> > *Sent:* 19 July 2017 18:06
>> > *To:* MLH; [hidden email]
>> > *Subject:* Re: to <expan> or not to <expan>
>> > <expan> should contain the whole of a word that has been
>> > expanded. <expan>Mister</expan> when the source says "Mr."
>> >
>> > <ex> should contain the bits of an expansion which are not
>> > present in the source text but have been added to it by an
>> editor.
>> > M<ex>iste</ex>r.
>> >
>> > The truly obsessive will also wish to mark (using <am>) the
>> > full stop which signals that "Mr." is an abbreviation of
>> course.
>> >
>> > <expan>M<ex>iste</ex>r</expan><am>.</am>
>> >
>> > Of course, it might be that (this is for Bertrand) you think
>> > "Mr." is actually an abbreviation for "Monseigneur", in which
>> > case you'd have
>> >
>> > <expan>M<ex>onseigneu</ex>r</expan><am>.</am>
>> >
>> > Or, if unable to decide, a <choice> containing both, and maybe
>> > also an <abbr> holding the original form just for fun. Though
>> > in that case, where to put the <am> becomes a bit trickier.
>> >
>> >
>> >
>> >
>> > On 19/07/17 17:42, MLH wrote:
>> >> Could anyone explain the rationale for combining <ex> and
>> >> <expan> in this way? (for someone not as familiar with the
>> >> transcription module as he probably should be!)
>> >> Thanks,
>> >> Matthew
>> >>
>> >
>>
>> --
>> Dr James Cummings, [hidden email]
>> Academic IT Services, University of Oxford


--
Dr James Cummings, [hidden email]
Academic IT Services, University of Oxford
Loading...