quotation across verse lines

classic Classic list List threaded Threaded
52 messages Options
123
Reply | Threaded
Open this post in threaded view
|

quotation across verse lines

Stephen Powell
Hello All,

We are marking up an epic verse poem and have run across a problem with quoted material in the transcription. We have marked individual lines with <l> and are attempting to mark quotations as <q>, but when we try to mark the quote across multiple lines, for example:

<l><q>ore en penst Damede qui soffri passion,</l>
<l>que que soit del parfaire le mur comencerom</q></l>

the validator rejects the code. Is there a work around that allows quoted material to extend across multiple verse lines without having to close the <q> tag on every verse line? Any suggestions for efficiently marking quotations would be greatly appreciated.

Thanks,

Stephen
Reply | Threaded
Open this post in threaded view
|

Re: quotation across verse lines

Paul Schaffner
Classic overlapping hierarchies problem.
Your only solutions are, I think:

(1)

<q>
   <l>ore en penst Damede qui soffri passion,</l>
   <l>que que soit del parfaire le mur comencerom</l>
</q>

(and why not?)

(2)

<l><q>ore en penst Damede qui soffri passion,</q></l>
<l><q>que que soit del parfaire le mur comencerom</q></l>

(3)

standoff markup

(4)

milestones or anchors marking beginning and end of quoted
bits.

pfs


On Wed, Oct 4, 2017, at 17:06, Stephen Powell wrote:

> Hello All,
>
> We are marking up an epic verse poem and have run across a problem with
> quoted material in the transcription. We have marked individual lines
> with <l> and are attempting to mark quotations as <q>, but when we try to
> mark the quote across multiple lines, for example:
>
> <l><q>ore en penst Damede qui soffri passion,</l>
> <l>que que soit del parfaire le mur comencerom</q></l>
>
> the validator rejects the code. Is there a work around that allows quoted
> material to extend across multiple verse lines without having to close
> the <q> tag on every verse line? Any suggestions for efficiently marking
> quotations would be greatly appreciated.
>
> Thanks,
>
> Stephen


--
Paul Schaffner  Digital Content & Collections
University of Michigan Libraries
[hidden email] | http://www.umich.edu/~pfs/
Reply | Threaded
Open this post in threaded view
|

Re: quotation across verse lines

Elisa Beshero-Bondar-2
In reply to this post by Stephen Powell
Dear Stephen,
You have run into a classic case of overlapping hierarchy, and one we like to model for our coding students as a Challenge. For this Wendell Piez created a form of code called LMNL and others philosophize about the problems of hierarchy generally in describing the inevitable interweaving and overlapping of semantic structures. What to do? Paul’s solutions make sense, and I’ve tended to the self-closing “milestone-style” solutions myself: keeping the <l> structure intact for TEI verse lines, but rendering that which signals the start and end of quotations as a self-closing milestone-style element. For examples, look up TEI milestone. You may want to adapt the <q> element to behave in a new way using a TEI ODD if you work with ODD-generated schemas.

Cheers,
Elisa
-- 
Elisa Beshero-Bondar, PhD
Director, Center for the Digital Text | Associate Professor of English
University of Pittsburgh at Greensburg | Humanities Division
150 Finoli Drive
Greensburg, PA  15601  USA
E-mail:[hidden email]
Development site: http://newtfire.org






On Oct 4, 2017, at 5:06 PM, Stephen Powell <[hidden email]> wrote:

Hello All,

We are marking up an epic verse poem and have run across a problem with quoted material in the transcription. We have marked individual lines with <l> and are attempting to mark quotations as <q>, but when we try to mark the quote across multiple lines, for example:

<l><q>ore en penst Damede qui soffri passion,</l>
<l>que que soit del parfaire le mur comencerom</q></l>

the validator rejects the code. Is there a work around that allows quoted material to extend across multiple verse lines without having to close the <q> tag on every verse line? Any suggestions for efficiently marking quotations would be greatly appreciated.

Thanks,

Stephen

Reply | Threaded
Open this post in threaded view
|

Re: quotation across verse lines

Syd Bauman-10
In reply to this post by Paul Schaffner
Stephen --

Paul and Elisa have both pointed out that you have a classic overlap
problem and given some thoughts about it. I'll add 3 things:
 I. Answering Paul's "why not?"
 II. How we do this at WWP
 III. A thought on <q>


I.
--
> (1)
> <q>
>    <l>ore en penst Damede qui soffri passion,</l>
>    <l>que que soit del parfaire le mur comencerom</l>
> </q>
> (and why not?)

This is not a crazy idea at all, but IMHO it falls short for three
reasons:

 a) Because <q> is not a valid child of <lg>;

 b) because it does not handle the similar case in which the
    quotation inconveniently starts in the middle, not the beginning,
    of a metrical line; and

 c) because a poem is not a sequence of quotations or other features
    which happen to be metrical; a poem is a sequence of metrical
    lines.

II.
---
We use Paul's second suggestion, but explicitly indicate which of the
disconnected elements should be considered a single unit using the
TEI's @next and @prev mechanism.[1] E.g.

| <lg n="XIX." type="quatrain">
|   <head>XIX.</head>
|   <l>Were these, <said xml:id="Q5" next="#Q6">dear child of all my tenderest care,</said></l>
|   <l><said xml:id="Q6" next="#Q7" prev="#Q5">Transfer that duteous love to me you pay'd,</said></l>
|   <l><said xml:id="Q7" prev="#Q6">To thy dear sire;—live but for him,</said> and died;—</l>
|   <l>Say blessed spirit, have I disobey'd?</l>
| </lg>

As you can see, one advantage of this solution is that it scales
well: it works when only partial lines are involved, or there is
narrative text between two chunks of direct speech or quotation.

But yes, it has the disadvantage that you end up having a <said> (or
<q> or <quote>) child for each involved metrical line.

III.
----
I'm curious about your choice of <q> over <said> or <quote>. If these
are quotations (as in, short excerpts from a different work), why not
use <quote> to explicitly indicate that? Or, if these are passages of
direct speech, why not use <said> to explicitly indicate that? (I
think of <q> as something like "there are quotation marks here in the
source, but I am not going to tell you why they are here, either
because I can't be bothered to figure it out, I can't afford to
figure it out, or because I am deliberately trying to stay agnostic
about such things".)

Notes
-----
 [1] See http://www.tei-c.org/release/doc/tei-p5-doc/en/html/NH.html#NHVE
Reply | Threaded
Open this post in threaded view
|

Re: quotation across verse lines

Martin Holmes
In reply to this post by Paul Schaffner
To elaborate on this a bit:

> <l><q>ore en penst Damede qui soffri passion,</q></l>
> <l><q>que que soit del parfaire le mur comencerom</q></l>

I would do:

<l><q xml:id="q1" next="#q2">ore en penst Damede qui soffri passion,</q></l>
<l><q xml:id="q2" prev="#q1">que que soit del parfaire le mur
comencerom</q></l>

assuming they're part of the same uninterrupted quotation.

Technically you only actually need one of @next or @prev for any
processing purposes, but having both is clearer.

Cheers,
Martin


On 2017-10-04 02:22 PM, Paul Schaffner wrote:

> Classic overlapping hierarchies problem.
> Your only solutions are, I think:
>
> (1)
>
> <q>
>     <l>ore en penst Damede qui soffri passion,</l>
>     <l>que que soit del parfaire le mur comencerom</l>
> </q>
>
> (and why not?)
>
> (2)
>
> <l><q>ore en penst Damede qui soffri passion,</q></l>
> <l><q>que que soit del parfaire le mur comencerom</q></l>
>
> (3)
>
> standoff markup
>
> (4)
>
> milestones or anchors marking beginning and end of quoted
> bits.
>
> pfs
>
>
> On Wed, Oct 4, 2017, at 17:06, Stephen Powell wrote:
>> Hello All,
>>
>> We are marking up an epic verse poem and have run across a problem with
>> quoted material in the transcription. We have marked individual lines
>> with <l> and are attempting to mark quotations as <q>, but when we try to
>> mark the quote across multiple lines, for example:
>>
>> <l><q>ore en penst Damede qui soffri passion,</l>
>> <l>que que soit del parfaire le mur comencerom</q></l>
>>
>> the validator rejects the code. Is there a work around that allows quoted
>> material to extend across multiple verse lines without having to close
>> the <q> tag on every verse line? Any suggestions for efficiently marking
>> quotations would be greatly appreciated.
>>
>> Thanks,
>>
>> Stephen
>
>
Reply | Threaded
Open this post in threaded view
|

Re: quotation across verse lines

John P. McCaskey-2
In reply to this post by Paul Schaffner
On 10/4/2017 5:22 PM, Paul Schaffner wrote:
Classic overlapping hierarchies problem. 
Your only solutions are, I think:

(1) 

<q>
   <l>ore en penst Damede qui soffri passion,</l>
   <l>que que soit del parfaire le mur comencerom</l>
</q>

(and why not?)
Sometimes the quotations begin or end mid-line.

(2)

<l><q>ore en penst Damede qui soffri passion,</q></l>
<l><q>que que soit del parfaire le mur comencerom</q></l>
What about (2) but adding @type="continuing" to the follow-on quote tags?

(. . . or some new attribute if that’s a misuse of @type, or use <quote> which does not already have recommended values for @type)

This would be lightweight (particularly if multi-line quotes are not very common), easily distinguishes multi-line quotes from single-line ones, readily handles a begin or end mid-line, and styles much more easily than, say, milestones.

-- John

Reply | Threaded
Open this post in threaded view
|

Re: quotation across verse lines

Elisa Beshero-Bondar-2
Well, I'm with Syd in liking <said> better than <q>, but I still prefer milestone-style self-closing elements with signaling attributes (telling us the start and the end). In my work these "style" just as well as elements containing text nodes, since one processes them on the preceding:: and following:: XPath axes. And they don't require so many tags, just information in attributes. When a quoted passage spans several lines, I am not sure I need a signal of it in each line when I can effectively span to it as far away in the poem as it may be.  Here’s an example of what I’m talking about—noting how a quoted passage might extend (at one point) over a couple of stanzas:

 <lg>
            <l xml:id="n01"><said who="I-speaker" xml:id="s1" next="#s2"/>Now tell me, had you rather be,<said xml:id="s2" prev="#s1" ana="interrupt"/></l>
            <l xml:id="n02">I said, and took him by the arm,</l>
            <l xml:id="n03"><said xml:id="s3" next="#s4" ana="resume"/>On Kilve's smooth shore, by the green sea,</l>
            <l xml:id="n04">Or here at Liswyn farm?<said xml:id="s4" prev="#s3"/></l>
         </lg>
         <lg>
            <l xml:id="n05">In careless mood he looked at me,</l>
            <l xml:id="n06">While still I held him by the arm,</l>
            <l xml:id="n07">And said, <said who="Edward" xml:id="s5" next="#s6"/>At Kilve I'd rather be</l>
            <l xml:id="n08">Than here at Liswyn farm.<said xml:id="s6" prev="#s5"/></l>
         </lg>
         <lg>
            <l xml:id="n09"><said who="I-speaker" xml:id="s7" next="#s8"/>Now, little Edward, say why so:</l>
            <l xml:id="n10">My little Edward, tell me why.<said xml:id="s8" prev="#s7"/> --</l>
            <l xml:id="n11"><said who="Edward" xml:id="s9" next="#s10"/>I cannot tell, I do not know.<said xml:id="s10" prev="#s9"/> --</l>
            <l xml:id="n12"><said who="I-speaker" xml:id="s11" next="#s12"/>Why, this is strange, said I;</l>
         </lg>
         <lg>
            <l xml:id="n13">For, here are woods, hills smooth and warm:</l>
            <l xml:id="n14">There surely must some reason be</l>
            <l xml:id="n15">Why you would change sweet Liswyn farm</l>
            <l xml:id="n16">For Kilve by the green sea.<said xml:id="s12" prev="#s11"/></l>
         </lg>
         <lg>
            <l xml:id="n17">At this, my boy hung down his head,</l>
            <l xml:id="n18">He blushed with shame, nor made reply;</l>
            <l xml:id="n19">And three times to the child I said,</l>
            <l xml:id="n20"><said who="I-speaker" xml:id="s13" next="#s14"/>Why, Edward, tell me why?<said xml:id="s14" prev="#s13"/></l>
         </lg>

Increasingly in my markup, I resist disturbances to simple hierarchies: What if I want all of my text nodes to be on the same hierarchical level? For some of my processing that’s a practical matter—I want only to be making unitary chunks out of lines and line-groups and I might want to be storing other kinds of information floating **around** my text, rather than shifting it to a different level—**especially** in cases where I am attempting to respect an overlapping hierarchy.

So, there is defense of the milestone style, incorporating `<said/>` into the mix.

Cheers!
Elisa
--
Elisa Beshero-Bondar, PhD 
Director, Center for the Digital Text
Associate Professor of English 
University of Pittsburgh at Greensburg
150 Finoli Drive, Greensburg, PA 15601 USA
E-mail: [hidden email] | Development site: http://newtfire.org

Typeset by hand on my iPad

On Oct 4, 2017, at 8:13 PM, John P. McCaskey <[hidden email]> wrote:

On 10/4/2017 5:22 PM, Paul Schaffner wrote:
Classic overlapping hierarchies problem. 
Your only solutions are, I think:

(1) 

<q>
   <l>ore en penst Damede qui soffri passion,</l>
   <l>que que soit del parfaire le mur comencerom</l>
</q>

(and why not?)
Sometimes the quotations begin or end mid-line.

(2)

<l><q>ore en penst Damede qui soffri passion,</q></l>
<l><q>que que soit del parfaire le mur comencerom</q></l>
What about (2) but adding @type="continuing" to the follow-on quote tags?

(. . . or some new attribute if that’s a misuse of @type, or use <quote> which does not already have recommended values for @type)

This would be lightweight (particularly if multi-line quotes are not very common), easily distinguishes multi-line quotes from single-line ones, readily handles a begin or end mid-line, and styles much more easily than, say, milestones.

-- John

Reply | Threaded
Open this post in threaded view
|

Re: quotation across verse lines

Elisa Beshero-Bondar-2
Ahh! Once more—I missed a couple of tags in that last post, so I’ve corrected them here. What I like about this is it permits me to locate, count, and process in a connected way the words in my text nodes that fall between the <said/> milestones. This is easily XPath-able and smooth sailing without requiring said elements inside every line. I can use these to count the number of speech acts, as well.

 <lg>
            <l xml:id="n01"><said who="I-speaker" xml:id="s1" next="#s2"/>Now tell me, had you rather be,<said xml:id="s2" prev="#s1" ana="interrupt"/></l>
            <l xml:id="n02">I said, and took him by the arm,</l>
            <l xml:id="n03"><said xml:id="s3" next="#s4" ana="resume"/>On Kilve's smooth shore, by the green sea,</l>
            <l xml:id="n04">Or here at Liswyn farm?<said xml:id="s4" prev="#s3"/></l>
         </lg>
         <lg>
            <l xml:id="n05">In careless mood he looked at me,</l>
            <l xml:id="n06">While still I held him by the arm,</l>
            <l xml:id="n07">And said, <said who="Edward" xml:id="s5" next="#s6"/>At Kilve I'd rather be</l>
            <l xml:id="n08">Than here at Liswyn farm.<said xml:id="s6" prev="#s5"/></l>
         </lg>
         <lg>
            <l xml:id="n09"><said who="I-speaker" xml:id="s7" next="#s8"/>Now, little Edward, say why so:</l>
            <l xml:id="n10">My little Edward, tell me why.<said xml:id="s8" prev="#s7"/> --</l>
            <l xml:id="n11"><said who="Edward" xml:id="s9" next="#s10"/>I cannot tell, I do not know.<said xml:id="s10" prev="#s9"/> --</l>
            <l xml:id="n12"><said who="I-speaker" xml:id="s11" next="#s12"/>Why, this is strange,<said xml:id="s12” prev="#s11” ana=“interrupt"/> said I;</l>
         </lg>
         <lg>
            <l xml:id="n13”><said who="I-speaker" xml:id="s13" next="#s14” ana=“resume"/>For, here are woods, hills smooth and warm:</l>
            <l xml:id="n14">There surely must some reason be</l>
            <l xml:id="n15">Why you would change sweet Liswyn farm</l>
            <l xml:id="n16">For Kilve by the green sea.<said xml:id="s14" prev="#s13"/></l>
         </lg>
         <lg>
            <l xml:id="n17">At this, my boy hung down his head,</l>
            <l xml:id="n18">He blushed with shame, nor made reply;</l>
            <l xml:id="n19">And three times to the child I said,</l>
            <l xml:id="n20"><said who="I-speaker" xml:id="s13" next="#s14"/>Why, Edward, tell me why?<said xml:id="s14" prev="#s13"/></l>
         </lg>

Yours in milestones,
Elisa
-- 
Elisa Beshero-Bondar, PhD
Director, Center for the Digital Text | Associate Professor of English
University of Pittsburgh at Greensburg | Humanities Division
150 Finoli Drive
Greensburg, PA  15601  USA
E-mail:[hidden email]
Development site: http://newtfire.org






On Oct 5, 2017, at 12:28 AM, Elisa <[hidden email]> wrote:

 <lg>
            <l xml:id="n01"><said who="I-speaker" xml:id="s1" next="#s2"/>Now tell me, had you rather be,<said xml:id="s2" prev="#s1" ana="interrupt"/></l>
            <l xml:id="n02">I said, and took him by the arm,</l>
            <l xml:id="n03"><said xml:id="s3" next="#s4" ana="resume"/>On Kilve's smooth shore, by the green sea,</l>
            <l xml:id="n04">Or here at Liswyn farm?<said xml:id="s4" prev="#s3"/></l>
         </lg>
         <lg>
            <l xml:id="n05">In careless mood he looked at me,</l>
            <l xml:id="n06">While still I held him by the arm,</l>
            <l xml:id="n07">And said, <said who="Edward" xml:id="s5" next="#s6"/>At Kilve I'd rather be</l>
            <l xml:id="n08">Than here at Liswyn farm.<said xml:id="s6" prev="#s5"/></l>
         </lg>
         <lg>
            <l xml:id="n09"><said who="I-speaker" xml:id="s7" next="#s8"/>Now, little Edward, say why so:</l>
            <l xml:id="n10">My little Edward, tell me why.<said xml:id="s8" prev="#s7"/> --</l>
            <l xml:id="n11"><said who="Edward" xml:id="s9" next="#s10"/>I cannot tell, I do not know.<said xml:id="s10" prev="#s9"/> --</l>
            <l xml:id="n12"><said who="I-speaker" xml:id="s11" next="#s12"/>Why, this is strange, said I;</l>
         </lg>
         <lg>
            <l xml:id="n13">For, here are woods, hills smooth and warm:</l>
            <l xml:id="n14">There surely must some reason be</l>
            <l xml:id="n15">Why you would change sweet Liswyn farm</l>
            <l xml:id="n16">For Kilve by the green sea.<said xml:id="s12" prev="#s11"/></l>
         </lg>
         <lg>
            <l xml:id="n17">At this, my boy hung down his head,</l>
            <l xml:id="n18">He blushed with shame, nor made reply;</l>
            <l xml:id="n19">And three times to the child I said,</l>
            <l xml:id="n20"><said who="I-speaker" xml:id="s13" next="#s14"/>Why, Edward, tell me why?<said xml:id="s14" prev="#s13"/></l>
         </lg>

Reply | Threaded
Open this post in threaded view
|

Re: quotation across verse lines

John P. McCaskey-2
In reply to this post by Elisa Beshero-Bondar-2

Elisa wrote:

I still prefer milestone-style self-closing elements with signaling attributes (telling us the start and the end). In my work these "style" just as well as elements containing text nodes, since one processes them on the preceding:: and following:: XPath axes.

Styling quotes with typed quote tags can be done with plain CSS.

q                  {quotes: "\201C" "\201D"}
q[type="begin"]    {quotes: "\201C" ""}
q[type="continue"] {quotes: "" ""}
q[type="end"]      {quotes: "" "\201D"}

A fiddle is here: https://jsfiddle.net/mccaskey/52yq418p/

I’d avoid making this project team bring in XPath and XSLT expertise and adding an extra step to their workflow if it’s only to handle these few cases of overlapping hierarchy in verse lines.

Also, the coders could include both typed quote tags and milestone tags. It would be redundant but provide the best of both. The team could style using CSS and down the line someone doing textual analysis could use the milestone tags.

John



On 10/5/2017 12:28 AM, Elisa wrote:
Well, I'm with Syd in liking <said> better than <q>, but I still prefer milestone-style self-closing elements with signaling attributes (telling us the start and the end). In my work these "style" just as well as elements containing text nodes, since one processes them on the preceding:: and following:: XPath axes. And they don't require so many tags, just information in attributes. When a quoted passage spans several lines, I am not sure I need a signal of it in each line when I can effectively span to it as far away in the poem as it may be.  Here’s an example of what I’m talking about—noting how a quoted passage might extend (at one point) over a couple of stanzas:

 <lg>
            <l <a class="moz-txt-link-freetext" href="xml:id">xml:id="n01"><said who="I-speaker" <a class="moz-txt-link-freetext" href="xml:id">xml:id="s1" next="#s2"/>Now tell me, had you rather be,<said <a class="moz-txt-link-freetext" href="xml:id">xml:id="s2" prev="#s1" ana="interrupt"/></l>
            <l <a class="moz-txt-link-freetext" href="xml:id">xml:id="n02">I said, and took him by the arm,</l>
            <l <a class="moz-txt-link-freetext" href="xml:id">xml:id="n03"><said <a class="moz-txt-link-freetext" href="xml:id">xml:id="s3" next="#s4" ana="resume"/>On Kilve's smooth shore, by the green sea,</l>
            <l <a class="moz-txt-link-freetext" href="xml:id">xml:id="n04">Or here at Liswyn farm?<said <a class="moz-txt-link-freetext" href="xml:id">xml:id="s4" prev="#s3"/></l>
         </lg>
         <lg>
            <l <a class="moz-txt-link-freetext" href="xml:id">xml:id="n05">In careless mood he looked at me,</l>
            <l <a class="moz-txt-link-freetext" href="xml:id">xml:id="n06">While still I held him by the arm,</l>
            <l <a class="moz-txt-link-freetext" href="xml:id">xml:id="n07">And said, <said who="Edward" <a class="moz-txt-link-freetext" href="xml:id">xml:id="s5" next="#s6"/>At Kilve I'd rather be</l>
            <l <a class="moz-txt-link-freetext" href="xml:id">xml:id="n08">Than here at Liswyn farm.<said <a class="moz-txt-link-freetext" href="xml:id">xml:id="s6" prev="#s5"/></l>
         </lg>
         <lg>
            <l <a class="moz-txt-link-freetext" href="xml:id">xml:id="n09"><said who="I-speaker" <a class="moz-txt-link-freetext" href="xml:id">xml:id="s7" next="#s8"/>Now, little Edward, say why so:</l>
            <l <a class="moz-txt-link-freetext" href="xml:id">xml:id="n10">My little Edward, tell me why.<said <a class="moz-txt-link-freetext" href="xml:id">xml:id="s8" prev="#s7"/> --</l>
            <l <a class="moz-txt-link-freetext" href="xml:id">xml:id="n11"><said who="Edward" <a class="moz-txt-link-freetext" href="xml:id">xml:id="s9" next="#s10"/>I cannot tell, I do not know.<said <a class="moz-txt-link-freetext" href="xml:id">xml:id="s10" prev="#s9"/> --</l>
            <l <a class="moz-txt-link-freetext" href="xml:id">xml:id="n12"><said who="I-speaker" <a class="moz-txt-link-freetext" href="xml:id">xml:id="s11" next="#s12"/>Why, this is strange, said I;</l>
         </lg>
         <lg>
            <l <a class="moz-txt-link-freetext" href="xml:id">xml:id="n13">For, here are woods, hills smooth and warm:</l>
            <l <a class="moz-txt-link-freetext" href="xml:id">xml:id="n14">There surely must some reason be</l>
            <l <a class="moz-txt-link-freetext" href="xml:id">xml:id="n15">Why you would change sweet Liswyn farm</l>
            <l <a class="moz-txt-link-freetext" href="xml:id">xml:id="n16">For Kilve by the green sea.<said <a class="moz-txt-link-freetext" href="xml:id">xml:id="s12" prev="#s11"/></l>
         </lg>
         <lg>
            <l <a class="moz-txt-link-freetext" href="xml:id">xml:id="n17">At this, my boy hung down his head,</l>
            <l <a class="moz-txt-link-freetext" href="xml:id">xml:id="n18">He blushed with shame, nor made reply;</l>
            <l <a class="moz-txt-link-freetext" href="xml:id">xml:id="n19">And three times to the child I said,</l>
            <l <a class="moz-txt-link-freetext" href="xml:id">xml:id="n20"><said who="I-speaker" <a class="moz-txt-link-freetext" href="xml:id">xml:id="s13" next="#s14"/>Why, Edward, tell me why?<said <a class="moz-txt-link-freetext" href="xml:id">xml:id="s14" prev="#s13"/></l>
         </lg>

Increasingly in my markup, I resist disturbances to simple hierarchies: What if I want all of my text nodes to be on the same hierarchical level? For some of my processing that’s a practical matter—I want only to be making unitary chunks out of lines and line-groups and I might want to be storing other kinds of information floating **around** my text, rather than shifting it to a different level—**especially** in cases where I am attempting to respect an overlapping hierarchy.

So, there is defense of the milestone style, incorporating `<said/>` into the mix.

Cheers!
Elisa
--
Elisa Beshero-Bondar, PhD 
Director, Center for the Digital Text
Associate Professor of English 
University of Pittsburgh at Greensburg
150 Finoli Drive, Greensburg, PA 15601 USA
E-mail: [hidden email] | Development site: http://newtfire.org

Typeset by hand on my iPad

On Oct 4, 2017, at 8:13 PM, John P. McCaskey <[hidden email]> wrote:

On 10/4/2017 5:22 PM, Paul Schaffner wrote:
Classic overlapping hierarchies problem. 
Your only solutions are, I think:

(1) 

<q>
   <l>ore en penst Damede qui soffri passion,</l>
   <l>que que soit del parfaire le mur comencerom</l>
</q>

(and why not?)
Sometimes the quotations begin or end mid-line.

(2)

<l><q>ore en penst Damede qui soffri passion,</q></l>
<l><q>que que soit del parfaire le mur comencerom</q></l>
What about (2) but adding @type="continuing" to the follow-on quote tags?

(. . . or some new attribute if that’s a misuse of @type, or use <quote> which does not already have recommended values for @type)

This would be lightweight (particularly if multi-line quotes are not very common), easily distinguishes multi-line quotes from single-line ones, readily handles a begin or end mid-line, and styles much more easily than, say, milestones.

-- John


Reply | Threaded
Open this post in threaded view
|

Re: quotation across verse lines

Syd Bauman-10
In reply to this post by Elisa Beshero-Bondar-2
This encoding is not conformant TEI (which is not really a problem),
and for those who are used to TEI will be very confusing (which does
strike me as problematic). Generic TEI software (e.g. TAPAS) will
have no idea how to process this.

In 2007 the TEI (very unwisely, IMHO) specifically eschewed the
method of encoding here (which is affectionately called HORSE) in
favor of using the <anchor> element. (Except in certain special cases
where there is a pre-existing *Span element (e.g., <addSpan>) which
would replace the first <anchor>.) See
http://www.tei-c.org/release/doc/tei-p5-doc/en/html/NH.html#NHBM

This encoding uses empty <said> elements to mark the boundaries of
what is direct speech, and uses @next (and @prev) to point from the
beginning boundary to the end (and back). That is, it uses attributes
intended for reconstitution of fragmented elements for indicating
start- and end- boundaries of elements. I see these as major
violations of the TEI abstract model, and as constructs that would
severely harm interchangeability.

In TEI the <said> element is intended to *contain* the passage of
direct speech. Using it as a segment boundary delimiter instead would
confuse the BLEEP out of any software written to follow the
Guidelines, *especially* when there is no indication (other than it
being empty) that it is not being used in the standard way. And using
@next to perform the function of @spanTo (i.e., to say "consider the
span of this element to be from here to there" rather than "consider
the span of this element both the content of this element and the
content of the one(s) there") I think will likewise wreak havoc.

But I'd like to reiterate that I'm not suggesting this is a bad
representation of the underlying structure -- given the limits of
XML, I think it is a good representation. But it is a bad use of TEI
for that representation.

It is also worth correcting the nomenclature. This use of <said> is
*not* as milestone elements, it is as empty elements. (Or, as empty
segment-boundary delimiters if you want to be precise.) A milestone,
although also empty, is something else -- the stuff being divided up
by milestones tessellates an ancestor. (E.g., every character (black
mark on the page, not speaker) in a book is on one and only one page
-- thus pages tessellate the book, and <pb> is a milestone that marks
page beginnings. It is not the case that every character in a book is
part of one and only one passage of direct speech.)

Lastly the values of @who and @ana are pointers, so I'm guessing that
you want "#I-speaker" and "#interrupt", etc.

P.S. One of the reasons that the mechanism TEI suggests for empty
boundary delimiters (using <anchor>) is so lame is because
<anchor> does not have the attributes needed to properly encode the
information available on the standard element (in this case <said>).
E.g., there's no @who on <anchor>. Another reason is that the TEI
gives no rule for how to indicate the semantics of that which is
being delimited -- the Guidelines just say to use @subtype for this
purpose, but don't say how. The TEI mechanism for empty boundary
delimiters being so impoverished is why I prefer reconstitution of
fragmented elements, i.e. using @next (and maybe @prev) on a set
of normal elements to indicate "these are one".

> Ahh! Once more—I missed a couple of tags in that last post, so I’ve
> corrected them here. What I like about this is it permits me to
> locate, count, and process in a connected way the words in my text
> nodes that fall between the <said/> milestones. This is easily
> XPath-able and smooth sailing without requiring said elements
> inside every line. I can use these to count the number of speech
> acts, as well.
>
> <lg>
>   <l xml:id="n01"><said who="I-speaker" xml:id="s1" next="#s2"/>Now tell me, had you rather be,<said xml:id="s2" prev="#s1" ana="interrupt"/></l>
>   <l xml:id="n02">I said, and took him by the arm,</l>
>   <l xml:id="n03"><said xml:id="s3" next="#s4" ana="resume"/>On Kilve's smooth shore, by the green sea,</l>
>   <l xml:id="n04">Or here at Liswyn farm?<said xml:id="s4" prev="#s3"/></l>
> </lg>
> <lg>
>   <l xml:id="n05">In careless mood he looked at me,</l>
>   <l xml:id="n06">While still I held him by the arm,</l>
>   <l xml:id="n07">And said, <said who="Edward" xml:id="s5" next="#s6"/>At Kilve I'd rather be</l>
>   <l xml:id="n08">Than here at Liswyn farm.<said xml:id="s6" prev="#s5"/></l>
> </lg>
> <lg>
>   <l xml:id="n09"><said who="I-speaker" xml:id="s7" next="#s8"/>Now, little Edward, say why so:</l>
>   <l xml:id="n10">My little Edward, tell me why.<said xml:id="s8" prev="#s7"/> --</l>
>   <l xml:id="n11"><said who="Edward" xml:id="s9" next="#s10"/>I cannot tell, I do not know.<said xml:id="s10" prev="#s9"/> --</l>
>   <l xml:id="n12"><said who="I-speaker" xml:id="s11" next="#s12"/>Why, this is strange,<said xml:id="s12” prev="#s11” ana=“interrupt"/> said I;</l>
> </lg>
> <lg>
>   <l xml:id="n13”><said who="I-speaker" xml:id="s13" next="#s14” ana=“resume"/>For, here are woods, hills smooth and warm:</l>
>   <l xml:id="n14">There surely must some reason be</l>
>   <l xml:id="n15">Why you would change sweet Liswyn farm</l>
>   <l xml:id="n16">For Kilve by the green sea.<said xml:id="s14" prev="#s13"/></l>
> </lg>
> <lg>
>   <l xml:id="n17">At this, my boy hung down his head,</l>
>   <l xml:id="n18">He blushed with shame, nor made reply;</l>
>   <l xml:id="n19">And three times to the child I said,</l>
>   <l xml:id="n20"><said who="I-speaker" xml:id="s13" next="#s14"/>Why, Edward, tell me why?<said xml:id="s14" prev="#s13"/></l>
> </lg>
Reply | Threaded
Open this post in threaded view
|

Re: quotation across verse lines

Paul Schaffner
In reply to this post by Syd Bauman-10
On Wed, Oct 4, 2017, at 18:22, Syd Bauman wrote:
>  I. Answering Paul's "why not?"

I was of course being hasty and provocative here and anticipated
Syd's answers. Except in the rare case of a poem incorporating
lines from other poems, we have always avoided this method
ourselves, for exactly the reasons Syd mentions.

But all three of his reasons (and ours) are rather
obviated or at least qualified if you advisedly decide
to privilege the <q> structure over the poetic structure,
just as one normally does in tagging drama. Rather than
breaking up the <q> elements (or <said>, or whatever)
and privileging <lg> / <l>, you are free to break up
the <lg>s and even the <l>s while maintaining the integrity
of the <q>s. I can certainly imagine
doing this in cases where the poem is conversational,
consists mostly of dialogue, or (in other words) approaches the
dramatic.

And in the interests of honesty and minimal encoding,
I should probably mention option 5: tag quotations
(or direct speech, as appropriate) within verse
using an alternative markup scheme altogether, or none
at all, such as punctuation. I.e., leave the
literal quotation marks in place and do not attempt
to interpret them. This is in fact what we mostly have
done.

pfs


>  II. How we do this at WWP
>  III. A thought on <q>
>
>
> I.
> --
> > (1)
> > <q>
> >    <l>ore en penst Damede qui soffri passion,</l>
> >    <l>que que soit del parfaire le mur comencerom</l>
> > </q>
> > (and why not?)
>
> This is not a crazy idea at all, but IMHO it falls short for three
> reasons:
>
>  a) Because <q> is not a valid child of <lg>;
>
>  b) because it does not handle the similar case in which the
>     quotation inconveniently starts in the middle, not the beginning,
>     of a metrical line; and
>
>  c) because a poem is not a sequence of quotations or other features
>     which happen to be metrical; a poem is a sequence of metrical
>     lines.
>
> II.
> ---
> We use Paul's second suggestion, but explicitly indicate which of the
> disconnected elements should be considered a single unit using the
> TEI's @next and @prev mechanism.[1] E.g.
>
> | <lg n="XIX." type="quatrain">
> |   <head>XIX.</head>
> |   <l>Were these, <said xml:id="Q5" next="#Q6">dear child of all my
> tenderest care,</said></l>
> |   <l><said xml:id="Q6" next="#Q7" prev="#Q5">Transfer that duteous love
> to me you pay'd,</said></l>
> |   <l><said xml:id="Q7" prev="#Q6">To thy dear sire;—live but for
> him,</said> and died;—</l>
> |   <l>Say blessed spirit, have I disobey'd?</l>
> | </lg>
>
> As you can see, one advantage of this solution is that it scales
> well: it works when only partial lines are involved, or there is
> narrative text between two chunks of direct speech or quotation.
>
> But yes, it has the disadvantage that you end up having a <said> (or
> <q> or <quote>) child for each involved metrical line.
>
> III.
> ----
> I'm curious about your choice of <q> over <said> or <quote>. If these
> are quotations (as in, short excerpts from a different work), why not
> use <quote> to explicitly indicate that? Or, if these are passages of
> direct speech, why not use <said> to explicitly indicate that? (I
> think of <q> as something like "there are quotation marks here in the
> source, but I am not going to tell you why they are here, either
> because I can't be bothered to figure it out, I can't afford to
> figure it out, or because I am deliberately trying to stay agnostic
> about such things".)
>
> Notes
> -----
>  [1] See http://www.tei-c.org/release/doc/tei-p5-doc/en/html/NH.html#NHVE


--
Paul Schaffner  Digital Content & Collections
University of Michigan Libraries
[hidden email] | http://www.umich.edu/~pfs/
Reply | Threaded
Open this post in threaded view
|

Re: quotation across verse lines

Elisa Beshero-Bondar
In reply to this post by Syd Bauman-10
Well, I understand from this that my preferred model is complicated and nonconformant to the TEI because of the application of `<said/>`--and that what I'm really modelling is **anchor**-style elements pointing with @next and @prev. Okay, I'll concede that.

But we can't get around the problem of adding LOTS of separate <said>....</said> elements inside each of the lines when there is really one long connected <said>...</said> that weaves around the lines. I suppose what I've crafted is something alternative to the TEI's way of handling `<said>` in cases like this: I want to be able to model one point of absolute beginning and one point of absolute ending, and I suppose I should be using `<anchor>` for that, but I want the attributes that go with `<said>` without having to change the element terribly much. 

I think of this visually as a model in which I see the poem in one dimension within its line boundaries, but I also want to see another dimension inconsistent with XML hierarchy: I want to see *one* start point and *one* end point, perhaps with interrupt and resume internal boundaries--but no more than that. Because I want those to be signaled as discreet events, I want **signposts** of them, particles in the stream of text that signal an act of speaking has begun and stopped (or has been interrupted and resumed).

I recognize that this is not *expected* behavior and that the way we process TEI wasn't built to handle this--apologies for that, and maybe it's not optimal for the present question. I don't know that is is not "interchangeable", though, if it can be documented and understood and even *mapped* with XSLT to a more traditional representation when needed. 

 It is optimal for me in writing my own processing scripts and addressing my research questions--and for that reason I'm a little concerned that this approach to the simple problem of overlapping hierarchy should be deemed a "violation" of the content model of the TEI itself. Is it a violation of the use of the `<said>` element, of the use of *anchor-style* elements (I agree I shouldn't be using milestone here, since anchor elements are built for range boundaries), or something else entirely?

To be clear, the reason I press this is because it affects multiple projects that I really thought I was coding with TEI, in which being able to signal and track overlapping hierarchies is vital to the research questions driving the projects. 

Elisa


On Thu, Oct 5, 2017 at 10:57 AM, Syd Bauman <[hidden email]> wrote:
This encoding is not conformant TEI (which is not really a problem),
and for those who are used to TEI will be very confusing (which does
strike me as problematic). Generic TEI software (e.g. TAPAS) will
have no idea how to process this.

In 2007 the TEI (very unwisely, IMHO) specifically eschewed the
method of encoding here (which is affectionately called HORSE) in
favor of using the <anchor> element. (Except in certain special cases
where there is a pre-existing *Span element (e.g., <addSpan>) which
would replace the first <anchor>.) See
http://www.tei-c.org/release/doc/tei-p5-doc/en/html/NH.html#NHBM

This encoding uses empty <said> elements to mark the boundaries of
what is direct speech, and uses @next (and @prev) to point from the
beginning boundary to the end (and back). That is, it uses attributes
intended for reconstitution of fragmented elements for indicating
start- and end- boundaries of elements. I see these as major
violations of the TEI abstract model, and as constructs that would
severely harm interchangeability.

In TEI the <said> element is intended to *contain* the passage of
direct speech. Using it as a segment boundary delimiter instead would
confuse the BLEEP out of any software written to follow the
Guidelines, *especially* when there is no indication (other than it
being empty) that it is not being used in the standard way. And using
@next to perform the function of @spanTo (i.e., to say "consider the
span of this element to be from here to there" rather than "consider
the span of this element both the content of this element and the
content of the one(s) there") I think will likewise wreak havoc.

But I'd like to reiterate that I'm not suggesting this is a bad
representation of the underlying structure -- given the limits of
XML, I think it is a good representation. But it is a bad use of TEI
for that representation.

It is also worth correcting the nomenclature. This use of <said> is
*not* as milestone elements, it is as empty elements. (Or, as empty
segment-boundary delimiters if you want to be precise.) A milestone,
although also empty, is something else -- the stuff being divided up
by milestones tessellates an ancestor. (E.g., every character (black
mark on the page, not speaker) in a book is on one and only one page
-- thus pages tessellate the book, and <pb> is a milestone that marks
page beginnings. It is not the case that every character in a book is
part of one and only one passage of direct speech.)

Lastly the values of @who and @ana are pointers, so I'm guessing that
you want "#I-speaker" and "#interrupt", etc.

P.S. One of the reasons that the mechanism TEI suggests for empty
boundary delimiters (using <anchor>) is so lame is because
<anchor> does not have the attributes needed to properly encode the
information available on the standard element (in this case <said>).
E.g., there's no @who on <anchor>. Another reason is that the TEI
gives no rule for how to indicate the semantics of that which is
being delimited -- the Guidelines just say to use @subtype for this
purpose, but don't say how. The TEI mechanism for empty boundary
delimiters being so impoverished is why I prefer reconstitution of
fragmented elements, i.e. using @next (and maybe @prev) on a set
of normal elements to indicate "these are one".

> Ahh! Once more—I missed a couple of tags in that last post, so I’ve
> corrected them here. What I like about this is it permits me to
> locate, count, and process in a connected way the words in my text
> nodes that fall between the <said/> milestones. This is easily
> XPath-able and smooth sailing without requiring said elements
> inside every line. I can use these to count the number of speech
> acts, as well.
>
> <lg>
>   <l xml:id="n01"><said who="I-speaker" xml:id="s1" next="#s2"/>Now tell me, had you rather be,<said xml:id="s2" prev="#s1" ana="interrupt"/></l>
>   <l xml:id="n02">I said, and took him by the arm,</l>
>   <l xml:id="n03"><said xml:id="s3" next="#s4" ana="resume"/>On Kilve's smooth shore, by the green sea,</l>
>   <l xml:id="n04">Or here at Liswyn farm?<said xml:id="s4" prev="#s3"/></l>
> </lg>
> <lg>
>   <l xml:id="n05">In careless mood he looked at me,</l>
>   <l xml:id="n06">While still I held him by the arm,</l>
>   <l xml:id="n07">And said, <said who="Edward" xml:id="s5" next="#s6"/>At Kilve I'd rather be</l>
>   <l xml:id="n08">Than here at Liswyn farm.<said xml:id="s6" prev="#s5"/></l>
> </lg>
> <lg>
>   <l xml:id="n09"><said who="I-speaker" xml:id="s7" next="#s8"/>Now, little Edward, say why so:</l>
>   <l xml:id="n10">My little Edward, tell me why.<said xml:id="s8" prev="#s7"/> --</l>
>   <l xml:id="n11"><said who="Edward" xml:id="s9" next="#s10"/>I cannot tell, I do not know.<said xml:id="s10" prev="#s9"/> --</l>
>   <l xml:id="n12"><said who="I-speaker" xml:id="s11" next="#s12"/>Why, this is strange,<said xml:id="s12” prev="#s11” ana=“interrupt"/> said I;</l>
> </lg>
> <lg>
>   <l xml:id="n13”><said who="I-speaker" xml:id="s13" next="#s14” ana=“resume"/>For, here are woods, hills smooth and warm:</l>
>   <l xml:id="n14">There surely must some reason be</l>
>   <l xml:id="n15">Why you would change sweet Liswyn farm</l>
>   <l xml:id="n16">For Kilve by the green sea.<said xml:id="s14" prev="#s13"/></l>
> </lg>
> <lg>
>   <l xml:id="n17">At this, my boy hung down his head,</l>
>   <l xml:id="n18">He blushed with shame, nor made reply;</l>
>   <l xml:id="n19">And three times to the child I said,</l>
>   <l xml:id="n20"><said who="I-speaker" xml:id="s13" next="#s14"/>Why, Edward, tell me why?<said xml:id="s14" prev="#s13"/></l>
> </lg>



--
Elisa Beshero-Bondar, PhD
Director, Center for the Digital Text | Associate Professor of English
University of Pittsburgh at Greensburg | Humanities Division
150 Finoli Drive
Greensburg, PA  15601  USA
E-mail:[hidden email]
Development site: http://newtfire.org
Reply | Threaded
Open this post in threaded view
|

Re: quotation across verse lines

Martin Holmes
> But we can't get around the problem of adding LOTS of separate
> <said>....</said> elements inside each of the lines when there is
> really one long connected <said>...</said> that weaves around the
> lines.

I think that's the very purpose of @next and @prev, isn't it? They
specify that this is a fragmented element that in an alternative
hierarchy would be continuous.

Cheers,
Martin

On 2017-10-05 08:47 AM, Elisa Beshero-Bondar wrote:

> Well, I understand from this that my preferred model is complicated
> and nonconformant to the TEI because of the application of
> `<said/>`--and that what I'm really modelling is **anchor**-style
> elements pointing with @next and @prev. Okay, I'll concede that.
>
> But we can't get around the problem of adding LOTS of separate
> <said>....</said> elements inside each of the lines when there is
> really one long connected <said>...</said> that weaves around the
> lines. I suppose what I've crafted is something alternative to the
> TEI's way of handling `<said>` in cases like this: I want to be able
> to model one point of absolute beginning and one point of absolute
> ending, and I suppose I should be using `<anchor>` for that, but I
> want the attributes that go with `<said>` without having to change
> the element terribly much.
>
> I think of this visually as a model in which I see the poem in one
> dimension within its line boundaries, but I also want to see another
>  dimension inconsistent with XML hierarchy: I want to see *one* start
>  point and *one* end point, perhaps with interrupt and resume
> internal boundaries--but no more than that. Because I want those to
> be signaled as discreet events, I want **signposts** of them,
> particles in the stream of text that signal an act of speaking has
> begun and stopped (or has been interrupted and resumed).
>
> I recognize that this is not *expected* behavior and that the way we
>  process TEI wasn't built to handle this--apologies for that, and
> maybe it's not optimal for the present question. I don't know that is
> is not "interchangeable", though, if it can be documented and
> understood and even *mapped* with XSLT to a more traditional
> representation when needed.
>
> It is optimal for me in writing my own processing scripts and
> addressing my research questions--and for that reason I'm a little
> concerned that this approach to the simple problem of overlapping
> hierarchy should be deemed a "violation" of the content model of the
> TEI itself. Is it a violation of the use of the `<said>` element, of
> the use of *anchor-style* elements (I agree I shouldn't be using
> milestone here, since anchor elements are built for range
> boundaries), or something else entirely?
>
> To be clear, the reason I press this is because it affects multiple
> projects that I really thought I was coding with TEI, in which being
>  able to signal and track overlapping hierarchies is vital to the
> research questions driving the projects.
>
> Elisa
>
>
> On Thu, Oct 5, 2017 at 10:57 AM, Syd Bauman
> <[hidden email] <mailto:[hidden email]>>
> wrote:
>
> This encoding is not conformant TEI (which is not really a problem),
> and for those who are used to TEI will be very confusing (which does
> strike me as problematic). Generic TEI software (e.g. TAPAS) will
> have no idea how to process this.
>
> In 2007 the TEI (very unwisely, IMHO) specifically eschewed the
> method of encoding here (which is affectionately called HORSE) in
> favor of using the <anchor> element. (Except in certain special
> cases where there is a pre-existing *Span element (e.g., <addSpan>)
> which would replace the first <anchor>.) See
> http://www.tei-c.org/release/doc/tei-p5-doc/en/html/NH.html#NHBM 
> <http://www.tei-c.org/release/doc/tei-p5-doc/en/html/NH.html#NHBM>
>
> This encoding uses empty <said> elements to mark the boundaries of
> what is direct speech, and uses @next (and @prev) to point from the
> beginning boundary to the end (and back). That is, it uses
> attributes intended for reconstitution of fragmented elements for
> indicating start- and end- boundaries of elements. I see these as
> major violations of the TEI abstract model, and as constructs that
> would severely harm interchangeability.
>
> In TEI the <said> element is intended to *contain* the passage of
> direct speech. Using it as a segment boundary delimiter instead
> would confuse the BLEEP out of any software written to follow the
> Guidelines, *especially* when there is no indication (other than it
> being empty) that it is not being used in the standard way. And
> using @next to perform the function of @spanTo (i.e., to say
> "consider the span of this element to be from here to there" rather
> than "consider the span of this element both the content of this
> element and the content of the one(s) there") I think will likewise
> wreak havoc.
>
> But I'd like to reiterate that I'm not suggesting this is a bad
> representation of the underlying structure -- given the limits of
> XML, I think it is a good representation. But it is a bad use of TEI
> for that representation.
>
> It is also worth correcting the nomenclature. This use of <said> is
> *not* as milestone elements, it is as empty elements. (Or, as empty
> segment-boundary delimiters if you want to be precise.) A milestone,
> although also empty, is something else -- the stuff being divided up
> by milestones tessellates an ancestor. (E.g., every character (black
> mark on the page, not speaker) in a book is on one and only one page
> -- thus pages tessellate the book, and <pb> is a milestone that
> marks page beginnings. It is not the case that every character in a
> book is part of one and only one passage of direct speech.)
>
> Lastly the values of @who and @ana are pointers, so I'm guessing
> that you want "#I-speaker" and "#interrupt", etc.
>
> P.S. One of the reasons that the mechanism TEI suggests for empty
> boundary delimiters (using <anchor>) is so lame is because <anchor>
> does not have the attributes needed to properly encode the
> information available on the standard element (in this case <said>).
> E.g., there's no @who on <anchor>. Another reason is that the TEI
> gives no rule for how to indicate the semantics of that which is
> being delimited -- the Guidelines just say to use @subtype for this
> purpose, but don't say how. The TEI mechanism for empty boundary
> delimiters being so impoverished is why I prefer reconstitution of
> fragmented elements, i.e. using @next (and maybe @prev) on a set of
> normal elements to indicate "these are one".
>
>> Ahh! Once more—I missed a couple of tags in that last post, so
>> I’ve corrected them here. What I like about this is it permits me
>> to locate, count, and process in a connected way the words in my
>> text nodes that fall between the <said/> milestones. This is
>> easily XPath-able and smooth sailing without requiring said
>> elements inside every line. I can use these to count the number of
>> speech acts, as well.
>>
>> <lg> <l xml:id="n01"><said who="I-speaker" xml:id="s1"
> next="#s2"/>Now tell me, had you rather be,<said xml:id="s2"
> prev="#s1" ana="interrupt"/></l>
>> <l xml:id="n02">I said, and took him by the arm,</l> <l
>> xml:id="n03"><said xml:id="s3" next="#s4" ana="resume"/>On
> Kilve's smooth shore, by the green sea,</l>
>> <l xml:id="n04">Or here at Liswyn farm?<said xml:id="s4"
> prev="#s3"/></l>
>> </lg> <lg> <l xml:id="n05">In careless mood he looked at me,</l> <l
>> xml:id="n06">While still I held him by the arm,</l> <l
>> xml:id="n07">And said, <said who="Edward" xml:id="s5"
> next="#s6"/>At Kilve I'd rather be</l>
>> <l xml:id="n08">Than here at Liswyn farm.<said xml:id="s6"
> prev="#s5"/></l>
>> </lg> <lg> <l xml:id="n09"><said who="I-speaker" xml:id="s7"
> next="#s8"/>Now, little Edward, say why so:</l>
>> <l xml:id="n10">My little Edward, tell me why.<said xml:id="s8"
> prev="#s7"/> --</l>
>> <l xml:id="n11"><said who="Edward" xml:id="s9" next="#s10"/>I
> cannot tell, I do not know.<said xml:id="s10" prev="#s9"/> --</l>
>> <l xml:id="n12"><said who="I-speaker" xml:id="s11"
> next="#s12"/>Why, this is strange,<said xml:id="s12” prev="#s11”
> ana=“interrupt"/> said I;</l>
>> </lg> <lg> <l xml:id="n13”><said who="I-speaker" xml:id="s13"
>> next="#s14”
> ana=“resume"/>For, here are woods, hills smooth and warm:</l>
>> <l xml:id="n14">There surely must some reason be</l> <l
>> xml:id="n15">Why you would change sweet Liswyn farm</l> <l
>> xml:id="n16">For Kilve by the green sea.<said xml:id="s14"
> prev="#s13"/></l>
>> </lg> <lg> <l xml:id="n17">At this, my boy hung down his head,</l>
>> <l xml:id="n18">He blushed with shame, nor made reply;</l> <l
>> xml:id="n19">And three times to the child I said,</l> <l
>> xml:id="n20"><said who="I-speaker" xml:id="s13"
> next="#s14"/>Why, Edward, tell me why?<said xml:id="s14"
> prev="#s13"/></l>
>> </lg>
>
>
>
>
> -- Elisa Beshero-Bondar, PhD Director, Center for the Digital Text |
> Associate Professor of English University of Pittsburgh at Greensburg
> | Humanities Division 150 Finoli Drive Greensburg, PA  15601  USA
> E-mail: [hidden email] <mailto:[hidden email]> Development site:
> http://newtfire.org <http://newtfire.org/>
Reply | Threaded
Open this post in threaded view
|

Re: quotation across verse lines

Elisa Beshero-Bondar
Indeed yes. The issue I have with:
<l><!--Single speech act starts here --><said id="#something">.....</said></l>
<l><said id="#somethingNew" next="#somethingNewerStill">....</said></l>
<l><said prev="#somethingNewerStill">....</said><!--Single speech act ends here --> </l>

is the sheer proliferation of id values that I have to generate for the speech act, when there is only one start point and one end point. I don't like the idea of having to generate ids for the saids I *have* to place in the middle just because I need to designate them to establish an overlapping hierarchy is in place.

I like my sign-post "anchor" modeling because I only need to use these when there's a change in the speech event itself, a start, an interruption, a resumption, a close. I don't need to add elements or ids that are semantically not meaningful to the speech act. 

I also kind of don't want to change the hierarchical level of my text node in a case of overlapping hierarchy if I don't have to.

Elisa


On Thu, Oct 5, 2017 at 12:04 PM, Martin Holmes <[hidden email]> wrote:
But we can't get around the problem of adding LOTS of separate
<said>....</said> elements inside each of the lines when there is
really one long connected <said>...</said> that weaves around the
lines.

I think that's the very purpose of @next and @prev, isn't it? They specify that this is a fragmented element that in an alternative hierarchy would be continuous.

Cheers,
Martin


On 2017-10-05 08:47 AM, Elisa Beshero-Bondar wrote:
Well, I understand from this that my preferred model is complicated
and nonconformant to the TEI because of the application of
`<said/>`--and that what I'm really modelling is **anchor**-style
elements pointing with @next and @prev. Okay, I'll concede that.

But we can't get around the problem of adding LOTS of separate <said>....</said> elements inside each of the lines when there is
really one long connected <said>...</said> that weaves around the
lines. I suppose what I've crafted is something alternative to the
TEI's way of handling `<said>` in cases like this: I want to be able
to model one point of absolute beginning and one point of absolute
ending, and I suppose I should be using `<anchor>` for that, but I
want the attributes that go with `<said>` without having to change
the element terribly much.

I think of this visually as a model in which I see the poem in one dimension within its line boundaries, but I also want to see another
 dimension inconsistent with XML hierarchy: I want to see *one* start
 point and *one* end point, perhaps with interrupt and resume
internal boundaries--but no more than that. Because I want those to
be signaled as discreet events, I want **signposts** of them,
particles in the stream of text that signal an act of speaking has
begun and stopped (or has been interrupted and resumed).

I recognize that this is not *expected* behavior and that the way we
 process TEI wasn't built to handle this--apologies for that, and
maybe it's not optimal for the present question. I don't know that is
is not "interchangeable", though, if it can be documented and
understood and even *mapped* with XSLT to a more traditional
representation when needed.

It is optimal for me in writing my own processing scripts and addressing my research questions--and for that reason I'm a little concerned that this approach to the simple problem of overlapping hierarchy should be deemed a "violation" of the content model of the
TEI itself. Is it a violation of the use of the `<said>` element, of
the use of *anchor-style* elements (I agree I shouldn't be using
milestone here, since anchor elements are built for range
boundaries), or something else entirely?

To be clear, the reason I press this is because it affects multiple projects that I really thought I was coding with TEI, in which being
 able to signal and track overlapping hierarchies is vital to the research questions driving the projects.

Elisa


On Thu, Oct 5, 2017 at 10:57 AM, Syd Bauman
<[hidden email] <mailto:[hidden email]>>
wrote:

This encoding is not conformant TEI (which is not really a problem), and for those who are used to TEI will be very confusing (which does strike me as problematic). Generic TEI software (e.g. TAPAS) will have no idea how to process this.

In 2007 the TEI (very unwisely, IMHO) specifically eschewed the method of encoding here (which is affectionately called HORSE) in favor of using the <anchor> element. (Except in certain special
cases where there is a pre-existing *Span element (e.g., <addSpan>)
which would replace the first <anchor>.) See https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.tei-c.org%2Frelease%2Fdoc%2Ftei-p5-doc%2Fen%2Fhtml%2FNH.html%23NHBM&data=01%7C01%7Cebb8%40PITT.EDU%7Cf5b0767f97aa4435200308d50c0ac249%7C9ef9f489e0a04eeb87cc3a526112fd0d%7C1&sdata=MrwEIqpKkaX9FWF9cSS8%2FoOZP9oU%2F8PqoE%2F%2FAARFl2A%3D&reserved=0 <https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.tei-c.org%2Frelease%2Fdoc%2Ftei-p5-doc%2Fen%2Fhtml%2FNH.html%23NHBM&data=01%7C01%7Cebb8%40PITT.EDU%7Cf5b0767f97aa4435200308d50c0ac249%7C9ef9f489e0a04eeb87cc3a526112fd0d%7C1&sdata=MrwEIqpKkaX9FWF9cSS8%2FoOZP9oU%2F8PqoE%2F%2FAARFl2A%3D&reserved=0>


This encoding uses empty <said> elements to mark the boundaries of what is direct speech, and uses @next (and @prev) to point from the beginning boundary to the end (and back). That is, it uses
attributes intended for reconstitution of fragmented elements for
indicating start- and end- boundaries of elements. I see these as
major violations of the TEI abstract model, and as constructs that
would severely harm interchangeability.

In TEI the <said> element is intended to *contain* the passage of direct speech. Using it as a segment boundary delimiter instead
would confuse the BLEEP out of any software written to follow the Guidelines, *especially* when there is no indication (other than it being empty) that it is not being used in the standard way. And
using @next to perform the function of @spanTo (i.e., to say
"consider the span of this element to be from here to there" rather
than "consider the span of this element both the content of this
element and the content of the one(s) there") I think will likewise
wreak havoc.

But I'd like to reiterate that I'm not suggesting this is a bad representation of the underlying structure -- given the limits of XML, I think it is a good representation. But it is a bad use of TEI for that representation.

It is also worth correcting the nomenclature. This use of <said> is *not* as milestone elements, it is as empty elements. (Or, as empty segment-boundary delimiters if you want to be precise.) A milestone, although also empty, is something else -- the stuff being divided up by milestones tessellates an ancestor. (E.g., every character (black mark on the page, not speaker) in a book is on one and only one page -- thus pages tessellate the book, and <pb> is a milestone that
marks page beginnings. It is not the case that every character in a
book is part of one and only one passage of direct speech.)

Lastly the values of @who and @ana are pointers, so I'm guessing
that you want "#I-speaker" and "#interrupt", etc.

P.S. One of the reasons that the mechanism TEI suggests for empty boundary delimiters (using <anchor>) is so lame is because <anchor>
does not have the attributes needed to properly encode the information available on the standard element (in this case <said>). E.g., there's no @who on <anchor>. Another reason is that the TEI gives no rule for how to indicate the semantics of that which is being delimited -- the Guidelines just say to use @subtype for this purpose, but don't say how. The TEI mechanism for empty boundary delimiters being so impoverished is why I prefer reconstitution of fragmented elements, i.e. using @next (and maybe @prev) on a set of
normal elements to indicate "these are one".

Ahh! Once more—I missed a couple of tags in that last post, so
I’ve corrected them here. What I like about this is it permits me
to locate, count, and process in a connected way the words in my
text nodes that fall between the <said/> milestones. This is
easily XPath-able and smooth sailing without requiring said
elements inside every line. I can use these to count the number of
speech acts, as well.

<lg> <l xml:id="n01"><said who="I-speaker" xml:id="s1"
next="#s2"/>Now tell me, had you rather be,<said xml:id="s2" prev="#s1" ana="interrupt"/></l>
<l xml:id="n02">I said, and took him by the arm,</l> <l
xml:id="n03"><said xml:id="s3" next="#s4" ana="resume"/>On
Kilve's smooth shore, by the green sea,</l>
<l xml:id="n04">Or here at Liswyn farm?<said xml:id="s4"
prev="#s3"/></l>
</lg> <lg> <l xml:id="n05">In careless mood he looked at me,</l> <l
xml:id="n06">While still I held him by the arm,</l> <l
xml:id="n07">And said, <said who="Edward" xml:id="s5"
next="#s6"/>At Kilve I'd rather be</l>
<l xml:id="n08">Than here at Liswyn farm.<said xml:id="s6"
prev="#s5"/></l>
</lg> <lg> <l xml:id="n09"><said who="I-speaker" xml:id="s7"
next="#s8"/>Now, little Edward, say why so:</l>
<l xml:id="n10">My little Edward, tell me why.<said xml:id="s8"
prev="#s7"/> --</l>
<l xml:id="n11"><said who="Edward" xml:id="s9" next="#s10"/>I
cannot tell, I do not know.<said xml:id="s10" prev="#s9"/> --</l>
<l xml:id="n12"><said who="I-speaker" xml:id="s11"
next="#s12"/>Why, this is strange,<said xml:id="s12” prev="#s11” ana=“interrupt"/> said I;</l>
</lg> <lg> <l xml:id="n13”><said who="I-speaker" xml:id="s13"
next="#s14”
ana=“resume"/>For, here are woods, hills smooth and warm:</l>
<l xml:id="n14">There surely must some reason be</l> <l
xml:id="n15">Why you would change sweet Liswyn farm</l> <l
xml:id="n16">For Kilve by the green sea.<said xml:id="s14"
prev="#s13"/></l>
</lg> <lg> <l xml:id="n17">At this, my boy hung down his head,</l> <l xml:id="n18">He blushed with shame, nor made reply;</l> <l
xml:id="n19">And three times to the child I said,</l> <l
xml:id="n20"><said who="I-speaker" xml:id="s13"
next="#s14"/>Why, Edward, tell me why?<said xml:id="s14" prev="#s13"/></l>
</lg>




-- Elisa Beshero-Bondar, PhD Director, Center for the Digital Text |
Associate Professor of English University of Pittsburgh at Greensburg
| Humanities Division 150 Finoli Drive Greensburg, PA  15601  USA E-mail: [hidden email] <mailto:[hidden email]> Development site:
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fnewtfire.org&data=01%7C01%7Cebb8%40PITT.EDU%7Cf5b0767f97aa4435200308d50c0ac249%7C9ef9f489e0a04eeb87cc3a526112fd0d%7C1&sdata=E2jbGtHNaO1P4Hq9a31d2UsIMS3F8NOOJjFQLuZA2is%3D&reserved=0 <https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fnewtfire.org%2F&data=01%7C01%7Cebb8%40PITT.EDU%7Cf5b0767f97aa4435200308d50c0ac249%7C9ef9f489e0a04eeb87cc3a526112fd0d%7C1&sdata=VyQkLz7FhZaENVoubwwaSFjRi5gP4OpSk3PIN5Ycny0%3D&reserved=0>



--
Elisa Beshero-Bondar, PhD
Director, Center for the Digital Text | Associate Professor of English
University of Pittsburgh at Greensburg | Humanities Division
150 Finoli Drive
Greensburg, PA  15601  USA
E-mail:[hidden email]
Development site: http://newtfire.org
Reply | Threaded
Open this post in threaded view
|

Re: quotation across verse lines

David Farmer

I have been in a similar situation in another markup
language.  The official markup required tags that were
not necessary for human understanding.  (For example,
if an item that could contain multiple paragraphs only
contained one paragraph, then I didn't want to have
to write the "p" tags.  As a human, I could tell that
the intention was to just have one p there.)

So I wrote a script that took my human-readable markup
and converted it to official markup.  The script was
idempotent, meaning that if I applied to to official
markup it did nothing.  It was a lot faster for me to
write in a simplified (but illegal) version of the markup,
and then run my script occasionally.

This approach works as long as you know which official
errors the script can handle.

You could write a script to handle that use of < said />,
converting it to the verbose form,
with all the id values generated for you.

Regards,

David

ps. Not that it matters, but my script was in python.





On Thu, 5 Oct 2017, Elisa Beshero-Bondar wrote:

> Indeed yes. The issue I have with:<l><!--Single speech act starts here --><said
> id="#something">.....</said></l>
> <l><said id="#somethingNew" next="#somethingNewerStill">....</said></l>
> <l><said prev="#somethingNewerStill">....</said><!--Single speech act ends here --> </l>
>
> is the sheer proliferation of id values that I have to generate for the speech act, when there is only one
> start point and one end point. I don't like the idea of having to generate ids for the saids I *have* to
> place in the middle just because I need to designate them to establish an overlapping hierarchy is in place.
>
> I like my sign-post "anchor" modeling because I only need to use these when there's a change in the speech
> event itself, a start, an interruption, a resumption, a close. I don't need to add elements or ids that are
> semantically not meaningful to the speech act. 
>
> I also kind of don't want to change the hierarchical level of my text node in a case of overlapping
> hierarchy if I don't have to.
>
> Elisa
>
>
> On Thu, Oct 5, 2017 at 12:04 PM, Martin Holmes <[hidden email]> wrote:
>             But we can't get around the problem of adding LOTS of separate
>             <said>....</said> elements inside each of the lines when there is
>             really one long connected <said>...</said> that weaves around the
>             lines.
>
>
>       I think that's the very purpose of @next and @prev, isn't it? They specify that this is a
>       fragmented element that in an alternative hierarchy would be continuous.
>
>       Cheers,
>       Martin
>
>       On 2017-10-05 08:47 AM, Elisa Beshero-Bondar wrote:
>       Well, I understand from this that my preferred model is complicated
>       and nonconformant to the TEI because of the application of
>       `<said/>`--and that what I'm really modelling is **anchor**-style
>       elements pointing with @next and @prev. Okay, I'll concede that.
>
>       But we can't get around the problem of adding LOTS of separate <said>....</said> elements
>       inside each of the lines when there is
>       really one long connected <said>...</said> that weaves around the
>       lines. I suppose what I've crafted is something alternative to the
>       TEI's way of handling `<said>` in cases like this: I want to be able
>       to model one point of absolute beginning and one point of absolute
>       ending, and I suppose I should be using `<anchor>` for that, but I
>       want the attributes that go with `<said>` without having to change
>       the element terribly much.
>
>       I think of this visually as a model in which I see the poem in one dimension within its
>       line boundaries, but I also want to see another
>        dimension inconsistent with XML hierarchy: I want to see *one* start
>        point and *one* end point, perhaps with interrupt and resume
>       internal boundaries--but no more than that. Because I want those to
>       be signaled as discreet events, I want **signposts** of them,
>       particles in the stream of text that signal an act of speaking has
>       begun and stopped (or has been interrupted and resumed).
>
>       I recognize that this is not *expected* behavior and that the way we
>        process TEI wasn't built to handle this--apologies for that, and
>       maybe it's not optimal for the present question. I don't know that is
>       is not "interchangeable", though, if it can be documented and
>       understood and even *mapped* with XSLT to a more traditional
>       representation when needed.
>
>       It is optimal for me in writing my own processing scripts and addressing my research
>       questions--and for that reason I'm a little concerned that this approach to the simple
>       problem of overlapping hierarchy should be deemed a "violation" of the content model of
>       the
>       TEI itself. Is it a violation of the use of the `<said>` element, of
>       the use of *anchor-style* elements (I agree I shouldn't be using
>       milestone here, since anchor elements are built for range
>       boundaries), or something else entirely?
>
>       To be clear, the reason I press this is because it affects multiple projects that I really
>       thought I was coding with TEI, in which being
>        able to signal and track overlapping hierarchies is vital to the research questions
>       driving the projects.
>
>       Elisa
>
>
>       On Thu, Oct 5, 2017 at 10:57 AM, Syd Bauman
> <[hidden email] <mailto:[hidden email]>>
> wrote:
>
> This encoding is not conformant TEI (which is not really a problem), and for those who are used
> to TEI will be very confusing (which does strike me as problematic). Generic TEI software (e.g.
> TAPAS) will have no idea how to process this.
>
> In 2007 the TEI (very unwisely, IMHO) specifically eschewed the method of encoding here (which
> is affectionately called HORSE) in favor of using the <anchor> element. (Except in certain
> special
> cases where there is a pre-existing *Span element (e.g., <addSpan>)
> which would replace the first <anchor>.) Seehttps://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.tei-c.org%2Frelease%2Fdoc%2Ftei-p5-doc%
> 2Fen%2Fhtml%2FNH.html%23NHBM&data=01%7C01%7Cebb8%40PITT.EDU%7Cf5b0767f97aa4435200308d50c0ac249%7C9ef9f489e0
> a04eeb87cc3a526112fd0d%7C1&sdata=MrwEIqpKkaX9FWF9cSS8%2FoOZP9oU%2F8PqoE%2F%2FAARFl2A%3D&reserved=0<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.tei-c.org%2Frelease%2Fdoc%2Ftei-p5-doc
> %2Fen%2Fhtml%2FNH.html%23NHBM&data=01%7C01%7Cebb8%40PITT.EDU%7Cf5b0767f97aa4435200308d50c0ac249%7C9ef9f489e
> 0a04eeb87cc3a526112fd0d%7C1&sdata=MrwEIqpKkaX9FWF9cSS8%2FoOZP9oU%2F8PqoE%2F%2FAARFl2A%3D&reserved=0>
>
> This encoding uses empty <said> elements to mark the boundaries of what is direct speech, and
> uses @next (and @prev) to point from the beginning boundary to the end (and back). That is, it
> uses
> attributes intended for reconstitution of fragmented elements for
> indicating start- and end- boundaries of elements. I see these as
> major violations of the TEI abstract model, and as constructs that
> would severely harm interchangeability.
>
> In TEI the <said> element is intended to *contain* the passage of direct speech. Using it as a
> segment boundary delimiter instead
> would confuse the BLEEP out of any software written to follow the Guidelines, *especially* when
> there is no indication (other than it being empty) that it is not being used in the standard
> way. And
> using @next to perform the function of @spanTo (i.e., to say
> "consider the span of this element to be from here to there" rather
> than "consider the span of this element both the content of this
> element and the content of the one(s) there") I think will likewise
> wreak havoc.
>
> But I'd like to reiterate that I'm not suggesting this is a bad representation of the underlying
> structure -- given the limits of XML, I think it is a good representation. But it is a bad use
> of TEI for that representation.
>
> It is also worth correcting the nomenclature. This use of <said> is *not* as milestone elements,
> it is as empty elements. (Or, as empty segment-boundary delimiters if you want to be precise.) A
> milestone, although also empty, is something else -- the stuff being divided up by milestones
> tessellates an ancestor. (E.g., every character (black mark on the page, not speaker) in a book
> is on one and only one page -- thus pages tessellate the book, and <pb> is a milestone that
> marks page beginnings. It is not the case that every character in a
> book is part of one and only one passage of direct speech.)
>
> Lastly the values of @who and @ana are pointers, so I'm guessing
> that you want "#I-speaker" and "#interrupt", etc.
>
> P.S. One of the reasons that the mechanism TEI suggests for empty boundary delimiters (using
> <anchor>) is so lame is because <anchor>
> does not have the attributes needed to properly encode the information available on the standard
> element (in this case <said>). E.g., there's no @who on <anchor>. Another reason is that the TEI
> gives no rule for how to indicate the semantics of that which is being delimited -- the
> Guidelines just say to use @subtype for this purpose, but don't say how. The TEI mechanism for
> empty boundary delimiters being so impoverished is why I prefer reconstitution of fragmented
> elements, i.e. using @next (and maybe @prev) on a set of
> normal elements to indicate "these are one".
>
>       Ahh! Once more—I missed a couple of tags in that last post, so
>       I’ve corrected them here. What I like about this is it permits me
>       to locate, count, and process in a connected way the words in my
>       text nodes that fall between the <said/> milestones. This is
>       easily XPath-able and smooth sailing without requiring said
>       elements inside every line. I can use these to count the number of
>       speech acts, as well.
>
>       <lg> <l xml:id="n01"><said who="I-speaker" xml:id="s1"
>
> next="#s2"/>Now tell me, had you rather be,<said xml:id="s2" prev="#s1" ana="interrupt"/></l>
>       <l xml:id="n02">I said, and took him by the arm,</l> <l
>       xml:id="n03"><said xml:id="s3" next="#s4" ana="resume"/>On
>
> Kilve's smooth shore, by the green sea,</l>
>       <l xml:id="n04">Or here at Liswyn farm?<said xml:id="s4"
>
> prev="#s3"/></l>
>       </lg> <lg> <l xml:id="n05">In careless mood he looked at me,</l> <l
>       xml:id="n06">While still I held him by the arm,</l> <l
>       xml:id="n07">And said, <said who="Edward" xml:id="s5"
>
> next="#s6"/>At Kilve I'd rather be</l>
>       <l xml:id="n08">Than here at Liswyn farm.<said xml:id="s6"
>
> prev="#s5"/></l>
>       </lg> <lg> <l xml:id="n09"><said who="I-speaker" xml:id="s7"
>
> next="#s8"/>Now, little Edward, say why so:</l>
>       <l xml:id="n10">My little Edward, tell me why.<said xml:id="s8"
>
> prev="#s7"/> --</l>
>       <l xml:id="n11"><said who="Edward" xml:id="s9" next="#s10"/>I
>
> cannot tell, I do not know.<said xml:id="s10" prev="#s9"/> --</l>
>       <l xml:id="n12"><said who="I-speaker" xml:id="s11"
>
> next="#s12"/>Why, this is strange,<said xml:id="s12” prev="#s11” ana=“interrupt"/> said I;</l>
>       </lg> <lg> <l xml:id="n13”><said who="I-speaker" xml:id="s13"
>       next="#s14”
>
> ana=“resume"/>For, here are woods, hills smooth and warm:</l>
>       <l xml:id="n14">There surely must some reason be</l> <l
>       xml:id="n15">Why you would change sweet Liswyn farm</l> <l
>       xml:id="n16">For Kilve by the green sea.<said xml:id="s14"
>
> prev="#s13"/></l>
>       </lg> <lg> <l xml:id="n17">At this, my boy hung down his head,</l> <l
>       xml:id="n18">He blushed with shame, nor made reply;</l> <l
>       xml:id="n19">And three times to the child I said,</l> <l
>       xml:id="n20"><said who="I-speaker" xml:id="s13"
>
> next="#s14"/>Why, Edward, tell me why?<said xml:id="s14" prev="#s13"/></l>
>       </lg>
>
>
>
>
>
> -- Elisa Beshero-Bondar, PhD Director, Center for the Digital Text |
> Associate Professor of English University of Pittsburgh at Greensburg
> | Humanities Division 150 Finoli Drive Greensburg, PA  15601  USA E-mail: [hidden email]
> <mailto:[hidden email]> Development site:
> https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fnewtfire.org&data=01%7C01%7Cebb8%40PITT.EDU
> %7Cf5b0767f97aa4435200308d50c0ac249%7C9ef9f489e0a04eeb87cc3a526112fd0d%7C1&sdata=E2jbGtHNaO1P4Hq9a31d2UsIMS
> 3F8NOOJjFQLuZA2is%3D&reserved=0<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fnewtfire.org%2F&data=01%7C01%7Cebb8%40PITT
> .EDU%7Cf5b0767f97aa4435200308d50c0ac249%7C9ef9f489e0a04eeb87cc3a526112fd0d%7C1&sdata=VyQkLz7FhZaENVoubwwaSF
> jRi5gP4OpSk3PIN5Ycny0%3D&reserved=0>
>
>
>
>
> --
> Elisa Beshero-Bondar, PhD
> Director, Center for the Digital Text | Associate Professor of English
> University of Pittsburgh at Greensburg | Humanities Division
> 150 Finoli Drive
> Greensburg, PA  15601  USA
> E-mail: [hidden email]
> Development site: http://newtfire.org
>
>
Reply | Threaded
Open this post in threaded view
|

Re: quotation across verse lines

Lou Burnard-6
In reply to this post by Elisa Beshero-Bondar

No problem inventing a special empty element to mark points in your text that need to be chained together to form a speech act. But please don't call it "<said>". That's tag abuse. <said/> means "nothing being said here".



On 05/10/17 17:13, Elisa Beshero-Bondar wrote:
Indeed yes. The issue I have with:
<l><!--Single speech act starts here --><said
id="#something">.....</said></l>
<l><said id="#somethingNew" next="#somethingNewerStill">....</said></l>
<l><said prev="#somethingNewerStill">....</said><!--Single speech act ends
here --> </l>

is the sheer proliferation of id values that I have to generate for the
speech act, when there is only one start point and one end point. I don't
like the idea of having to generate ids for the saids I *have* to place in
the middle just because I need to designate them to establish an
overlapping hierarchy is in place.

I like my sign-post "anchor" modeling because I only need to use these when
there's a change in the speech event itself, a start, an interruption, a
resumption, a close. I don't need to add elements or ids that are
semantically not meaningful to the speech act.

I also kind of don't want to change the hierarchical level of my text node
in a case of overlapping hierarchy if I don't have to.

Elisa


On Thu, Oct 5, 2017 at 12:04 PM, Martin Holmes [hidden email] wrote:

But we can't get around the problem of adding LOTS of separate
<said>....</said> elements inside each of the lines when there is
really one long connected <said>...</said> that weaves around the
lines.

I think that's the very purpose of @next and @prev, isn't it? They specify
that this is a fragmented element that in an alternative hierarchy would be
continuous.

Cheers,
Martin


On 2017-10-05 08:47 AM, Elisa Beshero-Bondar wrote:

Well, I understand from this that my preferred model is complicated
and nonconformant to the TEI because of the application of
`<said/>`--and that what I'm really modelling is **anchor**-style
elements pointing with @next and @prev. Okay, I'll concede that.

But we can't get around the problem of adding LOTS of separate
<said>....</said> elements inside each of the lines when there is
really one long connected <said>...</said> that weaves around the
lines. I suppose what I've crafted is something alternative to the
TEI's way of handling `<said>` in cases like this: I want to be able
to model one point of absolute beginning and one point of absolute
ending, and I suppose I should be using `<anchor>` for that, but I
want the attributes that go with `<said>` without having to change
the element terribly much.

I think of this visually as a model in which I see the poem in one
dimension within its line boundaries, but I also want to see another
 dimension inconsistent with XML hierarchy: I want to see *one* start
 point and *one* end point, perhaps with interrupt and resume
internal boundaries--but no more than that. Because I want those to
be signaled as discreet events, I want **signposts** of them,
particles in the stream of text that signal an act of speaking has
begun and stopped (or has been interrupted and resumed).

I recognize that this is not *expected* behavior and that the way we
 process TEI wasn't built to handle this--apologies for that, and
maybe it's not optimal for the present question. I don't know that is
is not "interchangeable", though, if it can be documented and
understood and even *mapped* with XSLT to a more traditional
representation when needed.

It is optimal for me in writing my own processing scripts and addressing
my research questions--and for that reason I'm a little concerned that this
approach to the simple problem of overlapping hierarchy should be deemed a
"violation" of the content model of the
TEI itself. Is it a violation of the use of the `<said>` element, of
the use of *anchor-style* elements (I agree I shouldn't be using
milestone here, since anchor elements are built for range
boundaries), or something else entirely?

To be clear, the reason I press this is because it affects multiple
projects that I really thought I was coding with TEI, in which being
 able to signal and track overlapping hierarchies is vital to the
research questions driving the projects.

Elisa


On Thu, Oct 5, 2017 at 10:57 AM, Syd Bauman
<[hidden email] [hidden email]>
wrote:

This encoding is not conformant TEI (which is not really a problem), and
for those who are used to TEI will be very confusing (which does strike me
as problematic). Generic TEI software (e.g. TAPAS) will have no idea how to
process this.

In 2007 the TEI (very unwisely, IMHO) specifically eschewed the method of
encoding here (which is affectionately called HORSE) in favor of using the
<anchor> element. (Except in certain special
cases where there is a pre-existing *Span element (e.g., <addSpan>)
which would replace the first <anchor>.) See
<a class="moz-txt-link-freetext" href="https://na01.safelinks.protection.outlook.com/?url=http%3A%">https://na01.safelinks.protection.outlook.com/?url=http%3A%
2F%2Fwww.tei-c.org%2Frelease%2Fdoc%2Ftei-p5-doc%2Fen%
2Fhtml%2FNH.html%23NHBM&data=01%7C01%7Cebb8%40PITT.EDU%7Cf5
b0767f97aa4435200308d50c0ac249%7C9ef9f489e0a04eeb87cc3a52611
2fd0d%7C1&sdata=MrwEIqpKkaX9FWF9cSS8%2FoOZP9oU%2F8PqoE%2F%2F
AARFl2A%3D&reserved=0 <https://na01.safelinks.protec
tion.outlook.com/?url=http%3A%2F%2Fwww.tei-c.org%2Frelease%
2Fdoc%2Ftei-p5-doc%2Fen%2Fhtml%2FNH.html%23NHBM&data=
01%7C01%7Cebb8%40PITT.EDU%7Cf5b0767f97aa4435200308d50c0ac249
%7C9ef9f489e0a04eeb87cc3a526112fd0d%7C1&sdata=MrwEIqpKkaX9FW
F9cSS8%2FoOZP9oU%2F8PqoE%2F%2FAARFl2A%3D&reserved=0>


This encoding uses empty <said> elements to mark the boundaries of what
is direct speech, and uses @next (and @prev) to point from the beginning
boundary to the end (and back). That is, it uses
attributes intended for reconstitution of fragmented elements for
indicating start- and end- boundaries of elements. I see these as
major violations of the TEI abstract model, and as constructs that
would severely harm interchangeability.

In TEI the <said> element is intended to *contain* the passage of direct
speech. Using it as a segment boundary delimiter instead
would confuse the BLEEP out of any software written to follow the
Guidelines, *especially* when there is no indication (other than it being
empty) that it is not being used in the standard way. And
using @next to perform the function of @spanTo (i.e., to say
"consider the span of this element to be from here to there" rather
than "consider the span of this element both the content of this
element and the content of the one(s) there") I think will likewise
wreak havoc.

But I'd like to reiterate that I'm not suggesting this is a bad
representation of the underlying structure -- given the limits of XML, I
think it is a good representation. But it is a bad use of TEI for that
representation.

It is also worth correcting the nomenclature. This use of <said> is *not*
as milestone elements, it is as empty elements. (Or, as empty
segment-boundary delimiters if you want to be precise.) A milestone,
although also empty, is something else -- the stuff being divided up by
milestones tessellates an ancestor. (E.g., every character (black mark on
the page, not speaker) in a book is on one and only one page -- thus pages
tessellate the book, and <pb> is a milestone that
marks page beginnings. It is not the case that every character in a
book is part of one and only one passage of direct speech.)

Lastly the values of @who and @ana are pointers, so I'm guessing
that you want "#I-speaker" and "#interrupt", etc.

P.S. One of the reasons that the mechanism TEI suggests for empty
boundary delimiters (using <anchor>) is so lame is because <anchor>
does not have the attributes needed to properly encode the information
available on the standard element (in this case <said>). E.g., there's no
@who on <anchor>. Another reason is that the TEI gives no rule for how to
indicate the semantics of that which is being delimited -- the Guidelines
just say to use @subtype for this purpose, but don't say how. The TEI
mechanism for empty boundary delimiters being so impoverished is why I
prefer reconstitution of fragmented elements, i.e. using @next (and maybe
@prev) on a set of
normal elements to indicate "these are one".

Ahh! Once more—I missed a couple of tags in that last post, so
I’ve corrected them here. What I like about this is it permits me
to locate, count, and process in a connected way the words in my
text nodes that fall between the <said/> milestones. This is
easily XPath-able and smooth sailing without requiring said
elements inside every line. I can use these to count the number of
speech acts, as well.

<lg> <l xml:id="n01"><said who="I-speaker" xml:id="s1"

next="#s2"/>Now tell me, had you rather be,<said xml:id="s2" prev="#s1"
ana="interrupt"/></l>

<l xml:id="n02">I said, and took him by the arm,</l> <l
xml:id="n03"><said xml:id="s3" next="#s4" ana="resume"/>On

Kilve's smooth shore, by the green sea,</l>

<l xml:id="n04">Or here at Liswyn farm?<said xml:id="s4"

prev="#s3"/></l>

</lg> <lg> <l xml:id="n05">In careless mood he looked at me,</l> <l
xml:id="n06">While still I held him by the arm,</l> <l
xml:id="n07">And said, <said who="Edward" xml:id="s5"

next="#s6"/>At Kilve I'd rather be</l>

<l xml:id="n08">Than here at Liswyn farm.<said xml:id="s6"

prev="#s5"/></l>

</lg> <lg> <l xml:id="n09"><said who="I-speaker" xml:id="s7"

next="#s8"/>Now, little Edward, say why so:</l>

<l xml:id="n10">My little Edward, tell me why.<said xml:id="s8"

prev="#s7"/> --</l>

<l xml:id="n11"><said who="Edward" xml:id="s9" next="#s10"/>I

cannot tell, I do not know.<said xml:id="s10" prev="#s9"/> --</l>

<l xml:id="n12"><said who="I-speaker" xml:id="s11"

next="#s12"/>Why, this is strange,<said xml:id="s12” prev="#s11”
ana=“interrupt"/> said I;</l>

</lg> <lg> <l xml:id="n13”><said who="I-speaker" xml:id="s13"
next="#s14”

ana=“resume"/>For, here are woods, hills smooth and warm:</l>

<l xml:id="n14">There surely must some reason be</l> <l
xml:id="n15">Why you would change sweet Liswyn farm</l> <l
xml:id="n16">For Kilve by the green sea.<said xml:id="s14"

prev="#s13"/></l>

</lg> <lg> <l xml:id="n17">At this, my boy hung down his head,</l> <l
xml:id="n18">He blushed with shame, nor made reply;</l> <l
xml:id="n19">And three times to the child I said,
<https://maps.google.com/?q=%22%3EAnd+three+times+to+the+child+I+said,&entry=gmail&source=g></l>
<l
xml:id="n20"><said who="I-speaker" xml:id="s13"

next="#s14"/>Why, Edward, tell me why?<said xml:id="s14" prev="#s13"/></l>

</lg>




-- Elisa Beshero-Bondar, PhD Director, Center for the Digital Text |
Associate Professor of English University of Pittsburgh at Greensburg
| Humanities Division 150 Finoli Drive Greensburg, PA  15601  USA E-mail:
[hidden email] [hidden email] Development site:
<a class="moz-txt-link-freetext" href="https://na01.safelinks.protection.outlook.com/?url=http%3A%">https://na01.safelinks.protection.outlook.com/?url=http%3A%
2F%2Fnewtfire.org&data=01%7C01%7Cebb8%40PITT.EDU%7Cf5b07
67f97aa4435200308d50c0ac249%7C9ef9f489e0a04eeb87cc3a526112fd
0d%7C1&sdata=E2jbGtHNaO1P4Hq9a31d2UsIMS3F8NOOJjFQLuZA2is%3D&reserved=0 <
https://na01.safelinks.protection.outlook.com/?url=http%3A%
2F%2Fnewtfire.org%2F&data=01%7C01%7Cebb8%40PITT.EDU%7Cf5b07
67f97aa4435200308d50c0ac249%7C9ef9f489e0a04eeb87cc3a526112fd
0d%7C1&sdata=VyQkLz7FhZaENVoubwwaSFjRi5gP4OpSk3PIN5Ycny0%3D&reserved=0>


      


Reply | Threaded
Open this post in threaded view
|

Re: quotation across verse lines

David Farmer

For my suggestion, it matters not at all what tag you use,
or if it even is a tag.  It just needs to be strings of characters
which unambiguously mark the start or the end of that you want to
convert into proper markup.  I often use wEIrdSTrinGS.  It just
has to be something that you know is not a real part of the document.

Nobody but the author (or someone who examines their script)
will ever know what string of characters is used, because
those other people will only see the proper XML output by
the script.

If the author finds it less confusing to use <said id="..."/>, then
they might as well do that.  Nobody else will ever know.
The fact that it is illegal as TEI is what makes it usable for
my script.

I'm just clarifying my suggestion for how you can use a script
to save some human effort.  The end result is perfect markup,
with no tag abuse, but the path may go through places that you
don't want other people to see.

David


On Thu, 5 Oct 2017, Lou Burnard wrote:

>
> No problem inventing a special empty element to mark points in your text that need to be chained together to
> form a speech act. But please don't call it "<said>". That's tag abuse. <said/> means "nothing being said
> here".
>
>
>
> On 05/10/17 17:13, Elisa Beshero-Bondar wrote:
>
> Indeed yes. The issue I have with:
> <l><!--Single speech act starts here --><said
> id="#something">.....</said></l>
> <l><said id="#somethingNew" next="#somethingNewerStill">....</said></l>
> <l><said prev="#somethingNewerStill">....</said><!--Single speech act ends
> here --> </l>
>
> is the sheer proliferation of id values that I have to generate for the
> speech act, when there is only one start point and one end point. I don't
> like the idea of having to generate ids for the saids I *have* to place in
> the middle just because I need to designate them to establish an
> overlapping hierarchy is in place.
>
> I like my sign-post "anchor" modeling because I only need to use these when
> there's a change in the speech event itself, a start, an interruption, a
> resumption, a close. I don't need to add elements or ids that are
> semantically not meaningful to the speech act.
>
> I also kind of don't want to change the hierarchical level of my text node
> in a case of overlapping hierarchy if I don't have to.
>
> Elisa
>
>
> On Thu, Oct 5, 2017 at 12:04 PM, Martin Holmes <[hidden email]> wrote:
>
> But we can't get around the problem of adding LOTS of separate
>
> <said>....</said> elements inside each of the lines when there is
> really one long connected <said>...</said> that weaves around the
> lines.
>
> I think that's the very purpose of @next and @prev, isn't it? They specify
> that this is a fragmented element that in an alternative hierarchy would be
> continuous.
>
> Cheers,
> Martin
>
>
> On 2017-10-05 08:47 AM, Elisa Beshero-Bondar wrote:
>
> Well, I understand from this that my preferred model is complicated
> and nonconformant to the TEI because of the application of
> `<said/>`--and that what I'm really modelling is **anchor**-style
> elements pointing with @next and @prev. Okay, I'll concede that.
>
> But we can't get around the problem of adding LOTS of separate
> <said>....</said> elements inside each of the lines when there is
> really one long connected <said>...</said> that weaves around the
> lines. I suppose what I've crafted is something alternative to the
> TEI's way of handling `<said>` in cases like this: I want to be able
> to model one point of absolute beginning and one point of absolute
> ending, and I suppose I should be using `<anchor>` for that, but I
> want the attributes that go with `<said>` without having to change
> the element terribly much.
>
> I think of this visually as a model in which I see the poem in one
> dimension within its line boundaries, but I also want to see another
>  dimension inconsistent with XML hierarchy: I want to see *one* start
>  point and *one* end point, perhaps with interrupt and resume
> internal boundaries--but no more than that. Because I want those to
> be signaled as discreet events, I want **signposts** of them,
> particles in the stream of text that signal an act of speaking has
> begun and stopped (or has been interrupted and resumed).
>
> I recognize that this is not *expected* behavior and that the way we
>  process TEI wasn't built to handle this--apologies for that, and
> maybe it's not optimal for the present question. I don't know that is
> is not "interchangeable", though, if it can be documented and
> understood and even *mapped* with XSLT to a more traditional
> representation when needed.
>
> It is optimal for me in writing my own processing scripts and addressing
> my research questions--and for that reason I'm a little concerned that this
> approach to the simple problem of overlapping hierarchy should be deemed a
> "violation" of the content model of the
> TEI itself. Is it a violation of the use of the `<said>` element, of
> the use of *anchor-style* elements (I agree I shouldn't be using
> milestone here, since anchor elements are built for range
> boundaries), or something else entirely?
>
> To be clear, the reason I press this is because it affects multiple
> projects that I really thought I was coding with TEI, in which being
>  able to signal and track overlapping hierarchies is vital to the
> research questions driving the projects.
>
> Elisa
>
>
> On Thu, Oct 5, 2017 at 10:57 AM, Syd Bauman
> <[hidden email] <mailto:[hidden email]>>
> wrote:
>
> This encoding is not conformant TEI (which is not really a problem), and
> for those who are used to TEI will be very confusing (which does strike me
> as problematic). Generic TEI software (e.g. TAPAS) will have no idea how to
> process this.
>
> In 2007 the TEI (very unwisely, IMHO) specifically eschewed the method of
> encoding here (which is affectionately called HORSE) in favor of using the
> <anchor> element. (Except in certain special
> cases where there is a pre-existing *Span element (e.g., <addSpan>)
> which would replace the first <anchor>.) See
> <a href="https://na01.safelinks.protection.outlook.com/?url=http%3A%">https://na01.safelinks.protection.outlook.com/?url=http%3A%
> 2F%2Fwww.tei-c.org%2Frelease%2Fdoc%2Ftei-p5-doc%2Fen%
> 2Fhtml%2FNH.html%23NHBM&data=01%7C01%7Cebb8%40PITT.EDU%7Cf5
> b0767f97aa4435200308d50c0ac249%7C9ef9f489e0a04eeb87cc3a52611
> 2fd0d%7C1&sdata=MrwEIqpKkaX9FWF9cSS8%2FoOZP9oU%2F8PqoE%2F%2F
> AARFl2A%3D&reserved=0 <https://na01.safelinks.protec
> tion.outlook.com/?url=http%3A%2F%2Fwww.tei-c.org%2Frelease%
> 2Fdoc%2Ftei-p5-doc%2Fen%2Fhtml%2FNH.html%23NHBM&data=
> 01%7C01%7Cebb8%40PITT.EDU%7Cf5b0767f97aa4435200308d50c0ac249
> %7C9ef9f489e0a04eeb87cc3a526112fd0d%7C1&sdata=MrwEIqpKkaX9FW
> F9cSS8%2FoOZP9oU%2F8PqoE%2F%2FAARFl2A%3D&reserved=0>
>
>
> This encoding uses empty <said> elements to mark the boundaries of what
> is direct speech, and uses @next (and @prev) to point from the beginning
> boundary to the end (and back). That is, it uses
> attributes intended for reconstitution of fragmented elements for
> indicating start- and end- boundaries of elements. I see these as
> major violations of the TEI abstract model, and as constructs that
> would severely harm interchangeability.
>
> In TEI the <said> element is intended to *contain* the passage of direct
> speech. Using it as a segment boundary delimiter instead
> would confuse the BLEEP out of any software written to follow the
> Guidelines, *especially* when there is no indication (other than it being
> empty) that it is not being used in the standard way. And
> using @next to perform the function of @spanTo (i.e., to say
> "consider the span of this element to be from here to there" rather
> than "consider the span of this element both the content of this
> element and the content of the one(s) there") I think will likewise
> wreak havoc.
>
> But I'd like to reiterate that I'm not suggesting this is a bad
> representation of the underlying structure -- given the limits of XML, I
> think it is a good representation. But it is a bad use of TEI for that
> representation.
>
> It is also worth correcting the nomenclature. This use of <said> is *not*
> as milestone elements, it is as empty elements. (Or, as empty
> segment-boundary delimiters if you want to be precise.) A milestone,
> although also empty, is something else -- the stuff being divided up by
> milestones tessellates an ancestor. (E.g., every character (black mark on
> the page, not speaker) in a book is on one and only one page -- thus pages
> tessellate the book, and <pb> is a milestone that
> marks page beginnings. It is not the case that every character in a
> book is part of one and only one passage of direct speech.)
>
> Lastly the values of @who and @ana are pointers, so I'm guessing
> that you want "#I-speaker" and "#interrupt", etc.
>
> P.S. One of the reasons that the mechanism TEI suggests for empty
> boundary delimiters (using <anchor>) is so lame is because <anchor>
> does not have the attributes needed to properly encode the information
> available on the standard element (in this case <said>). E.g., there's no
> @who on <anchor>. Another reason is that the TEI gives no rule for how to
> indicate the semantics of that which is being delimited -- the Guidelines
> just say to use @subtype for this purpose, but don't say how. The TEI
> mechanism for empty boundary delimiters being so impoverished is why I
> prefer reconstitution of fragmented elements, i.e. using @next (and maybe
> @prev) on a set of
> normal elements to indicate "these are one".
>
> Ahh! Once more—I missed a couple of tags in that last post, so
>
> I’ve corrected them here. What I like about this is it permits me
> to locate, count, and process in a connected way the words in my
> text nodes that fall between the <said/> milestones. This is
> easily XPath-able and smooth sailing without requiring said
> elements inside every line. I can use these to count the number of
> speech acts, as well.
>
> <lg> <l xml:id="n01"><said who="I-speaker" xml:id="s1"
>
> next="#s2"/>Now tell me, had you rather be,<said xml:id="s2" prev="#s1"
> ana="interrupt"/></l>
>
> <l xml:id="n02">I said, and took him by the arm,</l> <l
> xml:id="n03"><said xml:id="s3" next="#s4" ana="resume"/>On
>
> Kilve's smooth shore, by the green sea,</l>
>
> <l xml:id="n04">Or here at Liswyn farm?<said xml:id="s4"
>
> prev="#s3"/></l>
>
> </lg> <lg> <l xml:id="n05">In careless mood he looked at me,</l> <l
> xml:id="n06">While still I held him by the arm,</l> <l
> xml:id="n07">And said, <said who="Edward" xml:id="s5"
>
> next="#s6"/>At Kilve I'd rather be</l>
>
> <l xml:id="n08">Than here at Liswyn farm.<said xml:id="s6"
>
> prev="#s5"/></l>
>
> </lg> <lg> <l xml:id="n09"><said who="I-speaker" xml:id="s7"
>
> next="#s8"/>Now, little Edward, say why so:</l>
>
> <l xml:id="n10">My little Edward, tell me why.<said xml:id="s8"
>
> prev="#s7"/> --</l>
>
> <l xml:id="n11"><said who="Edward" xml:id="s9" next="#s10"/>I
>
> cannot tell, I do not know.<said xml:id="s10" prev="#s9"/> --</l>
>
> <l xml:id="n12"><said who="I-speaker" xml:id="s11"
>
> next="#s12"/>Why, this is strange,<said xml:id="s12” prev="#s11”
> ana=“interrupt"/> said I;</l>
>
> </lg> <lg> <l xml:id="n13”><said who="I-speaker" xml:id="s13"
> next="#s14”
>
> ana=“resume"/>For, here are woods, hills smooth and warm:</l>
>
> <l xml:id="n14">There surely must some reason be</l> <l
> xml:id="n15">Why you would change sweet Liswyn farm</l> <l
> xml:id="n16">For Kilve by the green sea.<said xml:id="s14"
>
> prev="#s13"/></l>
>
> </lg> <lg> <l xml:id="n17">At this, my boy hung down his head,</l> <l
> xml:id="n18">He blushed with shame, nor made reply;</l> <l
> xml:id="n19">And three times to the child I said,
> <https://maps.google.com/?q=%22%3EAnd+three+times+to+the+child+I+said,&entry=gmail&source=g></l>
> <l
> xml:id="n20"><said who="I-speaker" xml:id="s13"
>
> next="#s14"/>Why, Edward, tell me why?<said xml:id="s14" prev="#s13"/></l>
>
> </lg>
>
>
>
>
> -- Elisa Beshero-Bondar, PhD Director, Center for the Digital Text |
> Associate Professor of English University of Pittsburgh at Greensburg
> | Humanities Division 150 Finoli Drive Greensburg, PA  15601  USA E-mail:
> [hidden email] <mailto:[hidden email]> Development site:
> <a href="https://na01.safelinks.protection.outlook.com/?url=http%3A%">https://na01.safelinks.protection.outlook.com/?url=http%3A%
> 2F%2Fnewtfire.org&data=01%7C01%7Cebb8%40PITT.EDU%7Cf5b07
> 67f97aa4435200308d50c0ac249%7C9ef9f489e0a04eeb87cc3a526112fd
> 0d%7C1&sdata=E2jbGtHNaO1P4Hq9a31d2UsIMS3F8NOOJjFQLuZA2is%3D&reserved=0 <
> <a href="https://na01.safelinks.protection.outlook.com/?url=http%3A%">https://na01.safelinks.protection.outlook.com/?url=http%3A%
> 2F%2Fnewtfire.org%2F&data=01%7C01%7Cebb8%40PITT.EDU%7Cf5b07
> 67f97aa4435200308d50c0ac249%7C9ef9f489e0a04eeb87cc3a526112fd
> 0d%7C1&sdata=VyQkLz7FhZaENVoubwwaSFjRi5gP4OpSk3PIN5Ycny0%3D&reserved=0>
>
>
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: quotation across verse lines

Elisa Beshero-Bondar
Overlap is always an interesting challenge because it shows us limits and possibilities of what we can do with XML and TEI. Lou and Syd, I think you know that I'm not willfully out to abuse the TEI, but I'm thinking about what we do with markup and what's in an element name and its memberships in models and classes. The element <said> is more meaningful than <anchor> because of the definition we gave <said>: it "indicates passages thought or spoken aloud, whether explicitly indicated in the source or not, whether directly or indirectly reported, whether by real people or fictional characters".

The problem with overlap is that it demands we  "force-fit" our markup in some way: and words like "force" and "abuse" and "violence" come into play because we can't fully model things in TEI as we see them, and we introduce complications of various kinds. Perhaps we try to choose what seems to us the least of a series of evils--but I wish we didn't always think of what we're doing as at best trickery and at worst violence. TEI is flexible and extensible and our documentation gives it range of expression within the guidelines we've set. Because of what I've learned of <said> from the Guidelines, I'd far rather use <said> than <anchor> even in a case of overlap because <said> is meaningful in ways that <anchor> isn't without giving it a lot of new documentation. I can do it, but I didn't exactly see a strong indication from either the Guidelines or my parser that converting <said/> into an anchor-like expression was doing damage to its definition. I'm thinking a lot about that in light of your comments and what I've read.

Lou and Syd, you indicated that my <said/> solution is a violation or an abuse of the TEI, because <said> is expected to have a text node that indicates what is spoken, and if it's empty, it would necessarily mean that nothing is said. My first thought was, "why would I reach for a <said> element to indicate that nothing is spoken?" and I pondered whether there might be use cases for such a thing: Would it convey that we *expect* something to be said on the part of someone but nothing was uttered? Is it a way to make a silence explicit? That's interesting but unexpected, and certainly not what my markup of the Wordsworth poem would convey.

 My next thought was, let's check the Guidelines: how have we defined <said>? 

The element spec for <said> and the Guidelines passages on quotation (in 3.3.3 http://www.tei-c.org/release/doc/tei-p5-doc/en/html/CO.html#COHQQ ) give no expressed indication that a self-closing <said/> means "nothing at all is being said". I hunted for examples and explanation in the Guidelines to that effect to no avail. The only  examples we show give text contents to the element, and we've simply not represented the what possible meaning a self-closing version of the element might have. 

Because <said> is what I want to flag, and because it doesn't fit the hierarchy, I want the flexibility, the extensibility even, to render it self-closing specifically in the instance of overlap where we have to make choices that do with distending or contorting or stretching around our hierarchy. I want to be able to signal this as a range existing in another dimension than the one I'm marking within regular <lg> and <l> elements. If and when I need to convey that the hierarchical structure of poetry is being overlapped (as I very frequently do), I want a mechanism for doing it that doesn't sacrifice the semantic clarity of the element I would normally be treating as holding text content. Because this is overlap it *is* a special case, and we need special tools to handle it expressively and meaningfully, in a way that minimizes what I'd call a "semantic compromise"--where we have to default to a less meaningful element, or make the element we want to use force fit itself into multiple particulated units with lots of ids. We're making compromises no matter what we do here, but the overlap is often quite meaningful and requires a thoughtful solution. 

Best,
Elisa

On Thu, Oct 5, 2017 at 2:55 PM, David Farmer <[hidden email]> wrote:

For my suggestion, it matters not at all what tag you use,
or if it even is a tag.  It just needs to be strings of characters
which unambiguously mark the start or the end of that you want to
convert into proper markup.  I often use wEIrdSTrinGS.  It just
has to be something that you know is not a real part of the document.

Nobody but the author (or someone who examines their script)
will ever know what string of characters is used, because
those other people will only see the proper XML output by
the script.

If the author finds it less confusing to use <said id="..."/>, then
they might as well do that.  Nobody else will ever know.
The fact that it is illegal as TEI is what makes it usable for
my script.

I'm just clarifying my suggestion for how you can use a script
to save some human effort.  The end result is perfect markup,
with no tag abuse, but the path may go through places that you
don't want other people to see.

David



On Thu, 5 Oct 2017, Lou Burnard wrote:


No problem inventing a special empty element to mark points in your text that need to be chained together to
form a speech act. But please don't call it "<said>". That's tag abuse. <said/> means "nothing being said
here".



On 05/10/17 17:13, Elisa Beshero-Bondar wrote:

Indeed yes. The issue I have with:
<l><!--Single speech act starts here --><said
id="#something">.....</said></l>
<l><said id="#somethingNew" next="#somethingNewerStill">....</said></l>
<l><said prev="#somethingNewerStill">....</said><!--Single speech act ends
here --> </l>

is the sheer proliferation of id values that I have to generate for the
speech act, when there is only one start point and one end point. I don't
like the idea of having to generate ids for the saids I *have* to place in
the middle just because I need to designate them to establish an
overlapping hierarchy is in place.

I like my sign-post "anchor" modeling because I only need to use these when
there's a change in the speech event itself, a start, an interruption, a
resumption, a close. I don't need to add elements or ids that are
semantically not meaningful to the speech act.

I also kind of don't want to change the hierarchical level of my text node
in a case of overlapping hierarchy if I don't have to.

Elisa


On Thu, Oct 5, 2017 at 12:04 PM, Martin Holmes <[hidden email]> wrote:

But we can't get around the problem of adding LOTS of separate

<said>....</said> elements inside each of the lines when there is
really one long connected <said>...</said> that weaves around the
lines.

I think that's the very purpose of @next and @prev, isn't it? They specify
that this is a fragmented element that in an alternative hierarchy would be
continuous.

Cheers,
Martin


On 2017-10-05 08:47 AM, Elisa Beshero-Bondar wrote:

Well, I understand from this that my preferred model is complicated
and nonconformant to the TEI because of the application of
`<said/>`--and that what I'm really modelling is **anchor**-style
elements pointing with @next and @prev. Okay, I'll concede that.

But we can't get around the problem of adding LOTS of separate
<said>....</said> elements inside each of the lines when there is
really one long connected <said>...</said> that weaves around the
lines. I suppose what I've crafted is something alternative to the
TEI's way of handling `<said>` in cases like this: I want to be able
to model one point of absolute beginning and one point of absolute
ending, and I suppose I should be using `<anchor>` for that, but I
want the attributes that go with `<said>` without having to change
the element terribly much.

I think of this visually as a model in which I see the poem in one
dimension within its line boundaries, but I also want to see another
 dimension inconsistent with XML hierarchy: I want to see *one* start
 point and *one* end point, perhaps with interrupt and resume
internal boundaries--but no more than that. Because I want those to
be signaled as discreet events, I want **signposts** of them,
particles in the stream of text that signal an act of speaking has
begun and stopped (or has been interrupted and resumed).

I recognize that this is not *expected* behavior and that the way we
 process TEI wasn't built to handle this--apologies for that, and
maybe it's not optimal for the present question. I don't know that is
is not "interchangeable", though, if it can be documented and
understood and even *mapped* with XSLT to a more traditional
representation when needed.

It is optimal for me in writing my own processing scripts and addressing
my research questions--and for that reason I'm a little concerned that this
approach to the simple problem of overlapping hierarchy should be deemed a
"violation" of the content model of the
TEI itself. Is it a violation of the use of the `<said>` element, of
the use of *anchor-style* elements (I agree I shouldn't be using
milestone here, since anchor elements are built for range
boundaries), or something else entirely?

To be clear, the reason I press this is because it affects multiple
projects that I really thought I was coding with TEI, in which being
 able to signal and track overlapping hierarchies is vital to the
research questions driving the projects.

Elisa


On Thu, Oct 5, 2017 at 10:57 AM, Syd Bauman
<[hidden email] <mailto:[hidden email]>>
wrote:

This encoding is not conformant TEI (which is not really a problem), and
for those who are used to TEI will be very confusing (which does strike me
as problematic). Generic TEI software (e.g. TAPAS) will have no idea how to
process this.

In 2007 the TEI (very unwisely, IMHO) specifically eschewed the method of
encoding here (which is affectionately called HORSE) in favor of using the
<anchor> element. (Except in certain special
cases where there is a pre-existing *Span element (e.g., <addSpan>)
which would replace the first <anchor>.) See
<a href="https://na01.safelinks.protection.outlook.com/?url=http%3A%" rel="noreferrer" target="_blank">https://na01.safelinks.protection.outlook.com/?url=http%3A%
2F%2Fwww.tei-c.org%2Frelease%2Fdoc%2Ftei-p5-doc%2Fen%
2Fhtml%2FNH.html%23NHBM&data=01%7C01%7Cebb8%40PITT.EDU%7Cf5
b0767f97aa4435200308d50c0ac249%7C9ef9f489e0a04eeb87cc3a52611
2fd0d%7C1&sdata=MrwEIqpKkaX9FWF9cSS8%2FoOZP9oU%2F8PqoE%2F%2F
AARFl2A%3D&reserved=0 <https://na01.safelinks.protec
<a href="http://tion.outlook.com/?url=http%3A%2F%2Fwww.tei-c.org%2Frelease%" rel="noreferrer" target="_blank">tion.outlook.com/?url=http%3A%2F%2Fwww.tei-c.org%2Frelease%
2Fdoc%2Ftei-p5-doc%2Fen%2Fhtml%2FNH.html%23NHBM&data=
01%7C01%7Cebb8%40PITT.EDU%7Cf5b0767f97aa4435200308d50c0ac249
%7C9ef9f489e0a04eeb87cc3a526112fd0d%7C1&sdata=MrwEIqpKkaX9FW
F9cSS8%2FoOZP9oU%2F8PqoE%2F%2FAARFl2A%3D&reserved=0>


This encoding uses empty <said> elements to mark the boundaries of what
is direct speech, and uses @next (and @prev) to point from the beginning
boundary to the end (and back). That is, it uses
attributes intended for reconstitution of fragmented elements for
indicating start- and end- boundaries of elements. I see these as
major violations of the TEI abstract model, and as constructs that
would severely harm interchangeability.

In TEI the <said> element is intended to *contain* the passage of direct
speech. Using it as a segment boundary delimiter instead
would confuse the BLEEP out of any software written to follow the
Guidelines, *especially* when there is no indication (other than it being
empty) that it is not being used in the standard way. And
using @next to perform the function of @spanTo (i.e., to say
"consider the span of this element to be from here to there" rather
than "consider the span of this element both the content of this
element and the content of the one(s) there") I think will likewise
wreak havoc.

But I'd like to reiterate that I'm not suggesting this is a bad
representation of the underlying structure -- given the limits of XML, I
think it is a good representation. But it is a bad use of TEI for that
representation.

It is also worth correcting the nomenclature. This use of <said> is *not*
as milestone elements, it is as empty elements. (Or, as empty
segment-boundary delimiters if you want to be precise.) A milestone,
although also empty, is something else -- the stuff being divided up by
milestones tessellates an ancestor. (E.g., every character (black mark on
the page, not speaker) in a book is on one and only one page -- thus pages
tessellate the book, and <pb> is a milestone that
marks page beginnings. It is not the case that every character in a
book is part of one and only one passage of direct speech.)

Lastly the values of @who and @ana are pointers, so I'm guessing
that you want "#I-speaker" and "#interrupt", etc.

P.S. One of the reasons that the mechanism TEI suggests for empty
boundary delimiters (using <anchor>) is so lame is because <anchor>
does not have the attributes needed to properly encode the information
available on the standard element (in this case <said>). E.g., there's no
@who on <anchor>. Another reason is that the TEI gives no rule for how to
indicate the semantics of that which is being delimited -- the Guidelines
just say to use @subtype for this purpose, but don't say how. The TEI
mechanism for empty boundary delimiters being so impoverished is why I
prefer reconstitution of fragmented elements, i.e. using @next (and maybe
@prev) on a set of
normal elements to indicate "these are one".

Ahh! Once more—I missed a couple of tags in that last post, so

I’ve corrected them here. What I like about this is it permits me
to locate, count, and process in a connected way the words in my
text nodes that fall between the <said/> milestones. This is
easily XPath-able and smooth sailing without requiring said
elements inside every line. I can use these to count the number of
speech acts, as well.

<lg> <l xml:id="n01"><said who="I-speaker" xml:id="s1"

next="#s2"/>Now tell me, had you rather be,<said xml:id="s2" prev="#s1"
ana="interrupt"/></l>

<l xml:id="n02">I said, and took him by the arm,</l> <l
xml:id="n03"><said xml:id="s3" next="#s4" ana="resume"/>On

Kilve's smooth shore, by the green sea,</l>

<l xml:id="n04">Or here at Liswyn farm?<said xml:id="s4"

prev="#s3"/></l>

</lg> <lg> <l xml:id="n05">In careless mood he looked at me,</l> <l
xml:id="n06">While still I held him by the arm,</l> <l
xml:id="n07">And said, <said who="Edward" xml:id="s5"

next="#s6"/>At Kilve I'd rather be</l>

<l xml:id="n08">Than here at Liswyn farm.<said xml:id="s6"

prev="#s5"/></l>

</lg> <lg> <l xml:id="n09"><said who="I-speaker" xml:id="s7"

next="#s8"/>Now, little Edward, say why so:</l>

<l xml:id="n10">My little Edward, tell me why.<said xml:id="s8"

prev="#s7"/> --</l>

<l xml:id="n11"><said who="Edward" xml:id="s9" next="#s10"/>I

cannot tell, I do not know.<said xml:id="s10" prev="#s9"/> --</l>

<l xml:id="n12"><said who="I-speaker" xml:id="s11"

next="#s12"/>Why, this is strange,<said xml:id="s12” prev="#s11”
ana=“interrupt"/> said I;</l>

</lg> <lg> <l xml:id="n13”><said who="I-speaker" xml:id="s13"
next="#s14”

ana=“resume"/>For, here are woods, hills smooth and warm:</l>

<l xml:id="n14">There surely must some reason be</l> <l
xml:id="n15">Why you would change sweet Liswyn farm</l> <l
xml:id="n16">For Kilve by the green sea.<said xml:id="s14"

prev="#s13"/></l>

</lg> <lg> <l xml:id="n17">At this, my boy hung down his head,</l> <l
xml:id="n18">He blushed with shame, nor made reply;</l> <l
xml:id="n19">And three times to the child I said,
<https://maps.google.com/?q=%22%3EAnd+three+times+to+the+child+I+said,&entry=gmail&source=g></l>
<l
xml:id="n20"><said who="I-speaker" xml:id="s13"

next="#s14"/>Why, Edward, tell me why?<said xml:id="s14" prev="#s13"/></l>

</lg>




-- Elisa Beshero-Bondar, PhD Director, Center for the Digital Text |
Associate Professor of English University of Pittsburgh at Greensburg
| Humanities Division 150 Finoli Drive Greensburg, PA  15601  USA E-mail:
[hidden email] <mailto:[hidden email]> Development site:
<a href="https://na01.safelinks.protection.outlook.com/?url=http%3A%" rel="noreferrer" target="_blank">https://na01.safelinks.protection.outlook.com/?url=http%3A%
2F%2Fnewtfire.org&data=01%7C01%7Cebb8%40PITT.EDU%7Cf5b07
67f97aa4435200308d50c0ac249%7C9ef9f489e0a04eeb87cc3a526112fd
0d%7C1&sdata=E2jbGtHNaO1P4Hq9a31d2UsIMS3F8NOOJjFQLuZA2is%3D&reserved=0 <
<a href="https://na01.safelinks.protection.outlook.com/?url=http%3A%" rel="noreferrer" target="_blank">https://na01.safelinks.protection.outlook.com/?url=http%3A%
2F%2Fnewtfire.org%2F&data=01%7C01%7Cebb8%40PITT.EDU%7Cf5b07
67f97aa4435200308d50c0ac249%7C9ef9f489e0a04eeb87cc3a526112fd
0d%7C1&sdata=VyQkLz7FhZaENVoubwwaSFjRi5gP4OpSk3PIN5Ycny0%3D&reserved=0>







--
Elisa Beshero-Bondar, PhD
Director, Center for the Digital Text | Associate Professor of English
University of Pittsburgh at Greensburg | Humanities Division
150 Finoli Drive
Greensburg, PA  15601  USA
E-mail:[hidden email]
Development site: http://newtfire.org
Reply | Threaded
Open this post in threaded view
|

Re: quotation across verse lines

Lou Burnard-6

My point is simply that if you use an existing TEI element then it is good practice to respect the usual interpretation and usage of that element. I think most people -- or perhaps more significantly perhaps most code people would write -- expects to find some content inside a <said>. If there is none, they would either conclude the file is broken or that nothing is being said. Your desire to use the element in a different way -- as an anchor point -- is a perfectly reasonable thing to want, but you could make life easier by not using exactly the same element name, in the same namespace. That is what might be called "considerate conformance"! Your attribution to me and Syd of emotionally charged terms such as "violence" or "abuse" seems a little unjust.

By the way, I have now remembered that Council had a very similar argument some time ago around the elements <del> and <delSpan> (or <add> and <addSpan>) : *Span elements function just like your anchors in that they are empty, and indicate their corresponding end-point by means of a @spanTo attribute, provided by the att.spanning class. It was suggested that maybe we didn't need (say) delSpan at all, since we could achieve the same effect by adding del to the att.spanning class, and permitting it to be empty, perhaps with a proviso that it shouldn't have both content and a value for @spanTo. This proposal was however defeated, largely I think because of precisely the concerns which motivate me and Syd in this particular case.


On 06/10/17 01:33, Elisa Beshero-Bondar wrote:

Overlap is always an interesting challenge because it shows us limits and
possibilities of what we can do with XML and TEI. Lou and Syd, I think you
know that I'm not willfully out to abuse the TEI, but I'm thinking about
what we do with markup and what's in an element name and its memberships in
models and classes. The element <said> is more meaningful than <anchor>
because of the definition we gave <said>: it "indicates passages thought or
spoken aloud, whether explicitly indicated in the source or not, whether
directly or indirectly reported, whether by real people or fictional
characters".

The problem with overlap is that it demands we  "force-fit" our markup in
some way: and words like "force" and "abuse" and "violence" come into play
because we can't fully model things in TEI as we see them, and we introduce
complications of various kinds. Perhaps we try to choose what seems to us
the least of a series of evils--but I wish we didn't always think of what
we're doing as at best trickery and at worst violence. TEI is flexible and
extensible and our documentation gives it range of expression within the
guidelines we've set. Because of what I've learned of <said> from the
Guidelines, I'd far rather use <said> than <anchor> even in a case of
overlap because <said> is meaningful in ways that <anchor> isn't without
giving it a lot of new documentation. I can do it, but I didn't exactly see
a strong indication from either the Guidelines or my parser that converting
<said/> into an anchor-like expression was doing damage to its definition.
I'm thinking a lot about that in light of your comments and what I've read.

Lou and Syd, you indicated that my <said/> solution is a violation or an
abuse of the TEI, because <said> is expected to have a text node that
indicates what is spoken, and if it's empty, it would necessarily mean that
nothing is said. My first thought was, "why would I reach for a <said>
element to indicate that nothing is spoken?" and I pondered whether there
might be use cases for such a thing: Would it convey that we *expect*
something to be said on the part of someone but nothing was uttered? Is it
a way to make a silence explicit? That's interesting but unexpected, and
certainly not what my markup of the Wordsworth poem would convey.

 My next thought was, let's check the Guidelines: how have we defined
<said>?

The element spec for <said> and the Guidelines passages on quotation (in
3.3.3 http://www.tei-c.org/release/doc/tei-p5-doc/en/html/CO.html#COHQQ )
give no expressed indication that a self-closing <said/> means "nothing at
all is being said". I hunted for examples and explanation in the Guidelines
to that effect to no avail. The only  examples we show give text contents
to the element, and we've simply not represented the what possible meaning
a self-closing version of the element might have.

Because <said> is what I want to flag, and because it doesn't fit the
hierarchy, I want the flexibility, the extensibility even, to render it
self-closing specifically in the instance of overlap where we have to make
choices that do with distending or contorting or stretching around our
hierarchy. I want to be able to signal this as a range existing in another
dimension than the one I'm marking within regular <lg> and <l> elements. If
and when I need to convey that the hierarchical structure of poetry is
being overlapped (as I very frequently do), I want a mechanism for doing it
that doesn't sacrifice the semantic clarity of the element I would normally
be treating as holding text content. Because this is overlap it *is* a
special case, and we need special tools to handle it expressively and
meaningfully, in a way that minimizes what I'd call a "semantic
compromise"--where we have to default to a less meaningful element, or make
the element we want to use force fit itself into multiple particulated
units with lots of ids. We're making compromises no matter what we do here,
but the overlap is often quite meaningful and requires a thoughtful
solution.

Best,
Elisa

On Thu, Oct 5, 2017 at 2:55 PM, David Farmer [hidden email] wrote:

For my suggestion, it matters not at all what tag you use,
or if it even is a tag.  It just needs to be strings of characters
which unambiguously mark the start or the end of that you want to
convert into proper markup.  I often use wEIrdSTrinGS.  It just
has to be something that you know is not a real part of the document.

Nobody but the author (or someone who examines their script)
will ever know what string of characters is used, because
those other people will only see the proper XML output by
the script.

If the author finds it less confusing to use <said id="..."/>, then
they might as well do that.  Nobody else will ever know.
The fact that it is illegal as TEI is what makes it usable for
my script.

I'm just clarifying my suggestion for how you can use a script
to save some human effort.  The end result is perfect markup,
with no tag abuse, but the path may go through places that you
don't want other people to see.

David



On Thu, 5 Oct 2017, Lou Burnard wrote:


No problem inventing a special empty element to mark points in your text
that need to be chained together to
form a speech act. But please don't call it "<said>". That's tag abuse.
<said/> means "nothing being said
here".



On 05/10/17 17:13, Elisa Beshero-Bondar wrote:

Indeed yes. The issue I have with:
<l><!--Single speech act starts here --><said
id="#something">.....</said></l>
<l><said id="#somethingNew" next="#somethingNewerStill">....</said></l>
<l><said prev="#somethingNewerStill">....</said><!--Single speech act
ends
here --> </l>

is the sheer proliferation of id values that I have to generate for the
speech act, when there is only one start point and one end point. I don't
like the idea of having to generate ids for the saids I *have* to place in
the middle just because I need to designate them to establish an
overlapping hierarchy is in place.

I like my sign-post "anchor" modeling because I only need to use these
when
there's a change in the speech event itself, a start, an interruption, a
resumption, a close. I don't need to add elements or ids that are
semantically not meaningful to the speech act.

I also kind of don't want to change the hierarchical level of my text node
in a case of overlapping hierarchy if I don't have to.

Elisa


On Thu, Oct 5, 2017 at 12:04 PM, Martin Holmes [hidden email] wrote:

But we can't get around the problem of adding LOTS of separate

<said>....</said> elements inside each of the lines when there is
really one long connected <said>...</said> that weaves around the
lines.

I think that's the very purpose of @next and @prev, isn't it? They specify
that this is a fragmented element that in an alternative hierarchy would
be
continuous.

Cheers,
Martin


On 2017-10-05 08:47 AM, Elisa Beshero-Bondar wrote:

Well, I understand from this that my preferred model is complicated
and nonconformant to the TEI because of the application of
`<said/>`--and that what I'm really modelling is **anchor**-style
elements pointing with @next and @prev. Okay, I'll concede that.

But we can't get around the problem of adding LOTS of separate
<said>....</said> elements inside each of the lines when there is
really one long connected <said>...</said> that weaves around the
lines. I suppose what I've crafted is something alternative to the
TEI's way of handling `<said>` in cases like this: I want to be able
to model one point of absolute beginning and one point of absolute
ending, and I suppose I should be using `<anchor>` for that, but I
want the attributes that go with `<said>` without having to change
the element terribly much.

I think of this visually as a model in which I see the poem in one
dimension within its line boundaries, but I also want to see another
 dimension inconsistent with XML hierarchy: I want to see *one* start
 point and *one* end point, perhaps with interrupt and resume
internal boundaries--but no more than that. Because I want those to
be signaled as discreet events, I want **signposts** of them,
particles in the stream of text that signal an act of speaking has
begun and stopped (or has been interrupted and resumed).

I recognize that this is not *expected* behavior and that the way we
 process TEI wasn't built to handle this--apologies for that, and
maybe it's not optimal for the present question. I don't know that is
is not "interchangeable", though, if it can be documented and
understood and even *mapped* with XSLT to a more traditional
representation when needed.

It is optimal for me in writing my own processing scripts and addressing
my research questions--and for that reason I'm a little concerned that
this
approach to the simple problem of overlapping hierarchy should be deemed a
"violation" of the content model of the
TEI itself. Is it a violation of the use of the `<said>` element, of
the use of *anchor-style* elements (I agree I shouldn't be using
milestone here, since anchor elements are built for range
boundaries), or something else entirely?

To be clear, the reason I press this is because it affects multiple
projects that I really thought I was coding with TEI, in which being
 able to signal and track overlapping hierarchies is vital to the
research questions driving the projects.

Elisa


On Thu, Oct 5, 2017 at 10:57 AM, Syd Bauman
<[hidden email] [hidden email]>
wrote:

This encoding is not conformant TEI (which is not really a problem), and
for those who are used to TEI will be very confusing (which does strike me
as problematic). Generic TEI software (e.g. TAPAS) will have no idea how
to
process this.

In 2007 the TEI (very unwisely, IMHO) specifically eschewed the method of
encoding here (which is affectionately called HORSE) in favor of using the
<anchor> element. (Except in certain special
cases where there is a pre-existing *Span element (e.g., <addSpan>)
which would replace the first <anchor>.) See
<a class="moz-txt-link-freetext" href="https://na01.safelinks.protection.outlook.com/?url=http%3A%">https://na01.safelinks.protection.outlook.com/?url=http%3A%
2F%2Fwww.tei-c.org%2Frelease%2Fdoc%2Ftei-p5-doc%2Fen%
2Fhtml%2FNH.html%23NHBM&data=01%7C01%7Cebb8%40PITT.EDU%7Cf5
b0767f97aa4435200308d50c0ac249%7C9ef9f489e0a04eeb87cc3a52611
2fd0d%7C1&sdata=MrwEIqpKkaX9FWF9cSS8%2FoOZP9oU%2F8PqoE%2F%2F
AARFl2A%3D&reserved=0 <https://na01.safelinks.protec
tion.outlook.com/?url=http%3A%2F%2Fwww.tei-c.org%2Frelease%
2Fdoc%2Ftei-p5-doc%2Fen%2Fhtml%2FNH.html%23NHBM&data=
01%7C01%7Cebb8%40PITT.EDU%7Cf5b0767f97aa4435200308d50c0ac249
%7C9ef9f489e0a04eeb87cc3a526112fd0d%7C1&sdata=MrwEIqpKkaX9FW
F9cSS8%2FoOZP9oU%2F8PqoE%2F%2FAARFl2A%3D&reserved=0>


This encoding uses empty <said> elements to mark the boundaries of what
is direct speech, and uses @next (and @prev) to point from the beginning
boundary to the end (and back). That is, it uses
attributes intended for reconstitution of fragmented elements for
indicating start- and end- boundaries of elements. I see these as
major violations of the TEI abstract model, and as constructs that
would severely harm interchangeability.

In TEI the <said> element is intended to *contain* the passage of direct
speech. Using it as a segment boundary delimiter instead
would confuse the BLEEP out of any software written to follow the
Guidelines, *especially* when there is no indication (other than it being
empty) that it is not being used in the standard way. And
using @next to perform the function of @spanTo (i.e., to say
"consider the span of this element to be from here to there" rather
than "consider the span of this element both the content of this
element and the content of the one(s) there") I think will likewise
wreak havoc.

But I'd like to reiterate that I'm not suggesting this is a bad
representation of the underlying structure -- given the limits of XML, I
think it is a good representation. But it is a bad use of TEI for that
representation.

It is also worth correcting the nomenclature. This use of <said> is *not*
as milestone elements, it is as empty elements. (Or, as empty
segment-boundary delimiters if you want to be precise.) A milestone,
although also empty, is something else -- the stuff being divided up by
milestones tessellates an ancestor. (E.g., every character (black mark on
the page, not speaker) in a book is on one and only one page -- thus pages
tessellate the book, and <pb> is a milestone that
marks page beginnings. It is not the case that every character in a
book is part of one and only one passage of direct speech.)

Lastly the values of @who and @ana are pointers, so I'm guessing
that you want "#I-speaker" and "#interrupt", etc.

P.S. One of the reasons that the mechanism TEI suggests for empty
boundary delimiters (using <anchor>) is so lame is because <anchor>
does not have the attributes needed to properly encode the information
available on the standard element (in this case <said>). E.g., there's no
@who on <anchor>. Another reason is that the TEI gives no rule for how to
indicate the semantics of that which is being delimited -- the Guidelines
just say to use @subtype for this purpose, but don't say how. The TEI
mechanism for empty boundary delimiters being so impoverished is why I
prefer reconstitution of fragmented elements, i.e. using @next (and maybe
@prev) on a set of
normal elements to indicate "these are one".

Ahh! Once more—I missed a couple of tags in that last post, so

I’ve corrected them here. What I like about this is it permits me
to locate, count, and process in a connected way the words in my
text nodes that fall between the <said/> milestones. This is
easily XPath-able and smooth sailing without requiring said
elements inside every line. I can use these to count the number of
speech acts, as well.

<lg> <l xml:id="n01"><said who="I-speaker" xml:id="s1"

next="#s2"/>Now tell me, had you rather be,<said xml:id="s2" prev="#s1"
ana="interrupt"/></l>

<l xml:id="n02">I said, and took him by the arm,</l> <l
xml:id="n03"><said xml:id="s3" next="#s4" ana="resume"/>On

Kilve's smooth shore, by the green sea,</l>

<l xml:id="n04">Or here at Liswyn farm?<said xml:id="s4"

prev="#s3"/></l>

</lg> <lg> <l xml:id="n05">In careless mood he looked at me,</l> <l
xml:id="n06">While still I held him by the arm,</l> <l
xml:id="n07">And said, <said who="Edward" xml:id="s5"

next="#s6"/>At Kilve I'd rather be</l>

<l xml:id="n08">Than here at Liswyn farm.<said xml:id="s6"

prev="#s5"/></l>

</lg> <lg> <l xml:id="n09"><said who="I-speaker" xml:id="s7"

next="#s8"/>Now, little Edward, say why so:</l>

<l xml:id="n10">My little Edward, tell me why.<said xml:id="s8"

prev="#s7"/> --</l>

<l xml:id="n11"><said who="Edward" xml:id="s9" next="#s10"/>I

cannot tell, I do not know.<said xml:id="s10" prev="#s9"/> --</l>

<l xml:id="n12"><said who="I-speaker" xml:id="s11"

next="#s12"/>Why, this is strange,<said xml:id="s12” prev="#s11”
ana=“interrupt"/> said I;</l>

</lg> <lg> <l xml:id="n13”><said who="I-speaker" xml:id="s13"
next="#s14”

ana=“resume"/>For, here are woods, hills smooth and warm:</l>

<l xml:id="n14">There surely must some reason be</l> <l
xml:id="n15">Why you would change sweet Liswyn farm</l> <l
xml:id="n16">For Kilve by the green sea.<said xml:id="s14"

prev="#s13"/></l>

</lg> <lg> <l xml:id="n17">At this, my boy hung down his head,</l> <l
xml:id="n18">He blushed with shame, nor made reply;
<https://maps.google.com/?q=r+made+reply;+xml:id%3D%22n19%22%3EAnd+th&entry=gmail&source=g></l>
<l
xml:id="n19">And th
<https://maps.google.com/?q=r+made+reply;+xml:id%3D%22n19%22%3EAnd+th&entry=gmail&source=g>ree
times to the child I said,
<https://maps.google.com/?q=%22%3EAnd+three+times+to+the+chi
ld+I+said,&entry=gmail&source=g></l>
<l
xml:id="n20"><said who="I-speaker" xml:id="s13"

next="#s14"/>Why, Edward, tell me why?<said xml:id="s14" prev="#s13"/></l>

</lg>




-- Elisa Beshero-Bondar, PhD Director, Center for the Digital Text |
Associate Professor of English University of Pittsburgh at Greensburg
| Humanities Division 150 Finoli Drive Greensburg, PA  15601  USA E-mail:
[hidden email] [hidden email] Development site:
<a class="moz-txt-link-freetext" href="https://na01.safelinks.protection.outlook.com/?url=http%3A%">https://na01.safelinks.protection.outlook.com/?url=http%3A%
2F%2Fnewtfire.org&data=01%7C01%7Cebb8%40PITT.EDU%7Cf5b07
67f97aa4435200308d50c0ac249%7C9ef9f489e0a04eeb87cc3a526112fd
0d%7C1&sdata=E2jbGtHNaO1P4Hq9a31d2UsIMS3F8NOOJjFQLuZA2is%3D&reserved=0 <
https://na01.safelinks.protection.outlook.com/?url=http%3A%
2F%2Fnewtfire.org%2F&data=01%7C01%7Cebb8%40PITT.EDU%7Cf5b07
67f97aa4435200308d50c0ac249%7C9ef9f489e0a04eeb87cc3a526112fd
0d%7C1&sdata=VyQkLz7FhZaENVoubwwaSFjRi5gP4OpSk3PIN5Ycny0%3D&reserved=0>







Reply | Threaded
Open this post in threaded view
|

Re: quotation across verse lines

Elisa Beshero-Bondar-2
Hi Lou— Setting my weirdly-used element in a properly “Elisist” namespace as in <ebb:said/> makes good sense and works as a solution—something like David Farmer’s script, I suppose, but more of an incorporation of a different-kind-of-thing into that which is expected in the TEI. But I must (in all good humor) object—really, you *did* write

No problem inventing a special empty element to mark points in your text
that need to be chained together to
form a speech act. But please don't call it "<said>". That's tag abuse.
<said/> means "nothing being said
here".

(And of course, nobody wants to be a tag abuser! :-)

I wonder, though, about a whole range of elements that predictably "span around” verse lines, and whether we might want to consider modeling a behavior for them when TEI community members are faced with problems of overlapping hierarchy.

Cheers,
Elisa

-- 
Elisa Beshero-Bondar, PhD
Director, Center for the Digital Text | Associate Professor of English
University of Pittsburgh at Greensburg | Humanities Division
150 Finoli Drive
Greensburg, PA  15601  USA
E-mail:[hidden email]
Development site: http://newtfire.org






On Oct 6, 2017, at 10:07 AM, Lou Burnard <[hidden email]> wrote:

My point is simply that if you use an existing TEI element then it is good practice to respect the usual interpretation and usage of that element. I think most people -- or perhaps more significantly perhaps most code people would write -- expects to find some content inside a <said>. If there is none, they would either conclude the file is broken or that nothing is being said. Your desire to use the element in a different way -- as an anchor point -- is a perfectly reasonable thing to want, but you could make life easier by not using exactly the same element name, in the same namespace. That is what might be called "considerate conformance"! Your attribution to me and Syd of emotionally charged terms such as "violence" or "abuse" seems a little unjust.

By the way, I have now remembered that Council had a very similar argument some time ago around the elements <del> and <delSpan> (or <add> and <addSpan>) : *Span elements function just like your anchors in that they are empty, and indicate their corresponding end-point by means of a @spanTo attribute, provided by the att.spanning class. It was suggested that maybe we didn't need (say) delSpan at all, since we could achieve the same effect by adding del to the att.spanning class, and permitting it to be empty, perhaps with a proviso that it shouldn't have both content and a value for @spanTo. This proposal was however defeated, largely I think because of precisely the concerns which motivate me and Syd in this particular case.


On 06/10/17 01:33, Elisa Beshero-Bondar wrote:

Overlap is always an interesting challenge because it shows us limits and
possibilities of what we can do with XML and TEI. Lou and Syd, I think you
know that I'm not willfully out to abuse the TEI, but I'm thinking about
what we do with markup and what's in an element name and its memberships in
models and classes. The element <said> is more meaningful than <anchor>
because of the definition we gave <said>: it "indicates passages thought or
spoken aloud, whether explicitly indicated in the source or not, whether
directly or indirectly reported, whether by real people or fictional
characters".

The problem with overlap is that it demands we  "force-fit" our markup in
some way: and words like "force" and "abuse" and "violence" come into play
because we can't fully model things in TEI as we see them, and we introduce
complications of various kinds. Perhaps we try to choose what seems to us
the least of a series of evils--but I wish we didn't always think of what
we're doing as at best trickery and at worst violence. TEI is flexible and
extensible and our documentation gives it range of expression within the
guidelines we've set. Because of what I've learned of <said> from the
Guidelines, I'd far rather use <said> than <anchor> even in a case of
overlap because <said> is meaningful in ways that <anchor> isn't without
giving it a lot of new documentation. I can do it, but I didn't exactly see
a strong indication from either the Guidelines or my parser that converting
<said/> into an anchor-like expression was doing damage to its definition.
I'm thinking a lot about that in light of your comments and what I've read.

Lou and Syd, you indicated that my <said/> solution is a violation or an
abuse of the TEI, because <said> is expected to have a text node that
indicates what is spoken, and if it's empty, it would necessarily mean that
nothing is said. My first thought was, "why would I reach for a <said>
element to indicate that nothing is spoken?" and I pondered whether there
might be use cases for such a thing: Would it convey that we *expect*
something to be said on the part of someone but nothing was uttered? Is it
a way to make a silence explicit? That's interesting but unexpected, and
certainly not what my markup of the Wordsworth poem would convey.

 My next thought was, let's check the Guidelines: how have we defined
<said>?

The element spec for <said> and the Guidelines passages on quotation (in
3.3.3 http://www.tei-c.org/release/doc/tei-p5-doc/en/html/CO.html#COHQQ )
give no expressed indication that a self-closing <said/> means "nothing at
all is being said". I hunted for examples and explanation in the Guidelines
to that effect to no avail. The only  examples we show give text contents
to the element, and we've simply not represented the what possible meaning
a self-closing version of the element might have.

Because <said> is what I want to flag, and because it doesn't fit the
hierarchy, I want the flexibility, the extensibility even, to render it
self-closing specifically in the instance of overlap where we have to make
choices that do with distending or contorting or stretching around our
hierarchy. I want to be able to signal this as a range existing in another
dimension than the one I'm marking within regular <lg> and <l> elements. If
and when I need to convey that the hierarchical structure of poetry is
being overlapped (as I very frequently do), I want a mechanism for doing it
that doesn't sacrifice the semantic clarity of the element I would normally
be treating as holding text content. Because this is overlap it *is* a
special case, and we need special tools to handle it expressively and
meaningfully, in a way that minimizes what I'd call a "semantic
compromise"--where we have to default to a less meaningful element, or make
the element we want to use force fit itself into multiple particulated
units with lots of ids. We're making compromises no matter what we do here,
but the overlap is often quite meaningful and requires a thoughtful
solution.

Best,
Elisa

On Thu, Oct 5, 2017 at 2:55 PM, David Farmer [hidden email] wrote:

For my suggestion, it matters not at all what tag you use,
or if it even is a tag.  It just needs to be strings of characters
which unambiguously mark the start or the end of that you want to
convert into proper markup.  I often use wEIrdSTrinGS.  It just
has to be something that you know is not a real part of the document.

Nobody but the author (or someone who examines their script)
will ever know what string of characters is used, because
those other people will only see the proper XML output by
the script.

If the author finds it less confusing to use <said id="..."/>, then
they might as well do that.  Nobody else will ever know.
The fact that it is illegal as TEI is what makes it usable for
my script.

I'm just clarifying my suggestion for how you can use a script
to save some human effort.  The end result is perfect markup,
with no tag abuse, but the path may go through places that you
don't want other people to see.

David



On Thu, 5 Oct 2017, Lou Burnard wrote:


No problem inventing a special empty element to mark points in your text
that need to be chained together to
form a speech act. But please don't call it "<said>". That's tag abuse.
<said/> means "nothing being said
here".



On 05/10/17 17:13, Elisa Beshero-Bondar wrote:

Indeed yes. The issue I have with:
<l><!--Single speech act starts here --><said
id="#something">.....</said></l>
<l><said id="#somethingNew" next="#somethingNewerStill">....</said></l>
<l><said prev="#somethingNewerStill">....</said><!--Single speech act
ends
here --> </l>

is the sheer proliferation of id values that I have to generate for the
speech act, when there is only one start point and one end point. I don't
like the idea of having to generate ids for the saids I *have* to place in
the middle just because I need to designate them to establish an
overlapping hierarchy is in place.

I like my sign-post "anchor" modeling because I only need to use these
when
there's a change in the speech event itself, a start, an interruption, a
resumption, a close. I don't need to add elements or ids that are
semantically not meaningful to the speech act.

I also kind of don't want to change the hierarchical level of my text node
in a case of overlapping hierarchy if I don't have to.

Elisa


On Thu, Oct 5, 2017 at 12:04 PM, Martin Holmes [hidden email] wrote:

But we can't get around the problem of adding LOTS of separate

<said>....</said> elements inside each of the lines when there is
really one long connected <said>...</said> that weaves around the
lines.

I think that's the very purpose of @next and @prev, isn't it? They specify
that this is a fragmented element that in an alternative hierarchy would
be
continuous.

Cheers,
Martin


On 2017-10-05 08:47 AM, Elisa Beshero-Bondar wrote:

Well, I understand from this that my preferred model is complicated
and nonconformant to the TEI because of the application of
`<said/>`--and that what I'm really modelling is **anchor**-style
elements pointing with @next and @prev. Okay, I'll concede that.

But we can't get around the problem of adding LOTS of separate
<said>....</said> elements inside each of the lines when there is
really one long connected <said>...</said> that weaves around the
lines. I suppose what I've crafted is something alternative to the
TEI's way of handling `<said>` in cases like this: I want to be able
to model one point of absolute beginning and one point of absolute
ending, and I suppose I should be using `<anchor>` for that, but I
want the attributes that go with `<said>` without having to change
the element terribly much.

I think of this visually as a model in which I see the poem in one
dimension within its line boundaries, but I also want to see another
 dimension inconsistent with XML hierarchy: I want to see *one* start
 point and *one* end point, perhaps with interrupt and resume
internal boundaries--but no more than that. Because I want those to
be signaled as discreet events, I want **signposts** of them,
particles in the stream of text that signal an act of speaking has
begun and stopped (or has been interrupted and resumed).

I recognize that this is not *expected* behavior and that the way we
 process TEI wasn't built to handle this--apologies for that, and
maybe it's not optimal for the present question. I don't know that is
is not "interchangeable", though, if it can be documented and
understood and even *mapped* with XSLT to a more traditional
representation when needed.

It is optimal for me in writing my own processing scripts and addressing
my research questions--and for that reason I'm a little concerned that
this
approach to the simple problem of overlapping hierarchy should be deemed a
"violation" of the content model of the
TEI itself. Is it a violation of the use of the `<said>` element, of
the use of *anchor-style* elements (I agree I shouldn't be using
milestone here, since anchor elements are built for range
boundaries), or something else entirely?

To be clear, the reason I press this is because it affects multiple
projects that I really thought I was coding with TEI, in which being
 able to signal and track overlapping hierarchies is vital to the
research questions driving the projects.

Elisa


On Thu, Oct 5, 2017 at 10:57 AM, Syd Bauman
<[hidden email] [hidden email]>
wrote:

This encoding is not conformant TEI (which is not really a problem), and
for those who are used to TEI will be very confusing (which does strike me
as problematic). Generic TEI software (e.g. TAPAS) will have no idea how
to
process this.

In 2007 the TEI (very unwisely, IMHO) specifically eschewed the method of
encoding here (which is affectionately called HORSE) in favor of using the
<anchor> element. (Except in certain special
cases where there is a pre-existing *Span element (e.g., <addSpan>)
which would replace the first <anchor>.) See
<a class="moz-txt-link-freetext" href="https://na01.safelinks.protection.outlook.com/?url=http%3A%">https://na01.safelinks.protection.outlook.com/?url=http%3A%
2F%2Fwww.tei-c.org%2Frelease%2Fdoc%2Ftei-p5-doc%2Fen%
2Fhtml%2FNH.html%23NHBM&data=01%7C01%7Cebb8%40PITT.EDU%7Cf5
b0767f97aa4435200308d50c0ac249%7C9ef9f489e0a04eeb87cc3a52611
2fd0d%7C1&sdata=MrwEIqpKkaX9FWF9cSS8%2FoOZP9oU%2F8PqoE%2F%2F
AARFl2A%3D&reserved=0 <https://na01.safelinks.protec
tion.outlook.com/?url=http%3A%2F%2Fwww.tei-c.org%2Frelease%
2Fdoc%2Ftei-p5-doc%2Fen%2Fhtml%2FNH.html%23NHBM&data=
01%7C01%7Cebb8%40PITT.EDU%7Cf5b0767f97aa4435200308d50c0ac249
%7C9ef9f489e0a04eeb87cc3a526112fd0d%7C1&sdata=MrwEIqpKkaX9FW
F9cSS8%2FoOZP9oU%2F8PqoE%2F%2FAARFl2A%3D&reserved=0>


This encoding uses empty <said> elements to mark the boundaries of what
is direct speech, and uses @next (and @prev) to point from the beginning
boundary to the end (and back). That is, it uses
attributes intended for reconstitution of fragmented elements for
indicating start- and end- boundaries of elements. I see these as
major violations of the TEI abstract model, and as constructs that
would severely harm interchangeability.

In TEI the <said> element is intended to *contain* the passage of direct
speech. Using it as a segment boundary delimiter instead
would confuse the BLEEP out of any software written to follow the
Guidelines, *especially* when there is no indication (other than it being
empty) that it is not being used in the standard way. And
using @next to perform the function of @spanTo (i.e., to say
"consider the span of this element to be from here to there" rather
than "consider the span of this element both the content of this
element and the content of the one(s) there") I think will likewise
wreak havoc.

But I'd like to reiterate that I'm not suggesting this is a bad
representation of the underlying structure -- given the limits of XML, I
think it is a good representation. But it is a bad use of TEI for that
representation.

It is also worth correcting the nomenclature. This use of <said> is *not*
as milestone elements, it is as empty elements. (Or, as empty
segment-boundary delimiters if you want to be precise.) A milestone,
although also empty, is something else -- the stuff being divided up by
milestones tessellates an ancestor. (E.g., every character (black mark on
the page, not speaker) in a book is on one and only one page -- thus pages
tessellate the book, and <pb> is a milestone that
marks page beginnings. It is not the case that every character in a
book is part of one and only one passage of direct speech.)

Lastly the values of @who and @ana are pointers, so I'm guessing
that you want "#I-speaker" and "#interrupt", etc.

P.S. One of the reasons that the mechanism TEI suggests for empty
boundary delimiters (using <anchor>) is so lame is because <anchor>
does not have the attributes needed to properly encode the information
available on the standard element (in this case <said>). E.g., there's no
@who on <anchor>. Another reason is that the TEI gives no rule for how to
indicate the semantics of that which is being delimited -- the Guidelines
just say to use @subtype for this purpose, but don't say how. The TEI
mechanism for empty boundary delimiters being so impoverished is why I
prefer reconstitution of fragmented elements, i.e. using @next (and maybe
@prev) on a set of
normal elements to indicate "these are one".

Ahh! Once more—I missed a couple of tags in that last post, so

I’ve corrected them here. What I like about this is it permits me
to locate, count, and process in a connected way the words in my
text nodes that fall between the <said/> milestones. This is
easily XPath-able and smooth sailing without requiring said
elements inside every line. I can use these to count the number of
speech acts, as well.

<lg> <l xml:id="n01"><said who="I-speaker" xml:id="s1"

next="#s2"/>Now tell me, had you rather be,<said xml:id="s2" prev="#s1"
ana="interrupt"/></l>

<l xml:id="n02">I said, and took him by the arm,</l> <l
xml:id="n03"><said xml:id="s3" next="#s4" ana="resume"/>On

Kilve's smooth shore, by the green sea,</l>

<l xml:id="n04">Or here at Liswyn farm?<said xml:id="s4"

prev="#s3"/></l>

</lg> <lg> <l xml:id="n05">In careless mood he looked at me,</l> <l
xml:id="n06">While still I held him by the arm,</l> <l
xml:id="n07">And said, <said who="Edward" xml:id="s5"

next="#s6"/>At Kilve I'd rather be</l>

<l xml:id="n08">Than here at Liswyn farm.<said xml:id="s6"

prev="#s5"/></l>

</lg> <lg> <l xml:id="n09"><said who="I-speaker" xml:id="s7"

next="#s8"/>Now, little Edward, say why so:</l>

<l xml:id="n10">My little Edward, tell me why.<said xml:id="s8"

prev="#s7"/> --</l>

<l xml:id="n11"><said who="Edward" xml:id="s9" next="#s10"/>I

cannot tell, I do not know.<said xml:id="s10" prev="#s9"/> --</l>

<l xml:id="n12"><said who="I-speaker" xml:id="s11"

next="#s12"/>Why, this is strange,<said xml:id="s12” prev="#s11”
ana=“interrupt"/> said I;</l>

</lg> <lg> <l xml:id="n13”><said who="I-speaker" xml:id="s13"
next="#s14”

ana=“resume"/>For, here are woods, hills smooth and warm:</l>

<l xml:id="n14">There surely must some reason be</l> <l
xml:id="n15">Why you would change sweet Liswyn farm</l> <l
xml:id="n16">For Kilve by the green sea.<said xml:id="s14"

prev="#s13"/></l>

</lg> <lg> <l xml:id="n17">At this, my boy hung down his head,</l> <l
xml:id="n18">He blushed with shame, nor made reply;
<https://maps.google.com/?q=r+made+reply;+xml:id%3D%22n19%22%3EAnd+th&entry=gmail&source=g></l>
<l
xml:id="n19">And th
<https://maps.google.com/?q=r+made+reply;+xml:id%3D%22n19%22%3EAnd+th&entry=gmail&source=g>ree
times to the child I said,
<https://maps.google.com/?q=%22%3EAnd+three+times+to+the+chi
ld+I+said,&entry=gmail&source=g></l>
<l
xml:id="n20"><said who="I-speaker" xml:id="s13"

next="#s14"/>Why, Edward, tell me why?<said xml:id="s14" prev="#s13"/></l>

</lg>




-- Elisa Beshero-Bondar, PhD Director, Center for the Digital Text |
Associate Professor of English University of Pittsburgh at Greensburg
| Humanities Division 150 Finoli Drive Greensburg, PA  15601  USA E-mail:
[hidden email] [hidden email] Development site:
<a class="moz-txt-link-freetext" href="https://na01.safelinks.protection.outlook.com/?url=http%3A%">https://na01.safelinks.protection.outlook.com/?url=http%3A%
2F%2Fnewtfire.org&data=01%7C01%7Cebb8%40PITT.EDU%7Cf5b07
67f97aa4435200308d50c0ac249%7C9ef9f489e0a04eeb87cc3a526112fd
0d%7C1&sdata=E2jbGtHNaO1P4Hq9a31d2UsIMS3F8NOOJjFQLuZA2is%3D&reserved=0 <
https://na01.safelinks.protection.outlook.com/?url=http%3A%
2F%2Fnewtfire.org%2F&data=01%7C01%7Cebb8%40PITT.EDU%7Cf5b07
67f97aa4435200308d50c0ac249%7C9ef9f489e0a04eeb87cc3a526112fd
0d%7C1&sdata=VyQkLz7FhZaENVoubwwaSFjRi5gP4OpSk3PIN5Ycny0%3D&reserved=0>






    


123