seg and interp

classic Classic list List threaded Threaded
20 messages Options
Reply | Threaded
Open this post in threaded view
|

seg and interp

Mike Engle
Hi all

I'm involved in a project in which we're marking up a large body of literature and we want to mark various passages in the texts as significant, in essence labeling them as "this" or "that".  We took a look through the TEI Guidelines and decided to use <seg> and <interp> to mark the various passages with different attributes to specifiy what type of passage it is.  

For example, with <interp>:

In the <interp type="placeMain">Country of the Bhargas, on Mount Śuśumāra in a fearsome forest of wild animals</interp> with a <interp type="audienceGen">great saṅgha of about 500 monks, eminent śrāvaka-elders who possessed clairvoyance</interp>.


And with <seg>:

<seg function="modul" type="pastLifeWho">if you wonder whether the brahmin boy
                        Bhadraśuddha was then at that time someone else, or you are of two minds
                        about it, or doubtful, do not see him so. Why? Because the bodhisattva
                        mahāsattva Maitreya himself was then at that time the brahmin boy
                        Bhadraśuddha</seg>


Two questions:

1)  Since this is going to be a very long term and labor extensive project, I wanted to check and see if the community in general felt this was a reasonable way to mark these passages and also if there are any suggestions for other ways to do this which might work better.  Can anyone suggest any other elements that might be useful for this kind of thing?

2)  We have a problem of going across elements (breaking the nesting so to speak).  For instance, a passage might start in the middle of one paragraph and finish halfway through another paragraph, and when we mark it the tag begins in one paragraph and closes in the next.  The schema doesn't like this one bit, and I'm wondering what is the best way to handle this.  For example:


<p><seg function="modul" type="qualitiesBuddha">The Tathāgata was handsome and charismatic,
                        controlled in his faculties and in his mind. He had attained excellence in
                        control and calm abiding, and superiority in control and calm abiding. He
                        guarded his faculties, elephant-like in control of his passions, and
                        was radiant, unsullied, and clear like a lake.</p> 

<p>His body was adorned with the
                        thirty-two marks of a great being, and with the eighty minor marks, like the
                        blossoming flower of a royal sal tree, and towering like Mount Meru, the
                        king of mountains. His face was as calm as the sphere of the moon, and       
                        radiantly clear and brilliant like the sphere of
                        the sun. His body was proportioned like a nyagrodha tree, blazing with light
                        and great splendor.</seg></p>


Any help is greatly appreciate.  Forgive the formatting

Mike




Reply | Threaded
Open this post in threaded view
|

Re: seg and interp

Elisa Beshero-Bondar-2
Hi, Mike— Your project looks fascinating! I’m going to try to address your questions in reverse order:

I believe the problem you’re seeing with the <seg> element is a problem with XML well-formedness: You’ve nested your opening and closing tags for seg inside two different paragraphs to produce what we call “tangled tags”—like so: <p><seg></p><p></seg></p>.  To the XML parser, this disrupts the XML hierarchy: an element set inside a <p> is expected to open and close inside the <p> element. 

Very well, you might say, how about if I change the hierarchy, and set <seg> outside of those <p> elements: <seg><p></p><p></p></seg>? Well, the TEI schema will fire an error here because seg isn’t allowed to contain <p> children. This is a pretty common issue in our community, and there are a variety of ways to deal with it: it’s a problem of how to write good XML markup that accommodates overlapping hierarchies. It’s going to take some planning. I might be tempted in this case, if you’re really liking <seg>, to work with it like so:

<p>….<seg xml:id=“a1” next=“#a2">…</seg></p>
<p><seg xml:id=“a2” prev=“#a1">…</seg>…</p>

Here I’m using an @xml:id to set unique identifiers on each seg, and I’m using @next and @prev to point to the members of a series that span multiple paragraphs in the document. That’s one way of approaching the problem, but there will certainly be others.

Now, as for <interp>, your use of this has a certain logic but isn’t consistent with the TEI Guidelines’ explanation and examples, where the element isn’t being used as markup for base text. Instead, <interp> is basically part of a little family of elements (with <spanGrp> and <span> and more) that are for handling what we call “stand-off” annotation”, for analytical notes with a set vocabulary that you’re appending and attaching usually to a base text. This is a little difficult to explain, so first take a look at http://www.tei-c.org/release/doc/tei-p5-doc/en/html/AI.html#AISP 
Do you see how that’s being used? 

I’d say we don’t want <interp> here, unless you want to come up with a use of <spanGrp> as a means of handling your annotations. That’s worth considering, too!

But I think you might continue simply using the <seg> element in place of how you’re using <interp>. Use your @type to set up a series of set types for seg, when they point out things you care about. Presumably seg elements contain long spans of text that contain information of various kinds that you’re wanting to highlight. What are those various kinds of information? You could come up with @type and @subtype categories to apply to that element.

I hope this resolves the problem and gives you some ideas!

Cheers,
Elisa

-- 
Elisa Beshero-Bondar, PhD
Director, Center for the Digital Text | Associate Professor of English
University of Pittsburgh at Greensburg | Humanities Division
150 Finoli Drive
Greensburg, PA  15601  USA
E-mail:[hidden email]
Development site: http://newtfire.org






On Apr 28, 2017, at 3:41 AM, Mike Engle <[hidden email]> wrote:

Hi all

I'm involved in a project in which we're marking up a large body of literature and we want to mark various passages in the texts as significant, in essence labeling them as "this" or "that".  We took a look through the TEI Guidelines and decided to use <seg> and <interp> to mark the various passages with different attributes to specifiy what type of passage it is.  

For example, with <interp>:

In the <interp type="placeMain">Country of the Bhargas, on Mount Śuśumāra in a fearsome forest of wild animals</interp> with a <interp type="audienceGen">great saṅgha of about 500 monks, eminent śrāvaka-elders who possessed clairvoyance</interp>.


And with <seg>:

<seg function="modul" type="pastLifeWho">if you wonder whether the brahmin boy
                        Bhadraśuddha was then at that time someone else, or you are of two minds
                        about it, or doubtful, do not see him so. Why? Because the bodhisattva
                        mahāsattva Maitreya himself was then at that time the brahmin boy
                        Bhadraśuddha</seg>


Two questions:

1)  Since this is going to be a very long term and labor extensive project, I wanted to check and see if the community in general felt this was a reasonable way to mark these passages and also if there are any suggestions for other ways to do this which might work better.  Can anyone suggest any other elements that might be useful for this kind of thing?

2)  We have a problem of going across elements (breaking the nesting so to speak).  For instance, a passage might start in the middle of one paragraph and finish halfway through another paragraph, and when we mark it the tag begins in one paragraph and closes in the next.  The schema doesn't like this one bit, and I'm wondering what is the best way to handle this.  For example:


<p><seg function="modul" type="qualitiesBuddha">The Tathāgata was handsome and charismatic,
                        controlled in his faculties and in his mind. He had attained excellence in
                        control and calm abiding, and superiority in control and calm abiding. He
                        guarded his faculties, elephant-like in control of his passions, and
                        was radiant, unsullied, and clear like a lake.</p> 

<p>His body was adorned with the
                        thirty-two marks of a great being, and with the eighty minor marks, like the
                        blossoming flower of a royal sal tree, and towering like Mount Meru, the
                        king of mountains. His face was as calm as the sphere of the moon, and       
                        radiantly clear and brilliant like the sphere of
                        the sun. His body was proportioned like a nyagrodha tree, blazing with light
                        and great splendor.</seg></p>


Any help is greatly appreciate.  Forgive the formatting

Mike





Reply | Threaded
Open this post in threaded view
|

Re: seg and interp

Piotr Banski
In reply to this post by Mike Engle
Dear Mike,

It looks like your system assumes a rich taxonomy of objects the textual
cues to which you choose to identify in your corpus. I would suggest
that you first explore the possibilities that <taxonomy> gives you, as
it will free you of the need to parse your attributes later on, but it
will also force you (or help you) to keep your (so far merely implied)
taxonomy consistent throughout.

http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-taxonomy.html

Below is a fragment that uses <seg> alone -- which you may like, because
it's simpler for the encoders.

Remarks:
1. the markup below presupposes several hierarchically organised
<taxonomy> elements in the header.
2. I am not sure what you use @function for, so I just keep it as-is
3. It is a good idea to index at least the added <seg> elements. The
indices I added to them have nearly random content.
4. You don't need @prev on the last one, but it may be useful. It's up
to you and your tools. I assumed a rule whereby if there is no @ana
attribute, it means that the <seg> is a continuation, and @prev points
to the previous one in the chain. But YMMV.
5. Sorry about the line breaks, you may want to paste this fragment into
your test TEI document and reformat.

Hope this helps,

   Piotr

<body>
          <div>
             <p>In the <seg xml:id="d-p3-s3" ana="#place.main">Country
of the Bhargas, on Mount
                   Śuśumāra in a fearsome forest of wild animals</seg>
with a <seg xml:id="d-p3-s4"
                   ana="#audience.Gen">great saṅgha of about 500 monks,
eminent śrāvaka-elders who
                   possessed clairvoyance</seg>.</p>
          </div>
          <div>
             <p><seg xml:id="d-p-s43" function="modul"
ana="#past.life.Who">if you wonder whether the
                   brahmin boy Bhadraśuddha was then at that time
someone else, or you are of two
                   minds about it, or doubtful, do not see him so. Why?
Because the bodhisattva
                   mahāsattva Maitreya himself was then at that time the
brahmin boy
                   Bhadraśuddha</seg></p>
             <p>
                <seg xml:id="d-p-s44" function="modul"
type="#qualities.Buddha" next="#d-p-s45">The
                   Tathāgata was handsome and charismatic, controlled in
his faculties and in his
                   mind. He had attained excellence in control and calm
abiding, and superiority in
                   control and calm abiding. He guarded his faculties,
elephant-like in control of
                   his passions, and was radiant, unsullied, and clear
like a lake.</seg></p>

             <p><seg xml:id="d-p-s45" function="modul"
prev="#d-p-s44">His body was adorned with the
                   thirty-two marks of a great being, and with the
eighty minor marks, like the
                   blossoming flower of a royal sal tree, and towering
like Mount Meru, the king of
                   mountains. His face was as calm as the sphere of the
moon, and radiantly clear and
                   brilliant like the sphere of the sun. His body was
proportioned like a nyagrodha
                   tree, blazing with light and great splendor.</seg></p>
          </div>
       </body>



On 28/04/17 09:41, Mike Engle wrote:

> Hi all
>
> I'm involved in a project in which we're marking up a large body of
> literature and we want to mark various passages in the texts as
> significant, in essence labeling them as "this" or "that".  We took a
> look through the TEI Guidelines and decided to use <seg> and <interp> to
> mark the various passages with different attributes to specifiy what
> type of passage it is.
>
> For example, with <interp>:
>
> In the <interptype="placeMain">Country of the Bhargas, on Mount Śuśumāra
> in a fearsome forest of wild animals</interp> with a <interp
> type="audienceGen">great saṅgha of about 500 monks, eminent
> śrāvaka-elders who possessed clairvoyance</interp>.
>
>
> And with <seg>:
>
> <segfunction="modul"type="pastLifeWho">if you wonder whether the brahmin boy
>                         Bhadraśuddha was then at that time someone else,
> or you are of two minds
>                         about it, or doubtful, do not see him so. Why?
> Because the bodhisattva
>                         mahāsattva Maitreya himself was then at that
> time the brahmin boy
>                         Bhadraśuddha</seg>
>
>
> Two questions:
>
> 1)  Since this is going to be a very long term and labor extensive
> project, I wanted to check and see if the community in general felt this
> was a reasonable way to mark these passages and also if there are any
> suggestions for other ways to do this which might work better.  Can
> anyone suggest any other elements that might be useful for this kind of
> thing?
>
> 2)  We have a problem of going across elements (breaking the nesting so
> to speak).  For instance, a passage might start in the middle of one
> paragraph and finish halfway through another paragraph, and when we mark
> it the tag begins in one paragraph and closes in the next.  The schema
> doesn't like this one bit, and I'm wondering what is the best way to
> handle this.  For example:
>
>
> <p><seg function="modul" type="qualitiesBuddha">The Tathāgata was
> handsome and charismatic,
>                         controlled in his faculties and in his mind. He
> had attained excellence in
>                         control and calm abiding, and superiority in
> control and calm abiding. He
>                         guarded his faculties, elephant-like in control
> of his passions, and
>                         was radiant, unsullied, and clear like a lake.</p>
>
> <p>His body was adorned with the
>                         thirty-two marks of a great being, and with the
> eighty minor marks, like the
>                         blossoming flower of a royal sal tree, and
> towering like Mount Meru, the
>                         king of mountains. His face was as calm as the
> sphere of the moon, and
>                         radiantly clear and brilliant like the sphere of
>                         the sun. His body was proportioned like a
> nyagrodha tree, blazing with light
>                         and great splendor.</seg></p>
>
>
> Any help is greatly appreciate.  Forgive the formatting
>
> Mike
>
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: seg and interp

Roberto Rosselli Del Turco-2
Dear Piotr,
I've been tempted to use <taxonomy> for a very similar purpose, but
wouldn't that be a sort of "tag abuse"? Since from what I understand
reading the relative section in the Guidelines this element should be
used for *bibliographical* taxonomies only, it is not a general purpose
classification tool.

I would then suggest to use @ana with <seg>, as you do too, pointing to
a number of <interp>s in an <interpGrp>.

All best,

R

Il 28.04.2017 21:24 Piotr Bański ha scritto:

> Dear Mike,
>
> It looks like your system assumes a rich taxonomy of objects the
> textual cues to which you choose to identify in your corpus. I would
> suggest that you first explore the possibilities that <taxonomy> gives
> you, as it will free you of the need to parse your attributes later
> on, but it will also force you (or help you) to keep your (so far
> merely implied) taxonomy consistent throughout.
>
> http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-taxonomy.html
>
> Below is a fragment that uses <seg> alone -- which you may like,
> because it's simpler for the encoders.
>
> Remarks:
> 1. the markup below presupposes several hierarchically organised
> <taxonomy> elements in the header.
> 2. I am not sure what you use @function for, so I just keep it as-is
> 3. It is a good idea to index at least the added <seg> elements. The
> indices I added to them have nearly random content.
> 4. You don't need @prev on the last one, but it may be useful. It's up
> to you and your tools. I assumed a rule whereby if there is no @ana
> attribute, it means that the <seg> is a continuation, and @prev points
> to the previous one in the chain. But YMMV.
> 5. Sorry about the line breaks, you may want to paste this fragment
> into your test TEI document and reformat.
>
> Hope this helps,
>
>   Piotr
>
> <body>
>          <div>
>             <p>In the <seg xml:id="d-p3-s3" ana="#place.main">Country
> of the Bhargas, on Mount
>                   Śuśumāra in a fearsome forest of wild animals</seg>
> with a <seg xml:id="d-p3-s4"
>                   ana="#audience.Gen">great saṅgha of about 500 monks,
> eminent śrāvaka-elders who
>                   possessed clairvoyance</seg>.</p>
>          </div>
>          <div>
>             <p><seg xml:id="d-p-s43" function="modul"
> ana="#past.life.Who">if you wonder whether the
>                   brahmin boy Bhadraśuddha was then at that time
> someone else, or you are of two
>                   minds about it, or doubtful, do not see him so. Why?
> Because the bodhisattva
>                   mahāsattva Maitreya himself was then at that time
> the brahmin boy
>                   Bhadraśuddha</seg></p>
>             <p>
>                <seg xml:id="d-p-s44" function="modul"
> type="#qualities.Buddha" next="#d-p-s45">The
>                   Tathāgata was handsome and charismatic, controlled
> in his faculties and in his
>                   mind. He had attained excellence in control and calm
> abiding, and superiority in
>                   control and calm abiding. He guarded his faculties,
> elephant-like in control of
>                   his passions, and was radiant, unsullied, and clear
> like a lake.</seg></p>
>
>             <p><seg xml:id="d-p-s45" function="modul"
> prev="#d-p-s44">His body was adorned with the
>                   thirty-two marks of a great being, and with the
> eighty minor marks, like the
>                   blossoming flower of a royal sal tree, and towering
> like Mount Meru, the king of
>                   mountains. His face was as calm as the sphere of the
> moon, and radiantly clear and
>                   brilliant like the sphere of the sun. His body was
> proportioned like a nyagrodha
>                   tree, blazing with light and great
> splendor.</seg></p>
>          </div>
>       </body>
>
>
>
> On 28/04/17 09:41, Mike Engle wrote:
>> Hi all
>>
>> I'm involved in a project in which we're marking up a large body of
>> literature and we want to mark various passages in the texts as
>> significant, in essence labeling them as "this" or "that".  We took a
>> look through the TEI Guidelines and decided to use <seg> and <interp>
>> to
>> mark the various passages with different attributes to specifiy what
>> type of passage it is.
>>
>> For example, with <interp>:
>>
>> In the <interptype="placeMain">Country of the Bhargas, on Mount
>> Śuśumāra
>> in a fearsome forest of wild animals</interp> with a <interp
>> type="audienceGen">great saṅgha of about 500 monks, eminent
>> śrāvaka-elders who possessed clairvoyance</interp>.
>>
>>
>> And with <seg>:
>>
>> <segfunction="modul"type="pastLifeWho">if you wonder whether the
>> brahmin boy
>>                         Bhadraśuddha was then at that time someone
>> else,
>> or you are of two minds
>>                         about it, or doubtful, do not see him so. Why?
>> Because the bodhisattva
>>                         mahāsattva Maitreya himself was then at that
>> time the brahmin boy
>>                         Bhadraśuddha</seg>
>>
>>
>> Two questions:
>>
>> 1)  Since this is going to be a very long term and labor extensive
>> project, I wanted to check and see if the community in general felt
>> this
>> was a reasonable way to mark these passages and also if there are any
>> suggestions for other ways to do this which might work better.  Can
>> anyone suggest any other elements that might be useful for this kind
>> of
>> thing?
>>
>> 2)  We have a problem of going across elements (breaking the nesting
>> so
>> to speak).  For instance, a passage might start in the middle of one
>> paragraph and finish halfway through another paragraph, and when we
>> mark
>> it the tag begins in one paragraph and closes in the next.  The schema
>> doesn't like this one bit, and I'm wondering what is the best way to
>> handle this.  For example:
>>
>>
>> <p><seg function="modul" type="qualitiesBuddha">The Tathāgata was
>> handsome and charismatic,
>>                         controlled in his faculties and in his mind.
>> He
>> had attained excellence in
>>                         control and calm abiding, and superiority in
>> control and calm abiding. He
>>                         guarded his faculties, elephant-like in
>> control
>> of his passions, and
>>                         was radiant, unsullied, and clear like a
>> lake.</p>
>>
>> <p>His body was adorned with the
>>                         thirty-two marks of a great being, and with
>> the
>> eighty minor marks, like the
>>                         blossoming flower of a royal sal tree, and
>> towering like Mount Meru, the
>>                         king of mountains. His face was as calm as the
>> sphere of the moon, and
>>                         radiantly clear and brilliant like the sphere
>> of
>>                         the sun. His body was proportioned like a
>> nyagrodha tree, blazing with light
>>                         and great splendor.</seg></p>
>>
>>
>> Any help is greatly appreciate.  Forgive the formatting
>>
>> Mike
>>
>>
>>
>>
Reply | Threaded
Open this post in threaded view
|

Re: seg and interp

Mike Engle
Thanks everyone for the feedback!  Very helpful indeed.  I really appreciate it.  I looked a bit at what was suggested and I have a few more remarks/further questions:

Thanks for clarifying how to get around "breaking" the xml rules by having <seg> go across elements.  I'll use @next and @prev to get around that.

Understood about the problem with using <interp> in the way were doing it.   

 It seems like Roberto is right and that using @ana with <seg> that points to <interp>s in an <interpGrp> or the like would be appropriate, and that perhaps <taxonomy> isn't exactly the right element for doing this. 

Questions: 

Is it necessary to index the <seg>s?  I understand the need to do so in order to use @next and @prev, but what are the other reasons for doing so?  What is the benefit of including an xml:id?

I'm wondering what is the problem with keeping it more simple by using something like <seg> with @type instead of getting into @ana with <interp>.  For instance, as was proposed with the first email:

<seg function="modul" type="pastLifeWho">if you wonder whether the brahmin boy
                        Bhadraśuddha was then at that time someone else, or you are of two minds
                        about it, or doubtful, do not see him so. Why? Because the bodhisattva
                        mahāsattva Maitreya himself was then at that time the brahmin boy
                        Bhadraśuddha</seg>

 In terms of keeping the markup consistent without using @ana and an <interpGrp>, I was considering limiting the @type values with the schema.  Perhaps it's better to create the <interpGrp> and use @ana to point to it and keep all those interpretations in the document, but since we will be doing many texts I'd prefer not to keep that list in every text in case we need to make changes to the list (which I suppose would then involve going through and changing it every time).  Perhaps all the <interp>s could be kept in a separate document that all the <seg>s would point to instead of listing them all in every document.  

I guess my main question is: what are the main benefits of doing something more complex like using @ana with <interp> rather than doing it more simple with just @type?

Mike

On Sat, Apr 29, 2017 at 9:55 AM, Roberto Rosselli Del Turco <[hidden email]> wrote:
Dear Piotr,
I've been tempted to use <taxonomy> for a very similar purpose, but wouldn't that be a sort of "tag abuse"? Since from what I understand reading the relative section in the Guidelines this element should be used for *bibliographical* taxonomies only, it is not a general purpose classification tool.

I would then suggest to use @ana with <seg>, as you do too, pointing to a number of <interp>s in an <interpGrp>.

All best,

R


Il 28.04.2017 21:24 Piotr Bański ha scritto:
Dear Mike,

It looks like your system assumes a rich taxonomy of objects the
textual cues to which you choose to identify in your corpus. I would
suggest that you first explore the possibilities that <taxonomy> gives
you, as it will free you of the need to parse your attributes later
on, but it will also force you (or help you) to keep your (so far
merely implied) taxonomy consistent throughout.

http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-taxonomy.html

Below is a fragment that uses <seg> alone -- which you may like,
because it's simpler for the encoders.

Remarks:
1. the markup below presupposes several hierarchically organised
<taxonomy> elements in the header.
2. I am not sure what you use @function for, so I just keep it as-is
3. It is a good idea to index at least the added <seg> elements. The
indices I added to them have nearly random content.
4. You don't need @prev on the last one, but it may be useful. It's up
to you and your tools. I assumed a rule whereby if there is no @ana
attribute, it means that the <seg> is a continuation, and @prev points
to the previous one in the chain. But YMMV.
5. Sorry about the line breaks, you may want to paste this fragment
into your test TEI document and reformat.

Hope this helps,

  Piotr

<body>
         <div>
            <p>In the <seg xml:id="d-p3-s3" ana="#place.main">Country
of the Bhargas, on Mount
                  Śuśumāra in a fearsome forest of wild animals</seg>
with a <seg xml:id="d-p3-s4"
                  ana="#audience.Gen">great saṅgha of about 500 monks,
eminent śrāvaka-elders who
                  possessed clairvoyance</seg>.</p>
         </div>
         <div>
            <p><seg xml:id="d-p-s43" function="modul"
ana="#past.life.Who">if you wonder whether the
                  brahmin boy Bhadraśuddha was then at that time
someone else, or you are of two
                  minds about it, or doubtful, do not see him so. Why?
Because the bodhisattva
                  mahāsattva Maitreya himself was then at that time
the brahmin boy
                  Bhadraśuddha</seg></p>
            <p>
               <seg xml:id="d-p-s44" function="modul"
type="#qualities.Buddha" next="#d-p-s45">The
                  Tathāgata was handsome and charismatic, controlled
in his faculties and in his
                  mind. He had attained excellence in control and calm
abiding, and superiority in
                  control and calm abiding. He guarded his faculties,
elephant-like in control of
                  his passions, and was radiant, unsullied, and clear
like a lake.</seg></p>

            <p><seg xml:id="d-p-s45" function="modul"
prev="#d-p-s44">His body was adorned with the
                  thirty-two marks of a great being, and with the
eighty minor marks, like the
                  blossoming flower of a royal sal tree, and towering
like Mount Meru, the king of
                  mountains. His face was as calm as the sphere of the
moon, and radiantly clear and
                  brilliant like the sphere of the sun. His body was
proportioned like a nyagrodha
                  tree, blazing with light and great splendor.</seg></p>
         </div>
      </body>



On 28/04/17 09:41, Mike Engle wrote:
Hi all

I'm involved in a project in which we're marking up a large body of
literature and we want to mark various passages in the texts as
significant, in essence labeling them as "this" or "that".  We took a
look through the TEI Guidelines and decided to use <seg> and <interp> to
mark the various passages with different attributes to specifiy what
type of passage it is.

For example, with <interp>:

In the <interptype="placeMain">Country of the Bhargas, on Mount Śuśumāra
in a fearsome forest of wild animals</interp> with a <interp
type="audienceGen">great saṅgha of about 500 monks, eminent
śrāvaka-elders who possessed clairvoyance</interp>.


And with <seg>:

<segfunction="modul"type="pastLifeWho">if you wonder whether the brahmin boy
                        Bhadraśuddha was then at that time someone else,
or you are of two minds
                        about it, or doubtful, do not see him so. Why?
Because the bodhisattva
                        mahāsattva Maitreya himself was then at that
time the brahmin boy
                        Bhadraśuddha</seg>


Two questions:

1)  Since this is going to be a very long term and labor extensive
project, I wanted to check and see if the community in general felt this
was a reasonable way to mark these passages and also if there are any
suggestions for other ways to do this which might work better.  Can
anyone suggest any other elements that might be useful for this kind of
thing?

2)  We have a problem of going across elements (breaking the nesting so
to speak).  For instance, a passage might start in the middle of one
paragraph and finish halfway through another paragraph, and when we mark
it the tag begins in one paragraph and closes in the next.  The schema
doesn't like this one bit, and I'm wondering what is the best way to
handle this.  For example:


<p><seg function="modul" type="qualitiesBuddha">The Tathāgata was
handsome and charismatic,
                        controlled in his faculties and in his mind. He
had attained excellence in
                        control and calm abiding, and superiority in
control and calm abiding. He
                        guarded his faculties, elephant-like in control
of his passions, and
                        was radiant, unsullied, and clear like a lake.</p>

<p>His body was adorned with the
                        thirty-two marks of a great being, and with the
eighty minor marks, like the
                        blossoming flower of a royal sal tree, and
towering like Mount Meru, the
                        king of mountains. His face was as calm as the
sphere of the moon, and
                        radiantly clear and brilliant like the sphere of
                        the sun. His body was proportioned like a
nyagrodha tree, blazing with light
                        and great splendor.</seg></p>


Any help is greatly appreciate.  Forgive the formatting

Mike





Reply | Threaded
Open this post in threaded view
|

Re: seg and interp

Piotr Banski
In reply to this post by Roberto Rosselli Del Turco-2
Dear Roberto,

The intended nature of <taxonomy> aside for a moment (because the result
can also be achieved with feature structures, if text analysis wouldn't
qualify for being encoded in a <taxonomy>... but then, what a sad device
it would be :-) ), I think the primary solution depends on the assumed
methodology.

If Mike's system presupposes a taxonomy that is already complete or that
could at least potentially be made complete, then it seems that he
should go for a <taxonomy> or an equivalent (it could be an external
ontology, depending on his area and the stage of research into it). If,
however, the approach is more free-style, essentially creating an
open-ended taxonomy (or rather multiple taxonomies, as we've seen), then
not having a central repository is an advantage, because then the
current state can be harvested from, say, <interpGrp>s as you propose
(or essentially by cyclically parsing the @ana attributes as the
annotators move on).

(And whatever Mike's final decision is going to be, using capitalization
inside attribute values can probably bite harder than using a separator
such as a dot or a hyphen, unless the values, contrary to appearances,
are expected to be atomic.)

Best regards,

   Piotr

On 04/29/17 09:55, Roberto Rosselli Del Turco wrote:

> Dear Piotr,
> I've been tempted to use <taxonomy> for a very similar purpose, but
> wouldn't that be a sort of "tag abuse"? Since from what I understand
> reading the relative section in the Guidelines this element should be
> used for *bibliographical* taxonomies only, it is not a general
> purpose classification tool.
>
> I would then suggest to use @ana with <seg>, as you do too, pointing
> to a number of <interp>s in an <interpGrp>.
>
> All best,
>
> R
>
> Il 28.04.2017 21:24 Piotr Bański ha scritto:
>> Dear Mike,
>>
>> It looks like your system assumes a rich taxonomy of objects the
>> textual cues to which you choose to identify in your corpus. I would
>> suggest that you first explore the possibilities that <taxonomy> gives
>> you, as it will free you of the need to parse your attributes later
>> on, but it will also force you (or help you) to keep your (so far
>> merely implied) taxonomy consistent throughout.
>>
>> http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-taxonomy.html
>>
>> Below is a fragment that uses <seg> alone -- which you may like,
>> because it's simpler for the encoders.
>>
>> Remarks:
>> 1. the markup below presupposes several hierarchically organised
>> <taxonomy> elements in the header.
>> 2. I am not sure what you use @function for, so I just keep it as-is
>> 3. It is a good idea to index at least the added <seg> elements. The
>> indices I added to them have nearly random content.
>> 4. You don't need @prev on the last one, but it may be useful. It's up
>> to you and your tools. I assumed a rule whereby if there is no @ana
>> attribute, it means that the <seg> is a continuation, and @prev points
>> to the previous one in the chain. But YMMV.
>> 5. Sorry about the line breaks, you may want to paste this fragment
>> into your test TEI document and reformat.
>>
>> Hope this helps,
>>
>>   Piotr
>>
>> <body>
>>          <div>
>>             <p>In the <seg xml:id="d-p3-s3" ana="#place.main">Country
>> of the Bhargas, on Mount
>>                   Śuśumāra in a fearsome forest of wild animals</seg>
>> with a <seg xml:id="d-p3-s4"
>>                   ana="#audience.Gen">great saṅgha of about 500 monks,
>> eminent śrāvaka-elders who
>>                   possessed clairvoyance</seg>.</p>
>>          </div>
>>          <div>
>>             <p><seg xml:id="d-p-s43" function="modul"
>> ana="#past.life.Who">if you wonder whether the
>>                   brahmin boy Bhadraśuddha was then at that time
>> someone else, or you are of two
>>                   minds about it, or doubtful, do not see him so. Why?
>> Because the bodhisattva
>>                   mahāsattva Maitreya himself was then at that time
>> the brahmin boy
>>                   Bhadraśuddha</seg></p>
>>             <p>
>>                <seg xml:id="d-p-s44" function="modul"
>> type="#qualities.Buddha" next="#d-p-s45">The
>>                   Tathāgata was handsome and charismatic, controlled
>> in his faculties and in his
>>                   mind. He had attained excellence in control and calm
>> abiding, and superiority in
>>                   control and calm abiding. He guarded his faculties,
>> elephant-like in control of
>>                   his passions, and was radiant, unsullied, and clear
>> like a lake.</seg></p>
>>
>>             <p><seg xml:id="d-p-s45" function="modul"
>> prev="#d-p-s44">His body was adorned with the
>>                   thirty-two marks of a great being, and with the
>> eighty minor marks, like the
>>                   blossoming flower of a royal sal tree, and towering
>> like Mount Meru, the king of
>>                   mountains. His face was as calm as the sphere of the
>> moon, and radiantly clear and
>>                   brilliant like the sphere of the sun. His body was
>> proportioned like a nyagrodha
>>                   tree, blazing with light and great splendor.</seg></p>
>>          </div>
>>       </body>
>>
>>
>>
>> On 28/04/17 09:41, Mike Engle wrote:
>>> Hi all
>>>
>>> I'm involved in a project in which we're marking up a large body of
>>> literature and we want to mark various passages in the texts as
>>> significant, in essence labeling them as "this" or "that".  We took a
>>> look through the TEI Guidelines and decided to use <seg> and
>>> <interp> to
>>> mark the various passages with different attributes to specifiy what
>>> type of passage it is.
>>>
>>> For example, with <interp>:
>>>
>>> In the <interptype="placeMain">Country of the Bhargas, on Mount
>>> Śuśumāra
>>> in a fearsome forest of wild animals</interp> with a <interp
>>> type="audienceGen">great saṅgha of about 500 monks, eminent
>>> śrāvaka-elders who possessed clairvoyance</interp>.
>>>
>>>
>>> And with <seg>:
>>>
>>> <segfunction="modul"type="pastLifeWho">if you wonder whether the
>>> brahmin boy
>>>                         Bhadraśuddha was then at that time someone
>>> else,
>>> or you are of two minds
>>>                         about it, or doubtful, do not see him so. Why?
>>> Because the bodhisattva
>>>                         mahāsattva Maitreya himself was then at that
>>> time the brahmin boy
>>>                         Bhadraśuddha</seg>
>>>
>>>
>>> Two questions:
>>>
>>> 1)  Since this is going to be a very long term and labor extensive
>>> project, I wanted to check and see if the community in general felt
>>> this
>>> was a reasonable way to mark these passages and also if there are any
>>> suggestions for other ways to do this which might work better.  Can
>>> anyone suggest any other elements that might be useful for this kind of
>>> thing?
>>>
>>> 2)  We have a problem of going across elements (breaking the nesting so
>>> to speak).  For instance, a passage might start in the middle of one
>>> paragraph and finish halfway through another paragraph, and when we
>>> mark
>>> it the tag begins in one paragraph and closes in the next. The schema
>>> doesn't like this one bit, and I'm wondering what is the best way to
>>> handle this.  For example:
>>>
>>>
>>> <p><seg function="modul" type="qualitiesBuddha">The Tathāgata was
>>> handsome and charismatic,
>>>                         controlled in his faculties and in his mind. He
>>> had attained excellence in
>>>                         control and calm abiding, and superiority in
>>> control and calm abiding. He
>>>                         guarded his faculties, elephant-like in control
>>> of his passions, and
>>>                         was radiant, unsullied, and clear like a
>>> lake.</p>
>>>
>>> <p>His body was adorned with the
>>>                         thirty-two marks of a great being, and with the
>>> eighty minor marks, like the
>>>                         blossoming flower of a royal sal tree, and
>>> towering like Mount Meru, the
>>>                         king of mountains. His face was as calm as the
>>> sphere of the moon, and
>>>                         radiantly clear and brilliant like the
>>> sphere of
>>>                         the sun. His body was proportioned like a
>>> nyagrodha tree, blazing with light
>>>                         and great splendor.</seg></p>
>>>
>>>
>>> Any help is greatly appreciate.  Forgive the formatting
>>>
>>> Mike
>>>
>>>
>>>
>>>
>
Reply | Threaded
Open this post in threaded view
|

Re: seg and interp

Dominique Meeùs
In reply to this post by Mike Engle
On Sat, 29 Apr 2017 09:55:34 +0200, Roberto Rosselli Del Turco <[hidden email]> wrote:

>Dear Piotr,
>I've been tempted to use <taxonomy> for a very similar purpose, but
>wouldn't that be a sort of "tag abuse"? Since from what I understand
>reading the relative section in the Guidelines this element should be
>used for *bibliographical* taxonomies only, it is not a general purpose
>classification tool.

Reading 2.3.7, I see words bibliosomething used only about the role of the bibl element under taxonomy, not about the contents, the domain, of the taxonomy. Even if the examples are quite bibliographical, I see nothing that forbids the use of taxonomy for something else. Or am I mistaken?
Reply | Threaded
Open this post in threaded view
|

Re: seg and interp

James Cummings-4
On 05/05/17 08:50, Dominique Mee ùs wrote:
> On Sat, 29 Apr 2017 09:55:34 +0200, Roberto Rosselli Del Turco <[hidden email]> wrote:
>
>> Dear Piotr,
>> I've been tempted to use <taxonomy> for a very similar purpose, but
>> wouldn't that be a sort of "tag abuse"? Since from what I understand
>> reading the relative section in the Guidelines this element should be
>> used for *bibliographical* taxonomies only, it is not a general purpose
>> classification tool.
> Reading 2.3.7, I see words bibliosomething used only about the role of the bibl element under taxonomy, not about the contents, the domain, of the taxonomy. Even if the examples are quite bibliographical, I see nothing that forbids the use of taxonomy for something else. Or am I mistaken?

Given the definition as "defines a typology either implicitly, by
means of a bibliographic citation, or explicitly by a structured
taxonomy" I know I've certainly used it for more ad hoc
typologies and structured taxonomies for all sorts of things.
More of a hierarchical set of keyword categories that are then
referenced at the lowest applicable level for any particular
segment of text (at whatever granularity makes most sense).  I
find that using taxonomy in this way is a very powerful mechanism
because it gives the power of creating the hierarchy by which the
categories are nested to the encoder and thus enables them to
express their understanding of whatever form of analysis they are
undertaking to a level of granularity that makes sense to them.  
I'd always prefer to point to a deeply nested category than an
interp, for example. But I like deeply hierarchical
classifications. ;-)

-James

--
Dr James Cummings, [hidden email]
Academic IT Services, University of Oxford
Reply | Threaded
Open this post in threaded view
|

Re: seg and interp

Martin Holmes
In reply to this post by Dominique Meeùs
On 2017-05-05 12:50 AM, Dominique Mee ùs wrote:

> On Sat, 29 Apr 2017 09:55:34 +0200, Roberto Rosselli Del Turco <[hidden email]> wrote:
>
>> Dear Piotr,
>> I've been tempted to use <taxonomy> for a very similar purpose, but
>> wouldn't that be a sort of "tag abuse"? Since from what I understand
>> reading the relative section in the Guidelines this element should be
>> used for *bibliographical* taxonomies only, it is not a general purpose
>> classification tool.
>
> Reading 2.3.7, I see words bibliosomething used only about the role of the bibl element under taxonomy, not about the contents, the domain, of the taxonomy. Even if the examples are quite bibliographical, I see nothing that forbids the use of taxonomy for something else. Or am I mistaken?

I use taxonomies for all sorts of things (document types, rhyme types,
stanza types, types of place in a gazetteer...). I've always assumed it
was a generic classification structure.

Cheers,
Martin
Reply | Threaded
Open this post in threaded view
|

Re: seg and interp

James Cummings-4
On 05/05/17 13:22, Martin Holmes wrote:
> I use taxonomies for all sorts of things (document types, rhyme
> types, stanza types, types of place in a gazetteer...). I've
> always assumed it was a generic classification structure.

Even if it wasn't originally, enough of us do this -- and indeed
teach this -- that it seems pretty normal to me. ;-)

-James

--
Dr James Cummings, [hidden email]
Academic IT Services, University of Oxford
Reply | Threaded
Open this post in threaded view
|

Re: seg and interp

Roberto Rosselli Del Turco-2
In reply to this post by James Cummings-4
Il 05.05.2017 14:07 James Cummings ha scritto:

> On 05/05/17 08:50, Dominique Mee ùs wrote:
>> On Sat, 29 Apr 2017 09:55:34 +0200, Roberto Rosselli Del Turco
>> <[hidden email]> wrote:
>>
>>> Dear Piotr,
>>> I've been tempted to use <taxonomy> for a very similar purpose, but
>>> wouldn't that be a sort of "tag abuse"? Since from what I understand
>>> reading the relative section in the Guidelines this element should be
>>> used for *bibliographical* taxonomies only, it is not a general
>>> purpose
>>> classification tool.
>> Reading 2.3.7, I see words bibliosomething used only about the role of
>> the bibl element under taxonomy, not about the contents, the domain,
>> of the taxonomy. Even if the examples are quite bibliographical, I see
>> nothing that forbids the use of taxonomy for something else. Or am I
>> mistaken?
>
> Given the definition as "defines a typology either implicitly, by
> means of a bibliographic citation, or explicitly by a structured
> taxonomy" I know I've certainly used it for more ad hoc typologies and
> structured taxonomies for all sorts of things. More of a hierarchical
> set of keyword categories that are then referenced at the lowest
> applicable level for any particular segment of text (at whatever
> granularity makes most sense).  I find that using taxonomy in this way
> is a very powerful mechanism because it gives the power of creating
> the hierarchy by which the categories are nested to the encoder and
> thus enables them to express their understanding of whatever form of
> analysis they are undertaking to a level of granularity that makes
> sense to them.  I'd always prefer to point to a deeply nested category
> than an interp, for example. But I like deeply hierarchical
> classifications. ;-)

Guys, it may well be that I'm reading too much into the text of the
Guidelines (not a native English speaker after all), but consider that
<taxonomy> belongs to <classDecl> which "is used to group together
definitions or sources for any descriptive classification schemes used
*by other parts of the header*": this last remark seems to preclude
general use of <taxonomy> wrt the actual <text>. From this, from what is
written immediately after and from the examples proposed I gathered that
it would be an unwarranted extension of its original purpose to make it
a document-wide element: if I am wrong, and I surely believe you if you
tell me so, then may I ask that the language of the Guidelines be made
more clear and that at least one example showing how to use <taxonomy>
from the main <text> be added? would @ana work with it as well?

Thank you,

R
Reply | Threaded
Open this post in threaded view
|

Re: seg and interp

Lou Burnard-6
On 05/05/17 14:19, Roberto Rosselli Del Turco wrote:

> Il 05.05.2017 14:07 James Cummings ha scritto:
>> On 05/05/17 08:50, Dominique Mee ùs wrote:
>>> On Sat, 29 Apr 2017 09:55:34 +0200, Roberto Rosselli Del Turco
>>> <[hidden email]> wrote:
>>>
>>>> Dear Piotr,
>>>> I've been tempted to use <taxonomy> for a very similar purpose, but
>>>> wouldn't that be a sort of "tag abuse"? Since from what I understand
>>>> reading the relative section in the Guidelines this element should be
>>>> used for *bibliographical* taxonomies only, it is not a general
>>>> purpose
>>>> classification tool.
>>> Reading 2.3.7, I see words bibliosomething used only about the role
>>> of the bibl element under taxonomy, not about the contents, the
>>> domain, of the taxonomy. Even if the examples are quite
>>> bibliographical, I see nothing that forbids the use of taxonomy for
>>> something else. Or am I mistaken?
>>
>> Given the definition as "defines a typology either implicitly, by
>> means of a bibliographic citation, or explicitly by a structured
>> taxonomy" I know I've certainly used it for more ad hoc typologies and
>> structured taxonomies for all sorts of things. More of a hierarchical
>> set of keyword categories that are then referenced at the lowest
>> applicable level for any particular segment of text (at whatever
>> granularity makes most sense).  I find that using taxonomy in this way
>> is a very powerful mechanism because it gives the power of creating
>> the hierarchy by which the categories are nested to the encoder and
>> thus enables them to express their understanding of whatever form of
>> analysis they are undertaking to a level of granularity that makes
>> sense to them.  I'd always prefer to point to a deeply nested category
>> than an interp, for example. But I like deeply hierarchical
>> classifications. ;-)
>
> Guys, it may well be that I'm reading too much into the text of the
> Guidelines (not a native English speaker after all), but consider that
> <taxonomy> belongs to <classDecl> which "is used to group together
> definitions or sources for any descriptive classification schemes used
> *by other parts of the header*": this last remark seems to preclude
> general use of <taxonomy> wrt the actual <text>. From this, from what
> is written immediately after and from the examples proposed I gathered
> that it would be an unwarranted extension of its original purpose to
> make it a document-wide element: if I am wrong, and I surely believe
> you if you tell me so, then may I ask that the language of the
> Guidelines be made more clear and that at least one example showing
> how to use <taxonomy> from the main <text> be added? would @ana work
> with it as well?
>
> Thank you,
>
> R

I would say, yes. A taxonomy is a taxonomy, and @ana can point to
elements in it: there's no rule that says the thing pointing to an entry
in a taxonomy must be in the header, despite the wording that you cite.
I agree that the language should be clarified (adding "or elsewhere" at
the end would do it!)
Reply | Threaded
Open this post in threaded view
|

Re: seg and interp

James Cummings-4
On 05/05/17 14:32, Lou Burnard wrote:

> On 05/05/17 14:19, Roberto Rosselli Del Turco wrote:
>> Guys, it may well be that I'm reading too much into the text
>> of the Guidelines (not a native English speaker after all),
>> but consider that <taxonomy> belongs to <classDecl> which "is
>> used to group together definitions or sources for any
>> descriptive classification schemes used *by other parts of the
>> header*": this last remark seems to preclude general use of
>> <taxonomy> wrt the actual <text>. From this, from what is
>> written immediately after and from the examples proposed I
>> gathered that it would be an unwarranted extension of its
>> original purpose to make it a document-wide element: if I am
>> wrong, and I surely believe you if you tell me so, then may I
>> ask that the language of the Guidelines be made more clear and
>> that at least one example showing how to use <taxonomy> from
>> the main <text> be added? would @ana work with it as well?
>>
>> Thank you,
>>
>> R
>
> I would say, yes. A taxonomy is a taxonomy, and @ana can point
> to elements in it: there's no rule that says the thing pointing
> to an entry in a taxonomy must be in the header, despite the
> wording that you cite. I agree that the language should be
> clarified (adding "or elsewhere" at the end would do it!)

I would point to the second example on:
http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-taxonomy.html
which indicates pointing to a taxonomy from an <lg>.  My
recollection was this example was added by council specifically
to indicate this kind of usage.  However, they appear to have
overlooked <classDecl> needing a slight rephrasing.

-James


--
Dr James Cummings, [hidden email]
Academic IT Services, University of Oxford
Reply | Threaded
Open this post in threaded view
|

Re: seg and interp

Martin Holmes
On 2017-05-05 06:54 AM, James Cummings wrote:

> I would point to the second example on:
> http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-taxonomy.html
> which indicates pointing to a taxonomy from an <lg>.  My recollection
> was this example was added by council specifically to indicate this kind
> of usage.  However, they appear to have overlooked <classDecl> needing a
> slight rephrasing.

I think that's my example; I remember adding it when we updated the
specs to allow <taxonomy> to nest. I never thought to look at the
wording of <classDecl>.

Cheers,
Martin
Reply | Threaded
Open this post in threaded view
|

Re: seg and interp

Elisa Beshero-Bondar
Hi everyone-- 
I was about to go and open a ticket on this, but I'm pausing for a moment because I don't think it's necessary. I think Roberto was asking whether taxonomy element can be used within the text element of the TEI, and indeed it can't. It's a child either of taxonomy or of classDecl, which means that it's part of the teiHeader. But I think what we're all trying to clarify here is that the referents to a taxonomy ARE marked within the text elements of other documents. 

I'd point out that our explanation of classDecl is pretty clear: http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-classDecl.html  
<classDecl> (classification declarations) contains one or more taxonomies defining any classificatory codes used elsewhere in the text.

I think we have this covered? 

Best,
Elisa

On Fri, May 5, 2017 at 10:49 AM, Martin Holmes <[hidden email]> wrote:
On 2017-05-05 06:54 AM, James Cummings wrote:

I would point to the second example on:
http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-taxonomy.html
which indicates pointing to a taxonomy from an <lg>.  My recollection
was this example was added by council specifically to indicate this kind
of usage.  However, they appear to have overlooked <classDecl> needing a
slight rephrasing.

I think that's my example; I remember adding it when we updated the specs to allow <taxonomy> to nest. I never thought to look at the wording of <classDecl>.

Cheers,
Martin



--
Elisa Beshero-Bondar, PhD
Director, Center for the Digital Text | Associate Professor of English
University of Pittsburgh at Greensburg | Humanities Division
150 Finoli Drive
Greensburg, PA  15601  USA
E-mail:[hidden email]
Development site: http://newtfire.org
Reply | Threaded
Open this post in threaded view
|

Re: seg and interp

Elisa Beshero-Bondar
PS: I may have mis-read Roberto's last post now that I look at it again. At any rate, the confusion seems to be where the *referents* to a taxonomy can be marked. Given that the Guidelines make clear exactly where a taxonomy element is to be placed in the teiHeader, and that the classDecl in which a taxonomy appears is described as above "defining any classificatory codes used elsewhere in the text", I think the Guidelines are pretty clear and I'm not sure where we ought to make a revision.

* Perhaps it would be helpful, though, to include examples of material that points to a classDecl taxonomy from within a document.
*  For those of us who like putting our classificatory schemes for large projects in their own distinct files, perhaps we could use an example that demonstrates use of a taxonomy in its own separate file, to which other project files refer.

Are these worth adding to the Guidelines?

Elisa

On Fri, May 5, 2017 at 1:56 PM, Elisa Beshero-Bondar <[hidden email]> wrote:
Hi everyone-- 
I was about to go and open a ticket on this, but I'm pausing for a moment because I don't think it's necessary. I think Roberto was asking whether taxonomy element can be used within the text element of the TEI, and indeed it can't. It's a child either of taxonomy or of classDecl, which means that it's part of the teiHeader. But I think what we're all trying to clarify here is that the referents to a taxonomy ARE marked within the text elements of other documents. 

I'd point out that our explanation of classDecl is pretty clear: http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-classDecl.html  
<classDecl> (classification declarations) contains one or more taxonomies defining any classificatory codes used elsewhere in the text.

I think we have this covered? 

Best,
Elisa

On Fri, May 5, 2017 at 10:49 AM, Martin Holmes <[hidden email]> wrote:
On 2017-05-05 06:54 AM, James Cummings wrote:

I would point to the second example on:
http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-taxonomy.html
which indicates pointing to a taxonomy from an <lg>.  My recollection
was this example was added by council specifically to indicate this kind
of usage.  However, they appear to have overlooked <classDecl> needing a
slight rephrasing.

I think that's my example; I remember adding it when we updated the specs to allow <taxonomy> to nest. I never thought to look at the wording of <classDecl>.

Cheers,
Martin



--
Elisa Beshero-Bondar, PhD
Director, Center for the Digital Text | Associate Professor of English
University of Pittsburgh at Greensburg | Humanities Division
150 Finoli Drive
Greensburg, PA  15601  USA
E-mail:[hidden email]
Development site: http://newtfire.org



--
Elisa Beshero-Bondar, PhD
Director, Center for the Digital Text | Associate Professor of English
University of Pittsburgh at Greensburg | Humanities Division
150 Finoli Drive
Greensburg, PA  15601  USA
E-mail:[hidden email]
Development site: http://newtfire.org
Reply | Threaded
Open this post in threaded view
|

Re: seg and interp

Lou Burnard-6
I think Roberto is referring to (quotes from) the following passage at the start of 2.3.7

<div type="div3" xml:id="HD55"><head>The Classification Declaration</head>
<p>The <gi>classDecl</gi> element is used to group
together definitions or sources for any descriptive classification
schemes used by other parts of the header.  Each such scheme is
represented by a <gi>taxonomy</gi> element...

As I said in my reply, which you may have missed, all that is necessary is to add "or elsewhere in the document" at the end of the first sentence.



On 05/05/17 19:03, Elisa Beshero-Bondar wrote:
PS: I may have mis-read Roberto's last post now that I look at it again. At any rate, the confusion seems to be where the *referents* to a taxonomy can be marked. Given that the Guidelines make clear exactly where a taxonomy element is to be placed in the teiHeader, and that the classDecl in which a taxonomy appears is described as above "defining any classificatory codes used elsewhere in the text", I think the Guidelines are pretty clear and I'm not sure where we ought to make a revision.

* Perhaps it would be helpful, though, to include examples of material that points to a classDecl taxonomy from within a document.
*  For those of us who like putting our classificatory schemes for large projects in their own distinct files, perhaps we could use an example that demonstrates use of a taxonomy in its own separate file, to which other project files refer.

Are these worth adding to the Guidelines?

Elisa

On Fri, May 5, 2017 at 1:56 PM, Elisa Beshero-Bondar <[hidden email]> wrote:
Hi everyone-- 
I was about to go and open a ticket on this, but I'm pausing for a moment because I don't think it's necessary. I think Roberto was asking whether taxonomy element can be used within the text element of the TEI, and indeed it can't. It's a child either of taxonomy or of classDecl, which means that it's part of the teiHeader. But I think what we're all trying to clarify here is that the referents to a taxonomy ARE marked within the text elements of other documents. 

I'd point out that our explanation of classDecl is pretty clear: http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-classDecl.html  
<classDecl> (classification declarations) contains one or more taxonomies defining any classificatory codes used elsewhere in the text.

I think we have this covered? 

Best,
Elisa

On Fri, May 5, 2017 at 10:49 AM, Martin Holmes <[hidden email]> wrote:
On 2017-05-05 06:54 AM, James Cummings wrote:

I would point to the second example on:
http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-taxonomy.html
which indicates pointing to a taxonomy from an <lg>.  My recollection
was this example was added by council specifically to indicate this kind
of usage.  However, they appear to have overlooked <classDecl> needing a
slight rephrasing.

I think that's my example; I remember adding it when we updated the specs to allow <taxonomy> to nest. I never thought to look at the wording of <classDecl>.

Cheers,
Martin



--
Elisa Beshero-Bondar, PhD
Director, Center for the Digital Text | Associate Professor of English
University of Pittsburgh at Greensburg | Humanities Division
150 Finoli Drive
Greensburg, PA  15601  USA
E-mail:[hidden email]
Development site: http://newtfire.org



--
Elisa Beshero-Bondar, PhD
Director, Center for the Digital Text | Associate Professor of English
University of Pittsburgh at Greensburg | Humanities Division
150 Finoli Drive
Greensburg, PA  15601  USA
E-mail:[hidden email]
Development site: http://newtfire.org


Reply | Threaded
Open this post in threaded view
|

Re: seg and interp

Elisa Beshero-Bondar
Ahh, thanks--and yes, a revision there would make the usage clearer and also consistent with our definition of classDecl. I shall open the ticket forthwith. 

Elisa

On Fri, May 5, 2017 at 3:57 PM, Lou Burnard <[hidden email]> wrote:
I think Roberto is referring to (quotes from) the following passage at the start of 2.3.7

<div type="div3" xml:id="HD55"><head>The Classification Declaration</head>
<p>The <gi>classDecl</gi> element is used to group
together definitions or sources for any descriptive classification
schemes used by other parts of the header.  Each such scheme is
represented by a <gi>taxonomy</gi> element...

As I said in my reply, which you may have missed, all that is necessary is to add "or elsewhere in the document" at the end of the first sentence.




On 05/05/17 19:03, Elisa Beshero-Bondar wrote:
PS: I may have mis-read Roberto's last post now that I look at it again. At any rate, the confusion seems to be where the *referents* to a taxonomy can be marked. Given that the Guidelines make clear exactly where a taxonomy element is to be placed in the teiHeader, and that the classDecl in which a taxonomy appears is described as above "defining any classificatory codes used elsewhere in the text", I think the Guidelines are pretty clear and I'm not sure where we ought to make a revision.

* Perhaps it would be helpful, though, to include examples of material that points to a classDecl taxonomy from within a document.
*  For those of us who like putting our classificatory schemes for large projects in their own distinct files, perhaps we could use an example that demonstrates use of a taxonomy in its own separate file, to which other project files refer.

Are these worth adding to the Guidelines?

Elisa

On Fri, May 5, 2017 at 1:56 PM, Elisa Beshero-Bondar <[hidden email]> wrote:
Hi everyone-- 
I was about to go and open a ticket on this, but I'm pausing for a moment because I don't think it's necessary. I think Roberto was asking whether taxonomy element can be used within the text element of the TEI, and indeed it can't. It's a child either of taxonomy or of classDecl, which means that it's part of the teiHeader. But I think what we're all trying to clarify here is that the referents to a taxonomy ARE marked within the text elements of other documents. 

I'd point out that our explanation of classDecl is pretty clear: http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-classDecl.html  
<classDecl> (classification declarations) contains one or more taxonomies defining any classificatory codes used elsewhere in the text.

I think we have this covered? 

Best,
Elisa

On Fri, May 5, 2017 at 10:49 AM, Martin Holmes <[hidden email]> wrote:
On 2017-05-05 06:54 AM, James Cummings wrote:

I would point to the second example on:
http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-taxonomy.html
which indicates pointing to a taxonomy from an <lg>.  My recollection
was this example was added by council specifically to indicate this kind
of usage.  However, they appear to have overlooked <classDecl> needing a
slight rephrasing.

I think that's my example; I remember adding it when we updated the specs to allow <taxonomy> to nest. I never thought to look at the wording of <classDecl>.

Cheers,
Martin



--
Elisa Beshero-Bondar, PhD
Director, Center for the Digital Text | Associate Professor of English
University of Pittsburgh at Greensburg | Humanities Division
150 Finoli Drive
Greensburg, PA  15601  USA
E-mail:[hidden email]
Development site: http://newtfire.org



--
Elisa Beshero-Bondar, PhD
Director, Center for the Digital Text | Associate Professor of English
University of Pittsburgh at Greensburg | Humanities Division
150 Finoli Drive
Greensburg, PA  15601  USA
E-mail:[hidden email]
Development site: http://newtfire.org





--
Elisa Beshero-Bondar, PhD
Director, Center for the Digital Text | Associate Professor of English
University of Pittsburgh at Greensburg | Humanities Division
150 Finoli Drive
Greensburg, PA  15601  USA
E-mail:[hidden email]
Development site: http://newtfire.org
Reply | Threaded
Open this post in threaded view
|

Re: seg and interp

Elisa Beshero-Bondar
I've opened a ticket here: https://github.com/TEIC/TEI/issues/1640 and will quickly make the wording change now since it seems uncontroversial. I'll leave the ticket open, though, in case we'd like to work on more examples of common practice. 

Elisa

On Fri, May 5, 2017 at 4:01 PM, Elisa Beshero-Bondar <[hidden email]> wrote:
Ahh, thanks--and yes, a revision there would make the usage clearer and also consistent with our definition of classDecl. I shall open the ticket forthwith. 

Elisa

On Fri, May 5, 2017 at 3:57 PM, Lou Burnard <[hidden email]> wrote:
I think Roberto is referring to (quotes from) the following passage at the start of 2.3.7

<div type="div3" xml:id="HD55"><head>The Classification Declaration</head>
<p>The <gi>classDecl</gi> element is used to group
together definitions or sources for any descriptive classification
schemes used by other parts of the header.  Each such scheme is
represented by a <gi>taxonomy</gi> element...

As I said in my reply, which you may have missed, all that is necessary is to add "or elsewhere in the document" at the end of the first sentence.




On 05/05/17 19:03, Elisa Beshero-Bondar wrote:
PS: I may have mis-read Roberto's last post now that I look at it again. At any rate, the confusion seems to be where the *referents* to a taxonomy can be marked. Given that the Guidelines make clear exactly where a taxonomy element is to be placed in the teiHeader, and that the classDecl in which a taxonomy appears is described as above "defining any classificatory codes used elsewhere in the text", I think the Guidelines are pretty clear and I'm not sure where we ought to make a revision.

* Perhaps it would be helpful, though, to include examples of material that points to a classDecl taxonomy from within a document.
*  For those of us who like putting our classificatory schemes for large projects in their own distinct files, perhaps we could use an example that demonstrates use of a taxonomy in its own separate file, to which other project files refer.

Are these worth adding to the Guidelines?

Elisa

On Fri, May 5, 2017 at 1:56 PM, Elisa Beshero-Bondar <[hidden email]> wrote:
Hi everyone-- 
I was about to go and open a ticket on this, but I'm pausing for a moment because I don't think it's necessary. I think Roberto was asking whether taxonomy element can be used within the text element of the TEI, and indeed it can't. It's a child either of taxonomy or of classDecl, which means that it's part of the teiHeader. But I think what we're all trying to clarify here is that the referents to a taxonomy ARE marked within the text elements of other documents. 

I'd point out that our explanation of classDecl is pretty clear: http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-classDecl.html  
<classDecl> (classification declarations) contains one or more taxonomies defining any classificatory codes used elsewhere in the text.

I think we have this covered? 

Best,
Elisa

On Fri, May 5, 2017 at 10:49 AM, Martin Holmes <[hidden email]> wrote:
On 2017-05-05 06:54 AM, James Cummings wrote:

I would point to the second example on:
http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-taxonomy.html
which indicates pointing to a taxonomy from an <lg>.  My recollection
was this example was added by council specifically to indicate this kind
of usage.  However, they appear to have overlooked <classDecl> needing a
slight rephrasing.

I think that's my example; I remember adding it when we updated the specs to allow <taxonomy> to nest. I never thought to look at the wording of <classDecl>.

Cheers,
Martin



--
Elisa Beshero-Bondar, PhD
Director, Center for the Digital Text | Associate Professor of English
University of Pittsburgh at Greensburg | Humanities Division
150 Finoli Drive
Greensburg, PA  15601  USA
E-mail:[hidden email]
Development site: http://newtfire.org



--
Elisa Beshero-Bondar, PhD
Director, Center for the Digital Text | Associate Professor of English
University of Pittsburgh at Greensburg | Humanities Division
150 Finoli Drive
Greensburg, PA  15601  USA
E-mail:[hidden email]
Development site: http://newtfire.org





--
Elisa Beshero-Bondar, PhD
Director, Center for the Digital Text | Associate Professor of English
University of Pittsburgh at Greensburg | Humanities Division
150 Finoli Drive
Greensburg, PA  15601  USA
E-mail:[hidden email]
Development site: http://newtfire.org



--
Elisa Beshero-Bondar, PhD
Director, Center for the Digital Text | Associate Professor of English
University of Pittsburgh at Greensburg | Humanities Division
150 Finoli Drive
Greensburg, PA  15601  USA
E-mail:[hidden email]
Development site: http://newtfire.org
Reply | Threaded
Open this post in threaded view
|

Re: seg and interp

Piotr Banski
In reply to this post by Roberto Rosselli Del Turco-2
Dear Roberto,

You (and indirectly some of us as well) may be making a good case for
allowing <taxonomy> elsewhere in the header as well (where it wouldn't
have the semantic implications that you mention).

Best regards,

   Piotr


On 05/05/17 15:19, Roberto Rosselli Del Turco wrote:

> Il 05.05.2017 14:07 James Cummings ha scritto:
>> On 05/05/17 08:50, Dominique Mee ùs wrote:
>>> On Sat, 29 Apr 2017 09:55:34 +0200, Roberto Rosselli Del Turco
>>> <[hidden email]> wrote:
>>>
>>>> Dear Piotr,
>>>> I've been tempted to use <taxonomy> for a very similar purpose, but
>>>> wouldn't that be a sort of "tag abuse"? Since from what I understand
>>>> reading the relative section in the Guidelines this element should be
>>>> used for *bibliographical* taxonomies only, it is not a general
>>>> purpose
>>>> classification tool.
>>> Reading 2.3.7, I see words bibliosomething used only about the role
>>> of the bibl element under taxonomy, not about the contents, the
>>> domain, of the taxonomy. Even if the examples are quite
>>> bibliographical, I see nothing that forbids the use of taxonomy for
>>> something else. Or am I mistaken?
>>
>> Given the definition as "defines a typology either implicitly, by
>> means of a bibliographic citation, or explicitly by a structured
>> taxonomy" I know I've certainly used it for more ad hoc typologies and
>> structured taxonomies for all sorts of things. More of a hierarchical
>> set of keyword categories that are then referenced at the lowest
>> applicable level for any particular segment of text (at whatever
>> granularity makes most sense).  I find that using taxonomy in this way
>> is a very powerful mechanism because it gives the power of creating
>> the hierarchy by which the categories are nested to the encoder and
>> thus enables them to express their understanding of whatever form of
>> analysis they are undertaking to a level of granularity that makes
>> sense to them.  I'd always prefer to point to a deeply nested category
>> than an interp, for example. But I like deeply hierarchical
>> classifications. ;-)
>
> Guys, it may well be that I'm reading too much into the text of the
> Guidelines (not a native English speaker after all), but consider that
> <taxonomy> belongs to <classDecl> which "is used to group together
> definitions or sources for any descriptive classification schemes used
> *by other parts of the header*": this last remark seems to preclude
> general use of <taxonomy> wrt the actual <text>. From this, from what
> is written immediately after and from the examples proposed I gathered
> that it would be an unwarranted extension of its original purpose to
> make it a document-wide element: if I am wrong, and I surely believe
> you if you tell me so, then may I ask that the language of the
> Guidelines be made more clear and that at least one example showing
> how to use <taxonomy> from the main <text> be added? would @ana work
> with it as well?
>
> Thank you,
>
> R
>