Question on StackExchange about adding custom attribute to TEI vocabulary:

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

Question on StackExchange about adding custom attribute to TEI vocabulary:

Radu Coravu
Hi,

I just spotted a question on StackExchange about adding a custom
attribute to the TEI vocabulary:

https://stackoverflow.com/questions/50161914/oxygenxml-add-an-ad-hoc-attribute-to-teicorpus-framework

I do not know how to do that but if any of you knows maybe they can
answer there.

Regards,
Radu

Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
Reply | Threaded
Open this post in threaded view
|

Re: Question on StackExchange about adding custom attribute to TEI vocabulary:

Elisa Beshero-Bondar
Thanks, Radu! I've been in contact with the poster of the question and we've been discussing the idea for this attribute. She's working on a longer post to share here regarding working on cuneiform and transliteration, so we should probably just stay tuned and wait for that post here.

Best,
Elisa

On Fri, May 4, 2018 at 8:14 AM, Radu Coravu <[hidden email]> wrote:
Hi,

I just spotted a question on StackExchange about adding a custom attribute to the TEI vocabulary:

https://stackoverflow.com/questions/50161914/oxygenxml-add-an-ad-hoc-attribute-to-teicorpus-framework

I do not know how to do that but if any of you knows maybe they can answer there.

Regards,
Radu

Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com



--
Elisa Beshero-Bondar, PhD
Director, Center for the Digital Text | Associate Professor of English
University of Pittsburgh at Greensburg | Humanities Division
150 Finoli Drive
Greensburg, PA  15601  USA
E-mail:[hidden email]
Development site: http://newtfire.org
Reply | Threaded
Open this post in threaded view
|

Re: Question on StackExchange about adding custom attribute to TEI vocabulary:

Vanessa Juloux
In reply to this post by Radu Coravu
Dear Radu,
Dear TEI List,

thank you for having transmitted my message to TEI listServ. I have just added a comment to my post on StackExchange:

I have looked at other element and attribute but they don't work. I am currently working on an argumentation to explain why they are not relevant to the cuneiform script, and why @correspUnicshould be preferred.

Later today, I should be able to send my argumentation about adding a new attribute needed for cuneiform script. 

All the best,
Vanessa
---

Vanessa Bigot Juloux | Ph.D. candidate
» Chair Membership and Outreach Sub-committee for Europe (American Schools of Oriental Research)
Mobile + WhatsApp: +33 (0) 6 98 97 02 02 

Le 4 mai 2018 à 14:14, Radu Coravu <[hidden email]> a écrit :

Hi,

I just spotted a question on StackExchange about adding a custom attribute to the TEI vocabulary:

https://stackoverflow.com/questions/50161914/oxygenxml-add-an-ad-hoc-attribute-to-teicorpus-framework

I do not know how to do that but if any of you knows maybe they can answer there.

Regards,
Radu

Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com

Reply | Threaded
Open this post in threaded view
|

Re: Question on StackExchange about adding custom attribute to TEI vocabulary:

Piotr Bański
Dear Vanessa,

> Later today, I should be able to send my argumentation about adding a new attribute needed for cuneiform script.

Thanks, but -- do take your time. You can just add that attribute to your ODD and work with it right away, there is no special approval process necessary for you to be able to customize your own TEI schema. (On the other hand, we would love to learn about the results, when it's convenient to you.)

A piece of relevant documentation is located here:

http://www.tei-c.org/release/doc/tei-p5-doc/en/html/USE.html#MDMDAL

What you want to see in your ODD looks a bit like the following:

<elementSpec ident="w" module="analysis" mode="change">
<attList>
<attDef ident="myNewAttribute" mode="add">
<desc>This is a test attribute to hold a normalized value for my words.</desc>
<datatype>
  <dataRef key="teidata.text"/>
</datatype>
</attDef>
</attList>
</elementSpec>

(No need to bother about the namespace, at this point -- you can play with that later. The datatype is also very permissive.)

The above would allow you to do:

<w pos="verb" myNewAttribute="your-text">etc.</w>

Please note the @pos (part-of-speech) attribute used instead of @type. I also don't know what you want to use the attribute @ana for -- if for further grammatical features, then there exists the @msd attribute ready and willing to do that job, see

http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-att.linguistic.html

If you need @ana for something else, then please bear in mind that its datatype is URI.

Cheers,

  Piotr





On 05/04/18 14:51, Vanessa Bigot Juloux wrote:
Dear Radu,
Dear TEI List,

thank you for having transmitted my message to TEI listServ. I have just added a comment to my post on StackExchange:

I have looked at other element and attribute but they don't work. I am currently working on an argumentation to explain why they are not relevant to the cuneiform script, and why @correspUnicshould be preferred.

Later today, I should be able to send my argumentation about adding a new attribute needed for cuneiform script. 

All the best,
Vanessa
---

Vanessa Bigot Juloux | Ph.D. candidate
» Chair Membership and Outreach Sub-committee for Europe (American Schools of Oriental Research)
Mobile + WhatsApp: +33 (0) 6 98 97 02 02 

Le 4 mai 2018 à 14:14, Radu Coravu <[hidden email]> a écrit :

Hi,

I just spotted a question on StackExchange about adding a custom attribute to the TEI vocabulary:

https://stackoverflow.com/questions/50161914/oxygenxml-add-an-ad-hoc-attribute-to-teicorpus-framework

I do not know how to do that but if any of you knows maybe they can answer there.

Regards,
Radu

Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com


Reply | Threaded
Open this post in threaded view
|

Re: Question on StackExchange about adding custom attribute to TEI vocabulary:

Vanessa Juloux
Dear Piotr,

Thank you very much for your useful answer. I’m currently looking to ODD based on your suggestion and with the kind help of Elisa.

As for @type vs @pos, I’m a little worried since I just wrote a chapter in a forthcoming volume (Brill) that I’m co-editing (https://brill.com/view/title/34932?format=HC). I have used @type following the example on the TEI guidelines: http://www.tei-c.org/release/doc/tei-p5-doc/en/html/examples-w.html.

Regarding @ana, it’s not for grammatical features, but to set up a hermeneutics of action based on taxonomies, see my guidelines: 
In fact, this is the main topic of my chapter.

Of course, your comments/remarks are most welcome.

One again, thank you.
Best,
Vanessa


Le 4 mai 2018 à 16:35, Piotr Bański <[hidden email]> a écrit :

Dear Vanessa,

> Later today, I should be able to send my argumentation about adding a new attribute needed for cuneiform script.

Thanks, but -- do take your time. You can just add that attribute to your ODD and work with it right away, there is no special approval process necessary for you to be able to customize your own TEI schema. (On the other hand, we would love to learn about the results, when it's convenient to you.)

A piece of relevant documentation is located here:

http://www.tei-c.org/release/doc/tei-p5-doc/en/html/USE.html#MDMDAL

What you want to see in your ODD looks a bit like the following:

<elementSpec ident="w" module="analysis" mode="change">
<attList>
<attDef ident="myNewAttribute" mode="add">
<desc>This is a test attribute to hold a normalized value for my words.</desc>
<datatype>
  <dataRef key="teidata.text"/>
</datatype>
</attDef>
</attList>
</elementSpec>

(No need to bother about the namespace, at this point -- you can play with that later. The datatype is also very permissive.)

The above would allow you to do:

<w pos="verb" myNewAttribute="your-text">etc.</w>

Please note the @pos (part-of-speech) attribute used instead of @type. I also don't know what you want to use the attribute @ana for -- if for further grammatical features, then there exists the @msd attribute ready and willing to do that job, see

http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-att.linguistic.html

If you need @ana for something else, then please bear in mind that its datatype is URI.

Cheers,

  Piotr





On 05/04/18 14:51, Vanessa Bigot Juloux wrote:
Dear Radu,
Dear TEI List,

thank you for having transmitted my message to TEI listServ. I have just added a comment to my post on StackExchange:

I have looked at other element and attribute but they don't work. I am currently working on an argumentation to explain why they are not relevant to the cuneiform script, and why @correspUnicshould be preferred.

Later today, I should be able to send my argumentation about adding a new attribute needed for cuneiform script. 

All the best,
Vanessa
---

Vanessa Bigot Juloux | Ph.D. candidate
» Chair Membership and Outreach Sub-committee for Europe (American Schools of Oriental Research)
Mobile + WhatsApp: +33 (0) 6 98 97 02 02 

Le 4 mai 2018 à 14:14, Radu Coravu <[hidden email]> a écrit :

Hi,

I just spotted a question on StackExchange about adding a custom attribute to the TEI vocabulary:

https://stackoverflow.com/questions/50161914/oxygenxml-add-an-ad-hoc-attribute-to-teicorpus-framework

I do not know how to do that but if any of you knows maybe they can answer there.

Regards,
Radu

Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com



Reply | Threaded
Open this post in threaded view
|

Re: Question on StackExchange about adding custom attribute to TEI vocabulary:

Piotr Bański

Dear Vanessa,

> As for @type vs @pos, I’m a little worried

Well, your hands are clean -- processing the ticket that produced @pos and friends took many months, and processing the additional documentation took some extra time as well. It looks like the page you're pointing at will be recreated after another official release only. It's not that @type is illegal -- it's simply been released to perform other duties (or fancies), while a specialized attribute took over, for those who need it. If you can smuggle in a footnote into your publication, your readers will have a chance to get updated, and if not, well, we live in a dynamic world, where some things fortunately move forward, even if that takes time... No one is going to blame you, I'm sure.

Your documentation looks very impressive! Notice that in some of your @ana, you use pointers at fragments (essentially, at xml:ids):

ana="#Character #ANT #v-ANT-ktu1-3_ii_l5b-6a"

but in others, you use plain labels:

ana="subjectiveVar"

That may eventually bite, if you expect the same behaviour of both.

Also, I am not checking this against the spec right now, but I would be cautious in defining IDs such as "ʿmq" (note this little inverted comma at the beginning) -- this is most probably illegal from the XML point of view (others will correct me if I'm wrong on this point, but I'm afraid I'm not).

Still, these are minor friendly nitpicks of the sort that I am sure you'd expect of this list, but, overall, I'm impressed by the documentation and by the scale of your endeavour -- chapeau bas!

Best regards,

  Piotr






On 05/04/18 22:26, Vanessa Bigot Juloux wrote:
Dear Piotr,

Thank you very much for your useful answer. I’m currently looking to ODD based on your suggestion and with the kind help of Elisa.

As for @type vs @pos, I’m a little worried since I just wrote a chapter in a forthcoming volume (Brill) that I’m co-editing (https://brill.com/view/title/34932?format=HC). I have used @type following the example on the TEI guidelines: http://www.tei-c.org/release/doc/tei-p5-doc/en/html/examples-w.html.

Regarding @ana, it’s not for grammatical features, but to set up a hermeneutics of action based on taxonomies, see my guidelines: 
In fact, this is the main topic of my chapter.

Of course, your comments/remarks are most welcome.

One again, thank you.
Best,
Vanessa


Le 4 mai 2018 à 16:35, Piotr Bański <[hidden email]> a écrit :

Dear Vanessa,

> Later today, I should be able to send my argumentation about adding a new attribute needed for cuneiform script.

Thanks, but -- do take your time. You can just add that attribute to your ODD and work with it right away, there is no special approval process necessary for you to be able to customize your own TEI schema. (On the other hand, we would love to learn about the results, when it's convenient to you.)

A piece of relevant documentation is located here:

http://www.tei-c.org/release/doc/tei-p5-doc/en/html/USE.html#MDMDAL

What you want to see in your ODD looks a bit like the following:

<elementSpec ident="w" module="analysis" mode="change">
<attList>
<attDef ident="myNewAttribute" mode="add">
<desc>This is a test attribute to hold a normalized value for my words.</desc>
<datatype>
  <dataRef key="teidata.text"/>
</datatype>
</attDef>
</attList>
</elementSpec>

(No need to bother about the namespace, at this point -- you can play with that later. The datatype is also very permissive.)

The above would allow you to do:

<w pos="verb" myNewAttribute="your-text">etc.</w>

Please note the @pos (part-of-speech) attribute used instead of @type. I also don't know what you want to use the attribute @ana for -- if for further grammatical features, then there exists the @msd attribute ready and willing to do that job, see

http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-att.linguistic.html

If you need @ana for something else, then please bear in mind that its datatype is URI.

Cheers,

  Piotr





On 05/04/18 14:51, Vanessa Bigot Juloux wrote:
Dear Radu,
Dear TEI List,

thank you for having transmitted my message to TEI listServ. I have just added a comment to my post on StackExchange:

I have looked at other element and attribute but they don't work. I am currently working on an argumentation to explain why they are not relevant to the cuneiform script, and why @correspUnicshould be preferred.

Later today, I should be able to send my argumentation about adding a new attribute needed for cuneiform script. 

All the best,
Vanessa
---

Vanessa Bigot Juloux | Ph.D. candidate
» Chair Membership and Outreach Sub-committee for Europe (American Schools of Oriental Research)
Mobile + WhatsApp: +33 (0) 6 98 97 02 02 

Le 4 mai 2018 à 14:14, Radu Coravu <[hidden email]> a écrit :

Hi,

I just spotted a question on StackExchange about adding a custom attribute to the TEI vocabulary:

https://stackoverflow.com/questions/50161914/oxygenxml-add-an-ad-hoc-attribute-to-teicorpus-framework

I do not know how to do that but if any of you knows maybe they can answer there.

Regards,
Radu

Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com




Reply | Threaded
Open this post in threaded view
|

Re: Question on StackExchange about adding custom attribute to TEI vocabulary:

Piotr Bański
PS. I hasten to correct one of the points I've made, about what I
misinterpreted as an inverted comma and the nature of NCNames. And that
with many thanks to Michael Sperberg-McQueen, who has kindly tested the
string "ʿmq" and explained to me why and how I was wrong suggesting that
it's not legal. Briefly: I was wrong.

So that point is fine, but you might want to re-scan your @ana
attributes. A plain "subjectiveVar" will qualify as anyURI (because
almost everything, sadly, will), but I suspect that your intentions
could have been different there, and you probably don't want to mix
labels and references in a single attribute.

Cheers,
   Piotr


On 05/05/18 03:28, Piotr Bański wrote:

> Dear Vanessa,
>
>  > As for @type vs @pos, I’m a little worried
>
> Well, your hands are clean -- processing the ticket that produced @pos
> and friends took many months, and processing the additional
> documentation took some extra time as well. It looks like the page
> you're pointing at will be recreated after another official release
> only. It's not that @type is illegal -- it's simply been released to
> perform other duties (or fancies), while a specialized attribute took
> over, for those who need it. If you can smuggle in a footnote into your
> publication, your readers will have a chance to get updated, and if not,
> well, we live in a dynamic world, where some things fortunately move
> forward, even if that takes time... No one is going to blame you, I'm sure.
>
> Your documentation looks very impressive! Notice that in some of your
> @ana, you use pointers at fragments (essentially, at xml:ids):
>
> ana="#Character #ANT #v-ANT-ktu1-3_ii_l5b-6a"
>
> but in others, you use plain labels:
>
> ana="subjectiveVar"
>
> That may eventually bite, if you expect the same behaviour of both.
>
> Also, I am not checking this against the spec right now, but I would be
> cautious in defining IDs such as "ʿmq" (note this little inverted comma
> at the beginning) -- this is most probably illegal from the XML point of
> view (others will correct me if I'm wrong on this point, but I'm afraid
> I'm not).
>
> Still, these are minor friendly nitpicks of the sort that I am sure
> you'd expect of this list, but, overall, I'm impressed by the
> documentation and by the scale of your endeavour -- chapeau bas!
>
> Best regards,
>
>    Piotr
>
>
>
>
>
>
> On 05/04/18 22:26, Vanessa Bigot Juloux wrote:
>> Dear Piotr,
>>
>> Thank you very much for your useful answer. I’m currently looking to
>> ODD based on your suggestion and with the kind help of Elisa.
>>
>> As for @type vs @pos, I’m a little worried since I just wrote a
>> chapter in a forthcoming volume (Brill) that I’m co-editing
>> (https://brill.com/view/title/34932?format=HC). I have used @type
>> following the example on the TEI guidelines:
>> http://www.tei-c.org/release/doc/tei-p5-doc/en/html/examples-w.html.
>>
>> Regarding @ana, it’s not for grammatical features, but to set up a
>> hermeneutics of action based on taxonomies, see my guidelines:
>> - taxonomies:
>> https://vbigot-juloux.github.io/hermeneutics-of-action/UserManual/out/webhelp/index.html#categories.html 
>>
>> - transliteration:
>> https://vbigot-juloux.github.io/hermeneutics-of-action/UserManual/out/webhelp/index.html#withinTranscrip.html
>> In fact, this is the main topic of my chapter.
>>
>> Of course, your comments/remarks are most welcome.
>>
>> One again, thank you.
>> Best,
>> Vanessa
>>
>>
>>> Le 4 mai 2018 à 16:35, Piotr Bański <[hidden email]
>>> <mailto:[hidden email]>> a écrit :
>>>
>>> Dear Vanessa,
>>>
>>> > Later today, I should be able to send my argumentation about adding
>>> a new attribute needed for cuneiform script.
>>>
>>> Thanks, but -- do take your time. You can just add that attribute to
>>> your ODD and work with it right away, there is no special approval
>>> process necessary for you to be able to customize your own TEI
>>> schema. (On the other hand, we would love to learn about the results,
>>> when it's convenient to you.)
>>>
>>> A piece of relevant documentation is located here:
>>>
>>> http://www.tei-c.org/release/doc/tei-p5-doc/en/html/USE.html#MDMDAL
>>>
>>> What you want to see in your ODD looks a bit like the following:
>>>
>>> <elementSpec ident="w" module="analysis" mode="change">
>>> <attList>
>>> <attDef ident="myNewAttribute" mode="add">
>>> <desc>This is a test attribute to hold a normalized value for my
>>> words.</desc>
>>> <datatype>
>>>   <dataRef key="teidata.text"/>
>>> </datatype>
>>> </attDef>
>>> </attList>
>>> </elementSpec>
>>>
>>> (No need to bother about the namespace, at this point -- you can play
>>> with that later. The datatype is also very permissive.)
>>>
>>> The above would allow you to do:
>>>
>>> <w pos="verb" myNewAttribute="your-text">etc.</w>
>>>
>>> Please note the @pos (part-of-speech) attribute used instead of
>>> @type. I also don't know what you want to use the attribute @ana for
>>> -- if for further grammatical features, then there exists the @msd
>>> attribute ready and willing to do that job, see
>>>
>>> http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-att.linguistic.html
>>>
>>> If you need @ana for something else, then please bear in mind that
>>> its datatype is URI.
>>>
>>> Cheers,
>>>
>>>   Piotr
>>>
>>>
>>>
>>>
>>>
>>> On 05/04/18 14:51, Vanessa Bigot Juloux wrote:
>>>> Dear Radu,
>>>> Dear TEI List,
>>>>
>>>> thank you for having transmitted my message to TEI listServ. I have
>>>> just added a comment to my post on StackExchange:
>>>>
>>>>> I have looked at other |element| and |attribute| but they don't
>>>>> work. I am currently working on an argumentation to explain why
>>>>> they are not relevant to the cuneiform script, and why
>>>>> |@correspUnic|should be preferred.
>>>>
>>>> Later today, I should be able to send my argumentation about adding
>>>> a new attribute needed for cuneiform script.
>>>>
>>>> All the best,
>>>> Vanessa
>>>> ---
>>>>
>>>> Vanessa Bigot Juloux | Ph.D. candidate
>>>> » Ecole Pratique des Hautes Etudes
>>>> <http://www.ephe.sorbonne.fr/>  (EPHE), Paris Sciences et Lettres
>>>> <https://www.univ-psl.fr/en> (PSL)
>>>> » Chair Membership and Outreach Sub-committee for Europe (American
>>>> Schools of Oriental Research <http://www.asor.org/>)
>>>> Mobile + WhatsApp: +33 (0) 6 98 97 02 02
>>>> Academia <https://ephe.academia.edu/VanessaJuloux>,
>>>> vanessajuloux.xyz <http://vanessajuloux.xyz/>,
>>>> [hidden email] <mailto:[hidden email]>
>>>>
>>>>> Le 4 mai 2018 à 14:14, Radu Coravu <[hidden email]
>>>>> <mailto:[hidden email]>> a écrit :
>>>>>
>>>>> Hi,
>>>>>
>>>>> I just spotted a question on StackExchange about adding a custom
>>>>> attribute to the TEI vocabulary:
>>>>>
>>>>> https://stackoverflow.com/questions/50161914/oxygenxml-add-an-ad-hoc-attribute-to-teicorpus-framework
>>>>>
>>>>> I do not know how to do that but if any of you knows maybe they can
>>>>> answer there.
>>>>>
>>>>> Regards,
>>>>> Radu
>>>>>
>>>>> Radu Coravu
>>>>> <oXygen/> XML Editor
>>>>> http://www.oxygenxml.com
>>>>
>>>
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: Question on StackExchange about adding custom attribute to TEI vocabulary:

Elisa Beshero-Bondar
Hi Piotr— I hate to open Pandora’s box here, but can you make a concise summary of Michael’s explanation about the string “ʿmq” and why it’s okay? You’ve piqued my curiosity here! 
Thanks for the thoughtful review—I quite agree about modifying the values of @ana to make them more consistent (and about the vagueness of anyURI in practice if not in principle). 

Best,
Elisa


On Fri, May 4, 2018 at 11:39 PM, Piotr Bański <[hidden email]> wrote:
PS. I hasten to correct one of the points I've made, about what I misinterpreted as an inverted comma and the nature of NCNames. And that with many thanks to Michael Sperberg-McQueen, who has kindly tested the string "ʿmq" and explained to me why and how I was wrong suggesting that it's not legal. Briefly: I was wrong.

So that point is fine, but you might want to re-scan your @ana attributes. A plain "subjectiveVar" will qualify as anyURI (because almost everything, sadly, will), but I suspect that your intentions could have been different there, and you probably don't want to mix labels and references in a single attribute.

Cheers,
  Piotr



On 05/05/18 03:28, Piotr Bański wrote:
Dear Vanessa,

 > As for @type vs @pos, I’m a little worried

Well, your hands are clean -- processing the ticket that produced @pos and friends took many months, and processing the additional documentation took some extra time as well. It looks like the page you're pointing at will be recreated after another official release only. It's not that @type is illegal -- it's simply been released to perform other duties (or fancies), while a specialized attribute took over, for those who need it. If you can smuggle in a footnote into your publication, your readers will have a chance to get updated, and if not, well, we live in a dynamic world, where some things fortunately move forward, even if that takes time... No one is going to blame you, I'm sure.

Your documentation looks very impressive! Notice that in some of your @ana, you use pointers at fragments (essentially, at xml:ids):

ana="#Character #ANT #v-ANT-ktu1-3_ii_l5b-6a"

but in others, you use plain labels:

ana="subjectiveVar"

That may eventually bite, if you expect the same behaviour of both.

Also, I am not checking this against the spec right now, but I would be cautious in defining IDs such as "ʿmq" (note this little inverted comma at the beginning) -- this is most probably illegal from the XML point of view (others will correct me if I'm wrong on this point, but I'm afraid I'm not).

Still, these are minor friendly nitpicks of the sort that I am sure you'd expect of this list, but, overall, I'm impressed by the documentation and by the scale of your endeavour -- chapeau bas!

Best regards,

   Piotr






On 05/04/18 22:26, Vanessa Bigot Juloux wrote:
Dear Piotr,

Thank you very much for your useful answer. I’m currently looking to ODD based on your suggestion and with the kind help of Elisa.

As for @type vs @pos, I’m a little worried since I just wrote a chapter in a forthcoming volume (Brill) that I’m co-editing (https://brill.com/view/title/34932?format=HC). I have used @type following the example on the TEI guidelines: http://www.tei-c.org/release/doc/tei-p5-doc/en/html/examples-w.html.

Regarding @ana, it’s not for grammatical features, but to set up a hermeneutics of action based on taxonomies, see my guidelines:
- taxonomies: https://vbigot-juloux.github.io/hermeneutics-of-action/UserManual/out/webhelp/index.html#categories.html
- transliteration: https://vbigot-juloux.github.io/hermeneutics-of-action/UserManual/out/webhelp/index.html#withinTranscrip.html
In fact, this is the main topic of my chapter.

Of course, your comments/remarks are most welcome.

One again, thank you.
Best,
Vanessa


Le 4 mai 2018 à 16:35, Piotr Bański <[hidden email] <mailto:[hidden email]>> a écrit :

Dear Vanessa,

> Later today, I should be able to send my argumentation about adding a new attribute needed for cuneiform script.

Thanks, but -- do take your time. You can just add that attribute to your ODD and work with it right away, there is no special approval process necessary for you to be able to customize your own TEI schema. (On the other hand, we would love to learn about the results, when it's convenient to you.)

A piece of relevant documentation is located here:

http://www.tei-c.org/release/doc/tei-p5-doc/en/html/USE.html#MDMDAL

What you want to see in your ODD looks a bit like the following:

<elementSpec ident="w" module="analysis" mode="change">
<attList>
<attDef ident="myNewAttribute" mode="add">
<desc>This is a test attribute to hold a normalized value for my words.</desc>
<datatype>
  <dataRef key="teidata.text"/>
</datatype>
</attDef>
</attList>
</elementSpec>

(No need to bother about the namespace, at this point -- you can play with that later. The datatype is also very permissive.)

The above would allow you to do:

<w pos="verb" myNewAttribute="your-text">etc.</w>

Please note the @pos (part-of-speech) attribute used instead of @type. I also don't know what you want to use the attribute @ana for -- if for further grammatical features, then there exists the @msd attribute ready and willing to do that job, see

http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-att.linguistic.html

If you need @ana for something else, then please bear in mind that its datatype is URI.

Cheers,

  Piotr





On 05/04/18 14:51, Vanessa Bigot Juloux wrote:
Dear Radu,
Dear TEI List,

thank you for having transmitted my message to TEI listServ. I have just added a comment to my post on StackExchange:

I have looked at other |element| and |attribute| but they don't work. I am currently working on an argumentation to explain why they are not relevant to the cuneiform script, and why |@correspUnic|should be preferred.

Later today, I should be able to send my argumentation about adding a new attribute needed for cuneiform script.

All the best,
Vanessa
---

Vanessa Bigot Juloux | Ph.D. candidate
» Ecole Pratique des Hautes Etudes <http://www.ephe.sorbonne.fr/>  (EPHE), Paris Sciences et Lettres <https://www.univ-psl.fr/en> (PSL)
» Chair Membership and Outreach Sub-committee for Europe (American Schools of Oriental Research <http://www.asor.org/>)
Mobile + WhatsApp: +33 (0) 6 98 97 02 02
Academia <https://ephe.academia.edu/VanessaJuloux>, vanessajuloux.xyz <http://vanessajuloux.xyz/>, [hidden email] <mailto:[hidden email]>

Le 4 mai 2018 à 14:14, Radu Coravu <[hidden email] <mailto:[hidden email]>> a écrit :

Hi,

I just spotted a question on StackExchange about adding a custom attribute to the TEI vocabulary:

https://stackoverflow.com/questions/50161914/oxygenxml-add-an-ad-hoc-attribute-to-teicorpus-framework

I do not know how to do that but if any of you knows maybe they can answer there.

Regards,
Radu

Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com







--
Elisa Beshero-Bondar, PhD
Director, Center for the Digital Text | Associate Professor of English
University of Pittsburgh at Greensburg | Humanities Division
150 Finoli Drive
Greensburg, PA  15601  USA
E-mail:[hidden email]
Development site: http://newtfire.org
Reply | Threaded
Open this post in threaded view
|

Re: Question on StackExchange about adding custom attribute to TEI vocabulary:

Vanessa Juloux
Dear Elisa and Piotr,

Regarding ‘ʿmq’, it is not a coma but the Ugaritic letter ʿ, “ayin”. So that is why it is permitted—am I right Michael Sperberg-McQueen?

As for @ana, of course, my mistake. I had copied/pasted the same line in my Dita… I had checked several times, but you know how it is when you check a text yourself… So thank you very much! I’ll do the modification today. 

Regarding @type vs @pos, I doubt it will be possible to change in my chapter; however, I’ll change in the guidelines, XML (and XSL). Thank you very much for your kind advice.

Last, thank you for the compliment Piotr, I hope my doctoral supervisors will do the same… but above all, my goal was to do the guidelines as useful as possible for those who want to use taxonomies in order to prepare to a hermeneutics of action.
Anyway, these guidelines (always) need improvements, and I think the first will be to add ODD—I have decided to add my customized ODD, not only for the custom @attribute. Thank you Elisa for having convinced me. It makes me realized that a customized ODD is very useful (and interesting too). 
Just wondering why a a comprehensive ODD course is not always given before TEI encoding course. I truly believe we have to bridge this lack of ODD training—at least in France, I don’t know for other countries.

Best,
Vanessa

Le 5 mai 2018 à 05:47, Elisa Beshero-Bondar <[hidden email]> a écrit :

Hi Piotr— I hate to open Pandora’s box here, but can you make a concise summary of Michael’s explanation about the string “ʿmq” and why it’s okay? You’ve piqued my curiosity here! 
Thanks for the thoughtful review—I quite agree about modifying the values of @ana to make them more consistent (and about the vagueness of anyURI in practice if not in principle). 

Best,
Elisa


On Fri, May 4, 2018 at 11:39 PM, Piotr Bański <[hidden email]> wrote:
PS. I hasten to correct one of the points I've made, about what I misinterpreted as an inverted comma and the nature of NCNames. And that with many thanks to Michael Sperberg-McQueen, who has kindly tested the string "ʿmq" and explained to me why and how I was wrong suggesting that it's not legal. Briefly: I was wrong.

So that point is fine, but you might want to re-scan your @ana attributes. A plain "subjectiveVar" will qualify as anyURI (because almost everything, sadly, will), but I suspect that your intentions could have been different there, and you probably don't want to mix labels and references in a single attribute.

Cheers,
  Piotr



On 05/05/18 03:28, Piotr Bański wrote:
Dear Vanessa,

 > As for @type vs @pos, I’m a little worried

Well, your hands are clean -- processing the ticket that produced @pos and friends took many months, and processing the additional documentation took some extra time as well. It looks like the page you're pointing at will be recreated after another official release only. It's not that @type is illegal -- it's simply been released to perform other duties (or fancies), while a specialized attribute took over, for those who need it. If you can smuggle in a footnote into your publication, your readers will have a chance to get updated, and if not, well, we live in a dynamic world, where some things fortunately move forward, even if that takes time... No one is going to blame you, I'm sure.

Your documentation looks very impressive! Notice that in some of your @ana, you use pointers at fragments (essentially, at xml:ids):

ana="#Character #ANT #v-ANT-ktu1-3_ii_l5b-6a"

but in others, you use plain labels:

ana="subjectiveVar"

That may eventually bite, if you expect the same behaviour of both.

Also, I am not checking this against the spec right now, but I would be cautious in defining IDs such as "ʿmq" (note this little inverted comma at the beginning) -- this is most probably illegal from the XML point of view (others will correct me if I'm wrong on this point, but I'm afraid I'm not).

Still, these are minor friendly nitpicks of the sort that I am sure you'd expect of this list, but, overall, I'm impressed by the documentation and by the scale of your endeavour -- chapeau bas!

Best regards,

   Piotr






On 05/04/18 22:26, Vanessa Bigot Juloux wrote:
Dear Piotr,

Thank you very much for your useful answer. I’m currently looking to ODD based on your suggestion and with the kind help of Elisa.

As for @type vs @pos, I’m a little worried since I just wrote a chapter in a forthcoming volume (Brill) that I’m co-editing (https://brill.com/view/title/34932?format=HC). I have used @type following the example on the TEI guidelines: http://www.tei-c.org/release/doc/tei-p5-doc/en/html/examples-w.html.

Regarding @ana, it’s not for grammatical features, but to set up a hermeneutics of action based on taxonomies, see my guidelines:
- taxonomies: https://vbigot-juloux.github.io/hermeneutics-of-action/UserManual/out/webhelp/index.html#categories.html
- transliteration: https://vbigot-juloux.github.io/hermeneutics-of-action/UserManual/out/webhelp/index.html#withinTranscrip.html
In fact, this is the main topic of my chapter.

Of course, your comments/remarks are most welcome.

One again, thank you.
Best,
Vanessa


Le 4 mai 2018 à 16:35, Piotr Bański <[hidden email] <mailto:[hidden email]>> a écrit :

Dear Vanessa,

> Later today, I should be able to send my argumentation about adding a new attribute needed for cuneiform script.

Thanks, but -- do take your time. You can just add that attribute to your ODD and work with it right away, there is no special approval process necessary for you to be able to customize your own TEI schema. (On the other hand, we would love to learn about the results, when it's convenient to you.)

A piece of relevant documentation is located here:

http://www.tei-c.org/release/doc/tei-p5-doc/en/html/USE.html#MDMDAL

What you want to see in your ODD looks a bit like the following:

<elementSpec ident="w" module="analysis" mode="change">
<attList>
<attDef ident="myNewAttribute" mode="add">
<desc>This is a test attribute to hold a normalized value for my words.</desc>
<datatype>
  <dataRef key="teidata.text"/>
</datatype>
</attDef>
</attList>
</elementSpec>

(No need to bother about the namespace, at this point -- you can play with that later. The datatype is also very permissive.)

The above would allow you to do:

<w pos="verb" myNewAttribute="your-text">etc.</w>

Please note the @pos (part-of-speech) attribute used instead of @type. I also don't know what you want to use the attribute @ana for -- if for further grammatical features, then there exists the @msd attribute ready and willing to do that job, see

http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-att.linguistic.html

If you need @ana for something else, then please bear in mind that its datatype is URI.

Cheers,

  Piotr





On 05/04/18 14:51, Vanessa Bigot Juloux wrote:
Dear Radu,
Dear TEI List,

thank you for having transmitted my message to TEI listServ. I have just added a comment to my post on StackExchange:

I have looked at other |element| and |attribute| but they don't work. I am currently working on an argumentation to explain why they are not relevant to the cuneiform script, and why |@correspUnic|should be preferred.

Later today, I should be able to send my argumentation about adding a new attribute needed for cuneiform script.

All the best,
Vanessa
---

Vanessa Bigot Juloux | Ph.D. candidate
» Ecole Pratique des Hautes Etudes <http://www.ephe.sorbonne.fr/>  (EPHE), Paris Sciences et Lettres <https://www.univ-psl.fr/en> (PSL)
» Chair Membership and Outreach Sub-committee for Europe (American Schools of Oriental Research <http://www.asor.org/>)
Mobile + WhatsApp: +33 (0) 6 98 97 02 02
Academia <https://ephe.academia.edu/VanessaJuloux>, vanessajuloux.xyz <http://vanessajuloux.xyz/>, [hidden email] <mailto:[hidden email]>

Le 4 mai 2018 à 14:14, Radu Coravu <[hidden email] <mailto:[hidden email]>> a écrit :

Hi,

I just spotted a question on StackExchange about adding a custom attribute to the TEI vocabulary:

https://stackoverflow.com/questions/50161914/oxygenxml-add-an-ad-hoc-attribute-to-teicorpus-framework

I do not know how to do that but if any of you knows maybe they can answer there.

Regards,
Radu

Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com







--
Elisa Beshero-Bondar, PhD
Director, Center for the Digital Text | Associate Professor of English
University of Pittsburgh at Greensburg | Humanities Division
150 Finoli Drive
Greensburg, PA  15601  USA
E-mail:[hidden email]
Development site: http://newtfire.org

Reply | Threaded
Open this post in threaded view
|

Re: Question on StackExchange about adding custom attribute to TEI vocabulary:

Lou Burnard-6
On 05/05/18 11:23, Vanessa Bigot Juloux wrote:
> e decided to add my customized ODD, not only for the custom
> @attribute. Thank you Elisa for having convinced me. It makes me
> realized that a customized ODD is very useful (and interesting too).
> Just wondering why a a comprehensive ODD course is not always given
> before TEI encod

<cough/> I *always* include a session on ODD in every TEI training
event  I do!
Reply | Threaded
Open this post in threaded view
|

Re: Question on StackExchange about adding custom attribute to TEI vocabulary:

Vanessa Juloux
Of course Lou. But I was talking about a ‘comprehensive’ course, not a session, with a customized ODD developed by each participant in order to work on a TEI project.
I followed intensive TEI training; although we had a session on ODD, it was an overview, thus not exhaustive. In my opinion, a comprehensive course allows showing the relevance of a customized ODD for each project.
Vanessa

> Le 5 mai 2018 à 12:28, Lou Burnard <[hidden email]> a écrit :
>
> On 05/05/18 11:23, Vanessa Bigot Juloux wrote:
>> e decided to add my customized ODD, not only for the custom @attribute. Thank you Elisa for having convinced me. It makes me realized that a customized ODD is very useful (and interesting too).
>> Just wondering why a a comprehensive ODD course is not always given before TEI encod
>
> <cough/> I *always* include a session on ODD in every TEI training event  I do!
Reply | Threaded
Open this post in threaded view
|

Re: Question on StackExchange about adding custom attribute to TEI vocabulary:

Lou Burnard-6
Emmanuel Chateau and I prepared and gave exactly such a course for the
Ecole des Chartes twice in the last few years, but take up was a bit
disappointing this year and I don't know if we will be asked to do it
again. The teaching materials are all online though (See
https://github.com/tei-fr/formationEnc2017-02 for example) , so I'm
happy to respond to anyone else interested!

On 05/05/18 12:01, Vanessa Bigot Juloux wrote:

> Of course Lou. But I was talking about a ‘comprehensive’ course, not a session, with a customized ODD developed by each participant in order to work on a TEI project.
> I followed intensive TEI training; although we had a session on ODD, it was an overview, thus not exhaustive. In my opinion, a comprehensive course allows showing the relevance of a customized ODD for each project.
> Vanessa
>
>> Le 5 mai 2018 à 12:28, Lou Burnard <[hidden email]> a écrit :
>>
>> On 05/05/18 11:23, Vanessa Bigot Juloux wrote:
>>> e decided to add my customized ODD, not only for the custom @attribute. Thank you Elisa for having convinced me. It makes me realized that a customized ODD is very useful (and interesting too).
>>> Just wondering why a a comprehensive ODD course is not always given before TEI encod
>> <cough/> I *always* include a session on ODD in every TEI training event  I do!
Reply | Threaded
Open this post in threaded view
|

Re: Question on StackExchange about adding custom attribute to TEI vocabulary:

C. M. Sperberg-McQueen
In reply to this post by Elisa Beshero-Bondar
> On May 4, 2018, at 9:47 PM, Elisa Beshero-Bondar <[hidden email]> wrote:
>
> Hi Piotr— I hate to open Pandora’s box here, but can you make a concise summary of Michael’s explanation about the string “ʿmq” and why it’s okay? You’ve piqued my curiosity here!

I’ll try to save Piotr the trouble.

The short version is that Unicode character U+02BE MODIFIER LETTER
RIGHT HALF RING (the first character in the string "ʿmq") is defined
by the XML specification as a NameStartChar, and thus syntactically
legal at the beginning of an identifier.

Those whose curiosity is now satisfied may stop reading now.

Those interested in more detail (and unafraid of Pandora's Box) may find
the information below suitable reading for a quiet Saturday.

The details of name syntax changed somewhat in the fifth edition of
the specification (issued in 2008), so there are some names which will
be accepted by parsers which conform to XML 1.0 5e but would have been
rejected by older parsers, but the string in question here is not one
of them; it has been a syntactically legal Name in XML since the first
edition of the spec.  The reason for that is, as Vanessa Bigot Juloux
rightly surmises, is that the Unicode database classes it as a Letter;
the first edition of the XML 1.0 specification allows as name-start
characters every character then classified by Unicode as a letter or
as an ideograph.

Beginning with the fifth edition, a somewhat broader range of
characters is allowed, in an attempt to ensure that letters added to
Unicode later can be used in XML names without objection by parsers.
This has the advantage that scripts standardized later (which tend for
perhaps obvious reasons to be scripts for minority languages) are not
disadvantages with respect to being usable for names of XML elements
and attributes; it has the slight disadvantage that when currently
unoccupied code points allowed in XML names are finally defined, there
is no guarantee that they will actually denote letters and not
punctuation special forms of whitespace, or other characters which do
not logically speaking belong inside identifiers.  (I think the
history of XML 1.0 and 1.1 has amply confirmed that the tradeoff made
in the fifth edition is the correct one: the ability to use minority
scripts in names is more important than restricting name characters to
letters, ideographs, numeric digits, and the like.)

The XML specification can readily be consulted on the W3C web site (at
http://www.w3.org/TR/xml); the relevant grammar productions are those
for Name, NameStartChar, and NameChar (5, 4, and 4a in the current
version of the spec).  


[4] NameStartChar ::= ":" | [A-Z] | "_" | [a-z]
    | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF]
    | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D]
    | [#x2070-#x218F] | [#x2C00-#x2FEF]
    | [#x3001-#xD7FF] | [#xF900-#xFDCF]
    | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]
   
[4a] NameChar ::= NameStartChar | "-" | "." | [0-9]
    | #xB7 | [#x0300-#x036F]
    | [#x203F-#x2040]
   
[5] Name ::= NameStartChar (NameChar)*

U+02BE falls under the definition of NameStartChar by virtue of
the last part of the second line of production 4:  the expression
[#xF8-#x2FF] means (this is explained in the spec’s section on
Notation, you are not expected to be born knowing it, and you
are not required to guess) that any character between U+00F8
and U+02FF is a NameStartChar.

The first and other superseded editions are also still available,
for those interested; the quickest way to find it is to consult the
current spec and then click repeatedly on the 'Previous version(s)’
links until reaching the one you want (the first edition is
at https://www.w3.org/TR/1998/REC-xml-19980210).

In order to check a character or string against the spec's definition
of Name, it is of course necessary to know exactly what characters are
in the string; this is made straightforward by the interactive
interfaces offered by some XQuery processors (BaseX, eXist, and
MarkLogic all have some form of this or another; Oxygen also has an
XQuery tool that can be used for the same purpose).  In this case I
copied the string and pasted it into the following query in an XQuery
processor:

    string-to-codepoints("ʿmq")

This produces the result 703 109 113, which tells me that the first
character has the code point 703.  Since Unicode tables use hex
notation, some arithmetic is necessary to translate the decimal 703
into the hexadecimal 2BE, which can be checked against the lists of
character ranges in the spec.

A simpler approach, for those slightly less habituated to consulting
the text of a specification, would be to check whether "ʿmq" can be
coerced into the type xs:NCName, which can be done by issuing the
query

    xs:NCName("ʿmq")

to see whether it returns the NCName ʿmq or raises an error.  Since I
don't use the type coercion functions all that regularly and felt
uncertain whether I was going to get it right, what I actually typed,
eventually, was a slightly more sequence of queries, which compared the
results for this string with those for two simple strings for which I
know what results to expect.  The query

  for $s in ("ʿmq", "abc", "23skiddoo" )
  return <string s="{$s}"
    isname="{xs:NCName($s) instance of xs:NCName}"/>

produces the error message

  [FORG0001] Cannot cast xs:string to xs:NCName: "23skiddoo".

whereas commenting out "23skiddoo" allows the query to succeed:

  for $s in ("ʿmq", "abc" (:, "23skiddoo":) )
  return <string s="{$s}"
    isname="{xs:NCName($s) instance of xs:NCName}"/>

which produces the results

  <string s="ʿmq" isname="true"/>
  <string s="abc" isname="true"/>

I include these details in the hope of encouraging TEI users to
familiarize themselves with tools for interactive evaluation of XPath
and XQuery expressions; they can be very useful.


********************************************
C. M. Sperberg-McQueen
Black Mesa Technologies LLC
[hidden email]
http://www.blackmesatech.com
********************************************
Reply | Threaded
Open this post in threaded view
|

Re: Question on StackExchange about adding custom attribute to TEI vocabulary:

Elisa Beshero-Bondar

That Pandora’s Box turned out to be a lot of fun to investigate, Michael! Thanks especially for the details on testing for xs:NCName and the little XQuery adventure  at the end. A little XPath and XQuery can take us a long way. :-) 


Cheers,

Elisa



On Sat, May 5, 2018 at 11:06 AM, C. M. Sperberg-McQueen <[hidden email]> wrote:

> On May 4, 2018, at 9:47 PM, Elisa Beshero-Bondar <[hidden email]> wrote:
>
> Hi Piotr— I hate to open Pandora’s box here, but can you make a concise summary of Michael’s explanation about the string “ʿmq” and why it’s okay? You’ve piqued my curiosity here!

I’ll try to save Piotr the trouble.

The short version is that Unicode character U+02BE MODIFIER LETTER
RIGHT HALF RING (the first character in the string "ʿmq") is defined
by the XML specification as a NameStartChar, and thus syntactically
legal at the beginning of an identifier.

Those whose curiosity is now satisfied may stop reading now.

Those interested in more detail (and unafraid of Pandora's Box) may find
the information below suitable reading for a quiet Saturday.

The details of name syntax changed somewhat in the fifth edition of
the specification (issued in 2008), so there are some names which will
be accepted by parsers which conform to XML 1.0 5e but would have been
rejected by older parsers, but the string in question here is not one
of them; it has been a syntactically legal Name in XML since the first
edition of the spec.  The reason for that is, as Vanessa Bigot Juloux
rightly surmises, is that the Unicode database classes it as a Letter;
the first edition of the XML 1.0 specification allows as name-start
characters every character then classified by Unicode as a letter or
as an ideograph.

Beginning with the fifth edition, a somewhat broader range of
characters is allowed, in an attempt to ensure that letters added to
Unicode later can be used in XML names without objection by parsers.
This has the advantage that scripts standardized later (which tend for
perhaps obvious reasons to be scripts for minority languages) are not
disadvantages with respect to being usable for names of XML elements
and attributes; it has the slight disadvantage that when currently
unoccupied code points allowed in XML names are finally defined, there
is no guarantee that they will actually denote letters and not
punctuation special forms of whitespace, or other characters which do
not logically speaking belong inside identifiers.  (I think the
history of XML 1.0 and 1.1 has amply confirmed that the tradeoff made
in the fifth edition is the correct one: the ability to use minority
scripts in names is more important than restricting name characters to
letters, ideographs, numeric digits, and the like.)

The XML specification can readily be consulted on the W3C web site (at
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.w3.org%2FTR%2Fxml&data=01%7C01%7Cebb8%40PITT.EDU%7Cd268be483b654c4bd85d08d5b299c1af%7C9ef9f489e0a04eeb87cc3a526112fd0d%7C1&sdata=ruFfqpF3iafwq6qwu%2FNGqNZRCRINGzrCihV5caUGa%2Fg%3D&reserved=0); the relevant grammar productions are those
for Name, NameStartChar, and NameChar (5, 4, and 4a in the current
version of the spec). 


[4] NameStartChar ::= ":" | [A-Z] | "_" | [a-z]
    | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF]
    | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D]
    | [#x2070-#x218F] | [#x2C00-#x2FEF]
    | [#x3001-#xD7FF] | [#xF900-#xFDCF]
    | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]

[4a] NameChar ::= NameStartChar | "-" | "." | [0-9]
    | #xB7 | [#x0300-#x036F]
    | [#x203F-#x2040]

[5] Name ::= NameStartChar (NameChar)*

U+02BE falls under the definition of NameStartChar by virtue of
the last part of the second line of production 4:  the expression
[#xF8-#x2FF] means (this is explained in the spec’s section on
Notation, you are not expected to be born knowing it, and you
are not required to guess) that any character between U+00F8
and U+02FF is a NameStartChar.

The first and other superseded editions are also still available,
for those interested; the quickest way to find it is to consult the
current spec and then click repeatedly on the 'Previous version(s)’
links until reaching the one you want (the first edition is
at https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.w3.org%2FTR%2F1998%2FREC-xml-19980210&data=01%7C01%7Cebb8%40PITT.EDU%7Cd268be483b654c4bd85d08d5b299c1af%7C9ef9f489e0a04eeb87cc3a526112fd0d%7C1&sdata=m2zdfyy%2FfiKOCkYncPzXb5l0klix3JX4tLaFsyxfu88%3D&reserved=0).

In order to check a character or string against the spec's definition
of Name, it is of course necessary to know exactly what characters are
in the string; this is made straightforward by the interactive
interfaces offered by some XQuery processors (BaseX, eXist, and
MarkLogic all have some form of this or another; Oxygen also has an
XQuery tool that can be used for the same purpose).  In this case I
copied the string and pasted it into the following query in an XQuery
processor:

    string-to-codepoints("ʿmq")

This produces the result 703 109 113, which tells me that the first
character has the code point 703.  Since Unicode tables use hex
notation, some arithmetic is necessary to translate the decimal 703
into the hexadecimal 2BE, which can be checked against the lists of
character ranges in the spec.

A simpler approach, for those slightly less habituated to consulting
the text of a specification, would be to check whether "ʿmq" can be
coerced into the type xs:NCName, which can be done by issuing the
query

    xs:NCName("ʿmq")

to see whether it returns the NCName ʿmq or raises an error.  Since I
don't use the type coercion functions all that regularly and felt
uncertain whether I was going to get it right, what I actually typed,
eventually, was a slightly more sequence of queries, which compared the
results for this string with those for two simple strings for which I
know what results to expect.  The query

  for $s in ("ʿmq", "abc", "23skiddoo" )
  return <string s="{$s}"
    isname="{xs:NCName($s) instance of xs:NCName}"/>

produces the error message

  [FORG0001] Cannot cast xs:string to xs:NCName: "23skiddoo".

whereas commenting out "23skiddoo" allows the query to succeed:

  for $s in ("ʿmq", "abc" (:, "23skiddoo":) )
  return <string s="{$s}"
    isname="{xs:NCName($s) instance of xs:NCName}"/>

which produces the results

  <string s="ʿmq" isname="true"/>
  <string s="abc" isname="true"/>

I include these details in the hope of encouraging TEI users to
familiarize themselves with tools for interactive evaluation of XPath
and XQuery expressions; they can be very useful.


********************************************
C. M. Sperberg-McQueen
Black Mesa Technologies LLC
[hidden email]
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.blackmesatech.com&data=01%7C01%7Cebb8%40PITT.EDU%7Cd268be483b654c4bd85d08d5b299c1af%7C9ef9f489e0a04eeb87cc3a526112fd0d%7C1&sdata=VCQv33o4ervcfEjEPRWoAJpHgloJmeTok0yWF2aTQX4%3D&reserved=0
********************************************




--
Elisa Beshero-Bondar, PhD
Director, Center for the Digital Text | Associate Professor of English
University of Pittsburgh at Greensburg | Humanities Division
150 Finoli Drive
Greensburg, PA  15601  USA
E-mail:[hidden email]
Development site: http://newtfire.org
Reply | Threaded
Open this post in threaded view
|

Re: Question on StackExchange about adding custom attribute to TEI vocabulary:

Conal Tuohy-3
In reply to this post by C. M. Sperberg-McQueen
On 06/05/18 01:06, C. M. Sperberg-McQueen wrote:
> A simpler approach, for those slightly less habituated to consulting
> the text of a specification, would be to check whether "ʿmq" can be
> coerced into the type xs:NCName, which can be done by issuing the
> query
>
>      xs:NCName("ʿmq")
>
> to see whether it returns the NCName ʿmq or raises an error.
There's also the convenient XPath 2 expression "castable as", which
returns true or false, e.g.

"ʿmq" castable as xs:NCName