rules about mandatory xml:id

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

rules about mandatory xml:id

Martin Mueller

Can you add a rule to a schema requiring that some element(s) MUST have an xml:id, especially <TEI>? The question came up in the context of a project that feeds a lot of different texts to eXist's TEI Publisher, which is fussy on that point. 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: rules about mandatory xml:id

Elli Mylonas
Wouldn't Schematron be the way to do that? --elli

[Elli Mylonas
 Senior Digital Humanities Librarian
 and
 Center for Digital Scholarship
 University Library
 Brown University
 library.brown.edu/cds]

On Mon, Jun 19, 2017 at 9:26 AM, Martin Mueller <[hidden email]> wrote:

Can you add a rule to a schema requiring that some element(s) MUST have an xml:id, especially <TEI>? The question came up in the context of a project that feeds a lot of different texts to eXist's TEI Publisher, which is fussy on that point. 

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: rules about mandatory xml:id

Martin Holmes
You should be able to just make the attribute required (@usage="req") in
the elementSpec for the TEI element in the ODD file, surely?

Cheers,
Martin

On 2017-06-19 10:35 AM, Mylonas, Elli wrote:

> Wouldn't Schematron be the way to do that? --elli
>
> [Elli Mylonas
>   Senior Digital Humanities Librarian
>   and
>   Center for Digital Scholarship
>   University Library
>   Brown University
> library.brown.edu/cds <http://library.brown.edu/cds>]
>
> On Mon, Jun 19, 2017 at 9:26 AM, Martin Mueller
> <[hidden email] <mailto:[hidden email]>>
> wrote:
>
>
>     Can you add a rule to a schema requiring that some element(s) MUST
>     have an xml:id, especially <TEI>? The question came up in the
>     context of a project that feeds a lot of different texts to eXist's
>     TEI Publisher, which is fussy on that point.
>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: rules about mandatory xml:id

Lou Burnard-6
In reply to this post by Martin Mueller

This is perfectly easy to do in your ODD. For example, just add something like this

 <elementSpec ident="TEI" mode="change">
      <attList>
         <attDef  usage="req" ident="xml:id" mode="change"/>
       </attList>
   </elementSpec>

to make @xml:id required on <TEI>


On 19/06/17 14:26, Martin Mueller wrote:
Can you add a rule to a schema requiring that some element(s) MUST have an xml:id, especially <TEI>? The question came up in the context of a project that feeds a lot of different texts to eXist's TEI Publisher, which is fussy on that point.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: rules about mandatory xml:id

Syd Bauman-10
I agree w/ Martin & Lou, here. For any given element, the mechanism
Lou demonstrated is the way to go. You get faster validation, and get
validation even if you don't validate against the Schematron (but you
really should!)

But there is still a place for Schematron: switch over to Schematron
when you want to do something more complex than just require @xml:id
on a few particular elements. Examples:

 * Require @xml:id on <div type="chapter">, but not on other <div>s.

 * Require @xml:id on any line group that has 5 or more metrical
   lines, but not others. (Note that this is *not* the same as saying
   "any <lg> that has 5+ child <l> elements". Metrical lines may be
   split among multiple <l> elements to avoid overlap problems, and
   may be nested in smaller <lg>s, e.g. <lg type="couplet">. :-)

 * Require that an @xml:id be specified on *at least one* <term>, but
   not necessarily all of 'em.

 * Require not only that there *is* an @xml:id, but that the
   values thereof are in some particular order.

 * Provide a "quick fix" suggested correction for the error -- this
   is NOT supported by the TEI stylesheets yet, but hopefully will be
   by this time next year.


> This is perfectly easy to do in your ODD. For example, just add
> something like this
>
>   <elementSpec ident="TEI" mode="change">
>        <attList>
>           <attDef  usage="req" ident="xml:id" mode="change"/>
>         </attList>
>     </elementSpec>
>
> to make @xml:id required on </TEI/>
>
>
> > Can you add a rule to a schema requiring that some element(s)
> > MUST have an xml:id, especially <TEI>? The question came up in
> > the context of a project that feeds a lot of different texts to
> > eXist's TEI Publisher, which is fussy on that point.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: rules about mandatory xml:id

Eduard Drenth
Note that requiring xml:id on <TEI> does not garantee uniqueness of the id over multiple documents.

Eduard Drenth, Software Architekt

[hidden email]

Doelestrjitte 8
8911 DX  Ljouwert
+31 58 234 30 47
+31 62 094 34 28 (privé)

gpg: https://sks-keyservers.net/pks/lookup?op=get&search=0x065EF82A1E02CC43

________________________________________
From: TEI (Text Encoding Initiative) public discussion list <[hidden email]> on behalf of Syd Bauman <[hidden email]>
Sent: Wednesday, June 21, 2017 5:50 PM
To: [hidden email]
Subject: Re: rules about mandatory xml:id

I agree w/ Martin & Lou, here. For any given element, the mechanism
Lou demonstrated is the way to go. You get faster validation, and get
validation even if you don't validate against the Schematron (but you
really should!)

But there is still a place for Schematron: switch over to Schematron
when you want to do something more complex than just require @xml:id
on a few particular elements. Examples:

 * Require @xml:id on <div type="chapter">, but not on other <div>s.

 * Require @xml:id on any line group that has 5 or more metrical
   lines, but not others. (Note that this is *not* the same as saying
   "any <lg> that has 5+ child <l> elements". Metrical lines may be
   split among multiple <l> elements to avoid overlap problems, and
   may be nested in smaller <lg>s, e.g. <lg type="couplet">. :-)

 * Require that an @xml:id be specified on *at least one* <term>, but
   not necessarily all of 'em.

 * Require not only that there *is* an @xml:id, but that the
   values thereof are in some particular order.

 * Provide a "quick fix" suggested correction for the error -- this
   is NOT supported by the TEI stylesheets yet, but hopefully will be
   by this time next year.


> This is perfectly easy to do in your ODD. For example, just add
> something like this
>
>   <elementSpec ident="TEI" mode="change">
>        <attList>
>           <attDef  usage="req" ident="xml:id" mode="change"/>
>         </attList>
>     </elementSpec>
>
> to make @xml:id required on </TEI/>
>
>
> > Can you add a rule to a schema requiring that some element(s)
> > MUST have an xml:id, especially <TEI>? The question came up in
> > the context of a project that feeds a lot of different texts to
> > eXist's TEI Publisher, which is fussy on that point.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: rules about mandatory xml:id

Lou Burnard-6
It can do if your tei documents are grouped together into a teiCorpus, of course.

Sent from my Huawei Mobile

-------- Original Message --------
Subject: Re: rules about mandatory xml:id
From: Eduard Drenth
To: [hidden email]
CC:

Note that requiring xml:id on <TEI> does not garantee uniqueness of the id over multiple documents.

Eduard Drenth, Software Architekt

[hidden email]

Doelestrjitte 8
8911 DX  Ljouwert
+31 58 234 30 47
+31 62 094 34 28 (privé)

gpg: https://sks-keyservers.net/pks/lookup?op=get&search=0x065EF82A1E02CC43

________________________________________
From: TEI (Text Encoding Initiative) public discussion list <[hidden email]> on behalf of Syd Bauman <[hidden email]>
Sent: Wednesday, June 21, 2017 5:50 PM
To: [hidden email]
Subject: Re: rules about mandatory xml:id

I agree w/ Martin & Lou, here. For any given element, the mechanism
Lou demonstrated is the way to go. You get faster validation, and get
validation even if you don't validate against the Schematron (but you
really should!)

But there is still a place for Schematron: switch over to Schematron
when you want to do something more complex than just require @xml:id
on a few particular elements. Examples:

 * Require @xml:id on <div type="chapter">, but not on other <div>s.

 * Require @xml:id on any line group that has 5 or more metrical
   lines, but not others. (Note that this is *not* the same as saying
   "any <lg> that has 5+ child <l> elements". Metrical lines may be
   split among multiple <l> elements to avoid overlap problems, and
   may be nested in smaller <lg>s, e.g. <lg type="couplet">. :-)

 * Require that an @xml:id be specified on *at least one* <term>, but
   not necessarily all of 'em.

 * Require not only that there *is* an @xml:id, but that the
   values thereof are in some particular order.

 * Provide a "quick fix" suggested correction for the error -- this
   is NOT supported by the TEI stylesheets yet, but hopefully will be
   by this time next year.


> This is perfectly easy to do in your ODD. For example, just add
> something like this
>
>   <elementSpec ident="TEI" mode="change">
>        <attList>
>           <attDef  usage="req" ident="xml:id" mode="change"/>
>         </attList>
>     </elementSpec>
>
> to make @xml:id required on </TEI/>
>
>
> > Can you add a rule to a schema requiring that some element(s)
> > MUST have an xml:id, especially <TEI>? The question came up in
> > the context of a project that feeds a lot of different texts to
> > eXist's TEI Publisher, which is fussy on that point.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: rules about mandatory xml:id

Martin Holmes
On 2017-06-21 11:51 AM, Lou Burnard wrote:
> It can do if your tei documents are grouped together into a teiCorpus,

And you can create a teiCorpus document which XIncludes all the other
documents just for validation purposes.

> of course.
>
> Sent from my Huawei Mobile
>
> -------- Original Message --------
> Subject: Re: rules about mandatory xml:id
> From: Eduard Drenth
> To: [hidden email]
> CC:
>
> Note that requiring xml:id on <TEI> does not garantee uniqueness of the
> id over multiple documents.
>
> Eduard Drenth, Software Architekt
>
> [hidden email]
>
> Doelestrjitte 8
> 8911 DX  Ljouwert
> +31 58 234 30 47
> +31 62 094 34 28 (privé)
>
> gpg: https://sks-keyservers.net/pks/lookup?op=get&search=0x065EF82A1E02CC43
>
> ________________________________________
> From: TEI (Text Encoding Initiative) public discussion list
> <[hidden email]> on behalf of Syd Bauman
> <[hidden email]>
> Sent: Wednesday, June 21, 2017 5:50 PM
> To: [hidden email]
> Subject: Re: rules about mandatory xml:id
>
> I agree w/ Martin & Lou, here. For any given element, the mechanism
> Lou demonstrated is the way to go. You get faster validation, and get
> validation even if you don't validate against the Schematron (but you
> really should!)
>
> But there is still a place for Schematron: switch over to Schematron
> when you want to do something more complex than just require @xml:id
> on a few particular elements. Examples:
>
>   * Require @xml:id on <div type="chapter">, but not on other <div>s.
>
>   * Require @xml:id on any line group that has 5 or more metrical
>     lines, but not others. (Note that this is *not* the same as saying
>     "any <lg> that has 5+ child <l> elements". Metrical lines may be
>     split among multiple <l> elements to avoid overlap problems, and
>     may be nested in smaller <lg>s, e.g. <lg type="couplet">. :-)
>
>   * Require that an @xml:id be specified on *at least one* <term>, but
>     not necessarily all of 'em.
>
>   * Require not only that there *is* an @xml:id, but that the
>     values thereof are in some particular order.
>
>   * Provide a "quick fix" suggested correction for the error -- this
>     is NOT supported by the TEI stylesheets yet, but hopefully will be
>     by this time next year.
>
>
>> This is perfectly easy to do in your ODD. For example, just add
>> something like this
>>
>>   <elementSpec ident="TEI" mode="change">
>>        <attList>
>>           <attDef  usage="req" ident="xml:id" mode="change"/>
>>         </attList>
>>     </elementSpec>
>>
>> to make @xml:id required on </TEI/>
>>
>>
>> > Can you add a rule to a schema requiring that some element(s)
>> > MUST have an xml:id, especially <TEI>? The question came up in
>> > the context of a project that feeds a lot of different texts to
>> > eXist's TEI Publisher, which is fussy on that point.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: rules about mandatory xml:id

James Cummings-4
In reply to this post by Martin Mueller
Of course you could come up with a sensible ID creating strategy that includes the ID number that is part of the filename or similar. I think collection level identification schemes and protocols should take care of uniqueness of file IDs and it shouldn't be something individual document validation should worry about. 

What document validation like rng+schematron can do is ensure the ID number in the file matches the appropriate part of the file name and other IDs based on this match in the file. (If using this method of IDing and not some external random hashing that ensures uniqueness across the collection.)


James

--
Dr James Cummings, Academic IT Services, University of Oxford

On 21 Jun 2017 20:21, Martin Holmes <[hidden email]> wrote:
On 2017-06-21 11:51 AM, Lou Burnard wrote:
> It can do if your tei documents are grouped together into a teiCorpus,

And you can create a teiCorpus document which XIncludes all the other
documents just for validation purposes.

> of course.
>
> Sent from my Huawei Mobile
>
> -------- Original Message --------
> Subject: Re: rules about mandatory xml:id
> From: Eduard Drenth
> To: [hidden email]
> CC:
>
> Note that requiring xml:id on <TEI> does not garantee uniqueness of the
> id over multiple documents.
>
> Eduard Drenth, Software Architekt
>
> [hidden email]
>
> Doelestrjitte 8
> 8911 DX  Ljouwert
> +31 58 234 30 47
> +31 62 094 34 28 (privé)
>
> gpg: https://sks-keyservers.net/pks/lookup?op=get&search=0x065EF82A1E02CC43
>
> ________________________________________
> From: TEI (Text Encoding Initiative) public discussion list
> <[hidden email]> on behalf of Syd Bauman
> <[hidden email]>
> Sent: Wednesday, June 21, 2017 5:50 PM
> To: [hidden email]
> Subject: Re: rules about mandatory xml:id
>
> I agree w/ Martin & Lou, here. For any given element, the mechanism
> Lou demonstrated is the way to go. You get faster validation, and get
> validation even if you don't validate against the Schematron (but you
> really should!)
>
> But there is still a place for Schematron: switch over to Schematron
> when you want to do something more complex than just require @xml:id
> on a few particular elements. Examples:
>
>   * Require @xml:id on <div type="chapter">, but not on other <div>s.
>
>   * Require @xml:id on any line group that has 5 or more metrical
>     lines, but not others. (Note that this is *not* the same as saying
>     "any <lg> that has 5+ child <l> elements". Metrical lines may be
>     split among multiple <l> elements to avoid overlap problems, and
>     may be nested in smaller <lg>s, e.g. <lg type="couplet">. :-)
>
>   * Require that an @xml:id be specified on *at least one* <term>, but
>     not necessarily all of 'em.
>
>   * Require not only that there *is* an @xml:id, but that the
>     values thereof are in some particular order.
>
>   * Provide a "quick fix" suggested correction for the error -- this
>     is NOT supported by the TEI stylesheets yet, but hopefully will be
>     by this time next year.
>
>
>> This is perfectly easy to do in your ODD. For example, just add
>> something like this
>>
>>   <elementSpec ident="TEI" mode="change">
>>        <attList>
>>           <attDef  usage="req" ident="xml:id" mode="change"/>
>>         </attList>
>>     </elementSpec>
>>
>> to make @xml:id required on </TEI/>
>>
>>
>> > Can you add a rule to a schema requiring that some element(s)
>> > MUST have an xml:id, especially <TEI>? The question came up in
>> > the context of a project that feeds a lot of different texts to
>> > eXist's TEI Publisher, which is fussy on that point.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: rules about mandatory xml:id

Martin Holmes
This is an ODD fragment which does as James suggests:

<elementSpec ident="TEI" module="textstructure" mode="change">
   <constraintSpec ident="rootIdEqualsFileName" scheme="isoschematron">
     <desc>The root TEI element must have an @xml:id attribute which
           matches its filename.</desc>
        <constraint>
           <sch:rule context="tei:TEI">
              <sch:let name="reqId"
value="substring-before(tokenize(document-uri(/), '/')[last()], '.xml')"/>
                 <sch:assert test="@xml:id eq $reqId">
                     The @xml:id attribute on the TEI element
(<sch:value-of select="@xml:id"/>) should match the
                     document filename without extension (<sch:value-of
select="$reqId"/>).
                 </sch:assert>
            </sch:rule>
          </constraint>
   </constraintSpec>
</elementSpec>

Cheers,
Martin

On 2017-06-21 01:38 PM, James Cummings wrote:

> Of course you could come up with a sensible ID creating strategy that
> includes the ID number that is part of the filename or similar. I think
> collection level identification schemes and protocols should take care
> of uniqueness of file IDs and it shouldn't be something individual
> document validation should worry about.
>
> What document validation like rng+schematron can do is ensure the ID
> number in the file matches the appropriate part of the file name and
> other IDs based on this match in the file. (If using this method of
> IDing and not some external random hashing that ensures uniqueness
> across the collection.)
>
>
> James
>
> --
> Dr James Cummings, Academic IT Services, University of Oxford
>
> On 21 Jun 2017 20:21, Martin Holmes <[hidden email]> wrote:
>
>     On 2017-06-21 11:51 AM, Lou Burnard wrote:
>     > It can do if your tei documents are grouped together into a teiCorpus,
>
>     And you can create a teiCorpus document which XIncludes all the other
>     documents just for validation purposes.
>
>     > of course.
>     >
>     > Sent from my Huawei Mobile
>     >
>     > -------- Original Message --------
>     > Subject: Re: rules about mandatory xml:id
>     > From: Eduard Drenth
>     > To: [hidden email]
>     > CC:
>     >
>     > Note that requiring xml:id on <TEI> does not garantee uniqueness of the
>     > id over multiple documents.
>     >
>     > Eduard Drenth, Software Architekt
>     >
>     > [hidden email]
>     >
>     > Doelestrjitte 8
>     > 8911 DX  Ljouwert
>     > +31 58 234 30 47
>     > +31 62 094 34 28 (privé)
>     >
>     > gpg: https://sks-keyservers.net/pks/lookup?op=get&search=0x065EF82A1E02CC43
>     >
>     > ________________________________________
>     > From: TEI (Text Encoding Initiative) public discussion list
>     > <[hidden email]> on behalf of Syd Bauman
>     > <[hidden email]>
>     > Sent: Wednesday, June 21, 2017 5:50 PM
>     > To: [hidden email]
>     > Subject: Re: rules about mandatory xml:id
>     >
>     > I agree w/ Martin & Lou, here. For any given element, the mechanism
>     > Lou demonstrated is the way to go. You get faster validation, and get
>     > validation even if you don't validate against the Schematron (but you
>     > really should!)
>     >
>     > But there is still a place for Schematron: switch over to Schematron
>     > when you want to do something more complex than just require @xml:id
>     > on a few particular elements. Examples:
>     >
>     >   * Require @xml:id on <div type="chapter">, but not on other <div>s.
>     >
>     >   * Require @xml:id on any line group that has 5 or more metrical
>     >     lines, but not others. (Note that this is *not* the same as saying
>     >     "any <lg> that has 5+ child <l> elements". Metrical lines may be
>     >     split among multiple <l> elements to avoid overlap problems, and
>     >     may be nested in smaller <lg>s, e.g. <lg type="couplet">. :-)
>     >
>     >   * Require that an @xml:id be specified on *at least one* <term>, but
>     >     not necessarily all of 'em.
>     >
>     >   * Require not only that there *is* an @xml:id, but that the
>     >     values thereof are in some particular order.
>     >
>     >   * Provide a "quick fix" suggested correction for the error -- this
>     >     is NOT supported by the TEI stylesheets yet, but hopefully will be
>     >     by this time next year.
>     >
>     >
>     >> This is perfectly easy to do in your ODD. For example, just add
>     >> something like this
>     >>
>     >>   <elementSpec ident="TEI" mode="change">
>     >>        <attList>
>     >>           <attDef  usage="req" ident="xml:id" mode="change"/>
>     >>         </attList>
>     >>     </elementSpec>
>     >>
>     >> to make @xml:id required on </TEI/>
>     >>
>     >>
>     >> > Can you add a rule to a schema requiring that some element(s)
>     >> > MUST have an xml:id, especially <TEI>? The question came up in
>     >> > the context of a project that feeds a lot of different texts to
>     >> > eXist's TEI Publisher, which is fussy on that point.
>
>
Loading...