web services for text search, retrieval and other operations

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

web services for text search, retrieval and other operations

Sigfrid Lundberg
Dear everybody,

The Royal Danish Library, Copenhagen, are about to provide public access to various APIs and web services to our data. These are components that we use internally in applications but which we hope could useful for students and scholars alike.

Also we hope that they could seen as contributions to the discussions on what kind web services that APIs are useful within digital humanities and literary computing.

Here is a description of what we tend to call our snippet server

https://github.com/Det-Kongelige-Bibliotek/access-digital-objects/blob/master/text-corpora.md

I'll later extend this document to cover the related search service.

If anyone has input on this, and opinions on what else we could do to make our text sources useful, we'd love a mail.

Thanks in advance

Sigfrid
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

All teidata.pointer attributes

John P. McCaskey-2
Running an XSL transform against a TEI document, I want to match all TEI elements that contain at least one teidata.pointer attribute.

I can’t match just on attribute name. In some elements @value is a pointer, in others it’s text. Sometimes @class is a pointer, sometimes not.

Is there a master list of attributes that can be pointers and the elements in which they are?

(Even better if the list is in a form that makes it easy to use in an XSL match. Double extra better if you can show me the very XSLT that will copy the element and add an attribute “haspointer=’true’”!)

Thanks!
John

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Correspondence tag-set

Jack Orchard
Hello all, 

I'm working with Electronic Enlightenment (http://www.e-enlightenment.com) working on eighteenth-century correspondence. EE has its own tagset for letter features and metadata, and I'm working on writing up a concordance between EE and TEI. I'm pretty unfamiliar with TEI at the moment, and I was wondering if anyone here might be able to steer me towards current developments in TEI tagsets for correspondence? I've been looking at correspSearch and CMIF for metadata, but I understand that correspondence features are still being standardised?

Thank you for your time.

Jack Orchard

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: All teidata.pointer attributes

James Cummings-4
In reply to this post by John P. McCaskey-2
If you want to do it programatically you could look up their datatypes in p5subset.xml on the fly.   If you just want a reference then you can see all the  attributes of a particular class in appendix E. e.g.  http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-teidata.pointer.html

James


--
Dr James Cummings, Academic IT Services, University of Oxford

On 4 Feb 2017 11:54 p.m., "John P. McCaskey" <[hidden email]> wrote:
Running an XSL transform against a TEI document, I want to match all TEI elements that contain at least one teidata.pointer attribute.

I can’t match just on attribute name. In some elements @value is a pointer, in others it’s text. Sometimes @class is a pointer, sometimes not.

Is there a master list of attributes that can be pointers and the elements in which they are?

(Even better if the list is in a form that makes it easy to use in an XSL match. Double extra better if you can show me the very XSLT that will copy the element and add an attribute “haspointer=’true’”!)

Thanks!
John


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Correspondence tag-set

James Cummings-4
In reply to this post by Jack Orchard
Hi Jack, 

correspDesc corespAction etc were added to the guidelines some time ago.  See http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-correspDesc.html

And

http://www.tei-c.org/release/doc/tei-p5-doc/en/html/HD.html#HD44CD


We'd be interested if you discover things you need to do that the TEI can't cope with.  (Feel free to submit issues on github)

James 

--
Dr James Cummings, Academic IT Services, University of Oxford

On 4 Feb 2017 11:59 p.m., Jack Orchard <[hidden email]> wrote:
Hello all, 

I'm working with Electronic Enlightenment (http://www.e-enlightenment.com) working on eighteenth-century correspondence. EE has its own tagset for letter features and metadata, and I'm working on writing up a concordance between EE and TEI. I'm pretty unfamiliar with TEI at the moment, and I was wondering if anyone here might be able to steer me towards current developments in TEI tagsets for correspondence? I've been looking at correspSearch and CMIF for metadata, but I understand that correspondence features are still being standardised?

Thank you for your time.

Jack Orchard


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Correspondence tag-set

Jack Orchard

Thanks very much,


I'll feed anything back that I can find!


Jack




From: James Cummings <[hidden email]>
Sent: 04 February 2017 23:03
To: [hidden email]
Cc: [hidden email]
Subject: Re: Correspondence tag-set
 
Hi Jack, 

correspDesc corespAction etc were added to the guidelines some time ago.  See http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-correspDesc.html

And

http://www.tei-c.org/release/doc/tei-p5-doc/en/html/HD.html#HD44CD


We'd be interested if you discover things you need to do that the TEI can't cope with.  (Feel free to submit issues on github)

James 

--
Dr James Cummings, Academic IT Services, University of Oxford

On 4 Feb 2017 11:59 p.m., Jack Orchard <[hidden email]> wrote:
Hello all, 

I'm working with Electronic Enlightenment (http://www.e-enlightenment.com) working on eighteenth-century correspondence. EE has its own tagset for letter features and metadata, and I'm working on writing up a concordance between EE and TEI. I'm pretty unfamiliar with TEI at the moment, and I was wondering if anyone here might be able to steer me towards current developments in TEI tagsets for correspondence? I've been looking at correspSearch and CMIF for metadata, but I understand that correspondence features are still being standardised?
www.e-enlightenment.com
Electronic Enlightenment, University of Oxford. Digitizing letters from the 17th–19th centuries, EE reconnects the first global social network.


Thank you for your time.

Jack Orchard


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: All teidata.pointer attributes

Lou Burnard-6
In reply to this post by John P. McCaskey-2

This information is not available from the document instance: you need to look at the ODD from which the schema that validates the document instances was generated.

If you're using a document that is valid against TEI all, this will list all the elements concerned:

<xsl:template xmlns:t="http://www.tei-c.org/ns/1.0" match="/">
   <xsl:for-each select="document('http://www.tei-c.org/release/xml/tei/odd/p5subset.xmll')//t:elementSpec[t:attList/t:attDef/t:datatype/t:dataRef/@key='teidata.pointer']">
       <xsl:message><xsl:value-of select='@ident'/></xsl:message>
   </xsl:for-each> 
 


On 04/02/17 22:52, John P. McCaskey wrote:
Running an XSL transform against a TEI document, I want to match all TEI elements that contain at least one teidata.pointer attribute.

I can’t match just on attribute name. In some elements @value is a pointer, in others it’s text. Sometimes @class is a pointer, sometimes not.

Is there a master list of attributes that can be pointers and the elements in which they are?

(Even better if the list is in a form that makes it easy to use in an XSL match. Double extra better if you can show me the very XSLT that will copy the element and add an attribute “haspointer=’true’”!)

Thanks!
John


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Fwd: Re: All teidata.pointer attributes

Lou Burnard-6

Sorry, that got garbled. Try this :

<xsl:template match="/" xmlns:t="http://www.tei-c.org/ns/1.0">
    <xsl:for-each select="document('"document('http://www.tei-c.org/release/xml/tei/odd/p5subset.xml')//t:elementSpec[t:attList/t:attDef/t:datatype/t:dataRef/@key='teidata.pointer']">
       <xsl:message><xsl:value-of select='@ident'/></xsl:message>
   </xsl:for-each>  
</xsl:template>
   




-------- Forwarded Message --------
Subject: Re: All teidata.pointer attributes
Date: Sat, 4 Feb 2017 23:17:46 +0000
From: Lou Burnard [hidden email]
Reply-To: Lou Burnard [hidden email]
To:


This information is not available from the document instance: you need 
to look at the ODD from which the schema that validates the document 
instances was generated.

If you're using a document that is valid against TEI all, this will list 
all the elements concerned:

<xsl:template xmlns:t="http://www.tei-c.org/ns/1.0" match="/">
    <xsl:for-each 
select="document('"document('http://www.*tei*-c.org/release/*xml*/*tei*/odd/*p5subset*.*xml*l')//t:elementSpec[t:attList/t:attDef/t:datatype/t:dataRef/@key='teidata.pointer']">
        <xsl:message><xsl:value-of select='@ident'/></xsl:message>
    </xsl:for-each>


On 04/02/17 22:52, John P. McCaskey wrote:
> Running an XSL transform against a TEI document, I want to match all 
> TEI elements that contain at least one teidata.pointer attribute.
>
> I can’t match just on attribute name. In some elements @value is a 
> pointer, in others it’s text. Sometimes @class is a pointer, sometimes 
> not.
>
> Is there a master list of attributes that can be pointers and the 
> elements in which they are?
>
> (Even better if the list is in a form that makes it easy to use in an 
> XSL match. Double extra better if you can show me the very XSLT that 
> will copy the element and add an attribute “haspointer=’true’”!)
>
> Thanks!
> John
>


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: All teidata.pointer attributes

Conal Tuohy-3
In reply to this post by Lou Burnard-6
Borrowing Lou's suggested XPath, let me suggest a couple of ways to deploy it in XSLT:

If you are using XSLT 2.0 (or later), then you can record the list of "pointer" elements' names in a global variable, and then define a template which matches elements whose names appear in that list.

<xsl:variable name="pointer-element-names" select="
      //t:elementSpec[t:attList/t:attDef/t:datatype/t:dataRef/@key='teidata.pointer']
      /@ident
"/>

<xsl:template match="*[local-name() = $pointer-element-names]">
   <!-- process the pointer element -->
   <xsl:copy>
      <xsl:attribute name="haspointer">true</xsl:attribute>
      <xsl:copy-of select="@* | node()"/>
   </xsl:copy>
</xsl:template>

That's likely to be the most efficient approach, because the "//" step is expensive (traversing the entire ODD document), and storing the result in a variable means you only perform this traversal once. 

You can't do this in XSLT 1, though, because references to variables are not allowed in template match expressions. You could, however, do something like this:

<xsl:template match="*">
   <xsl:copy>
      <xsl:if test="local-name() = $pointer-element-names">
         <xsl:attribute name="haspointer">true</xsl:attribute>
      </xsl:if>
      <xsl:copy-of select="@* | node()"/>
   </xsl:copy>
</xsl:template>

Also in XSLT 1 you should be able to do without a variable, by including the "document()" function call in the match expression, but performance might not be very good (because of the "//" step, and also because some older XSLT processors might not cache the result of the "document()" function call).

<xsl:template match="
   *[
         //t:elementSpec[t:attList/t:attDef/t:datatype/t:dataRef/@key='teidata.pointer']
         /@ident
   ]
">
   <xsl:copy>
      <xsl:attribute name="haspointer">true</xsl:attribute>
      <xsl:copy-of select="@* | node()"/>
   </xsl:copy>
</xsl:template>


On 5 February 2017 at 09:17, Lou Burnard <[hidden email]> wrote:

This information is not available from the document instance: you need to look at the ODD from which the schema that validates the document instances was generated.

If you're using a document that is valid against TEI all, this will list all the elements concerned:

<xsl:template xmlns:t="http://www.tei-c.org/ns/1.0" match="/">
   <xsl:for-each select="document('http://www.tei-c.org/release/xml/tei/odd/p5subset.xmll')//t:elementSpec[t:attList/t:attDef/t:datatype/t:dataRef/@key='teidata.pointer']">
       <xsl:message><xsl:value-of select='@ident'/></xsl:message>
   </xsl:for-each> 
 


On 04/02/17 22:52, John P. McCaskey wrote:
Running an XSL transform against a TEI document, I want to match all TEI elements that contain at least one teidata.pointer attribute.

I can’t match just on attribute name. In some elements @value is a pointer, in others it’s text. Sometimes @class is a pointer, sometimes not.

Is there a master list of attributes that can be pointers and the elements in which they are?

(Even better if the list is in a form that makes it easy to use in an XSL match. Double extra better if you can show me the very XSLT that will copy the element and add an attribute “haspointer=’true’”!)

Thanks!
John





--
@conal_tuohy
+61-466-324297
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: All teidata.pointer attributes

Conal Tuohy-3
Whoops I made a mistake in that message (I blame the Brisbane summer heat!)

The code I had inside the templates was wrong:

   <!-- wrong! -->
   <xsl:copy>
      <xsl:attribute name="haspointer">true</xsl:attribute>
      <xsl:copy-of select="@* | node()"/>
   </xsl:copy>

It should of course have applied templates to the children of the matched elements:

   <xsl:copy>
      <xsl:attribute name="haspointer">true</xsl:attribute>
      <xsl:copy-of select="@*">
      <xsl:apply-templates/>
   </xsl:copy>

[turning the fan on now]

On 5 February 2017 at 19:07, Conal Tuohy <[hidden email]> wrote:
Borrowing Lou's suggested XPath, let me suggest a couple of ways to deploy it in XSLT:

If you are using XSLT 2.0 (or later), then you can record the list of "pointer" elements' names in a global variable, and then define a template which matches elements whose names appear in that list.

<xsl:variable name="pointer-element-names" select="
      //t:elementSpec[t:attList/t:attDef/t:datatype/t:dataRef/@key='teidata.pointer']
      /@ident
"/>

<xsl:template match="*[local-name() = $pointer-element-names]">
   <!-- process the pointer element -->
   <xsl:copy>
      <xsl:attribute name="haspointer">true</xsl:attribute>
      <xsl:copy-of select="@* | node()"/>
   </xsl:copy>
</xsl:template>

That's likely to be the most efficient approach, because the "//" step is expensive (traversing the entire ODD document), and storing the result in a variable means you only perform this traversal once. 

You can't do this in XSLT 1, though, because references to variables are not allowed in template match expressions. You could, however, do something like this:

<xsl:template match="*">
   <xsl:copy>
      <xsl:if test="local-name() = $pointer-element-names">
         <xsl:attribute name="haspointer">true</xsl:attribute>
      </xsl:if>
      <xsl:copy-of select="@* | node()"/>
   </xsl:copy>
</xsl:template>

Also in XSLT 1 you should be able to do without a variable, by including the "document()" function call in the match expression, but performance might not be very good (because of the "//" step, and also because some older XSLT processors might not cache the result of the "document()" function call).

<xsl:template match="
   *[
         //t:elementSpec[t:attList/t:attDef/t:datatype/t:dataRef/@key='teidata.pointer']
         /@ident
   ]
">
   <xsl:copy>
      <xsl:attribute name="haspointer">true</xsl:attribute>
      <xsl:copy-of select="@* | node()"/>
   </xsl:copy>
</xsl:template>


On 5 February 2017 at 09:17, Lou Burnard <[hidden email]> wrote:

This information is not available from the document instance: you need to look at the ODD from which the schema that validates the document instances was generated.

If you're using a document that is valid against TEI all, this will list all the elements concerned:

<xsl:template xmlns:t="http://www.tei-c.org/ns/1.0" match="/">
   <xsl:for-each select="document('http://www.tei-c.org/release/xml/tei/odd/p5subset.xmll')//t:elementSpec[t:attList/t:attDef/t:datatype/t:dataRef/@key='teidata.pointer']">
       <xsl:message><xsl:value-of select='@ident'/></xsl:message>
   </xsl:for-each> 
 


On 04/02/17 22:52, John P. McCaskey wrote:
Running an XSL transform against a TEI document, I want to match all TEI elements that contain at least one teidata.pointer attribute.

I can’t match just on attribute name. In some elements @value is a pointer, in others it’s text. Sometimes @class is a pointer, sometimes not.

Is there a master list of attributes that can be pointers and the elements in which they are?

(Even better if the list is in a form that makes it easy to use in an XSL match. Double extra better if you can show me the very XSLT that will copy the element and add an attribute “haspointer=’true’”!)

Thanks!
John





--
@conal_tuohy
<a href="tel:0466%20324%20297" value="+61466324297" target="_blank">+61-466-324297



--
@conal_tuohy
+61-466-324297
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: All teidata.pointer attributes

Lou Burnard-6
Thanks Conal. I don't have any climactic excuse, but I realised just
after my posting last night that this approach is only going to show the
attributes which are directly declared in an element spec. But in fact
most pointer-valued attributes get that way by being inherited from an
attribute class. So something rather more complex is needed:  you need
to look for elements which have a memberOf child that names an attribute
class which supplies an pointer-valued attribute ... or which inherits
from another class which does. Writing the xslt to do that is left as an
exercise for the reader, since it's Sunday and I am cooking.


On 05/02/17 09:11, Conal Tuohy wrote:

> Whoops I made a mistake in that message (I blame the Brisbane summer heat!)
>
> The code I had inside the templates was wrong:
>
>     <!-- wrong! -->
>     <xsl:copy>
>        <xsl:attribute name="haspointer">true</xsl:attribute>
>        <xsl:copy-of select="@* | node()"/>
>     </xsl:copy>
>
> It should of course have applied templates to the children of the matched
> elements:
>
>     <xsl:copy>
>        <xsl:attribute name="haspointer">true</xsl:attribute>
>        <xsl:copy-of select="@*">
>        <xsl:apply-templates/>
>     </xsl:copy>
>
> [turning the fan on now]
>
> On 5 February 2017 at 19:07, Conal Tuohy <[hidden email]> wrote:
>
>> Borrowing Lou's suggested XPath, let me suggest a couple of ways to deploy
>> it in XSLT:
>>
>> If you are using XSLT 2.0 (or later), then you can record the list of
>> "pointer" elements' names in a global variable, and then define a template
>> which matches elements whose names appear in that list.
>>
>> <xsl:variable name="pointer-element-names" select="
>>     document('http://www.tei-c.org/release/xml/tei/odd/p5subset.xml')
>>        //t:elementSpec[t:attList/t:attDef/t:datatype/t:dataRef/@
>> key='teidata.pointer']
>>        /@ident
>> "/>
>>
>> <xsl:template match="*[local-name() = $pointer-element-names]">
>>     <!-- process the pointer element -->
>>     <xsl:copy>
>>        <xsl:attribute name="haspointer">true</xsl:attribute>
>>        <xsl:copy-of select="@* | node()"/>
>>     </xsl:copy>
>> </xsl:template>
>>
>> That's likely to be the most efficient approach, because the "//" step is
>> expensive (traversing the entire ODD document), and storing the result in a
>> variable means you only perform this traversal once.
>>
>> You can't do this in XSLT 1, though, because references to variables are
>> not allowed in template match expressions. You could, however, do something
>> like this:
>>
>> <xsl:template match="*">
>>     <xsl:copy>
>>        <xsl:if test="local-name() = $pointer-element-names">
>>           <xsl:attribute name="haspointer">true</xsl:attribute>
>>        </xsl:if>
>>        <xsl:copy-of select="@* | node()"/>
>>     </xsl:copy>
>> </xsl:template>
>>
>> Also in XSLT 1 you should be able to do without a variable, by including
>> the "document()" function call in the match expression, but performance
>> might not be very good (because of the "//" step, and also because some
>> older XSLT processors might not cache the result of the "document()"
>> function call).
>>
>> <xsl:template match="
>>     *[
>>        local-name()=document('http://www.tei-c.org/release/xml/tei/
>> odd/p5subset.xml')
>>           //t:elementSpec[t:attList/t:attDef/t:datatype/t:dataRef/@
>> key='teidata.pointer']
>>           /@ident
>>     ]
>> ">
>>     <xsl:copy>
>>        <xsl:attribute name="haspointer">true</xsl:attribute>
>>        <xsl:copy-of select="@* | node()"/>
>>     </xsl:copy>
>> </xsl:template>
>>
>>
>> On 5 February 2017 at 09:17, Lou Burnard <[hidden email]>
>> wrote:
>>
>>> This information is not available from the document instance: you need to
>>> look at the ODD from which the schema that validates the document instances
>>> was generated.
>>>
>>> If you're using a document that is valid against TEI all, this will list
>>> all the elements concerned:
>>>
>>> <xsl:template xmlns:t="http://www.tei-c.org/ns/1.0"
>>> <http://www.tei-c.org/ns/1.0> match="/">
>>>     <xsl:for-each select="document('http://www.*tei*-c.org/release/*xml*/
>>> *tei*/odd/*p5subset*.*xml*l')//t:elementSpec[
>>> t:attList/t:attDef/t:datatype/t:dataRef/@key='teidata.pointer']">
>>>         <xsl:message><xsl:value-of select='@ident'/></xsl:message>
>>>     </xsl:for-each>
>>>
>>>
>>> On 04/02/17 22:52, John P. McCaskey wrote:
>>>
>>> Running an XSL transform against a TEI document, I want to match all TEI
>>> elements that contain at least one teidata.pointer attribute.
>>>
>>> I can’t match just on attribute name. In some elements @value is a
>>> pointer, in others it’s text. Sometimes @class is a pointer, sometimes not.
>>>
>>> Is there a master list of attributes that can be pointers and the
>>> elements in which they are?
>>>
>>> (Even better if the list is in a form that makes it easy to use in an XSL
>>> match. Double extra better if you can show me the very XSLT that will copy
>>> the element and add an attribute “haspointer=’true’”!)
>>>
>>> Thanks!
>>> John
>>>
>>>
>>>
>>
>> --
>> Conal Tuohy
>> http://conaltuohy.com/
>> @conal_tuohy
>> +61-466-324297 <0466%20324%20297>
>>
>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: All teidata.pointer attributes

John P. McCaskey-2

I’m have a difficult time doing this with ODD.

How does one get the RNG? (I have Oxygen 18. Is it in there somewhere?)

--



On 2/5/2017 8:09 AM, Lou Burnard wrote:
Thanks Conal. I don't have any climactic excuse, but I realised just after my posting last night that this approach is only going to show the attributes which are directly declared in an element spec. But in fact most pointer-valued attributes get that way by being inherited from an attribute class. So something rather more complex is needed:  you need to look for elements which have a memberOf child that names an attribute class which supplies an pointer-valued attribute ... or which inherits from another class which does. Writing the xslt to do that is left as an exercise for the reader, since it's Sunday and I am cooking.


On 05/02/17 09:11, Conal Tuohy wrote:
Whoops I made a mistake in that message (I blame the Brisbane summer heat!)

The code I had inside the templates was wrong:

    <!-- wrong! -->
    <xsl:copy>
       <xsl:attribute name="haspointer">true</xsl:attribute>
       <xsl:copy-of select="@* | node()"/>
    </xsl:copy>

It should of course have applied templates to the children of the matched
elements:

    <xsl:copy>
       <xsl:attribute name="haspointer">true</xsl:attribute>
       <xsl:copy-of select="@*">
       <xsl:apply-templates/>
    </xsl:copy>

[turning the fan on now]

On 5 February 2017 at 19:07, Conal Tuohy [hidden email] wrote:

Borrowing Lou's suggested XPath, let me suggest a couple of ways to deploy
it in XSLT:

If you are using XSLT 2.0 (or later), then you can record the list of
"pointer" elements' names in a global variable, and then define a template
which matches elements whose names appear in that list.

<xsl:variable name="pointer-element-names" select="
    document('http://www.tei-c.org/release/xml/tei/odd/p5subset.xml')
       //t:elementSpec[t:attList/t:attDef/t:datatype/t:dataRef/@
key='teidata.pointer']
       /@ident
"/>

<xsl:template match="*[local-name() = $pointer-element-names]">
    <!-- process the pointer element -->
    <xsl:copy>
       <xsl:attribute name="haspointer">true</xsl:attribute>
       <xsl:copy-of select="@* | node()"/>
    </xsl:copy>
</xsl:template>

That's likely to be the most efficient approach, because the "//" step is
expensive (traversing the entire ODD document), and storing the result in a
variable means you only perform this traversal once.

You can't do this in XSLT 1, though, because references to variables are
not allowed in template match expressions. You could, however, do something
like this:

<xsl:template match="*">
    <xsl:copy>
       <xsl:if test="local-name() = $pointer-element-names">
          <xsl:attribute name="haspointer">true</xsl:attribute>
       </xsl:if>
       <xsl:copy-of select="@* | node()"/>
    </xsl:copy>
</xsl:template>

Also in XSLT 1 you should be able to do without a variable, by including
the "document()" function call in the match expression, but performance
might not be very good (because of the "//" step, and also because some
older XSLT processors might not cache the result of the "document()"
function call).

<xsl:template match="
    *[
       local-name()=document('http://www.tei-c.org/release/xml/tei/
odd/p5subset.xml')
          //t:elementSpec[t:attList/t:attDef/t:datatype/t:dataRef/@
key='teidata.pointer']
          /@ident
    ]
">
    <xsl:copy>
       <xsl:attribute name="haspointer">true</xsl:attribute>
       <xsl:copy-of select="@* | node()"/>
    </xsl:copy>
</xsl:template>


On 5 February 2017 at 09:17, Lou Burnard [hidden email]
wrote:

This information is not available from the document instance: you need to
look at the ODD from which the schema that validates the document instances
was generated.

If you're using a document that is valid against TEI all, this will list
all the elements concerned:

<xsl:template xmlns:t="http://www.tei-c.org/ns/1.0"
<http://www.tei-c.org/ns/1.0> match="/">
    <xsl:for-each select="document('http://www.*tei*-c.org/release/*xml*/
*tei*/odd/*p5subset*.*xml*l')//t:elementSpec[
t:attList/t:attDef/t:datatype/t:dataRef/@key='teidata.pointer']">
        <xsl:message><xsl:value-of select='@ident'/></xsl:message>
    </xsl:for-each>


On 04/02/17 22:52, John P. McCaskey wrote:

Running an XSL transform against a TEI document, I want to match all TEI
elements that contain at least one teidata.pointer attribute.

I can’t match just on attribute name. In some elements @value is a
pointer, in others it’s text. Sometimes @class is a pointer, sometimes not.

Is there a master list of attributes that can be pointers and the
elements in which they are?

(Even better if the list is in a form that makes it easy to use in an XSL
match. Double extra better if you can show me the very XSLT that will copy
the element and add an attribute “haspointer=’true’”!)

Thanks!
John




--
Conal Tuohy
http://conaltuohy.com/
@conal_tuohy
+61-466-324297 <0466%20324%20297>




Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: All teidata.pointer attributes

Lou Burnard-6
The relaxng corresponding to any odd can be generated by oxygen if you have the tei framework installed. Just choose the odd to relaxng transformation scenario. Alternatively, for several predefined odds, you can grab the relaxng from the tei website. 
Look in the folder release/xml/tei/custom/schema/relaxng for example




Sent from my Samsung Galaxy Tab®|PRO


-------- Original message --------
From: "John P. McCaskey"
Date:2017/02/06 14:09 (GMT+00:00)
To: [hidden email]
Subject: Re: All teidata.pointer attributes

I’m have a difficult time doing this with ODD.

How does one get the RNG? (I have Oxygen 18. Is it in there somewhere?)

--



On 2/5/2017 8:09 AM, Lou Burnard wrote:
Thanks Conal. I don't have any climactic excuse, but I realised just after my posting last night that this approach is only going to show the attributes which are directly declared in an element spec. But in fact most pointer-valued attributes get that way by being inherited from an attribute class. So something rather more complex is needed:  you need to look for elements which have a memberOf child that names an attribute class which supplies an pointer-valued attribute ... or which inherits from another class which does. Writing the xslt to do that is left as an exercise for the reader, since it's Sunday and I am cooking.


On 05/02/17 09:11, Conal Tuohy wrote:
Whoops I made a mistake in that message (I blame the Brisbane summer heat!)

The code I had inside the templates was wrong:

    <!-- wrong! -->
    <xsl:copy>
       <xsl:attribute name="haspointer">true</xsl:attribute>
       <xsl:copy-of select="@* | node()"/>
    </xsl:copy>

It should of course have applied templates to the children of the matched
elements:

    <xsl:copy>
       <xsl:attribute name="haspointer">true</xsl:attribute>
       <xsl:copy-of select="@*">
       <xsl:apply-templates/>
    </xsl:copy>

[turning the fan on now]

On 5 February 2017 at 19:07, Conal Tuohy [hidden email] wrote:

Borrowing Lou's suggested XPath, let me suggest a couple of ways to deploy
it in XSLT:

If you are using XSLT 2.0 (or later), then you can record the list of
"pointer" elements' names in a global variable, and then define a template
which matches elements whose names appear in that list.

<xsl:variable name="pointer-element-names" select="
    document('http://www.tei-c.org/release/xml/tei/odd/p5subset.xml')
       //t:elementSpec[t:attList/t:attDef/t:datatype/t:dataRef/@
key='teidata.pointer']
       /@ident
"/>

<xsl:template match="*[local-name() = $pointer-element-names]">
    <!-- process the pointer element -->
    <xsl:copy>
       <xsl:attribute name="haspointer">true</xsl:attribute>
       <xsl:copy-of select="@* | node()"/>
    </xsl:copy>
</xsl:template>

That's likely to be the most efficient approach, because the "//" step is
expensive (traversing the entire ODD document), and storing the result in a
variable means you only perform this traversal once.

You can't do this in XSLT 1, though, because references to variables are
not allowed in template match expressions. You could, however, do something
like this:

<xsl:template match="*">
    <xsl:copy>
       <xsl:if test="local-name() = $pointer-element-names">
          <xsl:attribute name="haspointer">true</xsl:attribute>
       </xsl:if>
       <xsl:copy-of select="@* | node()"/>
    </xsl:copy>
</xsl:template>

Also in XSLT 1 you should be able to do without a variable, by including
the "document()" function call in the match expression, but performance
might not be very good (because of the "//" step, and also because some
older XSLT processors might not cache the result of the "document()"
function call).

<xsl:template match="
    *[
       local-name()=document('http://www.tei-c.org/release/xml/tei/
odd/p5subset.xml')
          //t:elementSpec[t:attList/t:attDef/t:datatype/t:dataRef/@
key='teidata.pointer']
          /@ident
    ]
">
    <xsl:copy>
       <xsl:attribute name="haspointer">true</xsl:attribute>
       <xsl:copy-of select="@* | node()"/>
    </xsl:copy>
</xsl:template>


On 5 February 2017 at 09:17, Lou Burnard [hidden email]
wrote:

This information is not available from the document instance: you need to
look at the ODD from which the schema that validates the document instances
was generated.

If you're using a document that is valid against TEI all, this will list
all the elements concerned:

<xsl:template xmlns:t="http://www.tei-c.org/ns/1.0"
<http://www.tei-c.org/ns/1.0> match="/">
    <xsl:for-each select="document('http://www.*tei*-c.org/release/*xml*/
*tei*/odd/*p5subset*.*xml*l')//t:elementSpec[
t:attList/t:attDef/t:datatype/t:dataRef/@key='teidata.pointer']">
        <xsl:message><xsl:value-of select='@ident'/></xsl:message>
    </xsl:for-each>


On 04/02/17 22:52, John P. McCaskey wrote:

Running an XSL transform against a TEI document, I want to match all TEI
elements that contain at least one teidata.pointer attribute.

I can’t match just on attribute name. In some elements @value is a
pointer, in others it’s text. Sometimes @class is a pointer, sometimes not.

Is there a master list of attributes that can be pointers and the
elements in which they are?

(Even better if the list is in a form that makes it easy to use in an XSL
match. Double extra better if you can show me the very XSLT that will copy
the element and add an attribute “haspointer=’true’”!)

Thanks!
John




--
Conal Tuohy
http://conaltuohy.com/
@conal_tuohy
+61-466-324297 <0466%20324%20297>




Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: All teidata.pointer attributes

Syd Bauman-10
John --

Sorry it has taken so long for me to get to posting this. Was off at
the TEI Council meeting, a one day TEI conference, and a 3-day XML
conference.

Lou is indeed correct, to do this properly you need to chase class
references. (Because not only might a pointer attribute be a member
of a class, that class might be a member of a class.) You can do it
using any of the following as the base from which you figure out
which element & attribute pairs are pointers.

 * Your customization ODD -- very hard, as you have to
   "compile" the ODD by merging it with the P5 ODD first;

 * Your compiled ODD; or

 * Your RELAX NG schema.

No matter which base you choose, with XSLT 1 or XSLT 2 you have to
use a 2-step process. (I think you can do it in 1 step in XSLT 3, but
I'm not sure as I've never tried. :-)

Step 1 reads in the compiled ODD or the RELAX NG schema and writes
out an XSLT stylesheet. Step 2 runs that XSLT stylesheet using your
document instance as input. The output is (hopefully) exactly what
you want: a copy of that document instance decorated with attributes
that say "this element has a pointer attribute".

I have written sample implementations of step 1 for both of those
inputs. You can find them at
  http://paramedic.wwp.neu.edu/~syd/temp/TEI/JPM_generate_ptr_attr_flagger_from_ODD.xslt
and
  http://paramedic.wwp.neu.edu/~syd/temp/TEI/JPM_generate_ptr_attr_flagger_from_RNG.xslt

IIRC Lou has already explained how to get your hands on the RNG;[1]
and generating the compiled ODD can be a bit tricky, so I suspect
you're going to prefer the latter.[2] But Peter insisted I write the
former, too, as it is somewhat more appropriate and cleaner. (The RNG
version, e.g., relies on the fact that TEI attribute class names
start with "att.". So it's somewhat more fragile. On the other hand,
I don't think anyone wants to change that naming convention. :-)

Caveats:
 * I have not tested these carefully
 * These are not necessarily the "right" or "best" way to do this,
   nor particularly good examples of XSLT.[3]

If someone has a better way to do this, or can suggest improvements
to my XSLT, I'm all ears. If there is community interest I could
spiff these up a bit and put them up on the TEI wiki.

Anyway, there are a variety of ways of combining those two steps into
one, the most obvious of which is to use XProc. (But one could also
use a shell script, ant, or make.)

Notes
-----
 [1] Given that you use oXygen, run the "TEI ODD to RelaxNG XML"
     transformation scenario using your customization ODD as the input
     document. (If you don't see the various ODD transformations when
     you click "Configure Transformation Scenario(s)" (cmd-shift-C or
     ctl-shift-C), remember to choose "Show all scenarios" from the
     little blue gear in the upper right corner of the dialog box.) At
     the moment the output is put into a directory called out/ that is
     in the same directory as the input.
     
 [2] Furthermore, I've just discovered a minor problem with use of the
     compiled ODD, which some may consider a bug. The declarations for
     at least some elements that are never used are still in the ODD.
     (E.g., the specification of the <typeNote> element is still in
     the compiled "tei_drama.odd.odd" file, even though there is no
     <typeDesc> element, so <typeNote> can never occur.) They have
     been removed by the time you get to the RELAX NG. Worth noting
     that these won't change your output -- you just end up trying to
     match an element/attribute combination (e.g.,
     typeNote/@scribeRef) that can never be there. Oh well, there's
     another 18 microseconds of your life you will never get back. :-)

 [3] E.g., I suspect that in the RNG version, the template that
     matches "data" in "class" mode and the template named
     "chase-attr-class" could be combined into a single more readable
     template. And in that "data" mode "class" template it might be
     clearer to test for something to return by checking count($all)
     rather than the length of the joined string.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: All teidata.pointer attributes

John P. McCaskey-2
Thanks so much for this Syd. It is certainly more involved than I first thought.

I had been looking at the recursive code that generates the Attributes section of an element's description. That wasn't getting me anywhere fast. Then I did a quick inventory and realized the great majority of pointer attributes are such across all elements. So I just manually typed those in, handled the exceptions brute-force, and moved on.

What I have now, then, is fragile. Any schema changes will require manually tweaking my XSLT. When I get back to generalizing, I'll come back to what you have here—or maybe that wiki article. ;-)

Thanks again!
John


On 2/14/2017 2:58 AM, Syd Bauman wrote:
John --

Sorry it has taken so long for me to get to posting this. Was off at
the TEI Council meeting, a one day TEI conference, and a 3-day XML
conference.

Lou is indeed correct, to do this properly you need to chase class
references. (Because not only might a pointer attribute be a member
of a class, that class might be a member of a class.) You can do it
using any of the following as the base from which you figure out
which element & attribute pairs are pointers.

 * Your customization ODD -- very hard, as you have to
   "compile" the ODD by merging it with the P5 ODD first;

 * Your compiled ODD; or

 * Your RELAX NG schema.

No matter which base you choose, with XSLT 1 or XSLT 2 you have to
use a 2-step process. (I think you can do it in 1 step in XSLT 3, but
I'm not sure as I've never tried. :-)

Step 1 reads in the compiled ODD or the RELAX NG schema and writes
out an XSLT stylesheet. Step 2 runs that XSLT stylesheet using your
document instance as input. The output is (hopefully) exactly what
you want: a copy of that document instance decorated with attributes
that say "this element has a pointer attribute".

I have written sample implementations of step 1 for both of those
inputs. You can find them at
  http://paramedic.wwp.neu.edu/~syd/temp/TEI/JPM_generate_ptr_attr_flagger_from_ODD.xslt
and
  http://paramedic.wwp.neu.edu/~syd/temp/TEI/JPM_generate_ptr_attr_flagger_from_RNG.xslt

IIRC Lou has already explained how to get your hands on the RNG;[1]
and generating the compiled ODD can be a bit tricky, so I suspect
you're going to prefer the latter.[2] But Peter insisted I write the
former, too, as it is somewhat more appropriate and cleaner. (The RNG
version, e.g., relies on the fact that TEI attribute class names
start with "att.". So it's somewhat more fragile. On the other hand,
I don't think anyone wants to change that naming convention. :-)

Caveats:
 * I have not tested these carefully
 * These are not necessarily the "right" or "best" way to do this,
   nor particularly good examples of XSLT.[3]

If someone has a better way to do this, or can suggest improvements
to my XSLT, I'm all ears. If there is community interest I could
spiff these up a bit and put them up on the TEI wiki.

Anyway, there are a variety of ways of combining those two steps into
one, the most obvious of which is to use XProc. (But one could also
use a shell script, ant, or make.)

Notes
-----
 [1] Given that you use oXygen, run the "TEI ODD to RelaxNG XML"
     transformation scenario using your customization ODD as the input
     document. (If you don't see the various ODD transformations when
     you click "Configure Transformation Scenario(s)" (cmd-shift-C or
     ctl-shift-C), remember to choose "Show all scenarios" from the
     little blue gear in the upper right corner of the dialog box.) At
     the moment the output is put into a directory called out/ that is
     in the same directory as the input.
     
 [2] Furthermore, I've just discovered a minor problem with use of the
     compiled ODD, which some may consider a bug. The declarations for
     at least some elements that are never used are still in the ODD.
     (E.g., the specification of the <typeNote> element is still in
     the compiled "tei_drama.odd.odd" file, even though there is no
     <typeDesc> element, so <typeNote> can never occur.) They have
     been removed by the time you get to the RELAX NG. Worth noting
     that these won't change your output -- you just end up trying to
     match an element/attribute combination (e.g.,
     typeNote/@scribeRef) that can never be there. Oh well, there's
     another 18 microseconds of your life you will never get back. :-)

 [3] E.g., I suspect that in the RNG version, the template that
     matches "data" in "class" mode and the template named
     "chase-attr-class" could be combined into a single more readable
     template. And in that "data" mode "class" template it might be
     clearer to test for something to return by checking count($all)
     rather than the length of the joined string.

Loading...