Encoding key-value pairs in an <abstract>

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Encoding key-value pairs in an <abstract>

John P. McCaskey-2
What is the best way to encode abstracts that are essentially just key-value pairs?

Consider formulaic abstracts such as these:

This document is a [writ|deed|charter]. The court of authority is [papal|secular]. The witness to the document is [insert someone’s name].

The inventor was [insert name]. The researched drug was [insert chemical name]. The test facility was [hospital|military facility|university].

Only the parameters are important. I don’t really need the introductory words.

I’m looking for something like this:
<abstract>
        <list>
                <item type="type">charter</item>
                <item type="court">secular</item>
                <item type="witness">someone’s name</item>
        </list>
</abstract>

But @type isn’t an attribute of <item> and <item> seems semantically wrong. <keywords> doesn’t work for key-value pairs. <classCode> doesn’t seem right. <catRef> seems overly complicated.

What is the best approach, sticking to the stock schema?

--
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Encoding key-value pairs in an <abstract>

Martin Holmes
Hi there,

The way we do this sort of thing is:

1. Create a centralized set of <taxonomy> elements, one for each type of
classification we want to apply to documents, with <category>s for each
type -- so something like

<taxonomy xml:id="taxDocType">
   <category xml:id="dtWrit"><catDesc>Writ....</catDesc></category>
[...]
</taxonomy>

Then in the <textClass> of each document, add any relevant category
references:

<textClass>
   <catRef target="tdt:dtWrit"/>
[...]
</textClass>

The private URI prefix is documented using a <prefixDef> element.

The advantage of this is that you can easily combine a whole set of
different types of categorizations from different taxonomies; your
taxonomies are centrally defined, clearly documented, and easily
maintained; and in our case at least, we can process the taxonomies to
create <valList>s in our ODD file so that the range of options available
for e.g. catRef/@target is constrained and documented in the schema,
making encoding much easier.

You can see example taxonomies in this file:

<http://mapoflondon.uvic.ca/includes.xml>

and example catRefs in this:

<http://mapoflondon.uvic.ca/ABCH1.xml>

Hope this helps,
Martin

On 2016-12-08 10:24 AM, John P. McCaskey wrote:

> What is the best way to encode abstracts that are essentially just key-value pairs?
>
> Consider formulaic abstracts such as these:
>
> This document is a [writ|deed|charter]. The court of authority is [papal|secular]. The witness to the document is [insert someone’s name].
>
> The inventor was [insert name]. The researched drug was [insert chemical name]. The test facility was [hospital|military facility|university].
>
> Only the parameters are important. I don’t really need the introductory words.
>
> I’m looking for something like this:
> <abstract>
> <list>
> <item type="type">charter</item>
> <item type="court">secular</item>
> <item type="witness">someone’s name</item>
> </list>
> </abstract>
>
> But @type isn’t an attribute of <item> and <item> seems semantically wrong. <keywords> doesn’t work for key-value pairs. <classCode> doesn’t seem right. <catRef> seems overly complicated.
>
> What is the best approach, sticking to the stock schema?
>
> --
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Encoding key-value pairs in an <abstract>

Frederik Elwert
In reply to this post by John P. McCaskey-2
Dear John,

you should probably take a look at the feature structure system. That seems
to match you description.

Best,
Frederik



Am Donnerstag, 8. Dezember 2016 19:24:00 CET schrieb John P. McCaskey
<[hidden email]>:

> What is the best way to encode abstracts that are essentially
> just key-value pairs?
>
> Consider formulaic abstracts such as these:
>
> This document is a [writ|deed|charter]. The court of authority
> is [papal|secular]. The witness to the document is [insert
> someone’s name].
>
> The inventor was [insert name]. The researched drug was [insert
> chemical name]. The test facility was [hospital|military
> facility|university].
>
> Only the parameters are important. I don’t really need the
> introductory words.
>
> I’m looking for something like this:
> <abstract>
> <list>
> <item type="type">charter</item>
> <item type="court">secular</item>
> <item type="witness">someone’s name</item>
> </list>
> </abstract>
>
> But @type isn’t an attribute of <item> and <item> seems
> semantically wrong. <keywords> doesn’t work for key-value pairs.
> <classCode> doesn’t seem right. <catRef> seems overly
> complicated.
>
> What is the best approach, sticking to the stock schema?
>
> --
>


--
Dr. Frederik Elwert
Centrum für Religionswissenschaftliche Studien
Ruhr-Universität Bochum
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Encoding key-value pairs in an <abstract>

John P. McCaskey-2

Yes, that’s exactly what I’m looking for. There is a complete, general-purpose, data-typed key-value data structure in TEI and I never noticed it. Wow.

Maybe the name threw me off. I’ve seen “key-, name- or attribute-value” but never “feature-value”. Is that specific to linguistic analysis? Maybe “18 Feature Structures” could be edited to highlight the general purpose nature of this part of TEI.

Using feature-value in <profileDesc> could be very helpful to those describing documents in Omeka. But <fs> could be used only down in <profileDesc><textClass><classCode>. Could it be added to <textClass> or even <profileDesc>?

Thanks,
John


On 12/8/2016 2:12 PM, Frederik Elwert wrote:
Dear John,

you should probably take a look at the feature structure system. That seems to match you description.
Best,
Frederik



Am Donnerstag, 8. Dezember 2016 19:24:00 CET schrieb John P. McCaskey [hidden email]:
What is the best way to encode abstracts that are essentially just key-value pairs?

Consider formulaic abstracts such as these:

This document is a [writ|deed|charter]. The court of authority is [papal|secular]. The witness to the document is [insert someone’s name].

The inventor was [insert name]. The researched drug was [insert chemical name]. The test facility was [hospital|military facility|university].

Only the parameters are important. I don’t really need the introductory words.

I’m looking for something like this:
<abstract>
    <list>
        <item type="type">charter</item>
        <item type="court">secular</item>
        <item type="witness">someone’s name</item>
    </list>
</abstract>

But @type isn’t an attribute of <item> and <item> seems semantically wrong. <keywords> doesn’t work for key-value pairs. <classCode> doesn’t seem right. <catRef> seems overly complicated.

What is the best approach, sticking to the stock schema?

--




Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Encoding key-value pairs in an <abstract>

John P. McCaskey-2

To close this out, in case it helps anyone else:

I was too quick to dismiss <keywords>. Elli Mylonas reminded me that <term> can (of course!) have @type. So I can use

<textClass>
  <keywords>
    <term type="type">charter</term>
    <term type="court">secular</term>
    <term type="witness">someone’s name</term>
  </keywords>
</textClass> 

Thanks Eli.

John


On 12/8/2016 9:58 PM, John P. McCaskey wrote:

Yes, that’s exactly what I’m looking for. There is a complete, general-purpose, data-typed key-value data structure in TEI and I never noticed it. Wow.

Maybe the name threw me off. I’ve seen “key-, name- or attribute-value” but never “feature-value”. Is that specific to linguistic analysis? Maybe “18 Feature Structures” could be edited to highlight the general purpose nature of this part of TEI.

Using feature-value in <profileDesc> could be very helpful to those describing documents in Omeka. But <fs> could be used only down in <profileDesc><textClass><classCode>. Could it be added to <textClass> or even <profileDesc>?

Thanks,
John


On 12/8/2016 2:12 PM, Frederik Elwert wrote:
Dear John,

you should probably take a look at the feature structure system. That seems to match you description.
Best,
Frederik



Am Donnerstag, 8. Dezember 2016 19:24:00 CET schrieb John P. McCaskey [hidden email]:
What is the best way to encode abstracts that are essentially just key-value pairs?

Consider formulaic abstracts such as these:

This document is a [writ|deed|charter]. The court of authority is [papal|secular]. The witness to the document is [insert someone’s name].

The inventor was [insert name]. The researched drug was [insert chemical name]. The test facility was [hospital|military facility|university].

Only the parameters are important. I don’t really need the introductory words.

I’m looking for something like this:
<abstract>
    <list>
        <item type="type">charter</item>
        <item type="court">secular</item>
        <item type="witness">someone’s name</item>
    </list>
</abstract>

But @type isn’t an attribute of <item> and <item> seems semantically wrong. <keywords> doesn’t work for key-value pairs. <classCode> doesn’t seem right. <catRef> seems overly complicated.

What is the best approach, sticking to the stock schema?

--





Loading...