Interchange of different kinds

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

Interchange of different kinds

Julia Flanders
This may be a really silly question, or a silly way of approaching the  
topic, but as long as we're thinking things over I would like to get  
the thoughts of the group  on this:

To what extent does the TEI's successful functioning as an interchange  
language depend on its vocabulary and the way it associates semantics  
with specific terms, and to what extent does it depend on the specific  
structures (i.e. where elements can go and what they can contain) that  
are specified in the TEI schema?

In using the phrase "successful functioning" I don't mean to set aside  
the debate about whether it *does* function successfully as an  
interchange language--I'm really curious as to whether vocabulary or  
grammar is really the  key to whatever success we do attribute to, or  
seek from, the TEI.

It seems to me that the answer to this question might affect how we  
approach the problem of consistency and constraint. But I may be wrong  
about this and I'd be glad to know more about it than I do.

best, Julia
Reply | Threaded
Open this post in threaded view
|

Re: Interchange of different kinds

Martin Holmes
On 11-08-25 09:02 AM, Julia Flanders wrote:
> This may be a really silly question, or a silly way of approaching the
> topic, but as long as we're thinking things over I would like to get
> the thoughts of the group  on this:
>
> To what extent does the TEI's successful functioning as an interchange
> language depend on its vocabulary and the way it associates semantics
> with specific terms, and to what extent does it depend on the specific
> structures (i.e. where elements can go and what they can contain) that
> are specified in the TEI schema?

This is a really important point. For me, the vocabulary is key, and the
constraints are often annoyingly restrictive -- why can't element X
appear inside element Y, when my document has obvious instances of
feature X inside feature Y?

For instance, why can't I have this?

<div>
  <p></p>
  <head></head>
</div>

Here, I'm trying to use the vocabulary accurately to describe what
appears in my document (a heading after a paragraph), but the "grammar"
constraints are frustrating me, and I begin to wonder why they exist at
all. But there are strong feelings in the community about how documents
are expected to be structured, and many would, I'm sure, tell me that my
document _can't_ actually be like that; a <head> after a <p> must imply
a new <div>.

Cheers,
Martin

>
> In using the phrase "successful functioning" I don't mean to set aside
> the debate about whether it *does* function successfully as an
> interchange language--I'm really curious as to whether vocabulary or
> grammar is really the  key to whatever success we do attribute to, or
> seek from, the TEI.
>
> It seems to me that the answer to this question might affect how we
> approach the problem of consistency and constraint. But I may be wrong
> about this and I'd be glad to know more about it than I do.
>
> best, Julia
>

--
Martin Holmes
University of Victoria Humanities Computing and Media Centre
([hidden email])
Reply | Threaded
Open this post in threaded view
|

Re: Interchange of different kinds

Doug Reside
Very much agree with Martin.  Just yesterday, I really wanted to be
able to put two heads inside one div, because two distinct heading
clearly exist.
Doug


On Thu, Aug 25, 2011 at 1:05 PM, Martin Holmes <[hidden email]> wrote:

> On 11-08-25 09:02 AM, Julia Flanders wrote:
>>
>> This may be a really silly question, or a silly way of approaching the
>> topic, but as long as we're thinking things over I would like to get
>> the thoughts of the group  on this:
>>
>> To what extent does the TEI's successful functioning as an interchange
>> language depend on its vocabulary and the way it associates semantics
>> with specific terms, and to what extent does it depend on the specific
>> structures (i.e. where elements can go and what they can contain) that
>> are specified in the TEI schema?
>
> This is a really important point. For me, the vocabulary is key, and the
> constraints are often annoyingly restrictive -- why can't element X appear
> inside element Y, when my document has obvious instances of feature X inside
> feature Y?
>
> For instance, why can't I have this?
>
> <div>
>  <p></p>
>  <head></head>
> </div>
>
> Here, I'm trying to use the vocabulary accurately to describe what appears
> in my document (a heading after a paragraph), but the "grammar" constraints
> are frustrating me, and I begin to wonder why they exist at all. But there
> are strong feelings in the community about how documents are expected to be
> structured, and many would, I'm sure, tell me that my document _can't_
> actually be like that; a <head> after a <p> must imply a new <div>.
>
> Cheers,
> Martin
>
>>
>> In using the phrase "successful functioning" I don't mean to set aside
>> the debate about whether it *does* function successfully as an
>> interchange language--I'm really curious as to whether vocabulary or
>> grammar is really the  key to whatever success we do attribute to, or
>> seek from, the TEI.
>>
>> It seems to me that the answer to this question might affect how we
>> approach the problem of consistency and constraint. But I may be wrong
>> about this and I'd be glad to know more about it than I do.
>>
>> best, Julia
>>
>
> --
> Martin Holmes
> University of Victoria Humanities Computing and Media Centre
> ([hidden email])
>
Reply | Threaded
Open this post in threaded view
|

Re: Interchange of different kinds

Syd Bauman
> Very much agree with Martin. Just yesterday, I really wanted to be
> able to put two heads inside one div, because two distinct heading
> clearly exist.

Pray tell, what stopped you?

(Not the vanilla TEI schema, for sure, which permits any number of
<head> elements, which may be differentiated by type= and subtype=.)
Reply | Threaded
Open this post in threaded view
|

Re: Interchange of different kinds

Doug Reside
Using the tei_drama.dtd schema, Eclipse complained it wasn't valid.
Perhaps I should have used another flavor?

Doug


On Thu, Aug 25, 2011 at 2:30 PM, Syd Bauman <[hidden email]> wrote:
>> Very much agree with Martin. Just yesterday, I really wanted to be
>> able to put two heads inside one div, because two distinct heading
>> clearly exist.
>
> Pray tell, what stopped you?
>
> (Not the vanilla TEI schema, for sure, which permits any number of
> <head> elements, which may be differentiated by type= and subtype=.)
>
Reply | Threaded
Open this post in threaded view
|

Re: Interchange of different kinds

Mandell, Laura C. Dr.
In reply to this post by Julia Flanders
Julia:

I think that's a fascinating question.  It must be a little of both, yes?
Sometimes the TEI's grammatical functioning seems to be wiser than I am as
an editor--it makes me think about what I'm really designating as "X"
because "Xs" don't usually go inside "Ys."  And when that structure is
really irritating is when I'm trying to use TEI for something that it really
wasn't designed for coding (page layout, e.g.).   On the other hand, the
ease with which TEI of all sorts can be mapped onto NINES rdf is always a
joy to me: virtually everything else seems hard by comparison.  And that's a
matter of semantics (I think).

What do you make of the difference, in terms of the interoperability
question?

Laura



On 8/25/11 11:02 AM, "Julia Flanders" <[hidden email]> wrote:

> This may be a really silly question, or a silly way of approaching the
> topic, but as long as we're thinking things over I would like to get
> the thoughts of the group  on this:
>
> To what extent does the TEI's successful functioning as an interchange
> language depend on its vocabulary and the way it associates semantics
> with specific terms, and to what extent does it depend on the specific
> structures (i.e. where elements can go and what they can contain) that
> are specified in the TEI schema?
>
> In using the phrase "successful functioning" I don't mean to set aside
> the debate about whether it *does* function successfully as an
> interchange language--I'm really curious as to whether vocabulary or
> grammar is really the  key to whatever success we do attribute to, or
> seek from, the TEI.
>
> It seems to me that the answer to this question might affect how we
> approach the problem of consistency and constraint. But I may be wrong
> about this and I'd be glad to know more about it than I do.
>
> best, Julia
Reply | Threaded
Open this post in threaded view
|

Re: Interchange of different kinds

Sebastian Rahtz
In reply to this post by Doug Reside
On 25 Aug 2011, at 20:23, Doug Reside wrote:

> Using the tei_drama.dtd schema, Eclipse complained it wasn't valid.
> Perhaps I should have used another flavor?


I think you  need to show us your XML file. Multiple <head>s at the top of the <div>
are entirely normal. tei_drama is a very simple customization which just
selects some modules.

This:

<div>
 <head>...</head>
 <p>...</p>
 <head>...</head>
 ..
</div>

would not be legal.
--
Sebastian Rahtz      
Head of Information and Support Group, Oxford University Computing Services
13 Banbury Road, Oxford OX2 6NN. Phone +44 1865 283431

Sólo le pido a Dios
que el futuro no me sea indiferente
Reply | Threaded
Open this post in threaded view
|

Re: Interchange of different kinds

Julia Flanders
In reply to this post by Julia Flanders
...and, we should add, there's a good reason in this case why <head>  
following <p> is not legal: in TEI, the presence of a <head> signals  
the start of a new division.

This is relevant to Laura's point about TEI structures sometimes  
conveying a kind of wisdom--the structural requirements can serve in  
some cases as a kind of control on semantics as well. So if I look at  
a document and I think I see

<head>
<p>
<p>
<p>
<head>
<p>
<p>

the TEI wants to educate me to see that in TEI terms as

<div>
        <head/>
        <p/>
        <p/>
        <p/>
</div>
<div>
        <head/>
        <p/>
        <p/>
</div>

and this is reasonable, I think. But where the TEI prohibits me from  
putting a figure in an odd place because no one anticipated that any  
book would ever have a figure in such a place (or whatever), then  
that's a restraint I'd rather be without. The WWP early on had trouble  
with the content model of <note> because it couldn't contain the kinds  
of things that apparently early modern women wanted to put into notes.

 From the standpoint of interchange, though, I could imagine the  
following as a good situation:

1. People/projects for whom structure matters enough to take it  
seriously use the schema to constrain their document structures

2. Groups of people/projects who need to interchange data and use  
common tools agree on structural constraints; if these are eccentric,  
they use a customization to do so, and if they're really eccentric,  
that customization won't be conformant.

3. People/projects for whom structure doesn't matter, or who have  
other structural considerations in play, can still use the vocabulary  
meaningfully to say "I have one of those things you call an <epigraph>"

Some people writing tools to process TEI might want to take advantage  
of structural predictability ("Only process <author> inside  
<titleStmt>") and for them, conformance would matter.

Others (e.g. groups of consenting adults in item 2 above) might need  
to develop tools that take advantage of structural predictability of a  
sort not anticipated/covered by the TEI, and they could still use the  
customization mechanism to provide useful constraint in support of  
their own tools, and their data would still use the same recognizable  
vocabulary for non-structure-based interchange. (And the use of the  
customization mechanism would make the specifics of their deviance  
visible and to some extent processable.)

Still other tool-writers might only need consistent vocabularies to  
accomplish their goals: for instance, tools that are interested in  
searching for/processing named entities--they don't care whether my  
<persName>s are in an <lg> that is nested inside a <div> or not. For  
them, all of this data (conformant or not) is perfectly intelligible  
from an interchange perspective.

I think it's really useful to break down "interchange" into a more  
specific set of tasks and requirements and to think carefully about  
what conditions are really needed to bring it about. I also think that  
in many cases the real enemy of interchange isn't the design of the  
system (i.e. the TEI schema) but the implementation (i.e. the level of  
consistency achieved by any given project in its encoding) and I think  
it would be very helpful to:

--improve and extend the tools we offer people for creating  
customizations
--place greater emphasis on using customization as a way of avoiding  
inconsistency and enforcing local encoding decisions
--provide tools for the detection of inconsistency (e.g. Schematron-
for-novices). It's easy to say "oh, you can test for that with  
schematron" but that may be quite difficult at the moment for someone  
starting a small project with limited access to technical support.  
Simple GUI schematron editor for TEI, anyone?

Best, Julia

On Aug 25, 2011, at 3:32 PM, Sebastian Rahtz wrote:

> On 25 Aug 2011, at 20:23, Doug Reside wrote:
>
>> Using the tei_drama.dtd schema, Eclipse complained it wasn't valid.
>> Perhaps I should have used another flavor?
>
>
> I think you  need to show us your XML file. Multiple <head>s at the  
> top of the <div>
> are entirely normal. tei_drama is a very simple customization which  
> just
> selects some modules.
>
> This:
>
> <div>
> <head>...</head>
> <p>...</p>
> <head>...</head>
> ..
> </div>
>
> would not be legal.
> --
> Sebastian Rahtz
> Head of Information and Support Group, Oxford University Computing  
> Services
> 13 Banbury Road, Oxford OX2 6NN. Phone +44 1865 283431
>
> Sólo le pido a Dios
> que el futuro no me sea indiferente
Reply | Threaded
Open this post in threaded view
|

Re: Interchange of different kinds

Martin Holmes
On 11-08-25 01:38 PM, Julia Flanders wrote:

> ...and, we should add, there's a good reason in this case why<head>
> following<p>  is not legal: in TEI, the presence of a<head>  signals
> the start of a new division.
>
> This is relevant to Laura's point about TEI structures sometimes
> conveying a kind of wisdom--the structural requirements can serve in
> some cases as a kind of control on semantics as well. So if I look at
> a document and I think I see
>
> <head>
> <p>
> <p>
> <p>
> <head>
> <p>
> <p>
>
> the TEI wants to educate me to see that in TEI terms as
>
> <div>
> <head/>
> <p/>
> <p/>
> <p/>
> </div>
> <div>
> <head/>
> <p/>
> <p/>
> </div>
>
> and this is reasonable, I think.

I don't actually think so. If I'm encoding the document, I'm encoding MY
theory of the text, not the TEI's theory of it, and if I think my text
consists of a <div> that has a <head>, a brief paragraph with some kind
of motif or whatever, and then another <head>, I think I should be able
to do that.

But as your comments below illustrate, we all have particular bugbears
along these lines, all different; the overriding question for me is why
the TEI should be so resistant to loosening up such restrictions, more
or less as soon as anyone demonstrates a need for them. We can't
continually be telling people that the TEI understands the structure of
their document better than they do (or expressly forbids their
particular theory of its structure); that seems heavy-handed and
shortsighted to me.

On the other hand, allowing all tags everywhere would be a kind of
anarchy, I suppose, and Sebastian's job maintaining the stylesheets
would be impossible.

Cheers,
Martin

> But where the TEI prohibits me from
> putting a figure in an odd place because no one anticipated that any
> book would ever have a figure in such a place (or whatever), then
> that's a restraint I'd rather be without. The WWP early on had trouble
> with the content model of<note>  because it couldn't contain the kinds
> of things that apparently early modern women wanted to put into notes.
>
>   From the standpoint of interchange, though, I could imagine the
> following as a good situation:
>
> 1. People/projects for whom structure matters enough to take it
> seriously use the schema to constrain their document structures
>
> 2. Groups of people/projects who need to interchange data and use
> common tools agree on structural constraints; if these are eccentric,
> they use a customization to do so, and if they're really eccentric,
> that customization won't be conformant.
>
> 3. People/projects for whom structure doesn't matter, or who have
> other structural considerations in play, can still use the vocabulary
> meaningfully to say "I have one of those things you call an<epigraph>"
>
> Some people writing tools to process TEI might want to take advantage
> of structural predictability ("Only process<author>  inside
> <titleStmt>") and for them, conformance would matter.
>
> Others (e.g. groups of consenting adults in item 2 above) might need
> to develop tools that take advantage of structural predictability of a
> sort not anticipated/covered by the TEI, and they could still use the
> customization mechanism to provide useful constraint in support of
> their own tools, and their data would still use the same recognizable
> vocabulary for non-structure-based interchange. (And the use of the
> customization mechanism would make the specifics of their deviance
> visible and to some extent processable.)
>
> Still other tool-writers might only need consistent vocabularies to
> accomplish their goals: for instance, tools that are interested in
> searching for/processing named entities--they don't care whether my
> <persName>s are in an<lg>  that is nested inside a<div>  or not. For
> them, all of this data (conformant or not) is perfectly intelligible
> from an interchange perspective.
>
> I think it's really useful to break down "interchange" into a more
> specific set of tasks and requirements and to think carefully about
> what conditions are really needed to bring it about. I also think that
> in many cases the real enemy of interchange isn't the design of the
> system (i.e. the TEI schema) but the implementation (i.e. the level of
> consistency achieved by any given project in its encoding) and I think
> it would be very helpful to:
>
> --improve and extend the tools we offer people for creating
> customizations
> --place greater emphasis on using customization as a way of avoiding
> inconsistency and enforcing local encoding decisions
> --provide tools for the detection of inconsistency (e.g. Schematron-
> for-novices). It's easy to say "oh, you can test for that with
> schematron" but that may be quite difficult at the moment for someone
> starting a small project with limited access to technical support.
> Simple GUI schematron editor for TEI, anyone?
>
> Best, Julia
>
> On Aug 25, 2011, at 3:32 PM, Sebastian Rahtz wrote:
>
>> On 25 Aug 2011, at 20:23, Doug Reside wrote:
>>
>>> Using the tei_drama.dtd schema, Eclipse complained it wasn't valid.
>>> Perhaps I should have used another flavor?
>>
>>
>> I think you  need to show us your XML file. Multiple<head>s at the
>> top of the<div>
>> are entirely normal. tei_drama is a very simple customization which
>> just
>> selects some modules.
>>
>> This:
>>
>> <div>
>> <head>...</head>
>> <p>...</p>
>> <head>...</head>
>> ..
>> </div>
>>
>> would not be legal.
>> --
>> Sebastian Rahtz
>> Head of Information and Support Group, Oxford University Computing
>> Services
>> 13 Banbury Road, Oxford OX2 6NN. Phone +44 1865 283431
>>
>> Sólo le pido a Dios
>> que el futuro no me sea indiferente
> .
>

--
Martin Holmes
University of Victoria Humanities Computing and Media Centre
([hidden email])
Reply | Threaded
Open this post in threaded view
|

Re: Interchange of different kinds

Sebastian Rahtz
> I don't actually think so. If I'm encoding the document, I'm encoding MY
> theory of the text, not the TEI's theory of it, and if I think my text
> consists of a <div> that has a <head>, a brief paragraph with some kind
> of motif or whatever, and then another <head>, I think I should be able
> to do that.

Sounds like your motif is one of a class of objects the TEI classifies as model.divTop,
and which includes <head>. So you are not disagreeing with the TEI,
and it supports what you want to do.

>
> the overriding question for me is why
> the TEI should be so resistant to loosening up such restrictions, more
> or less as soon as anyone demonstrates a need for them.

I don't think the TEI _is_ "so resistant".  Feature requests come  in, they are looked
at by the Council as is right and proper, and usually agreed to once its clear
(often after dialogue with the originator) that a) the need is genuine, not a misunderstanding,
and b) it does not break anything. The number of occasions when the TEI Elder Statesmen
say "pshaw! that does not correspond to Our Theory of Text" is really rather small.

>
> On the other hand, allowing all tags everywhere would be a kind of
> anarchy, I suppose, and Sebastian's job maintaining the stylesheets
> would be impossible.


because I deal in presentation, often to HTML, what confuses me is when things cross
the boundary in unforeseen ways between the broad groups of div-like, block-like, inline, and floating.
If suddenly one has to allow for <div> inside <del>, it's pretty unpleasant.  And what really makes
me sweat is recursive things like

<p> ..  <note> <p>.... <q>... <p> ... <q> .<note>...</note>..</q> ...</p> ...</q> ... </p> .. </note> .. </p>

which I am sure is legal TEI (and almost certainly used in TCP texts :-} ) - the problem there is the lack
of clues over how <note> and <q> are to be regarded in presentation.

But I do like Julia's point that people merely mining a <TEI> for all occurrences
of <persName> is an entirely reasonable usecase, and an occasion when
content model restrictions are not needed
--
Sebastian Rahtz      
Head of Information and Support Group, Oxford University Computing Services
13 Banbury Road, Oxford OX2 6NN. Phone +44 1865 283431

Sólo le pido a Dios
que el futuro no me sea indiferente
Reply | Threaded
Open this post in threaded view
|

Re: Interchange of different kinds

Doug Reside
In reply to this post by Sebastian Rahtz
I think I must have had a tag in between two heads.  I'm trying to
reproduce the problem now and it validates nicely as you say (unless I
stick a tag in-between somewhere).  I stand corrected (and hang my
perfectly-valid multiple heads in shame)

Doug




On Thu, Aug 25, 2011 at 3:32 PM, Sebastian Rahtz
<[hidden email]> wrote:

>
> On 25 Aug 2011, at 20:23, Doug Reside wrote:
>
>> Using the tei_drama.dtd schema, Eclipse complained it wasn't valid.
>> Perhaps I should have used another flavor?
>
>
> I think you  need to show us your XML file. Multiple <head>s at the top of the <div>
> are entirely normal. tei_drama is a very simple customization which just
> selects some modules.
>
> This:
>
> <div>
>  <head>...</head>
>  <p>...</p>
>  <head>...</head>
>  ..
> </div>
>
> would not be legal.
> --
> Sebastian Rahtz
> Head of Information and Support Group, Oxford University Computing Services
> 13 Banbury Road, Oxford OX2 6NN. Phone +44 1865 283431
>
> Sólo le pido a Dios
> que el futuro no me sea indiferente
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Interchange of different kinds

Espen S. Ore-6
In reply to this post by Martin Holmes
On 25.08.2011 19:05, Martin Holmes wrote:

> On 11-08-25 09:02 AM, Julia Flanders wrote:
>> This may be a really silly question, or a silly way of approaching the
>> topic, but as long as we're thinking things over I would like to get
>> the thoughts of the group on this:
>>
>> To what extent does the TEI's successful functioning as an interchange
>> language depend on its vocabulary and the way it associates semantics
>> with specific terms, and to what extent does it depend on the specific
>> structures (i.e. where elements can go and what they can contain) that
>> are specified in the TEI schema?
>
> This is a really important point. For me, the vocabulary is key, and the
> constraints are often annoyingly restrictive -- why can't element X
> appear inside element Y, when my document has obvious instances of
> feature X inside feature Y?
>
> For instance, why can't I have this?
>
> <div>
> <p></p>
> <head></head>
> </div>
>

And not to forget our old friend:

<div>
<p></p>
<div><p></p></div>
<p></p>
</div>

And floatingText is not the same as a <div>, so it has a different
content model and is not an equivalent.

Espen Ore
University of Oslo