<text type=?> - <div type=?>

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

<text type=?> - <div type=?>

Georg Vogeler-2
Hallo,

A similar case might have been already discussed, but I wasn't able to
find an answer in the P5-Guidelines and the archive:

I'm encoding the scholarly editions of a collection of medieval
charters. In the view of the historian and the medieval scribe, each
charter was a single text with legal value by it's own right.
Similiarily the editing scholar and the medieval scribe take the charter
as a part of collection, sometimes refering to an earlier inscribed text
("testes ut supra" ...).
In terms of TEI-P5: would the edition be a group of texts
(text/group/text) oder one single text with divisions (text/div)?

Best

Georg

Reply | Threaded
Open this post in threaded view
|

Re: <text type=?> - <div type=?>

Peter Boot-2
Hello Georg,

At the Huygens Institute we encode medieval miscellanies and consider
each text to be a <text> rather than a <div>. I'd say your charters are
<text>s too.

Peter

Georg Vogeler wrote:

> Hallo,
>
> A similar case might have been already discussed, but I wasn't able to
> find an answer in the P5-Guidelines and the archive:
>
> I'm encoding the scholarly editions of a collection of medieval
> charters. In the view of the historian and the medieval scribe, each
> charter was a single text with legal value by it's own right.
> Similiarily the editing scholar and the medieval scribe take the charter
> as a part of collection, sometimes refering to an earlier inscribed text
> ("testes ut supra" ...).
> In terms of TEI-P5: would the edition be a group of texts
> (text/group/text) oder one single text with divisions (text/div)?
>
> Best
>
> Georg
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: <text type=?> - <div type=?>

Dot Porter
In reply to this post by Georg Vogeler-2
I'm working on a collection of carolingian laws, and we're using <div>
for each law. But these laws are divided into books, and each book is
its own <text>. I'd say it depends on how much of a collection your
collection is. If there is an internal organization, better to use
<div>; if it's more random, use <text>.

Dot

On 6/27/06, Georg Vogeler <[hidden email]> wrote:

> Hallo,
>
> A similar case might have been already discussed, but I wasn't able to
> find an answer in the P5-Guidelines and the archive:
>
> I'm encoding the scholarly editions of a collection of medieval
> charters. In the view of the historian and the medieval scribe, each
> charter was a single text with legal value by it's own right.
> Similiarily the editing scholar and the medieval scribe take the charter
> as a part of collection, sometimes refering to an earlier inscribed text
> ("testes ut supra" ...).
> In terms of TEI-P5: would the edition be a group of texts
> (text/group/text) oder one single text with divisions (text/div)?
>
> Best
>
> Georg
>


--
***************************************
Dot Porter, Program Coordinator
Collaboratory for Research in Computing for Humanities
University of Kentucky
351 William T. Young Library
Lexington, KY  40506

[hidden email]          859-257-9549
***************************************

Reply | Threaded
Open this post in threaded view
|

Re: <text type=?> - <div type=?>

Lou Burnard-5
In reply to this post by Georg Vogeler-2
This is one of those recurrent questions.

My view is that a div is by definition incomplete. So if you are dealing
with something that only makes sense if regarded as part of something
else and could not be taken as a free standing item, then it's a div.
Otherwise it is a text.

So I agree with Peter that it makes good sense to treat individual texts
in a miscellany, or individual charters in a cartulary, as distinct
texts. In the same way, I would advocate treating separate poems in a
collection of poems as texts, not divs.


fPeter Boot wrote:

> Hello Georg,
>
> At the Huygens Institute we encode medieval miscellanies and consider
> each text to be a <text> rather than a <div>. I'd say your charters are
> <text>s too.
>
> Peter
>
> Georg Vogeler wrote:
>
>
>> Hallo,
>>
>> A similar case might have been already discussed, but I wasn't able to
>> find an answer in the P5-Guidelines and the archive:
>>
>> I'm encoding the scholarly editions of a collection of medieval
>> charters. In the view of the historian and the medieval scribe, each
>> charter was a single text with legal value by it's own right.
>> Similiarily the editing scholar and the medieval scribe take the charter
>> as a part of collection, sometimes refering to an earlier inscribed text
>> ("testes ut supra" ...).
>> In terms of TEI-P5: would the edition be a group of texts
>> (text/group/text) oder one single text with divisions (text/div)?
>>
>> Best
>>
>> Georg
>>
>>
>>
>>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: <text type=?> - <div type=?>

ron.vandenbranden
Administrator
In reply to this post by Georg Vogeler-2
>
>So I agree with Peter that it makes good sense to treat individual texts
>in a miscellany, or individual charters in a cartulary, as distinct
>texts. In the same way, I would advocate treating separate poems in a
>collection of poems as texts, not divs.
>

For the encoding of the complete works of a 16th century Flemish poetess, I
am considering to going even further:
* treat every poem as <TEI> document
* treat every bundle as a <teiCorpus> element, including the poems with
entity references

This strategy is informed by the desire to assign every poem a high level of
autonomy, so that
1) poems can be transcribed and thus validated as unitary texts
2) poems that come in different versions in different bundles / manuscript
collections can in a later phase more easily be collated against each other

However, I am stumbling into problems with non-poem contents of bundles,
like title pages, prefatory matter, back matter. There doesn't seem to be an
option to encode <front> and <back> matter for an entire corpus. The only
thing I can think of within standard TEI, is to encode e.g. the title page,
table of contents,... as well as separate <TEI> documents, but then this has
to be done as <body> content. This does not seem very satisfactory if not
straight TEI-abuse. Or would it make sense to give up the <teiCorpus>
approach for the bundles, and instead encode every bundle as <TEI> element,
containing <group> elements in which links are encoded to the relevant poem
<text> parts?

Does anyone have a better suggestion to deal with either
1) front / back matter at <teiCorpus> level
2) the encoding of the same texts at different levels (as autonomous texts
AND parts of a bundle)

Those would be greatly appreciated,

Ron

Reply | Threaded
Open this post in threaded view
|

Re: <text type=?> - <div type=?>

Lou Burnard-5
In reply to this post by Georg Vogeler-2
Ron Van Den Branden wrote:

>> So I agree with Peter that it makes good sense to treat individual texts
>> in a miscellany, or individual charters in a cartulary, as distinct
>> texts. In the same way, I would advocate treating separate poems in a
>> collection of poems as texts, not divs.
>>
>>
>
> For the encoding of the complete works of a 16th century Flemish poetess, I
> am considering to going even further:
> * treat every poem as <TEI> document
> * treat every bundle as a <teiCorpus> element, including the poems with
> entity references
>
I am not sure that I understand correctly what you mean by "bundle". Is
a bundle a specific manuscript or early edition, for example?

If so, it would make sense to treat each one as a distinct <TEI>
element, with its own header, and a degree of autonomy. If it contains
its own front and back matter, that would be represented within the
outermost <text>; if it contains multiple poems or other things they
would be represented within a <group> which would replace the <body> of
the outermost text. See the discussion of <group> in the Guidelines for
some examples.

Then you can have a whole bunch of those inside a <teiCorpus>.

The mapping of all this structure to system entities (files etc) is a
different matter however.

Or have I completely misunderstood the problem?


> This strategy is informed by the desire to assign every poem a high level of
> autonomy, so that
> 1) poems can be transcribed and thus validated as unitary texts
> 2) poems that come in different versions in different bundles / manuscript
> collections can in a later phase more easily be collated against each other
>
> However, I am stumbling into problems with non-poem contents of bundles,
> like title pages, prefatory matter, back matter. There doesn't seem to be an
> option to encode <front> and <back> matter for an entire corpus. The only
> thing I can think of within standard TEI, is to encode e.g. the title page,
> table of contents,... as well as separate <TEI> documents, but then this has
> to be done as <body> content. This does not seem very satisfactory if not
> straight TEI-abuse. Or would it make sense to give up the <teiCorpus>
> approach for the bundles, and instead encode every bundle as <TEI> element,
> containing <group> elements in which links are encoded to the relevant poem
> <text> parts?
>
> Does anyone have a better suggestion to deal with either
> 1) front / back matter at <teiCorpus> level
> 2) the encoding of the same texts at different levels (as autonomous texts
> AND parts of a bundle)
>
> Those would be greatly appreciated,
>
> Ron
>
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: <text type=?> - <div type=?>

Gautier Poupeau
In reply to this post by Georg Vogeler-2
Hi,

Georg already knows my position about this question. I think like Peter
and Lou each charter represents a particular text, so I use  <text>
element for our edition (eg :
http://elec.enc.sorbonne.fr/cartulaireblanc/, cf
http://elec.enc.sorbonne.fr/cartulaireblanc/xml/tremblay.xml) too. The
use of this element allow to indicate the charters metadata (analysis,
datation, bibliography...) in <front> element.
I prefer this solution than <tei> element for each charter and
<teiCorpus> element to group each text, because I don't think a
cartulary or a book of charters are an anthology but rather than a
documentary unit.
But, it's true, the type attribut default for this element, it could be
interresting, when we encode a cartulary  whom contains charters and a
rental. Is it possible to add this attribute for <text> element in P5 ?
Best regards
Gautier Poupeau

Ron Van Den Branden a écrit :

>> So I agree with Peter that it makes good sense to treat individual texts
>> in a miscellany, or individual charters in a cartulary, as distinct
>> texts. In the same way, I would advocate treating separate poems in a
>> collection of poems as texts, not divs.
>>
>>
>
> For the encoding of the complete works of a 16th century Flemish poetess, I
> am considering to going even further:
> * treat every poem as <TEI> document
> * treat every bundle as a <teiCorpus> element, including the poems with
> entity references
>
> This strategy is informed by the desire to assign every poem a high level of
> autonomy, so that
> 1) poems can be transcribed and thus validated as unitary texts
> 2) poems that come in different versions in different bundles / manuscript
> collections can in a later phase more easily be collated against each other
>
> However, I am stumbling into problems with non-poem contents of bundles,
> like title pages, prefatory matter, back matter. There doesn't seem to be an
> option to encode <front> and <back> matter for an entire corpus. The only
> thing I can think of within standard TEI, is to encode e.g. the title page,
> table of contents,... as well as separate <TEI> documents, but then this has
> to be done as <body> content. This does not seem very satisfactory if not
> straight TEI-abuse. Or would it make sense to give up the <teiCorpus>
> approach for the bundles, and instead encode every bundle as <TEI> element,
> containing <group> elements in which links are encoded to the relevant poem
> <text> parts?
>
> Does anyone have a better suggestion to deal with either
> 1) front / back matter at <teiCorpus> level
> 2) the encoding of the same texts at different levels (as autonomous texts
> AND parts of a bundle)
>
> Those would be greatly appreciated,
>
> Ron
>
>

Reply | Threaded
Open this post in threaded view
|

Re: <text type=?> - <div type=?>

ron.vandenbranden
Administrator
In reply to this post by Georg Vogeler-2
Thank you Lou,

>I am not sure that I understand correctly what you mean by "bundle". Is
>a bundle a specific manuscript or early edition, for example?
>

Exactly. Sorry for the vagueness.

In this project, the poems are the central locus of interest. The poems have
considerable length, and those occurring in multiple sources (ie.
manuscripts / early editions) will be collated against each other. In the
first phase, they will be transcribed by different people, so it would be
good to be able to validate the poems autonomously.

>If so, it would make sense to treat each one as a distinct <TEI>
>element, with its own header, and a degree of autonomy. If it contains
>its own front and back matter, that would be represented within the
>outermost <text>; if it contains multiple poems or other things they
>would be represented within a <group> which would replace the <body> of
>the outermost text. See the discussion of <group> in the Guidelines for
>some examples.
>
>Then you can have a whole bunch of those inside a <teiCorpus>.
>

That would be a corpus of manuscripts / editions, which is not what we're
aiming at. The problem with the <group> approach, in my opinion, is that the
poems have to be <text> fragments, that can't stand on their own.

Ron

Reply | Threaded
Open this post in threaded view
|

Re: <text type=?> - <div type=?>

Michael Beddow-4
In reply to this post by Georg Vogeler-2
Ron Van Den Branden wrote:

>  I  am considering to going even further:
> * treat every poem as <TEI> document
> * treat every bundle as a <teiCorpus> element, including the
> poems with entity references

In this sort of case, I would tend to choose between  encoding as a
tei.Corpus with component documents on the one hand and a single document
with <group>ed <text>s on the other on the basis of metadata requirements.
The big difference between the corpus and the group approach is that the
latter allows only one teiHeader, whereas a corpus not only has its own
teiHeader on the container, but each constituent document has a teiHeader of
its own.  How much that matters is very much dependent on what the metadata
needs of the project in question are. Plainly, large corpora of generically,
historically, lunguistically and/or structurally diverse texts which
tei.Corpus is primarily intended to encode will benefit from separate
teiHeaders. The less diverse and divergent the metadata requirements of a
collection are, the less likely they are to need multiple teiHeaders.

And as you yourself mention, it is possible to use entity inclusion (or
XIncludes) to assemble <group>ed <text>s. Inclusion via such techniques
isn't confined to corpus encoding.

On the more specific issues

> This strategy is informed by the desire to assign every poem a high level
of
> autonomy, so that
> 1) poems can be transcribed and thus validated as unitary texts
> 2) poems that come in different versions in different bundles / manuscript
> collections can in a later phase more easily be collated against each
other

It seems to me that 1) is fully met by making each poem a <text> within a
group. I don't really see how treating each poem as a document within a
corpus would make any difference here. At the level of encoding management,
there is no reason why a transcriber-encoder shouldn't work on a document
instance where the <group> has only one <text> member if that seems
desirable. I don't see how 2) is made any easier by a multiple discrete
document approach, rather than a <group>ed mulitple <text> one. So I can't
really see the gains that would justify the strain placed upon the idea of a
corpus by the attempt to accommodate front and back matter at macro level.

Perhaps things would become clearer if you could say more about what you see
as the issues behind
> the encoding of the same texts at different levels (as autonomous texts
> AND parts of a bundle)
and in what respects <group>ing of <text>s is inadequate to meet what you
need to express.

Michael Beddow

Reply | Threaded
Open this post in threaded view
|

Re: <text type=?> - <div type=?>

ron.vandenbranden
Administrator
In reply to this post by Georg Vogeler-2
I agree that if metadata requirements are met, the <group> proposal seems
most natural. Yet, this seems to imply that the transcription has to occur
at collection level. Since only full <TEI> texts can be validated, there's
no way to validate single poems (ie. <text>s)?

>At the level of encoding management,
>there is no reason why a transcriber-encoder shouldn't work on a document
>instance where the <group> has only one <text> member if that seems
>desirable.

I'm afraid your hint isn't clear to me...

>Perhaps things would become clearer if you could say more about what you see
>as the issues behind
>> the encoding of the same texts at different levels (as autonomous texts
>> AND parts of a bundle)
>and in what respects <group>ing of <text>s is inadequate to meet what you
>need to express.
>

Well, some of the poems are unique; others occur in different versions. We
intend to create one parallell-segmented collated version for each of the
latter poems, thus 'unifying' different <text>s. A collection would then
consist of the unique poems + the unified versions. In this respect, it
seemed easier to think of the constituting parts as <TEI> documents. I must
confess, however, this is still at the conception phase, so I might be on
wrong tracks.

Ron

Reply | Threaded
Open this post in threaded view
|

Re: <text type=?> - <div type=?>

Lou Burnard-5
In reply to this post by Georg Vogeler-2
Ron Van Den Branden wrote:
> I agree that if metadata requirements are met, the <group> proposal seems
> most natural. Yet, this seems to imply that the transcription has to occur
> at collection level. Since only full <TEI> texts can be validated, there's
> no way to validate single poems (ie. <text>s)?
>

This isn't true, if by "validate"  you mean "what an XML
parser/validator does".

You can specify "text" as the start point when validating -- in relaxng
or XML DTD speak (Sorry i dont do XSD, but I'd be surprised if the same
facility isn't there too)


>
> Well, some of the poems are unique; others occur in different versions. We
> intend to create one parallell-segmented collated version for each of the
> latter poems, thus 'unifying' different <text>s. A collection would then
> consist of the unique poems + the unified versions. In this respect, it
> seemed easier to think of the constituting parts as <TEI> documents.

don't forget that <group> can be nested within <group>s.


  must
> confess, however, this is still at the conception phase, so I might be on
> wrong tracks.
>
> Ron
>
>

Reply | Threaded
Open this post in threaded view
|

Re: <text type=?> - <div type=?>

Michael Beddow-4
In reply to this post by Georg Vogeler-2
[RVDB]

>>>  this seems to imply that the transcription has to occur
>>> at collection level. Since only full <TEI> texts can be validated,
there's
>>>  no way to validate single poems (ie. <text>s)?

[MB]
>> At the level of encoding management,
>> there is no reason why a transcriber-encoder shouldn't work on a document
>> instance where the <group> has only one <text> member if that seems
>> desirable.

[RVDB]

>I'm afraid your hint isn't clear to me...

At the simplest level, I really meant only that a document instance like the
following is valid and can be edited with the full constraints of the DTD
applied by an editing application. Hence, if it is operationally convenient,
it's all an encoder needs to work on (and/or a validator module needs to
check).  Merging of such documents (or rather of their innermost <text>
elements) into larger documents (with real teiHeader content etc) can be
handled elsewhere in the workflow. And extraction of such stub documents
from a canonical repository with multiple <texts> is also an easily
automated task.  A very simple XSLT pass will do it: and it you had, say,
oXygen integrated with eXist (I know you are interested in the latter from
Another Place...) you could probably arrange for extraction and
re-integration of such stubs to be transparent to front-line encoders.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE TEI.2 SYSTEM "http://www.tei-c.org/P4X/DTD/tei2.dtd" [
<!ENTITY % TEI.XML   "INCLUDE" >
<!ENTITY % TEI.verse  "INCLUDE" >
]>
<TEI.2>
<teiHeader>
  <fileDesc><titleStmt><title></title></titleStmt>
  <editionStmt><p></p></editionStmt>
  <publicationStmt><p></p></publicationStmt>
  <sourceDesc><p></p></sourceDesc>
</fileDesc>
</teiHeader>
<text>
  <group>
    <text>
     <body>
      <lg>
         <l>A short poem</l>
     </lg>
     </body>
  </text>
</group>
</text>
</TEI.2>


Of course, you could add any text-specific front and back material into that
inmost <text> if desired. I don't see what would be gained by adopting a
fairly radically new markup scheme (single poem in a collection = TEI
document) or in what sense(s) it would allow validation that couldn't
equally well be done on the above model.

But maybe I haven't understood what is behind your remark about what can be
validated.

> Well, some of the poems are unique; others occur in different versions. We
> intend to create one parallell-segmented collated version for each of the
> latter poems, thus 'unifying' different <text>s. A collection would then
> consist of the unique poems + the unified versions. In this respect, it
> seemed easier to think of the constituting parts as <TEI> documents. I
must
> confess, however, this is still at the conception phase, so I might be on
> wrong tracks.

These are familiar enough goals for an electronic critical edition --  which
doesn't mean that they, or their encoding, are trivial to achieve, of
course; but I don't know of anyone who has found that the associated tasks
were made easier by a "corpus" approach per se. My hunch is that going down
that route might bring a bunch of additional problems while taking you into
territory where you would be less able to draw on the prior experience of
others.

Michael Beddow

Reply | Threaded
Open this post in threaded view
|

Re: <text type=?> - <div type=?>

Michael Beddow-4
In reply to this post by Georg Vogeler-2
Ron Van Den Branden wrote:

> In the
> first phase, they will be transcribed by different people, so it would be
> good to be able to validate the poems autonomously.
>

But you *can* do that, as I outlined in my previous posting.

The problem with the <group> approach, in my opinion, is that the
> poems have to be <text> fragments, that can't stand on their own.
>

Ditto. I am at a loss to see what the problem is, at least in this part of
the issue, which seems to be the driving point behind the notion of encoding
each single *poem* within a given "bundle" as a separate TEI document and
treating the bundle as a corpus. And anyway, how would you then bundle up
the separate bundles to create the edition.? By inventing a teiSuperCorpus
element?

Michael Beddow