Has anyone converted tei.2 files (ideally, those from the Library of Congress's American Memory project, DTD developed circa 1997) to TEI P5? I'm looking at a batch of ~1,700 files with transcripts of oral history interviews , and while I could rig up a script to perform a basic conversion, I thought I'd ask if there was any previous work in this area.
Walking back to my hotel at the TEI Council meeting in Prague. There is a p4top5 stylesheet at
It isn't perfect especially where ppl have customised and the p5 it produces might be slightly out of date, but that is where is start if someone hasn't converted it already.
Dr James Cummings, Academic IT Services, University of Oxford
On 7 Feb 2017 4:10 p.m., Joe Wicentowski <[hidden email]> wrote:
There is an entire category of xsl scripts in the wiki for handling different aspects of this transition. http://wiki.tei-c.org/index.php/Category:P4toP5
The one I personally used is at http://wiki.tei-c.org/index.php/P4toP5NZETC and is a mix of strict migration issues, fixing local issues and fixing things in the text that weren't checked for by the previous schema.
...let us be heard from red core to black sky
On 8 February 2017 at 04:20, James Cummings <[hidden email]> wrote:
Thanks, James, Stuart, and those who replied off list. These look like great resources.
Having only started working with TEI just after the release of P5 in late 2007, I blindly assumed that these LOC documents, whose root element was <TEI2>, were necessarily circa P2. But I gather from the comments that these appear P4-esque.
I now see from the Vault that P4 used <TEI.2>, P3 used <tei.2>, P2 used <TEI.2>, and P1 used <teidoc> (!). Some guideposts for the TEI forensicists...
The DTD comments note that the DTD was updated to P3, but P3 didn't use <TEI.2>. The American Memory DTD notes: "LC staff renamed TEI element <TEI.2> as <TEI2>." And, to boot, the samples I've looked at use <tei2>, not <TEI2>; the entity files referenced in each TEI file are all missing; and the DOCTYPE declarations are encased in comments.
So it seems these LOC files represent a rather significant customization - some kind of hybrid.
On the plus side, the DTD is well documented and the TEI files themselves appear to be all well-formed XML - so assuming they're internally consistent, it shouldn't be too hard to pull the data out into a basic P5-compliant form.
On Tue, Feb 7, 2017 at 2:54 PM, Stuart A. Yeates <[hidden email]> wrote:
Just a brief update: I've completed a basic conversion of the data in question and posted the conversion script, along with the resulting data and generated eXist app (generated with TEI Publisher), at https://github.com/joewiz/adst with some screenshots and directions.
The direct link to the conversion script is: https://github.com/joewiz/adst/blob/master/import-scripts/04-convert-to-tei-p5.xq.
If you haven't seen TEI Publisher, its project homepage is http://teipublisher.com.
Thanks again for everyone's help,
On Tue, Feb 7, 2017 at 4:12 PM, Joe Wicentowski <[hidden email]> wrote:
|Free forum by Nabble||Edit this page|