I’m setting up some kind of basic pipeline for converting word to TEI and I’m stuck with a couple of things.
teitodocx VSK.P11.docx test.xml
Warning: XML resolver not found; external catalogs will be ignored
What do these warnings actually mean?
/Users/ttasovac/Development/tei-stylesheets/docx/build-to.xml:35: The following error occurred while executing this line:
/Users/ttasovac/Development/tei-stylesheets/common/teianttasks.xml:159: net.sf.saxon.trans.XPathException: org.xml.sax.SAXParseException; systemId: file:/Users/ttasovac/Dropbox/-2-%20External%20Shares/VSK.P11/VSK.P11.docx; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.
Now, this content not allowed in prolog thing probably means that we have something before the xml declaration in the Word file. The docx file is a package cotaining many xml files and I don’t know which file exactly to start looking at. Any thoughts?
Prof. Dr. Elisabeth Burr
In reply to this post by Toma Tasovac-3
On 06/08/17 08:00, Toma Tasovac wrote:
This one means you haven't said where the convertor is to look for the P5 source. So it's going to get a copy from the Vault. Which is fine.
I have no idea what this one means, beyond what it actually says.
If you're converting a docx file, the useful content is in the file called document.xml inside the folder called word.
If you have a copy of oXygen, you can use that to step through the conversion which might be an easier way to track down where the problem is. I wrote a little tutorial (but it's in French) for that : http://lb42.github.io/oxyTransforms.html
If the command you're issuing is "teitodocx VSK.P11.docx test.xml", then it's probably not working because that command tries to turn a TEI file into a .docx file. You probably want docxtotei. As Lou says, the warnings aren't important. The first really only matters when you're working with ODDs, and maybe should be suppressed if you aren't. The second is a DTD thing, and maybe should also be suppressed. I haven't used an XML Catalog in over a decade I think...
All the best,
On Sun, Aug 6, 2017 at 5:06 PM, Lou Burnard <[hidden email]> wrote:
|Free forum by Nabble||Edit this page|