teitodocx warnings & errors

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

teitodocx warnings & errors

Toma Tasovac-3
Hi.

I’m setting up some kind of basic pipeline for converting word to TEI and I’m stuck with a couple of things. 

teitodocx VSK.P11.docx test.xml

1. Warnings

WARNING: No localsource set. Will get a copy from http://www.tei-c.org/Vault/P5/.

Warning: XML resolver not found; external catalogs will be ignored

What do these warnings actually mean?

2. Error

BUILD FAILED
/Users/ttasovac/Development/tei-stylesheets/docx/build-to.xml:35: The following error occurred while executing this line:
/Users/ttasovac/Development/tei-stylesheets/common/teianttasks.xml:159: net.sf.saxon.trans.XPathException: org.xml.sax.SAXParseException; systemId: file:/Users/ttasovac/Dropbox/-2-%20External%20Shares/VSK.P11/VSK.P11.docx; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.


Now, this content not allowed in prolog thing probably means that we have something before the xml declaration in the Word file. The docx file is a package cotaining many xml files and I don’t know which file exactly to start looking at. Any thoughts? 

All best,
T. 
--
Belgrade Center for Digital Humanities

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: teitodocx warnings & errors

Elisabeth Burr-2

Dear Toma,

this is just to avoid double work: there is already the DHConvalidator which does what you have in mind. At the ADHO steering meeting we will discuss how the DHConvalidator can be / will be developed further. I think, it would be great, if we could concentrate on one thing instead of developing new things all the time.

Best Elisabeth from Montréal

Zitat von Toma Tasovac <[hidden email]>:

Hi.
 
I’m setting up some kind of basic pipeline for converting word to TEI and I’m stuck with a couple of things. 
 
teitodocx VSK.P11.docx test.xml
 
1. Warnings
 
WARNING: No localsource set. Will get a copy from http://www.tei-c.org/Vault/P5/.
 
Warning: XML resolver not found; external catalogs will be ignored
 
What do these warnings actually mean?
 
2. Error
 
BUILD FAILED
/Users/ttasovac/Development/tei-stylesheets/docx/build-to.xml:35: The following error occurred while executing this line:
/Users/ttasovac/Development/tei-stylesheets/common/teianttasks.xml:159: net.sf.saxon.trans.XPathException: org.xml.sax.SAXParseException; systemId: file:/Users/ttasovac/Dropbox/-2-%20External%20Shares/VSK.P11/VSK.P11.docx; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.
 
 
Now, this content not allowed in prolog thing probably means that we have something before the xml declaration in the Word file. The docx file is a package cotaining many xml files and I don’t know which file exactly to start looking at. Any thoughts? 
 
All best,
T. 
--
Belgrade Center for Digital Humanities


Prof. Dr. Elisabeth Burr
Universität Leipzig
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: teitodocx warnings & errors

Lou Burnard-6
In reply to this post by Toma Tasovac-3
On 06/08/17 08:00, Toma Tasovac wrote:

WARNING: No localsource set. Will get a copy from http://www.tei-c.org/Vault/P5/.


This one means you haven't said where the convertor is to look for the P5 source. So it's going to get a copy from the Vault. Which is fine.

Warning: XML resolver not found; external catalogs will be ignored


I have no idea what this one means, beyond what it actually says.

What do these warnings actually mean?

2. Error

BUILD FAILED
/Users/ttasovac/Development/tei-stylesheets/docx/build-to.xml:35: The following error occurred while executing this line:
/Users/ttasovac/Development/tei-stylesheets/common/teianttasks.xml:159: net.sf.saxon.trans.XPathException: org.xml.sax.SAXParseException; systemId: file:/Users/ttasovac/Dropbox/-2-%20External%20Shares/VSK.P11/VSK.P11.docx; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.


Now, this content not allowed in prolog thing probably means that we have something before the xml declaration in the Word file. The docx file is a package cotaining many xml files and I don’t know which file exactly to start looking at. Any thoughts?

If you're converting a docx file, the useful content is in the file called document.xml inside the folder called word.

If you have a copy of oXygen, you can use that to step through the conversion which might be an easier way to track down where the problem is. I wrote a little tutorial (but it's in French) for that : http://lb42.github.io/oxyTransforms.html


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: teitodocx warnings & errors

Hugh Cayless-2
If the command you're issuing is "teitodocx VSK.P11.docx test.xml", then it's probably not working because that command tries to turn a TEI file into a .docx file. You probably want docxtotei. As Lou says, the warnings aren't important. The first really only matters when you're working with ODDs, and maybe should be suppressed if you aren't. The second is a DTD thing, and maybe should also be suppressed. I haven't used an XML Catalog in over a decade I think...

All the best,
Hugh

On Sun, Aug 6, 2017 at 5:06 PM, Lou Burnard <[hidden email]> wrote:
On 06/08/17 08:00, Toma Tasovac wrote:

WARNING: No localsource set. Will get a copy from http://www.tei-c.org/Vault/P5/.


This one means you haven't said where the convertor is to look for the P5 source. So it's going to get a copy from the Vault. Which is fine.

Warning: XML resolver not found; external catalogs will be ignored


I have no idea what this one means, beyond what it actually says.

What do these warnings actually mean?

2. Error

BUILD FAILED
/Users/ttasovac/Development/tei-stylesheets/docx/build-to.xml:35: The following error occurred while executing this line:
/Users/ttasovac/Development/tei-stylesheets/common/teianttasks.xml:159: net.sf.saxon.trans.XPathException: org.xml.sax.SAXParseException; systemId: file:/Users/ttasovac/Dropbox/-2-%20External%20Shares/VSK.P11/VSK.P11.docx; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.


Now, this content not allowed in prolog thing probably means that we have something before the xml declaration in the Word file. The docx file is a package cotaining many xml files and I don’t know which file exactly to start looking at. Any thoughts?

If you're converting a docx file, the useful content is in the file called document.xml inside the folder called word.

If you have a copy of oXygen, you can use that to step through the conversion which might be an easier way to track down where the problem is. I wrote a little tutorial (but it's in French) for that : http://lb42.github.io/oxyTransforms.html



Loading...