LaTex to TEI transformation tools?

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

LaTex to TEI transformation tools?

Rosanna Cantavella
Dear list,

Would anybody kindly tell me if advances in automatic conversion of files from Tex to TEI-XML have been made? We would be interested in converting a synoptic edition of the complete works of the Catalan poet Ausiàs March (many thousands of lines on many witnessess) using this process.

Enjoy your summer/winter! Best,

Rosanna
--

Prof. Rosanna Cantavella
Universitat de València
https://uv.academia.edu/RosannaCantavella
Life Member, Clare Hall University of Cambridge
Editor, Magnificat Cultura i Literatura Medievals
https://ojs.uv.es/index.php/MCLM/
Reply | Threaded
Open this post in threaded view
|

Re: LaTex to TEI transformation tools?

Imsieke, Gerrit, le-tex
Hi Rosanna,

We have written a custom XSLT that is based on a customized LaTeXML [1]
output, and its output conforms to a TEI customization that is extending
the DTA-Basisformat [2]. The custom XSLT is on Github [3], but it has
not been released as open source yet. The LaTeX input contains some
macro definitions that the original typesetters conceived, but it also
contains macros familiar to users of the (rel)edmac packages, such as
\edtext, \Afootnote, \lemma, and \Bfootnote.

So there’s a lot of “custom” in the paragraph above. I’d say if we
convinced BBAW to make our conversion public (or at least make it
available to you) and if we included our LaTeXML customization in the
git repo, you could use it as a starting point. But it’s nothing that
works out of the box, and I doubt that anything in this context will
work out of the box, mostly because of custom TeX macro definitions in
the input that LaTeXML does not support well yet, and because of
peculiarities in the target TEI specification.

An additional difficulty of our project is that there are many
mathematical formulas with occasional hacks to get critical apparatus
markup into the formulas. These hacks are problematic both in the source
and in the target: Often the LaTeX apparatus markup is not allowed in
math mode, and the output is MathML within TEI, and there may be no
<app>, <note>, etc. in this MathML markup.

Gerrit

[1] https://dlmf.nist.gov/LaTeXML/
[2] http://www.deutschestextarchiv.de/doku/basisformat/
[3] https://github.com/telota/Leibniz-TEI/tree/master/latexml2tei/xsl

On 06.08.2018 13:46, Rosanna Cantavella wrote:

> Dear list,
>
> Would anybody kindly tell me if advances in automatic conversion of files from Tex to TEI-XML have been made? We would be interested in converting a synoptic edition of the complete works of the Catalan poet Ausiàs March (many thousands of lines on many witnessess) using this process.
>
> Enjoy your summer/winter! Best,
>
> Rosanna
> --
>
> Prof. Rosanna Cantavella
> Universitat de València
> https://uv.academia.edu/RosannaCantavella
> Life Member, Clare Hall University of Cambridge
> Editor, Magnificat Cultura i Literatura Medievals
> https://ojs.uv.es/index.php/MCLM/
>

--
Gerrit Imsieke
Geschäftsführer / Managing Director
le-tex publishing services GmbH
Weissenfelser Str. 84, 04229 Leipzig, Germany
Phone +49 341 355356 110, Fax +49 341 355356 510
[hidden email], http://www.le-tex.de

Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930

Geschäftsführer / Managing Directors:
Gerrit Imsieke, Svea Jelonek, Thomas Schmidt
Reply | Threaded
Open this post in threaded view
|

Re: LaTex to TEI transformation tools?

Peter Flynn-8
In reply to this post by Rosanna Cantavella
On 06/08/18 12:46, Rosanna Cantavella wrote:
> Dear list,
>
> Would anybody kindly tell me if advances in automatic conversion of
> files from TeX to TEI-XML have been made?

Not that I am aware of, and certainly not for plain TeX. For
LaTeX-to-XML, see
http://latex.silmaril.ie/formattinginformation/latexto.html and the note
about Pandoc.

Iff the LaTeX is well-formed (no custom macros, no redefining of
primitives, etc) and uses the standard default commands as described in
the LaTeX Book, then Pandoc will do an excellent transformation to a
variety of XNL formats including TEI Simple.

However...(you saw that coming :-) most LaTeX is anything but
well-formed in that sense. On the VERY rare occasions when I have to do
this (and I have one pending at the moment), I use a combination of
Emacs macros, global replaces, Linux shell scripts, and maybe Pandoc.

> We would be interested in converting a synoptic edition of the
> complete works of the Catalan poet Ausiàs March (many thousands of
> lines on many witnessess) using this process.
If it is regularly-formed (does things the same way every time) then it
is certainly possible.

One route you might want to investigate, if the volume is large, and if
there is some money available, is using one of the many companies in the
Pacific Rim who do this kind of conversion.

///Peter
Reply | Threaded
Open this post in threaded view
|

Re: LaTex to TEI transformation tools?

Rosanna Cantavella
Thank you so much, Gerrit and Peter. What a luxury is this list, as I'm being responded by the best!

OK, I can see that, although there are no miracles yet, the process has more possibilities than it had some years ago.

We'll study your excellent proposals (thanks again for them!) and I'm sure we'll be able to make a start from here on these texts.

Best wishes,


Rosanna


> On 06/08/18 12:46, Rosanna Cantavella wrote:
> > Dear list,
> >
> > Would anybody kindly tell me if advances in automatic conversion of
> > files from TeX to TEI-XML have been made?
>
> Not that I am aware of, and certainly not for plain TeX. For
> LaTeX-to-XML, see
> http://latex.silmaril.ie/formattinginformation/latexto.html and the note
> about Pandoc.
>
> Iff the LaTeX is well-formed (no custom macros, no redefining of
> primitives, etc) and uses the standard default commands as described in
> the LaTeX Book, then Pandoc will do an excellent transformation to a
> variety of XNL formats including TEI Simple.
>
> However...(you saw that coming :-) most LaTeX is anything but
> well-formed in that sense. On the VERY rare occasions when I have to do
> this (and I have one pending at the moment), I use a combination of
> Emacs macros, global replaces, Linux shell scripts, and maybe Pandoc.
>
> > We would be interested in converting a synoptic edition of the
> > complete works of the Catalan poet Ausiàs March (many thousands of
> > lines on many witnessess) using this process.
> If it is regularly-formed (does things the same way every time) then it
> is certainly possible.
>
> One route you might want to investigate, if the volume is large, and if
> there is some money available, is using one of the many companies in the
> Pacific Rim who do this kind of conversion.
>
> ///Peter
>


--

Prof. Rosanna Cantavella
Universitat de València
https://uv.academia.edu/RosannaCantavella
Life Member, Clare Hall University of Cambridge
Editor, Magnificat Cultura i Literatura Medievals
https://ojs.uv.es/index.php/MCLM/
Reply | Threaded
Open this post in threaded view
|

Re: LaTex to TEI transformation tools?

David Farmer

For the math part of (La)TeX, one idea that has been used successfully
is to use XML delimiters for math mode (such as < m > or < me > instead
of $ and $$ or \(\) and \[\] ), and leave the actual math in (La)TeX
markup. That keeps it human-readable, and no information has been lost.
Later you can convert just the math, in the unlikely event that you want
that content in MathML.

That is the approach used by the PreTeXt authoring markup language:

http://pretextbook.org

Regards,

David

ps.  You do have to convert the & and < in math mode, to \amp and \le .


On Tue, 7 Aug 2018, Rosanna Cantavella wrote:

> Thank you so much, Gerrit and Peter. What a luxury is this list, as I'm
> being responded by the best!
>
> OK, I can see that, although there are no miracles yet, the process has
> more possibilities than it had some years ago.
>
> We'll study your excellent proposals (thanks again for them!) and I'm
> sure we'll be able to make a start from here on these texts.
>
> Best wishes,
>
>
> Rosanna
>
>
>> On 06/08/18 12:46, Rosanna Cantavella wrote:
>>> Dear list,
>>>
>>> Would anybody kindly tell me if advances in automatic conversion of
>>> files from TeX to TEI-XML have been made?
>>
>> Not that I am aware of, and certainly not for plain TeX. For
>> LaTeX-to-XML, see
>> http://latex.silmaril.ie/formattinginformation/latexto.html and the note
>> about Pandoc.
>>
>> Iff the LaTeX is well-formed (no custom macros, no redefining of
>> primitives, etc) and uses the standard default commands as described in
>> the LaTeX Book, then Pandoc will do an excellent transformation to a
>> variety of XNL formats including TEI Simple.
>>
>> However...(you saw that coming :-) most LaTeX is anything but
>> well-formed in that sense. On the VERY rare occasions when I have to do
>> this (and I have one pending at the moment), I use a combination of
>> Emacs macros, global replaces, Linux shell scripts, and maybe Pandoc.
>>
>>> We would be interested in converting a synoptic edition of the
>>> complete works of the Catalan poet Ausiàs March (many thousands of
>>> lines on many witnessess) using this process.
>> If it is regularly-formed (does things the same way every time) then it
>> is certainly possible.
>>
>> One route you might want to investigate, if the volume is large, and if
>> there is some money available, is using one of the many companies in the
>> Pacific Rim who do this kind of conversion.
>>
>> ///Peter
>>
>
>
> --
>
> Prof. Rosanna Cantavella
> Universitat de València
> https://uv.academia.edu/RosannaCantavella
> Life Member, Clare Hall University of Cambridge
> Editor, Magnificat Cultura i Literatura Medievals
> https://ojs.uv.es/index.php/MCLM/
>