Generating stripspace.xsl.model from ODD

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Generating stripspace.xsl.model from ODD

Jakub Simek
Dear list members,

May I ask ask around again about generating the stripspace.xsl.model file in the TEI infrastructure? I have posted this question here in June and got no replies so far.

Could anyone please give me a hint how the list of TEI elements in which whitespace-only text nodes are considered as insignificant and can therefore be removed would be generated from an ODD file?

The current standard TEI-All list is contained in the stripspace.xsl.model file, as stated in https://www.tei-c.org/release/doc/tei-p5-doc/en/html/ST.html#STGAxs. This file can be found under http://www.tei-c.org/release/xml/tei/odd/stripspace.xsl.model.

But I wonder how I would generate such a list for a customized ODD where the content model of some elements has been changed (e.g. in order to prevent mixed content). I understand that I probably would need to compile my ODD to make it self-contained, but then it still contains references to macros in the content specifications of elements. So just looking for missing <textNode/> in the content models of elements does not do the job and one probably would need to resolve the macro references.

But is there perhaps already a script around to do this?

Best wishes,
Jakub

---
Jakub Simek
Heidelberg University Library
Reply | Threaded
Open this post in threaded view
|

Re: Generating stripspace.xsl.model from ODD

David Maus-2
Hi Jakub,

I just started to thrawl through the guidelines and stylesheets but
found out that stripspace.xsl.model is created by the XSL
Transformation 'odd2xslstripspace.xsl' in the Stylesheet package [1].

Maybe this is of some help?

Best,
  -- David

[https://github.com/TEIC/Stylesheets/blob/dev/odds/odd2xslstripspace.xsl]

On Mon, 14 Sep 2020 12:50:43 +0200,
Jakub Simek wrote:

>
> Dear list members,
>
> May I ask ask around again about generating the stripspace.xsl.model file in the TEI infrastructure? I have posted this question here in June and got no replies so far.
>
> Could anyone please give me a hint how the list of TEI elements in which whitespace-only text nodes are considered as insignificant and can therefore be removed would be generated from an ODD file?
>
> The current standard TEI-All list is contained in the stripspace.xsl.model file, as stated in https://www.tei-c.org/release/doc/tei-p5-doc/en/html/ST.html#STGAxs. This file can be found under http://www.tei-c.org/release/xml/tei/odd/stripspace.xsl.model.
>
> But I wonder how I would generate such a list for a customized ODD where the content model of some elements has been changed (e.g. in order to prevent mixed content). I understand that I probably would need to compile my ODD to make it self-contained, but then it still contains references to macros in the content specifications of elements. So just looking for missing <textNode/> in the content models of elements does not do the job and one probably would need to resolve the macro references.
>
> But is there perhaps already a script around to do this?
>
> Best wishes,
> Jakub
>
> ---
> Jakub Simek
> Heidelberg University Library

--
David Maus M.A.

Www: http://dmaus.name
Twitter: @_dmaus
Reply | Threaded
Open this post in threaded view
|

Re: Generating stripspace.xsl.model from ODD

Bauman, Syd
In reply to this post by Jakub Simek
Apologies. Not sure why I missed your very interesting question back in June. (It is obviously there in my Inbox, so it was me.)

My short answer is “I don’t know”, but I think it is something worth looking into. I don’t have time to do so right now, but hope to have something more intelligent to say over the weekend.

Please feel free to bug me directly (i.e., off-list, so as not to pester others) if you haven't heard from me by Tue 22 Sep.



May I ask ask around again about generating the stripspace.xsl.model file in the TEI infrastructure? I have posted this question here in June and got no replies so far.

Could anyone please give me a hint how the list of TEI elements in which whitespace-only text nodes are considered as insignificant and can therefore be removed would be generated from an ODD file?

The current standard TEI-All list is contained in the stripspace.xsl.model file, as stated in https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.tei-c.org%2Frelease%2Fdoc%2Ftei-p5-doc%2Fen%2Fhtml%2FST.html%23STGAxs&amp;data=02%7C01%7Cs.bauman%40NORTHEASTERN.EDU%7Ca6195b8f71374be4208b08d8589c0c32%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C637356774532841842&amp;sdata=W2fj1tj5vg62htc%2Fp7rIFHZVaqaiG5TfQ%2BA9zvk1kZ4%3D&amp;reserved=0. This file can be found under https://nam12.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.tei-c.org%2Frelease%2Fxml%2Ftei%2Fodd%2Fstripspace.xsl.model&amp;data=02%7C01%7Cs.bauman%40NORTHEASTERN.EDU%7Ca6195b8f71374be4208b08d8589c0c32%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C637356774532841842&amp;sdata=%2FIpaTe%2Bz6AalgjsNHZlx5G6yEIAH5%2Bf2HVkNUa7ijM0%3D&amp;reserved=0.

But I wonder how I would generate such a list for a customized ODD where the content model of some elements has been changed (e.g. in order to prevent mixed content). I understand that I probably would need to compile my ODD to make it self-contained, but then it still contains references to macros in the content specifications of elements. So just looking for missing <textNode/> in the content models of elements does not do the job and one probably would need to resolve the macro references.

But is there perhaps already a script around to do this?

Reply | Threaded
Open this post in threaded view
|

Re: Generating stripspace.xsl.model from ODD

Bauman, Syd
In reply to this post by Jakub Simek
Jakub —

The good news is I have figured out exactly how that list is generated. The bad news is I have no idea how that list is generated. :-)

That is, the file TEI/P5/stripspace.xsl.model is generated by running the program Stylesheets/odds/odd2xslstripspace.xsl with P5 (either p5.xml or p5subset.xml, should not matter) as input.

However, that program is somewhat complex and has no comments at all. Thus the joke above that I have no idea how it works.

In truth, I have a clue about the main structure, but have not figured out all the details. It reads in the P5 source and iterates over all of the <elementSpec> elements. For each one, it processes the declared content and, using various tests and algorithms (some of which are pretty simple, some of which I have not yet figured out) decides whether or not to add it to the list.

Anyway, to generate the appropriate list for a customized ODD, you would need to first generate the “compiled ODD” (also called the “processed”, “flattened”, “expanded”, or “merged” ODD) from the customization ODD; and then you would want feed that compiled ODD to the odd2stripspace.xsl program.

How you generate the compiled ODD depends a bit on how you already generate outputs from your customization ODD. It boils down to running Stylesheets/odds/odd2odd.xsl. E.g., if you are used to using the “teitorelax” program (with the --odd switch), you could just add a --debug switch. That will cause two changes to processing: first, thousands of lines of debugging output (which is probably not useful to you) and second an extra output file, INPUT.odd.processedodd, which will be exactly what you want.

HTH. Let me know if you need more information or assistance.



May I ask ask around again about generating the stripspace.xsl.model file in the TEI infrastructure? I have posted this question here in June and got no replies so far.

Could anyone please give me a hint how the list of TEI elements in which whitespace-only text nodes are considered as insignificant and can therefore be removed would be generated from an ODD file?

The current standard TEI-All list is contained in the stripspace.xsl.model file, as stated in https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.tei-c.org%2Frelease%2Fdoc%2Ftei-p5-doc%2Fen%2Fhtml%2FST.html%23STGAxs&amp;data=02%7C01%7Cs.bauman%40NORTHEASTERN.EDU%7Ca6195b8f71374be4208b08d8589c0c32%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C637356774532841842&amp;sdata=W2fj1tj5vg62htc%2Fp7rIFHZVaqaiG5TfQ%2BA9zvk1kZ4%3D&amp;reserved=0. This file can be found under https://nam12.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.tei-c.org%2Frelease%2Fxml%2Ftei%2Fodd%2Fstripspace.xsl.model&amp;data=02%7C01%7Cs.bauman%40NORTHEASTERN.EDU%7Ca6195b8f71374be4208b08d8589c0c32%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C637356774532841842&amp;sdata=%2FIpaTe%2Bz6AalgjsNHZlx5G6yEIAH5%2Bf2HVkNUa7ijM0%3D&amp;reserved=0.

But I wonder how I would generate such a list for a customized ODD where the content model of some elements has been changed (e.g. in order to prevent mixed content). I understand that I probably would need to compile my ODD to make it self-contained, but then it still contains references to macros in the content specifications of elements. So just looking for missing <textNode/> in the content models of elements does not do the job and one probably would need to resolve the macro references.

But is there perhaps already a script around to do this?