John Baima asks whether there is a BNF form of the TEI DTD. A while
back, Richard Goerwitz asked whether there was a BNF definition (actually, any 'concise syntactic summary' or 'clean, short summary') of (the TEI subset of) SGML. These are distinct questions, and both should be answered. Richard Goerwitz first. There is definitely a formal grammar defining the syntax of SGML -- ISO 8879 uses formal grammar productions to define the form of an SGML document. Though not strictly BNF, it's fairly close. The difficulties in writing a BNF equivalent for use in syntax-driven programs are that the grammar is clearly not written with automatic parser generators in mind, and some productions, while clear enough in their intent, present difficulties for automatic parser generation. Also a lot of details are conveyed only in the accompanying prose, not in the formal productions. In a couple of cases, the productions seem to me to be downright misleading and to contradict the prose (but I'm not really an expert). A formal description of the TEI subset of SGML does sound like a good idea; I've been working on something similar for a while, when I get the chance (i.e. rarely), and if there is serious interest I will try to finish it. John Baima's question I interpret to mean "is there a BNF definition for TEI documents?" and not "... for TEI DTDs", since the DTDs are described by the formal grammar of ISO 8879, and that part of the grammar is relatively clean and simple. In some sense, the formal grammar of ISO8879 describes TEI documents, and one should be able to parse TEI documents using it or some facsimile. Validating the documents, however, is more complicated. The DTD itself provides a formal description of TEI documents, using a regular-right-part grammar (that means the right hand of a production can have regular expressions, which is a slight enrichment over Backus's normal form, I think). Some other complications (notably inclusion and exclusion exceptions) can make the production of strict BNF equivalents rather complicated, and largely as a result I doubt that BNF parser generators are going to be as useful a tool for SGML validation as they are in other contexts. This is one reason many computer scientists shake their heads mournfully when you mention SGML to them. Since the document has the right to modify the standard TEI DTD in any case, any software for TEI validation must be able to parse from a formal grammar presented at run time -- this is like building yacc into your application program. It's not impossible, but it is simpler to work with an existing SGML processor. So: no BNF in the strict sense, but something close as to the structure of tags and content, and something less close as to the legal combinations of tags. Michael Sperberg-McQueen |
Free forum by Nabble | Edit this page |