SGML-like, having a DTD, and being SGML-conformant

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

SGML-like, having a DTD, and being SGML-conformant

Robert A Amsler-2
I hate to mention the OED2 again, but the issues of having SGML-tags,
having a DTD, and being SGML-conformant are worth elaborating upon.

Having SGML-tags is a somewhat generic property. It doesn't tell you a
document's pedigree any more than looking at a race horse tells you its
pedigree. There are indeed some documents, such as the OED2, which are only
in a so-called `SGML-like' markup, which means that they have  the
appearance of SGML with the usual <tag> ...  </tag> style of typically
content-based markup, but lack the official credentials of a DTD. Most of
you have seen text like this and noted ``So, THAT is SGML'', which
just goes to show that in text markup as other things, you can't
be certain of how something was made just by looking at it.

Having a DTD means that somewhere there is a description of the tags and
their legal attributes and values, how they may be nested inside one another
and how often they may appear. In short, the pedigree records. Finally,
being SGML-conformant means that a text has been TESTED relative to the DTD
by an SGML-parser and found to conform to the description given in the DTD.
I.e.  someone has certified that indeed the pedigree applies to THIS
individual document.

Now, to the OED2. The OED2 doesn't have a DTD since the printed work (the
OED and its supplements) was already written before the tags were invented
and added into its text. Since the people doing the tagging weren't free to
change anything (Gee, I think Murray made a mistake here, I'll just fix
it... No!!!) they couldn't guarantee that there was a simple DTD for the
work, i.e. they couldn't guarantee that they knew where they would have to
use every tag. This is significant since it may prove to be the case for
other works of historic origin, i.e.  works without any guarantee that they
are error-free and whose contents cannot be changed even if errors are
detected.