If I've been following the argument about text-file formats correctly,
(and you can shoot me if I haven't), there's actually two distinct (well, semi-distinct) issues. The question of whether SGML-encoded files should be distributed in ASCII format or not is relevant to the idea that the TEI project is supposed to create a device and application independent coding scheme (I don't have the TEI guidelines yet, so I may have the details wrong). In order to achieve the goal of device-independent text interchange, files should be distributed in a form that (ideally, at least) any system could read. So far, plain ASCII text files come closest to that (as far as I know). Most word processors, editors and system utilities (like DOS TYPE) can input ASCII files, even if they use a different format for their own work. But since SGML is a coding scheme, even if the codes are ASCII, the particular application is not going to be able to make any use of the information encoded unless it has a filter/interpreter which understands the code. So on a practical level, the only software or devices that really need to be able to read SGML files are those that have the appropriate filters. Now, -some- standard file format -does- need to be defined, so that application programmers can build those filters into their software, and be able to make the software output files that other SGML-senstive software can make use of. |
BANG!
The TEI's goal isn't device-independent text interchange, it is presentation-independent content representation (for text interchange). `Devices' are concerned with tasks like `printing', which while a desirable capability for text--is not at all a necessary precondition for TEI text. It doesn't matter whether anyone knows how to print something--it is a question of whether they know what the information items in the text signify. Likewise, TEI text might just as well be described as database-independent or even application-independent (though I suppose that is a BIT strong as there is no telling what your application might be for text). ASCII is afterall only an alphabet. Saying a text is in ASCII is saying little more than that it uses alphanumerics and some punctuation. The TEI, for example, doesn't assume anything about the control characters--the ``unprintable'' characters. Even carriage-return is optional. In some sense the TEI and its standards go well beyond ASCII to assume only a printable subset of ASCII. SGML really doesn't care about `some file format' as it doesn't deal with physical things at all---of course, there is no such thing as an abstract magnetic medium and it matters when you render text machine-readable how and on what you enter it. However here the TEI doesn't intend to tell you how to render it machine-readable since the TEI doesn't intend to actually create any text--only the standards for the abstract (ASCII-subset) representation of the content in the text. The TEI is really just like the style guides you buy to help you write documents that conform to good writing practices. Style guides don't tell you whether to use a word processor or a typewriter or even a pencil and paper. They only address things like how to represent the name of a musical note in text, what the abbreviation for `und so weiter' is, what the difference between a figure and a table are, how to denote the elements in a two-level index entry, etc. |
Free forum by Nabble | Edit this page |