My requests for inf. on markup-stripping text editors and on
automatic SGML markup have proved fruitful. To sum up what seems to emerge from the correspondence so far: 1. It is impossible to perform these functions quickly and easily on widely- used microcomputer word processors, as used by the majority of academics or others currently writing books or editing texts (where they are using computers at all). Remember, in the U.K. most people working in the humanities can't even afford a PC: they get by with an Amstrad PCW. The situation must be worse in e.g. India. 2. This restricts effective use of SGML markup at present, and reception of SGML-marked texts, to the elite who have access to more powerful systems, and the know-how and skill to use them. People who already have texts prepared by a word processor aren't going to be willing to spend months inserting SGML markup by hand; nor will those who want human-readable texts be willing to accept them in marked-up form, as Michael Hart points out. 3. Even on UNIX etc. systems, it seems doubtful whether the text editors are up to the straightforward tasks I mentioned. (To the average academic user, it will seem that the tasks *ought* to be straightforward, even if from a programming viewpoint they aren't). The optimistic suggestion of a simple strip-out has been refuted. How many of the people who are interested in *texts*, not computer programming, will be up to writing an Emacs macro? I suspect that the only effective way is a brute-force translation-table system, on the lines of that drafted by Richard Goerwitz. (And why hadn't Icon been mentioned by the experts before? This sort of thing is what it's for, surely). For two or three years now I have been using a general-purpose translation-table utility, running on a PC, for converting WP text, which we can't expect the authors to mark up for themselves, to generically-coded input for a variety of typesetting systems (and now DTP software). The SGML conversion is similar in principle, though a good deal more complex. I had to write the software myself as we couldn't find anything that would do the job on equipment that a small department could afford. Richard Goerwitz's program is much more elegant. What's needed now are draft translation tables which users can modify to their special requirements. Even then, users will have to be taught what strings to include to represent the 'hidden' wordprocessor codes, which they are not aware of in WYSIWYG systems. It's essential that the standard be easy to implement, even if it's flexible enough to cover a wide range of conceivable requirements. Otherwise people simply won't use it, any more than, for example, they follow standards on citation of documents by bibliographical references. Nor will most academics be willing to use a second text editor simply to handle SGML. They will want those which they are alreadly using to be upgraded to be compatible. Christopher |
Christopher,
I don't understand your pessimism at all. I particularly don't under- stand what the lack of PC's in India has to do with this. We can't start doing heavy-duty computing with complaining about lack of PC's in India, anyhow. Nothing good comes of complaining how bad the possibly worst conditions might be. I suggested that we sit down and figure out how certain things should look on a display unit with few, if any, modes of graphical rendition. Anders Thulin and others have brought forth the same idea, and every time it looks as if they are the first to do so, because each time, the idea is nearly drowned in "but it won't fit in/run on/etc my matchbox computer". Who cares? First we make it work, then we make it work on lots of machines. If it is not possible to make people accept that they won't get SGML software on their tiny portable Z80-based CP/M machines with a 14K floppy disk and nineteen bits of memory, how about using the minimization features of SGML to the extreme, where we accept a certain code to mean "<keyword>". I mean, SGML is *powerful*. Opinion: You won't have any use for SGML on your Amstrad PCW or Matchbox PC-2000, anyway, so why bother? > 3. Even on UNIX etc. systems, it seems doubtful whether the text > editors are up to the straightforward tasks I mentioned. (To the > average academic user, it will seem that the tasks *ought* to be > straightforward, even if from a programming viewpoint they > aren't). The optimistic suggestion of a simple strip-out has been > refuted. How many of the people who are interested in *texts*, > not computer programming, will be up to writing an Emacs macro? I wonder, do you think SGML is the worst thing to happen to texts? I'm interested in both texts and computer programming. I may be able to do some of the things that people who are interested in texts, only, would like to see happen. I see SGML as containing _information_, that should _not_ be stripped away. You can represent it in many different ways, of which SGML is but one, but you don't strip it out. I just don't get the main thrust of your suggestions. Some counter- suggestions to your "strip it" position: We can make an empty line mean paragraph start. We can make "_word_" mean "<emph>word</emph>". We can make ">>" mean "<head>" and make the "</head>" omissible in the DTD. So you would have: >> Easy Markup with SGML An easy way to achieve widespread use of SGML _now_ is to let lots of people use existing software for the task of entering text. The typewriter model has been used in the SGML document, for example. The already widespread use of word processors may allow us to increase the expectations of what software can do, provided that sufficiently powerful ideas about text management can make it to the producers of the widespead software. to mean: <head>Easy Markup with SGML</head> <p>An easy way to achieve widespread use of SGML <emph>now</emph> is to let lots of people ... </p> <p>The already widespread use of word ...</p> The former is undoubtedly easier to read, unless you have lots of training in reading text marked up with SGML. Finally, I think we're discussing a non-problem. Let's pick up on the important things, and get something which will make people see, by themselves, that it's smarter to use generalized markup than their existing specific things. I did it with a bunch of journalists in a financial newspaper, and they are very delighted with the results. They produce text both for fax transmission, electronic news to brokers and others, _and_ newspaper columns, and now they can enter the text only once. Major incentive. What do we have to offer people? What can we make TEI mean to people who use texts in more diverse ways than read them, or would do if it was available? We must _not_ let ourselves be entrenched in existing technology. It's not existing technology we're dealing with, in the first place. [Erik Naggum] Naggum Software, Oslo, Norway |
Free forum by Nabble | Edit this page |