SGML and the restoration of case information to a file

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

SGML and the restoration of case information to a file

Robert A Amsler-2
I'm working on restoring a text file that was keypunched in the 1960s
to complete upper/lower case ASCII. (It currently is ALL UPPER CASE).
I would like to `do it right' such that more than just the upper/lower
case, italics, accent marks, paragraph, headings are correctly
identified. I.e. capitalization seems to me to be a reflection of
a deeper semantic reason things are capitalized, such as that they
are a particular type of proper noun, name of a person, country, company,
etc. Italics likewise reflects roles such as book, play, movie, etc. titles;
foreign words, quoted material, emphasis, etc.

Does anyone have any suggestions as to how to do such tagging?  I.e.,
if one encounters something such as,


and wants to restore the reasons it should be capitalized as ,

`President Kennedy told Prime Minister Macmillian...'

(Note: it is Macmillian not MacMillian) what should one do?

Thus, I can markup the text as,

 <NationLeader country=USA>
  <Title> President </Title>
  <LName id=John_F._Kennedy> Kennedy </LName> </NationLeader>
 <NationLeader country=GB>
  <Title> Prime Minister </Title>
  <LName id=Harold_Macmillan> Macmillan </LName> </NationLeader>