SGML and the restoration of case information to a file

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

SGML and the restoration of case information to a file

Robert A Amsler-2
I'm working on restoring a text file that was keypunched in the 1960s
to complete upper/lower case ASCII. (It currently is ALL UPPER CASE).
I would like to `do it right' such that more than just the upper/lower
case, italics, accent marks, paragraph, headings are correctly
identified. I.e. capitalization seems to me to be a reflection of
a deeper semantic reason things are capitalized, such as that they
are a particular type of proper noun, name of a person, country, company,
etc. Italics likewise reflects roles such as book, play, movie, etc. titles;
foreign words, quoted material, emphasis, etc.

Does anyone have any suggestions as to how to do such tagging?  I.e.,
if one encounters something such as,

`PRESIDENT KENNEDY TOLD PRIME MINISTER MACMILLIAN ...'

and wants to restore the reasons it should be capitalized as ,

`President Kennedy told Prime Minister Macmillian...'

(Note: it is Macmillian not MacMillian) what should one do?

Thus, I can markup the text as,

<Sentence>
 <NationLeader country=USA>
  <Title> President </Title>
  <LName id=John_F._Kennedy> Kennedy </LName> </NationLeader>
 told
 <NationLeader country=GB>
  <Title> Prime Minister </Title>
  <LName id=Harold_Macmillan> Macmillan </LName> </NationLeader>
   ...

Loading...