Accents and character sets

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

Accents and character sets

Dear Colleagues,

A question has been posted on my list ([hidden email], a primarily
French-speaking list in Computational Linguistics), concerning the encoding of
accents in French. I tried to read the TEI guidelines for an answer, but I
found it difficult to understand what the answer is. I have the feeling that
you can in the end use any "standard" character set, provided it is properly
declared, but it is recommended not to do that, and use entities like é
etc. Am I correct in my interpretation?

If I am right, a French text would look like this:

La linguistique informatique modélise les phènomènes
liés à l'interprétation et à la production du
langage, de manière à etc.

I am not sure I'll be able to sell this to many of my subscribers...

Who could provide a nice, simple, straightforward and tutorial explanation of
the accent problem and its TEI solution?

Jean Veronis