oh not not another message about calendars

classic Classic list List threaded Threaded
4 messages Options
lou
Reply | Threaded
Open this post in threaded view
|

oh not not another message about calendars

lou
Thanks to all for an interesting discussion! I will take the liberty of making a few comments on the points raised, and then (probably) shut up.

Christian provides a convincing use case:

"A user might want to find out: When did Humboldt stop to use the French Rev. calendar, although still in France during the revolution. So I will have to use @calendar to indicate which calendar is used in the source."

Precisely. The question turns on the way in which Humboldt (or whoever) expresses the date. Sometimes (I hypothesize) he uses the Revolutionary Calendar, sometimes the Gregorian. And sometimes -- I am guessing not so often -- he expresses a single date using both systems. To find all the Revolutionary cases we need a way of labelling the dates, and this is what @calendar provides.

"in the primary source text surrounded by the <date>-tag, which calendar is present? Answer: Both, as noted in the @calendar"

This is where we are disagreeing. What  does @calendar="#julian #gregorian" actually mean? For me, it means that the content is both Julian and Gregorian at the same time, which is imho nonsensical. For if it is  (say) Julian, it should have the same structure as other "pure" Julian dates, which it clearly does not -- it has two day numbers for a start.  And mutatis mutandis similarly for Gregorian. So I argue that we need a special "gregorian-and-julian" calendar label for cases like this,  where the content is not expressed purely in one or the other date form but instead in some special Humboldtian way, from which  a reader can derive an expression in either of the two calendars. 

Of course, it will be a lot simpler for downstream processing if the labels I use for all the calendars, pure or combined, make it possible to detect their affinities, for example by using substring matches. If I called the combined julian-revolutionary calendar something like "julian-revo" and the revolutionary calendar "revo", I can find every time Humboldt uses the revolutionary calendar by looking for the substring "revo".

And equally of course, this is a completely different issue from the need to provide a normalization of the date (or several), for which other attributes are provided, and concerning which no-one seems to disagree.

If I read him correctly, Martin Holmes suggests that the number of calendars to be defined risks being exponentially large, either because a writer (like Stow) uses differing styles within a single calendar system, or because there are many different possible combinations. But I think this is a misunderstanding. We only need to define an extra calendar for dates whose content is clearly and systematically expressed using more than one calendar convention.  And we only ever need to refer to the calendar system used by a specific example.  What evidence is there that any writer ever expresses or might express  a single date using all possible combinations? What sort of convention might they apply?  If they were to wish to do so, my guess is that they would actually use multiple date expressions.  But we need to see some examples. (The Stow case is interesting though because it suggests that we might want to go beyond simply identifying the calendar system, maybe using @rend to indicate subvariants in the way it is expressed)

Martin also raises the question of whether or not  multiple values for an attribute might be an issue for processing. But I think this is also a red herring: it makes no difference to my argument whether you express the calendar value  as "julian gregorian" or "#julGreg" or "#julian_gregorian". Just, please, not "#julian #gregorian"  or "#julian#gregorian", either of which will fail validation and therefore  break existing tools.

But I promised to shut up, so I will.

Lou


 


Reply | Threaded
Open this post in threaded view
|

Re: oh not not another message about calendars

Christian Thomas (HU Berlin)

Dear Lou, although I also (sort of) promised to shut up, and obviously arguments don't get stronger by repeating them, but since you ask, let me just quickly add one more thought:

Am 28.08.2020 um 11:02 schrieb Lou Burnard:
Thanks to all for an interesting discussion! I will take the liberty of making a few comments on the points raised, and then (probably) shut up.

Christian provides a convincing use case:

"A user might want to find out: When did Humboldt stop to use the French Rev. calendar, although still in France during the revolution. So I will have to use @calendar to indicate which calendar is used in the source."

Precisely. The question turns on the way in which Humboldt (or whoever) expresses the date. Sometimes (I hypothesize) he uses the Revolutionary Calendar, sometimes the Gregorian. And sometimes -- I am guessing not so often -- he expresses a single date using both systems. To find all the Revolutionary cases we need a way of labelling the dates, and this is what @calendar provides.

"in the primary source text surrounded by the <date>-tag, which calendar is present? Answer: Both, as noted in the @calendar"

This is where we are disagreeing. What  does @calendar="#julian #gregorian" actually mean? For me, it means that the content is both Julian and Gregorian at the same time, which is imho nonsensical. For if it is  (say) Julian, it should have the same structure as other "pure" Julian dates, which it clearly does not -- it has two day numbers for a start.  And mutatis mutandis similarly for Gregorian. So I argue that we need a special "gregorian-and-julian" calendar label for cases like this,  where the content is not expressed purely in one or the other date form but instead in some special Humboldtian way, from which  a reader can derive an expression in either of the two calendars.

to me, this is a matter of interpretation. When Humboldt writes: "Dienstag, 1./12. May", one being Greg and one beying Julian, has he written one date or two? I (if I were the editor of said manuscript) would answer this question with "one". You would answer "two", that's your interpretation, and that's completely fine.

Below you ask, "What evidence is there that any writer ever expresses or might express  a single date using all possible combinations? What sort of convention might they apply?". As for many interesting questions in our field, there is no undisputable evidence, my (transcription and) tagging of the source I edit is my interpretation (striving to be true to the source and 'objective', another impossible ideal, but a usable guiding star), expressed in TEI-XML, and offered to you human and machine readers for their analyses.

In the example above and in the hundred, maybe thousand other instances where Humboldt in his notebooks uses this "convention" of notation (the other thing you asked about), in my interpretation Humboldt is not taking down to different dates, but one. For this and for efficiency reasons, I want to use only one <date>-tag, but still express in some way that in the source both calendars are present. But I said that already, and will say no more.

Just one more thing: I do see your point, and actually think both suggestions -- @calendar="#julian #gregorian" and "@calendar="#julian-gregorian" -- are not ideal. But that's ok, I just happen to favour the first option, which is not an option, since the TEI at the moment does not offer it as an option. We can now discuss Martin's feature request https://github.com/TEIC/TEI/issues/2028 (thanks Martin!) and see how it goes (at the moment 2:2). In the meantime and for the edition humboldt digital, https://edition-humboldt.de/, we will use Dr. Frankenstein's "@calendar="#julian-gregorian", since we need to proceed within the current TEI frame.

Best wishes
Christian



Of course, it will be a lot simpler for downstream processing if the labels I use for all the calendars, pure or combined, make it possible to detect their affinities, for example by using substring matches. If I called the combined julian-revolutionary calendar something like "julian-revo" and the revolutionary calendar "revo", I can find every time Humboldt uses the revolutionary calendar by looking for the substring "revo".

And equally of course, this is a completely different issue from the need to provide a normalization of the date (or several), for which other attributes are provided, and concerning which no-one seems to disagree.

If I read him correctly, Martin Holmes suggests that the number of calendars to be defined risks being exponentially large, either because a writer (like Stow) uses differing styles within a single calendar system, or because there are many different possible combinations. But I think this is a misunderstanding. We only need to define an extra calendar for dates whose content is clearly and systematically expressed using more than one calendar convention.  And we only ever need to refer to the calendar system used by a specific example.  What evidence is there that any writer ever expresses or might express  a single date using all possible combinations? What sort of convention might they apply?  If they were to wish to do so, my guess is that they would actually use multiple date expressions.  But we need to see some examples. (The Stow case is interesting though because it suggests that we might want to go beyond simply identifying the calendar system, maybe using @rend to indicate subvariants in the way it is expressed)

Martin also raises the question of whether or not  multiple values for an attribute might be an issue for processing. But I think this is also a red herring: it makes no difference to my argument whether you express the calendar value  as "julian gregorian" or "#julGreg" or "#julian_gregorian". Just, please, not "#julian #gregorian"  or "#julian#gregorian", either of which will fail validation and therefore  break existing tools.

But I promised to shut up, so I will.

Lou


 



-- 
Christian Thomas
[hidden email]
@dta_cthomas
--
Reply | Threaded
Open this post in threaded view
|

Re: oh not not another message about calendars

Stuart A. Yeates
As my sole contribution to this discussion, allow me to recommend
"Calendrical Calculations" by Edward M. Reingold and Nachum
Dershowitz, which contains LISP code for converting between all these
calendars. Converting the LISP code to XSLT might be relatively
straightforward, since they're both functional languages.

At the very least reliable conversion enables easier spotting of errors.

[Disclaimer: My recommendation is based on the first edition and I see
there have been several since then.]

cheers
stuart
--
...let us be heard from red core to black sky

On Fri, 28 Aug 2020 at 21:51, Christian Thomas
<[hidden email]> wrote:

>
>
> Dear Lou, although I also (sort of) promised to shut up, and obviously arguments don't get stronger by repeating them, but since you ask, let me just quickly add one more thought:
>
> Am 28.08.2020 um 11:02 schrieb Lou Burnard:
>
> Thanks to all for an interesting discussion! I will take the liberty of making a few comments on the points raised, and then (probably) shut up.
>
> Christian provides a convincing use case:
>
> "A user might want to find out: When did Humboldt stop to use the French Rev. calendar, although still in France during the revolution. So I will have to use @calendar to indicate which calendar is used in the source."
>
> Precisely. The question turns on the way in which Humboldt (or whoever) expresses the date. Sometimes (I hypothesize) he uses the Revolutionary Calendar, sometimes the Gregorian. And sometimes -- I am guessing not so often -- he expresses a single date using both systems. To find all the Revolutionary cases we need a way of labelling the dates, and this is what @calendar provides.
>
> "in the primary source text surrounded by the <date>-tag, which calendar is present? Answer: Both, as noted in the @calendar"
>
> This is where we are disagreeing. What  does @calendar="#julian #gregorian" actually mean? For me, it means that the content is both Julian and Gregorian at the same time, which is imho nonsensical. For if it is  (say) Julian, it should have the same structure as other "pure" Julian dates, which it clearly does not -- it has two day numbers for a start.  And mutatis mutandis similarly for Gregorian. So I argue that we need a special "gregorian-and-julian" calendar label for cases like this,  where the content is not expressed purely in one or the other date form but instead in some special Humboldtian way, from which  a reader can derive an expression in either of the two calendars.
>
>
> to me, this is a matter of interpretation. When Humboldt writes: "Dienstag, 1./12. May", one being Greg and one beying Julian, has he written one date or two? I (if I were the editor of said manuscript) would answer this question with "one". You would answer "two", that's your interpretation, and that's completely fine.
>
> Below you ask, "What evidence is there that any writer ever expresses or might express  a single date using all possible combinations? What sort of convention might they apply?". As for many interesting questions in our field, there is no undisputable evidence, my (transcription and) tagging of the source I edit is my interpretation (striving to be true to the source and 'objective', another impossible ideal, but a usable guiding star), expressed in TEI-XML, and offered to you human and machine readers for their analyses.
>
> In the example above and in the hundred, maybe thousand other instances where Humboldt in his notebooks uses this "convention" of notation (the other thing you asked about), in my interpretation Humboldt is not taking down to different dates, but one. For this and for efficiency reasons, I want to use only one <date>-tag, but still express in some way that in the source both calendars are present. But I said that already, and will say no more.
>
> Just one more thing: I do see your point, and actually think both suggestions -- @calendar="#julian #gregorian" and "@calendar="#julian-gregorian" -- are not ideal. But that's ok, I just happen to favour the first option, which is not an option, since the TEI at the moment does not offer it as an option. We can now discuss Martin's feature request https://github.com/TEIC/TEI/issues/2028 (thanks Martin!) and see how it goes (at the moment 2:2). In the meantime and for the edition humboldt digital, https://edition-humboldt.de/, we will use Dr. Frankenstein's "@calendar="#julian-gregorian", since we need to proceed within the current TEI frame.
>
> Best wishes
> Christian
>
>
>
> Of course, it will be a lot simpler for downstream processing if the labels I use for all the calendars, pure or combined, make it possible to detect their affinities, for example by using substring matches. If I called the combined julian-revolutionary calendar something like "julian-revo" and the revolutionary calendar "revo", I can find every time Humboldt uses the revolutionary calendar by looking for the substring "revo".
>
> And equally of course, this is a completely different issue from the need to provide a normalization of the date (or several), for which other attributes are provided, and concerning which no-one seems to disagree.
>
> If I read him correctly, Martin Holmes suggests that the number of calendars to be defined risks being exponentially large, either because a writer (like Stow) uses differing styles within a single calendar system, or because there are many different possible combinations. But I think this is a misunderstanding. We only need to define an extra calendar for dates whose content is clearly and systematically expressed using more than one calendar convention.  And we only ever need to refer to the calendar system used by a specific example.  What evidence is there that any writer ever expresses or might express  a single date using all possible combinations? What sort of convention might they apply?  If they were to wish to do so, my guess is that they would actually use multiple date expressions.  But we need to see some examples. (The Stow case is interesting though because it suggests that we might want to go beyond simply identifying the calendar system, maybe using @rend to indicate subvariants in the way it is expressed)
>
> Martin also raises the question of whether or not  multiple values for an attribute might be an issue for processing. But I think this is also a red herring: it makes no difference to my argument whether you express the calendar value  as "julian gregorian" or "#julGreg" or "#julian_gregorian". Just, please, not "#julian #gregorian"  or "#julian#gregorian", either of which will fail validation and therefore  break existing tools.
>
> But I promised to shut up, so I will.
>
> Lou
>
>
>
>
>
>
> --
> Christian Thomas
> [hidden email]
> @dta_cthomas
> --
Reply | Threaded
Open this post in threaded view
|

Re: oh not not another message about calendars

Serge Heiden-2
There are other implementations of "Calendrical Calculations" in Java,
JavaScript, Python, Lua and C++:
https://github.com/moshekaplan/calendrical_calculations

best, --serge

Le 28/08/2020 à 22:24, Stuart A. Yeates a écrit :

> As my sole contribution to this discussion, allow me to recommend
> "Calendrical Calculations" by Edward M. Reingold and Nachum
> Dershowitz, which contains LISP code for converting between all these
> calendars. Converting the LISP code to XSLT might be relatively
> straightforward, since they're both functional languages.
>
> At the very least reliable conversion enables easier spotting of errors.
>
> [Disclaimer: My recommendation is based on the first edition and I see
> there have been several since then.]
>
> cheers
> stuart
> --
> ...let us be heard from red core to black sky
>
> On Fri, 28 Aug 2020 at 21:51, Christian Thomas
> <[hidden email]> wrote:
>>
>> Dear Lou, although I also (sort of) promised to shut up, and obviously arguments don't get stronger by repeating them, but since you ask, let me just quickly add one more thought:
>>
>> Am 28.08.2020 um 11:02 schrieb Lou Burnard:
>>
>> Thanks to all for an interesting discussion! I will take the liberty of making a few comments on the points raised, and then (probably) shut up.
>>
>> Christian provides a convincing use case:
>>
>> "A user might want to find out: When did Humboldt stop to use the French Rev. calendar, although still in France during the revolution. So I will have to use @calendar to indicate which calendar is used in the source."
>>
>> Precisely. The question turns on the way in which Humboldt (or whoever) expresses the date. Sometimes (I hypothesize) he uses the Revolutionary Calendar, sometimes the Gregorian. And sometimes -- I am guessing not so often -- he expresses a single date using both systems. To find all the Revolutionary cases we need a way of labelling the dates, and this is what @calendar provides.
>>
>> "in the primary source text surrounded by the <date>-tag, which calendar is present? Answer: Both, as noted in the @calendar"
>>
>> This is where we are disagreeing. What  does @calendar="#julian #gregorian" actually mean? For me, it means that the content is both Julian and Gregorian at the same time, which is imho nonsensical. For if it is  (say) Julian, it should have the same structure as other "pure" Julian dates, which it clearly does not -- it has two day numbers for a start.  And mutatis mutandis similarly for Gregorian. So I argue that we need a special "gregorian-and-julian" calendar label for cases like this,  where the content is not expressed purely in one or the other date form but instead in some special Humboldtian way, from which  a reader can derive an expression in either of the two calendars.
>>
>>
>> to me, this is a matter of interpretation. When Humboldt writes: "Dienstag, 1./12. May", one being Greg and one beying Julian, has he written one date or two? I (if I were the editor of said manuscript) would answer this question with "one". You would answer "two", that's your interpretation, and that's completely fine.
>>
>> Below you ask, "What evidence is there that any writer ever expresses or might express  a single date using all possible combinations? What sort of convention might they apply?". As for many interesting questions in our field, there is no undisputable evidence, my (transcription and) tagging of the source I edit is my interpretation (striving to be true to the source and 'objective', another impossible ideal, but a usable guiding star), expressed in TEI-XML, and offered to you human and machine readers for their analyses.
>>
>> In the example above and in the hundred, maybe thousand other instances where Humboldt in his notebooks uses this "convention" of notation (the other thing you asked about), in my interpretation Humboldt is not taking down to different dates, but one. For this and for efficiency reasons, I want to use only one <date>-tag, but still express in some way that in the source both calendars are present. But I said that already, and will say no more.
>>
>> Just one more thing: I do see your point, and actually think both suggestions -- @calendar="#julian #gregorian" and "@calendar="#julian-gregorian" -- are not ideal. But that's ok, I just happen to favour the first option, which is not an option, since the TEI at the moment does not offer it as an option. We can now discuss Martin's feature request https://github.com/TEIC/TEI/issues/2028 (thanks Martin!) and see how it goes (at the moment 2:2). In the meantime and for the edition humboldt digital, https://edition-humboldt.de/, we will use Dr. Frankenstein's "@calendar="#julian-gregorian", since we need to proceed within the current TEI frame.
>>
>> Best wishes
>> Christian
>>
>>
>>
>> Of course, it will be a lot simpler for downstream processing if the labels I use for all the calendars, pure or combined, make it possible to detect their affinities, for example by using substring matches. If I called the combined julian-revolutionary calendar something like "julian-revo" and the revolutionary calendar "revo", I can find every time Humboldt uses the revolutionary calendar by looking for the substring "revo".
>>
>> And equally of course, this is a completely different issue from the need to provide a normalization of the date (or several), for which other attributes are provided, and concerning which no-one seems to disagree.
>>
>> If I read him correctly, Martin Holmes suggests that the number of calendars to be defined risks being exponentially large, either because a writer (like Stow) uses differing styles within a single calendar system, or because there are many different possible combinations. But I think this is a misunderstanding. We only need to define an extra calendar for dates whose content is clearly and systematically expressed using more than one calendar convention.  And we only ever need to refer to the calendar system used by a specific example.  What evidence is there that any writer ever expresses or might express  a single date using all possible combinations? What sort of convention might they apply?  If they were to wish to do so, my guess is that they would actually use multiple date expressions.  But we need to see some examples. (The Stow case is interesting though because it suggests that we might want to go beyond simply identifying the calendar system, maybe using @rend to indicate subvariants in the way it is expressed)
>>
>> Martin also raises the question of whether or not  multiple values for an attribute might be an issue for processing. But I think this is also a red herring: it makes no difference to my argument whether you express the calendar value  as "julian gregorian" or "#julGreg" or "#julian_gregorian". Just, please, not "#julian #gregorian"  or "#julian#gregorian", either of which will fail validation and therefore  break existing tools.
>>
>> But I promised to shut up, so I will.
>>
>> Lou
>>
>>
>>
>>
>>
>>
>> --
>> Christian Thomas
>> [hidden email]
>> @dta_cthomas
>> --

--
Dr. Serge Heiden, slh AT ens-lyon.fr, http://textometrie.ens-lyon.fr
Équipe de recherche Cactus, laboratoire IHRIM UMR5317, ENS de Lyon
15, parvis René Descartes 69342 Lyon BP7000 Cedex, tél. +33(0)622003883