Uncertainty of authorship, and attribution methods

Kevin McMullen
Dear TEI folks,

The Walt Whitman Archive is currently working with newspaper editorials that Whitman likely wrote while the editor of several newspapers in New York in the 1840s and 50s. As with much 19th-century journalism, the editorials contained no byline, thus, attribution has been based on a number of other factors (manuscript evidence, anecdotal accounts of Whitman's involvement with the paper, identification of "Whitmanian" style in the text, and, in recent years, computational stylistic analysis). Yet there continues to be debate over whether Whitman was or was not the author of some of these editorials, with scholars being more certain of his involvement in some pieces and less certain in others. As such, we wanted to acknowledge and foreground, in some way, the uncertainty inherent in dealing with anonymously published periodical material. To that end, we are trying to develop a machine-readable way to express uncertainty about authorship with reference to a series of evidentiary categories.  In other words, we want to be able to say how certain we are that Whitman is the author of a piece, and then point to specific sources/methods that led us to make that determination.

We have come up with the following possible solution, which would appear in the <bibl> or <biblStruct> of the TEI header:

<author xml:id="ww">Walt Whitman</author>
<certainty target="#ww" cert="high" given="#wp #ss">
     <ref target="#ss01"/>
     <ref target="#ss01"/>

The value of @given would be a set of codes that we will have developed (and that will be recorded and described in an external file) that relate to different types of "evidence" that have led us to make our claim of Whitman's authorship. In the above example, "wp" stands for "Whitman pseudonym" (meaning the piece was published under a known Whitman pseudonym) and "ss" stands for "scholarly source" (a published source that identifies Whitman as the author of a piece). To tie the determination to a specific scholarly source (or, in this case, two scholarly sources), we would include <ref>s that would point to an external document containing a list of bibliographic entries. There would be additional "codes" for different types of evidentiary categories. 

We're curious what others make of this approach. Have other projects dealt with uncertain authorship? And if so, how was it handled? Would there be a different or better way of tying these certainty claims to externally defined rationales and/or sources?

Many thanks in advance for any feedback!

Kevin McMullen


