(no subject)

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

(no subject)

Hayim Lapin

Dear Colleagues

My colleague Daniel Stoekl and I are building a model digital edition of a chunk of the Mishnah, a Hebrew text from ca 200 AD/CE. As part of this analysis we are generating morphological analysis of the words in the text (with syntactic analysis as a desideratum) as well as a dictionary. This query relates to the dictionary.

  1. [Specifically relating to the representation of semitic languages].

We are interested in best practices for structuring entries and super-entries. There are roots that appear in inflected verb stems or forms, but also nouns some of which are derivatives of the verbal forms but also nouns inflected with noun-only forms. (Just to make things more complicated, in Hebrew and ancient Aramaic, unlike Arabic, the participle is used to render the present tense ….)
We are discussing how best to organized entries. For example:

  • Group everything related to a root under superEntry for root
  • Follow the practice of the Comprehensive Aramaic Lexicon (effectively; it is not TEI compliant) and group verb forms under root (superEntry, with sub entry), and cross reference (<re>) to “dependent forms” for nouns.
  • Have entries for root whose sole purpose is to point to related entries (re) for verbs, nouns, etc. 

Some of this involves editorial judgment about how best to represent the target language for users. But we are also hoping the collective experience on this list may be able to help us determine whether there are performance or usability issues with structure that is either too broad and shallow, or too narrow and deep.

  1. [Integrating  morphological analysis of multiple witnesses with dictionary].

For the morphological analysis we have been working on a set of features and feature structures based on the ISO MAF standard (shout out to Laurent Romary). We would like to be able to integrate the relevant features (POS, root, stem or noun form, etc.).  I am interested in how this can be done compactly. (FN: It bugs me to have a gramGrp with a single gram in it under each entry.)  Two examples are set out below (1. a gramGrp for a superEntry that lists the grammatical analyses relevant to the child entries ; 2. linking from each entry to features) but I/we am new to dictionaries, and I am not sure these are the best way to structure the entries

Many thanks, Hayim (and Daniel)


<superEntry>

  <form>

<!-- 1. a gramGrp that includes, inter alia, the grammatical -->

<!-- forms referenced in entries; points to features ...  -->

     <gramGrp>

        <gram xml:id="gram1" ana="#features1"></gram>

        <gram xml:id="gram2" ana="#features2"></gram>

     </gramGrp>

  </form>

  <!-- then in child entry point back to grammatical -->

  <!-- analysis in top-level form  -->

  <entry ana="#gram1">

     <form>

        <orth type="lexeme"></orth>

        <orth type="vocalized"></orth>

        <orth type="consonantal"></orth>

    </form>

     <sense>

        <def></def>

     </sense>

  </entry>

  <entry ana="#gram2">

     <form>

        <orth type="lexeme"></orth>

        <orth type="vocalized"></orth>

        <orth type="consonantal"></orth>

     </form>

     <sense>

        <def></def>

     </sense>

  </entry>

<!-- 2. or alternatively, in the sub entry -->

  <entry>

     <form>

        <orth></orth>

        <gram ana="#features3"></gram>

     </form>

     <sense>

        <def></def>

     </sense>

  </entry>

</superEntry>



Robert H. Smith Professor of Jewish Studies and
Professor of History
Department of History
University of Maryland
2115 Francis Scott Key Hall
College Park, MD 20742
301 405 4296 | [hidden email]
www.digitalmishnah.org | www.eRabbinica.org

Director
Joseph and Rebecca Meyehoff
Program and Center forJewish Studies
University of Maryland
4141 Susquehanna Hall
College Park, MD 20742
301 405 4975 | [hidden email]
www.jewishstudies.umd.edu