Relative databases vs. XML technologies

classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

Relative databases vs. XML technologies

Mandell, Laura C. Dr.
Relative databases vs. XML technologies Dear TEI-L:

Do people directly query their TEI files using xml technologies, as opposed to storing data in a relational database?  If not, why not?

Laura
Reply | Threaded
Open this post in threaded view
|

Re: Relative databases vs. XML technologies

Sebastian Rahtz
On 30 Jun 2010, at 17:20, Mandell, Laura C. Dr. wrote:

>
> Do people directly query their TEI files using xml technologies, as opposed to storing data in a relational database?  If not, why not?

because relational databases are more mature, more widely deployed, more efficient,  and use a more widely understood query language.
it depends on the type of TEI XML  and type of query.

--
Sebastian Rahtz
Information Manager, Oxford University Computing Services
13 Banbury Road, Oxford OX2 6NN. Phone +44 1865 283431

Sólo le pido a Dios
que el futuro no me sea indiferente

Reply | Threaded
Open this post in threaded view
|

Re: Relative databases vs. XML technologies

Martin Mueller
In reply to this post by Mandell, Laura C. Dr.
Joseph Wicentowski at the State Department's Office of the Historian and the folks at syntactica.com are doing some very interesting work with a publishing solution that uses eXist and xquery to do just about everything. I'm marginally and not very competently involved in this enterprise, but conceptually it looks very promising, and the web site of the Office of the Historian is certainly a site of some scale.

The big question here is whether this technology can be brought to a level at which non-programmers can learn it in a reasonable time. A "reasonable time" is more than five minutes, but it's almost closer to learning how to ride a bicycle than how to play the violin.

MM
On Jun 30, 2010, at 11:54 AM, Sebastian Rahtz wrote:

> On 30 Jun 2010, at 17:20, Mandell, Laura C. Dr. wrote:
>
>>
>> Do people directly query their TEI files using xml technologies, as opposed to storing data in a relational database?  If not, why not?
>
> because relational databases are more mature, more widely deployed, more efficient,  and use a more widely understood query language.
> it depends on the type of TEI XML  and type of query.
>
> --
> Sebastian Rahtz
> Information Manager, Oxford University Computing Services
> 13 Banbury Road, Oxford OX2 6NN. Phone +44 1865 283431
>
> Sólo le pido a Dios
> que el futuro no me sea indiferente

Reply | Threaded
Open this post in threaded view
|

Re: Relative databases vs. XML technologies

Martin Holmes
In reply to this post by Mandell, Laura C. Dr.
On 10-06-30 09:20 AM, Mandell, Laura C. Dr. wrote:
> Dear TEI-L:
>
> Do people directly query their TEI files using xml technologies, as
> opposed to storing data in a relational database? If not, why not?

Both approaches are commonly used -- you'll find many TEI-ers using the
eXist XML database, for instance, and others using tools such as
Philologic, which uses a MySQL back end.

My preference is for tools that are designed specifically for (Cocoon
and eXist, in my case) because XPath and XQuery allow for more powerful
and precise interrogation and manipulation of XML data structures. If
your XML is very sparsely-encoded, then you may find the search
capabilities of a relational database system as useful, or more useful.

Cheers,
Martin

--
Martin Holmes
University of Victoria Humanities Computing and Media Centre
([hidden email])
Half-Baked Software, Inc.
([hidden email])
[hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Relative databases vs. XML technologies

Frédéric Glorieux
In reply to this post by Mandell, Laura C. Dr.
Le 30/06/10 18:20, Mandell, Laura C. Dr. a écrit :
> Dear TEI-L:
>
> Do people directly query their TEI files using xml technologies, as
> opposed to storing data in a relational database? If not, why not?
>
> Laura

Dear Laura,

We have used very different technologies to query TEI, I was personnaly
involved for a while in the development Exist (XML native database
<http://exist.sourceforge.net/>).

XML native databases could be very slow. Maturity is not the only
problem, the complexity of the XML tree (especially in TEI) could not be
reduced only by magic. One way to have better performances in some
queries is to prepare indexes, or simplify the original XML, so that a
promise of this technology is lost (produce and query the same XML
without knowledge about the schema). If time and computing resources is
not your every day problem, then put your TEI corpora in an XML native
database (Exist, Berkeley DB XML
<http://www.oracle.com/technology/products/berkeley-db/xml/index.html>
...) may help a lot if XQuery language can become for you a second
nature. If your tagging allow it, you can have fastly answers like : How
much sentences are not in quotes? Who said "I love you" in this drama? I
remember an idea in a note but where? But, it's not a good idea to open
that on the web. A classroom of 15 students can freeze an XML database
on a decent server (depending on queries).

SQL is fast and robust, but you have to project your corpora in flat
tables, so that for us, TEI corpora can't be stored in a relational
model. They are reduced, some information is usually lost, this is the
price to pay for efficiency in prepared queries.

Another computational model you have not mentioned is fulltext indexing.
With things like lucene, it's possible to keep records with repeatable
fields, lighter than SQL, but with some efficiency, and evidence (Notes
talking about money ? If a field "note" have been indexed, you can ask
"note:money")

We have here no conclusion, except understand datamodels, find good
implementations, and apply the best one to each corpora, for
scientifical reasons, and practical ones also.

--
Frédéric

Reply | Threaded
Open this post in threaded view
|

Re: Relative databases vs. XML technologies

Sebastian Rahtz
In reply to this post by Mandell, Laura C. Dr.
For a project here, we recently switched from a setup based on eXist and XQuery to one using an SQL database; the speed of operation and the speed of development rocketed overnight :-} This worked because our TEI file consisted of 250,000 "records" (TEI <person>), which we stored untouched in one column of the table, and added as many columns as we needed to index the data. Then we used XSLT to format the <person> records which came back from a query. It's not a new technique.

Of course, this is not a traditional use of TEI, but it demonstrates a) that there are applications which are TEI but look more like a database, and b) you can combine XML tools with SQL databases.

For those who care, the problem with eXist was purely speed - we were never able to achieve acceptable response times when all the data was loaded. Yes, we did try pretty hard indeed :-} for other types of project I'd very happily use it again.
--
Sebastian Rahtz
Information Manager, Oxford University Computing Services
13 Banbury Road, Oxford OX2 6NN. Phone +44 1865 283431

Sólo le pido a Dios
que el futuro no me sea indiferente

Reply | Threaded
Open this post in threaded view
|

Re: Relative databases vs. XML technologies

stuart yeates
In reply to this post by Mandell, Laura C. Dr.
Mandell, Laura C. Dr. wrote:
> Do people directly query their TEI files using xml technologies, as
> opposed to storing data in a relational database?  If not, why not?

I certainly don't see these as being in opposition.

At the NZETC we:

* use XSLT to harvest (bibliographic, document structure, named-entity
references, etc) metadata out of our TEI documents and store in a
semantic web (which is backed by a relational database, but we never
deal directly with the database except for house-keeping).

* use XSLT to generate complex solr indexes from our TEI documents which
can then be used for faceted browsing.

* use XSLT to generate HTML fragments from our TEI documents for serving
to end users.

cheers
stuart
--
Stuart Yeates
http://www.nzetc.org/       New Zealand Electronic Text Centre
http://researcharchive.vuw.ac.nz/     Institutional Repository

Reply | Threaded
Open this post in threaded view
|

Re: Relative databases vs. XML technologies

stuart yeates
In reply to this post by Mandell, Laura C. Dr.
Sebastian Rahtz wrote:
> For a project here, we recently switched from a setup based on eXist and XQuery to one using an SQL database; the speed of operation and the speed of development rocketed overnight :-} This worked because our TEI file consisted of 250,000 "records" (TEI <person>), which we stored untouched in one column of the table, and added as many columns as we needed to index the data. Then we used XSLT to format the <person> records which came back from a query. It's not a new technique.
>
> Of course, this is not a traditional use of TEI, but it demonstrates a) that there are applications which are TEI but look more like a database, and b) you can combine XML tools with SQL databases.

We have ~1300 documents (letters/books/volumes) with 689344 tei:name
tags in them, so you don't need to go outside the "traditional" uses of
TEI to get those kinds of numbers.

The only data that we actually serve to users from our database is
authoritative names and cross-references; all actual texts are served
from the TEI XML (or from the cache, as the case may be).

cheers
stuart
--
Stuart Yeates
http://www.nzetc.org/       New Zealand Electronic Text Centre
http://researcharchive.vuw.ac.nz/     Institutional Repository

Reply | Threaded
Open this post in threaded view
|

Re: Relative databases vs. XML technologies

Martin Holmes
In reply to this post by Mandell, Laura C. Dr.
I would agree with this: if your data looks like records in a database,
then a database is the right tool for the job. eXist is unlikely ever to
compete with db speeds for this kind of operation, although it can come
pretty close if you tune the indexes carefully enough.

The strength of an XML db is that it can handle arbitrarily complex tree
structures with massive levels of nesting, and let you navigate around
these sorts of structures in your query. You can frame queries such as:

Give me every <l> (line) element which is the third in its stanza, whose
stanza is nested inside another stanza which is the last of its
siblings, where the line contains text in Latin, where the containing
poem was published after 1753... etc.

In other words, if structure and hierarchy are intrinsic parts of your
data, XML databases are very useful indeed; if structure is an arbitrary
organizing principle which is not inherently interesting, and especially
if it's simple, then relational dbs are a better option.

Cheers,
Martin

On 10-06-30 02:28 PM, Sebastian Rahtz wrote:

> For a project here, we recently switched from a setup based on eXist and XQuery to one using an SQL database; the speed of operation and the speed of development rocketed overnight :-} This worked because our TEI file consisted of 250,000 "records" (TEI<person>), which we stored untouched in one column of the table, and added as many columns as we needed to index the data. Then we used XSLT to format the<person>  records which came back from a query. It's not a new technique.
>
> Of course, this is not a traditional use of TEI, but it demonstrates a) that there are applications which are TEI but look more like a database, and b) you can combine XML tools with SQL databases.
>
> For those who care, the problem with eXist was purely speed - we were never able to achieve acceptable response times when all the data was loaded. Yes, we did try pretty hard indeed :-} for other types of project I'd very happily use it again.
> --
> Sebastian Rahtz
> Information Manager, Oxford University Computing Services
> 13 Banbury Road, Oxford OX2 6NN. Phone +44 1865 283431
>
> Sólo le pido a Dios
> que el futuro no me sea indiferente
>

--
Martin Holmes
University of Victoria Humanities Computing and Media Centre
([hidden email])
Half-Baked Software, Inc.
([hidden email])
[hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Relative databases vs. XML technologies

Mandell, Laura C. Dr.
In reply to this post by Mandell, Laura C. Dr.
Dear TEI-List:

Thank you all for this great info.

Best, Laura


On 6/30/10 11:50 PM, "Martin Holmes" <[hidden email]> wrote:

> I would agree with this: if your data looks like records in a database,
> then a database is the right tool for the job. eXist is unlikely ever to
> compete with db speeds for this kind of operation, although it can come
> pretty close if you tune the indexes carefully enough.
>
> The strength of an XML db is that it can handle arbitrarily complex tree
> structures with massive levels of nesting, and let you navigate around
> these sorts of structures in your query. You can frame queries such as:
>
> Give me every <l> (line) element which is the third in its stanza, whose
> stanza is nested inside another stanza which is the last of its
> siblings, where the line contains text in Latin, where the containing
> poem was published after 1753... etc.
>
> In other words, if structure and hierarchy are intrinsic parts of your
> data, XML databases are very useful indeed; if structure is an arbitrary
> organizing principle which is not inherently interesting, and especially
> if it's simple, then relational dbs are a better option.
>
> Cheers,
> Martin
>
> On 10-06-30 02:28 PM, Sebastian Rahtz wrote:
>> For a project here, we recently switched from a setup based on eXist and
>> XQuery to one using an SQL database; the speed of operation and the speed of
>> development rocketed overnight :-} This worked because our TEI file consisted
>> of 250,000 "records" (TEI<person>), which we stored untouched in one column
>> of the table, and added as many columns as we needed to index the data. Then
>> we used XSLT to format the<person>  records which came back from a query.
>> It's not a new technique.
>>
>> Of course, this is not a traditional use of TEI, but it demonstrates a) that
>> there are applications which are TEI but look more like a database, and b)
>> you can combine XML tools with SQL databases.
>>
>> For those who care, the problem with eXist was purely speed - we were never
>> able to achieve acceptable response times when all the data was loaded. Yes,
>> we did try pretty hard indeed :-} for other types of project I'd very happily
>> use it again.
>> --
>> Sebastian Rahtz
>> Information Manager, Oxford University Computing Services
>> 13 Banbury Road, Oxford OX2 6NN. Phone +44 1865 283431
>>
>> Sólo le pido a Dios
>> que el futuro no me sea indiferente
>>
>
> --
> Martin Holmes
> University of Victoria Humanities Computing and Media Centre
> ([hidden email])
> Half-Baked Software, Inc.
> ([hidden email])
> [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Relative databases vs. XML technologies

Brett Zamir
In reply to this post by Mandell, Laura C. Dr.
The big question here is whether this technology can be brought to a level at which non-programmers can learn it in a reasonable time. A "reasonable time" is more than five minutes, but it's almost closer to learning how to ride a bicycle than how to play the violin.
  

I believe so. In the process of arguing in favor of XML databases for use in client-side HTML storage*, I've also proposed allowing a jQuery-like syntax against client-side databases, jQuery being a hugely popular and easy-to-learn JavaScript library, since CSS Selectors (which jQuery uses) should, I believe, be fully convertible into XPath (as John Resig has done in the other direction for simple XPath), which is a subset of the yet more powerful XQuery. Even if such an interface is dumbing things down compared to XQuery or even XPath, it may be more comfortable to get people used to it (and jQuery has its own XQuery-like functions for easily iterating nodes, etc. anyways).

Client-side usage might look like this (where the client-side database "Classics" had been created earlier by some other web API function call):


The above finds every ironic passage of Shakespeare in the locally-stored collection and adds the passages to an HTML element.

Even querying remote XML/HTML stores should be possible for client-side HTML too (as it already available to server-side languages), especially if websites were not forced to obtain explicit permissions from the remote site but could instead make cross-domain requests (via HTML Ajax), conditional on user permission: https://bugzilla.mozilla.org/show_bug.cgi?id=573886 . This would allow any website to be treated as a data store (including TEI ones) with the burden shifted away from the server to the user who could query to their heart's content without slowing down an intermediate server (but still allowing that third-party the ability to make a user interface available to them in HTML). But because there are security issues when incorporating content from remote sites, it would need to require permission by the user.

For example, the remote querying might look as simple as this:

As above, this finds every ironic passage of Shakespeare in the specified works and adds the passages to an HTML element, but in this case, the files have not been created locally, but are available live from remote. (The best of both worlds could be possible if the client-side storage could be made to check periodically for updates.)

If you XML fans want to see this kind of functionality available to browsers (which could either use jQuery or ideally XQuery itself), so that any web designer could make an interface which would work against your TEI, whether stored on a remote server, or designed to be installable via the web into a web-accessible client-side database, voice your support in the HTML5 WHATWG email list (http://www.whatwg.org/mailing-list )!

Brett

* In the discussion thread about IndexedDB, the proposed standard for allowing client-side database storage inside HTML5 at http://hacks.mozilla.org/2010/06/comparing-indexeddb-and-webdatabase/comment-page-1/#comment-96635

Reply | Threaded
Open this post in threaded view
|

Re: Relative databases vs. XML technologies

Brett Zamir
In reply to this post by Mandell, Laura C. Dr.
> If time and computing resources is not your every day problem, then
> put your TEI corpora in an XML native database (Exist, Berkeley DB XML
> <http://www.oracle.com/technology/products/berkeley-db/xml/index.html>
> ...) may help a lot if XQuery language can become for you a second
> nature. If your tagging allow it, you can have fastly answers like :
> How much sentences are not in quotes? Who said "I love you" in this
> drama? I remember an idea in a note but where? But, it's not a good
> idea to open that on the web. A classroom of 15 students can freeze an
> XML database on a decent server (depending on queries).
Unless as, per my other post just now, it is client-side storage
accessible to websites which can handle both privacy and server load
concerns while still allowing web applications on the web to install and
access the data, according to user preference.

This would work like cookies (which are also stored locally) but allow a
much richer and larger-storage-capable client-side database features,
whether that would be IndexedDB as currently proposed for HTML5, or, my
preference, a native XML XQuery-supporting database. I have proposed the
latter (and am currently working on a Firefox add-on to hopefully
demonstrate the concept and make it usable until such time as it could
hopefully become standardized). While eXist or BDBXML would be great for
those willing to make a fairly big download, I've recently discovered a
very small XML database, Sedna, which I think could be small enough to
include with an extension or possibly as part of a browser like Firefox
itself, though ideally the HTML database API would be generic (maybe
using XQJ, the Java API for XQuery?) to work with any database the user
installed.

best wishes,
Brett

Reply | Threaded
Open this post in threaded view
|

Re: Relative databases vs. XML technologies

Brett Zamir
In reply to this post by Mandell, Laura C. Dr.
On 7/1/2010 6:50 AM, Martin Holmes wrote:
> I would agree with this: if your data looks like records in a
> database, then a database is the right tool for the job. eXist is
> unlikely ever to compete with db speeds for this kind of operation,
> although it can come pretty close if you tune the indexes carefully
> enough.

I would think that this is merely an implementation issue, rather than
an inherent problem that XML databases could not overcome.  If an XML
file is known to have a strictly tabular structure (with no processing
instructions, comments, etc., at least of relevance to a deliberately
"relational" query for which a separate table structure could be created
internally), then whatever storage principles are being applied in
relational databases can be applied to the XML data.   I see absolutely
no reason, at least theoretical, why there will be any need for
relational databases as separate from XML databases, unless there is
some benefit to an XML database allowing for
"relational"-aware/optimizing queries inside XQuery (as extensions to
XQuery allow in being able to make SQL inside XQuery). That is, if the
database can't automatically be made to pre-optimize on its own, as I
would think should be possible.

Since XML in concept can mimic a relational database perfectly (while
the converse is not true), an XML database could even outsource this
work to a relational database component, I would imagine. I'm not any
expert here, I'll admit, but I just don't see any reason it couldn't work.

The advantage of focusing on XML databases I think is that it provides a
common query mechanism (XQuery) which can work with either hierarchical
or tabular data. In particular, I hope the web will not be deprived of
this single means of querying.

Brett


Reply | Threaded
Open this post in threaded view
|

Re: Relative databases vs. XML technologies

richard light
In reply to this post by Mandell, Laura C. Dr.
In message <[hidden email]>,
Sebastian Rahtz <[hidden email]> writes

>For a project here, we recently switched from a setup based on eXist
>and XQuery to one using an SQL database; the speed of operation and the
>speed of development rocketed overnight :-} This worked because our TEI
>file consisted of 250,000 "records" (TEI <person>), which we stored
>untouched in one column of the table, and added as many columns as we
>needed to index the data. Then we used XSLT to format the <person>
>records which came back from a query. It's not a new technique.
>
>Of course, this is not a traditional use of TEI, but it demonstrates a)
>that there are applications which are TEI but look more like a
>database, and b) you can combine XML tools with SQL databases.

Just to mention another hybrid approach: we also use a relational
database engine and store XML fragments as BLOBs.  We then add an
indexing plugin to the database.  This allows us to specify multiple
indexes which use XPath expressions to index XML content within the
BLOBs. However, as far as the database engine is concerned, these are
"normal" indexes.

We use parent and child processing instructions to indicate the position
of each XML fragment within the original document. This allows the
re-creation of this document as part of a report generation "pipe".

This approach gives us the benefit of holding our TEI as a shared
updateable resource (with the usual relational record-locking and
transaction support, and real-time indexing of content).

On the retrieval front, this approach doesn't help much with external
querying, since SQL doesn't deal in indexes, but we have built custom
search mechanisms which use the XPath indexes directly, and are happy
enough with that.  Nor does it give you the ability to put ad hoc XPath
queries to the entire document as a native XML database would.

One advantage of this approach is that it will support any type of TEI
document, not just "record-like" ones.  The one requirement is that you
have to decide on a "chunking" policy, and assign a unique identifier to
each chunk/record.

Richard
--
Richard Light

Reply | Threaded
Open this post in threaded view
|

Re: Relative databases vs. XML technologies

Sebastian Rahtz
In reply to this post by Mandell, Laura C. Dr.
I suspect we all agree that managing the source data as TEI XML is the important thing. From there we can choose
to use XSLT, Postgres, eXist or Jena triplestore as meets our needs today, pretty confident that we can switch
to something else next year if it seems appropriate.
--
Sebastian Rahtz
(acting) Information and Support Group Manager
Oxford University Computing Services
13 Banbury Road, Oxford OX2 6NN. Phone +44 1865 283431

Sólo le pido a Dios
que el futuro no me sea indiferente

Reply | Threaded
Open this post in threaded view
|

Re: Relative databases vs. XML technologies

Graham, Wayne (wsg4w)
In reply to this post by Mandell, Laura C. Dr.
I'll throw one more in here. We've been using Solr a lot for the discovery
layer, splitting XML in to different types of Solr documents as needed, then
using client XML/XSLT libraries to provide more granular search results on a
per-page basis as needed. You can see an example of the technique at
http://raven.scholarslab.org/; If you're really interested, you can check
out some code we have at the githubs: http://github.com/mwmitchell/raven

HTH,
Wayne


On 7/1/10 4:11 AM, "Richard Light" <[hidden email]> wrote:

> In message <[hidden email]>,
> Sebastian Rahtz <[hidden email]> writes
>> For a project here, we recently switched from a setup based on eXist
>> and XQuery to one using an SQL database; the speed of operation and the
>> speed of development rocketed overnight :-} This worked because our TEI
>> file consisted of 250,000 "records" (TEI <person>), which we stored
>> untouched in one column of the table, and added as many columns as we
>> needed to index the data. Then we used XSLT to format the <person>
>> records which came back from a query. It's not a new technique.
>>
>> Of course, this is not a traditional use of TEI, but it demonstrates a)
>> that there are applications which are TEI but look more like a
>> database, and b) you can combine XML tools with SQL databases.
>
> Just to mention another hybrid approach: we also use a relational
> database engine and store XML fragments as BLOBs.  We then add an
> indexing plugin to the database.  This allows us to specify multiple
> indexes which use XPath expressions to index XML content within the
> BLOBs. However, as far as the database engine is concerned, these are
> "normal" indexes.
>
> We use parent and child processing instructions to indicate the position
> of each XML fragment within the original document. This allows the
> re-creation of this document as part of a report generation "pipe".
>
> This approach gives us the benefit of holding our TEI as a shared
> updateable resource (with the usual relational record-locking and
> transaction support, and real-time indexing of content).
>
> On the retrieval front, this approach doesn't help much with external
> querying, since SQL doesn't deal in indexes, but we have built custom
> search mechanisms which use the XPath indexes directly, and are happy
> enough with that.  Nor does it give you the ability to put ad hoc XPath
> queries to the entire document as a native XML database would.
>
> One advantage of this approach is that it will support any type of TEI
> document, not just "record-like" ones.  The one requirement is that you
> have to decide on a "chunking" policy, and assign a unique identifier to
> each chunk/record.
>
> Richard

Reply | Threaded
Open this post in threaded view
|

Re: Relative databases vs. XML technologies

stuart yeates
In reply to this post by Mandell, Laura C. Dr.
> I'll throw one more in here. We've been using Solr a lot for the discovery
> layer, splitting XML in to different types of Solr documents as needed, then
> using client XML/XSLT libraries to provide more granular search results on a
> per-page basis as needed. You can see an example of the technique at
> http://raven.scholarslab.org/; If you're really interested, you can check
> out some code we have at the githubs: http://github.com/mwmitchell/raven

We do something completely completely different with solr: http://www.nzetc.org/tm/scholarly/facets/search

See also http://www.nzetc.org/tm/scholarly/books.rss which is an alias to our complete list of works, as an RSS feed of downloadable ePubs, through solr

cheers
stuart

Reply | Threaded
Open this post in threaded view
|

Re: Relative databases vs. XML technologies

James Cummings
In reply to this post by Mandell, Laura C. Dr.
On 30/06/10 18:05, Martin Mueller wrote:

> Joseph Wicentowski at the State Department's Office of the Historian
> and the folks at syntactica.com are doing some very interesting work
> with a publishing solution that uses eXist and xquery to do just
> about everything. I'm marginally and not very competently involved in
> this enterprise, but conceptually it looks very promising, and the
> web site of the Office of the Historian is certainly a site of some
> scale.
>
> The big question here is whether this technology can be brought to a
> level at which non-programmers can learn it in a reasonable time. A
> "reasonable time" is more than five minutes, but it's almost closer
> to learning how to ride a bicycle than how to play the violin.

Just to comment on this. Joe is giving a 2.5 workshop as part of our TEI
Summer School introducing this kind of thing to people.  I've taught
people XQuery and basic eXist enough to have them indexing and doing
several types of query in 3/4s of a day. Obviously though it takes much
longer to feel familiar implementing it for real.  In a poorly created
analogy, I can teach anyone to juggle in 45 minutes, but it will take at
least a couple weeks of regular practice before it feels normal.

I use eXist/XQuery for several websites and proper indexing (with
lucene full-text) is certainly something that makes a substantial
difference to retrieval times. The xquery url rewriting in recent
versions is also quite interesting.

-James