your eXist-db is an open proxy

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

your eXist-db is an open proxy

Mathias Göbel

Dear TEI-Community,

thank you for offering an increasing number of documents stored in outstanding great databases like eXist-db and available via REST. Would those guys using eXist-db consider to capture&redirect the "_query" parameter (or at least a set of function names) to avoid offering an open proxy like in this example:

https://tei2016app.acdh.oeaw.ac.at/data/?_query=xquery%20version%20%223.1%22;response:stream-binary(%20xs:base64Binary(%20data(httpclient:get(xs:anyURI(%22http://24.media.tumblr.com/tumblr_lt8vrdas9o1qb8xalo1_400.jpg%22),%20false(),%20())//httpclient:body))%20,%20%22image/jpg%22)

If you are using Apache you might want to

        RewriteEngine on
        RewriteCond %{QUERY_STRING} _query=
        RewriteRule (.*) $1? [R=permanent]

Best,
Mathias
--
Mathias Göbel
Research and Development

Georg-August-Universität Göttingen
Göttingen State and University Library
D-37070 Göttingen

Papendiek 14 (hist. building, room 2.408)
+49 551 39-20184 (Tel.)
+49 551 39-33856 (Fax.)

[hidden email]
http://www.sub.uni-goettingen.de

--
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: your eXist-db is an open proxy

Piotr Bański-2
Dear Mathias,

What a pretty cat(ch)! Thanks for sharing :-)

Best regards,

   Piotr

On 03/10/17 14:26, Mathias Göbel wrote:

> Dear TEI-Community,
>
> thank you for offering an increasing number of documents stored in
> outstanding great databases like eXist-db and available via REST. Would
> those guys using eXist-db consider to capture&redirect the "_query"
> parameter (or at least a set of function names) to avoid offering an
> open proxy like in this example:
>
> https://tei2016app.acdh.oeaw.ac.at/data/?_query=xquery%20version%20%223.1%22;response:stream-binary(%20xs:base64Binary(%20data(httpclient:get(xs:anyURI(%22http://24.media.tumblr.com/tumblr_lt8vrdas9o1qb8xalo1_400.jpg%22),%20false(),%20())//httpclient:body))%20,%20%22image/jpg%22)
>
> If you are using Apache you might want to
>
>         RewriteEngine on
>         RewriteCond %{QUERY_STRING} _query=
>         RewriteRule (.*) $1? [R=permanent]
>
> Best,
> Mathias
> --
> Mathias Göbel
> Research and Development
>
> Georg-August-Universität Göttingen
> Göttingen State and University Library
> D-37070 Göttingen
>
> Papendiek 14 (hist. building, room 2.408
> <https://lageplan.uni-goettingen.de/?ident=7209_4_2.OG_2.408>)
> +49 551 39-20184 (Tel.)
> +49 551 39-33856 (Fax.)
>
> [hidden email] <mailto:[hidden email]=%0A-goettingen.de>
> http://www.sub.uni-goettingen.de
>
> --

--
Piotr Bański, Ph.D.
Senior Researcher,
Institut für Deutsche Sprache,
R5 6-13
68-161 Mannheim, Germany
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: your eXist-db is an open proxy

Joe Wicentowski
Hi all,

For anyone considering moving an application from your own computer and putting it on the public internet, the eXist documentation states a helpful admonition: 

"For any live application it is recognised best practice to keep the attack surface of the application as small as possible. There are two aspects to this: 1. Reducing the application itself to the absolute essentials. 2. Limiting access routes to the application. eXist-db is no exception and should be configured for your production systems so that it provides only what you need and no more." (from http://exist-db.org/exist/apps/doc/production_good_practice.xml)

As applied to the oeaw.ac.at server, the issue is that the eXist server's REST interface is exposed to the public.  Essentially, the "/data" URL at https://tei2016app.acdh.oeaw.ac.at/data/ is being mapped onto the eXist server's own URL, http://localhost:8080/exist/rest/db/apps/tei-abstracts/data.  eXist's REST interface (http://exist-db.org/exist/apps/doc/devguide_rest.xml) is a convenient way to expose the documents in your collection for browsing and downloading.  But this powerful interface does allow users with access to it to execute arbitrary XQuery.  Matthias's solution (already applied, it appears!) keeps the original "/rest" URLs exposed, while filtering requests to prevent users from executing arbitrary code.  This is a good step, but in general, good practice is to prevent these "/rest" URLs from being exposed to the public, using eXist's robust URL rewriting functions to limit what visitors are able to see and access.

Like many open source projects, the built-in documentation is uneven.  For anyone getting started with eXist, I'd highly recommend Adam Retter and Erik Siegel's book, _eXist_ (O'Reilly, 2014):


The whole book is really well done and approachable.  I wrote a review at http://joewiz.org/2014/12/28/exist-the-indispensable-guide/.  

Joe

On Fri, Mar 10, 2017 at 8:33 AM, Piotr Bański <[hidden email]> wrote:
Dear Mathias,

What a pretty cat(ch)! Thanks for sharing :-)

Best regards,

  Piotr

On 03/10/17 14:26, Mathias Göbel wrote:
Dear TEI-Community,

thank you for offering an increasing number of documents stored in
outstanding great databases like eXist-db and available via REST. Would
those guys using eXist-db consider to capture&redirect the "_query"
parameter (or at least a set of function names) to avoid offering an
open proxy like in this example:

https://tei2016app.acdh.oeaw.ac.at/data/?_query=xquery%20version%20%223.1%22;response:stream-binary(%20xs:base64Binary(%20data(httpclient:get(xs:anyURI(%22http://24.media.tumblr.com/tumblr_lt8vrdas9o1qb8xalo1_400.jpg%22),%20false(),%20())//httpclient:body))%20,%20%22image/jpg%22)

If you are using Apache you might want to

        RewriteEngine on
        RewriteCond %{QUERY_STRING} _query=
        RewriteRule (.*) $1? [R=permanent]

Best,
Mathias
--
Mathias Göbel
Research and Development

Georg-August-Universität Göttingen
Göttingen State and University Library
D-37070 Göttingen

Papendiek 14 (hist. building, room 2.408
<https://lageplan.uni-goettingen.de/?ident=7209_4_2.OG_2.408>)
+49 551 39-20184 (Tel.)
+49 551 39-33856 (Fax.)

[hidden email] <mailto:[hidden email]=%0A-goettingen.de>
http://www.sub.uni-goettingen.de

--

--
Piotr Bański, Ph.D.
Senior Researcher,
Institut für Deutsche Sprache,
R5 6-13
68-161 Mannheim, Germany

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: your eXist-db is an open proxy

Omar Siam-2

Why does this work?

https://exist-curation.minerva.arz.oeaw.ac.at/exist/apps/does_not_matter/what/path/even_if_it_does_not_exist.badending?_query=xquery%20version%20%223.1%22;response:stream-binary(%20xs:base64Binary(%20data(httpclient:get(xs:anyURI(%22http://24.media.tumblr.com/tumblr_lt8vrdas9o1qb8xalo1_400.jpg%22),%20false(),%20())//httpclient:body))%20,%20%22image/jpg%22)

Also works perfectly like this

http://localhost:8080/exist/apps/does_not_matter/what/path/even_if_it_does_not_exist.badending?_query=xquery%20version%20%223.1%22;response:stream-binary(%20xs:base64Binary(%20data(httpclient:get(xs:anyURI(%22http://24.media.tumblr.com/tumblr_lt8vrdas9o1qb8xalo1_400.jpg%22),%20false(),%20())//httpclient:body))%20,%20%22image/jpg%22)

Works in /exist/apps/... and in /exist/rest/..., it does not work on /exist/xmlrpc and /xml/restxq

So I sincerely doubt that it is a misconfiguration of our proxy servers.

Who thought that the _query parameter needs to work *everywhere*?

Also have a look at this: http://exist-db.org/exist/apps/doc/?_query=xquery%20version%20%221.0%22;response:stream(%20xs:base64Binary(%20data(httpclient:get(xs:anyURI(%22http://24.media.tumblr.com/tumblr_lt8vrdas9o1qb8xalo1_400.jpg%22),%20false(),%20())//httpclient:body))%20,%20%22image/jpg%22)

Or if you prefer:

<a class="moz-txt-link-freetext" href="view-source:http://exist-db.org/exist/apps/doc/?_query=response:stream(httpclient:get(xs:anyURI(%27http://www.example.org/%27),%20false(),())//httpclient:body/*,%27%27)">view-source:http://exist-db.org/exist/apps/doc/?_query=response:stream(httpclient:get(xs:anyURI(%27http://www.example.org/%27),%20false(),())//httpclient:body/*,%27%27)

Where in http://exist-db.org/exist/apps/doc/production_good_practice.xml did you state that it is absolutely mandatory to add smash _query= before the request hits exist-db? Like the apache config snippet does.

What sort of trap is this? Please be explicit about what "service, servlet or filter" I need to disable to stop this.

Best Regards

Omar

Am 10.03.2017 um 16:44 schrieb Joe Wicentowski:
Hi all,

For anyone considering moving an application from your own computer and putting it on the public internet, the eXist documentation states a helpful admonition: 

"For any live application it is recognised best practice to keep the attack surface of the application as small as possible. There are two aspects to this: 1. Reducing the application itself to the absolute essentials. 2. Limiting access routes to the application. eXist-db is no exception and should be configured for your production systems so that it provides only what you need and no more." (from http://exist-db.org/exist/apps/doc/production_good_practice.xml)

As applied to the oeaw.ac.at server, the issue is that the eXist server's REST interface is exposed to the public.  Essentially, the "/data" URL at https://tei2016app.acdh.oeaw.ac.at/data/ is being mapped onto the eXist server's own URL, http://localhost:8080/exist/rest/db/apps/tei-abstracts/data.  eXist's REST interface (http://exist-db.org/exist/apps/doc/devguide_rest.xml) is a convenient way to expose the documents in your collection for browsing and downloading.  But this powerful interface does allow users with access to it to execute arbitrary XQuery.  Matthias's solution (already applied, it appears!) keeps the original "/rest" URLs exposed, while filtering requests to prevent users from executing arbitrary code.  This is a good step, but in general, good practice is to prevent these "/rest" URLs from being exposed to the public, using eXist's robust URL rewriting functions to limit what visitors are able to see and access.

Like many open source projects, the built-in documentation is uneven.  For anyone getting started with eXist, I'd highly recommend Adam Retter and Erik Siegel's book, _eXist_ (O'Reilly, 2014):


The whole book is really well done and approachable.  I wrote a review at http://joewiz.org/2014/12/28/exist-the-indispensable-guide/.  

Joe

On Fri, Mar 10, 2017 at 8:33 AM, Piotr Bański <[hidden email]> wrote:
Dear Mathias,

What a pretty cat(ch)! Thanks for sharing :-)

Best regards,

  Piotr

On 03/10/17 14:26, Mathias Göbel wrote:
Dear TEI-Community,

thank you for offering an increasing number of documents stored in
outstanding great databases like eXist-db and available via REST. Would
those guys using eXist-db consider to capture&redirect the "_query"
parameter (or at least a set of function names) to avoid offering an
open proxy like in this example:

https://tei2016app.acdh.oeaw.ac.at/data/?_query=xquery%20version%20%223.1%22;response:stream-binary(%20xs:base64Binary(%20data(httpclient:get(xs:anyURI(%22http://24.media.tumblr.com/tumblr_lt8vrdas9o1qb8xalo1_400.jpg%22),%20false(),%20())//httpclient:body))%20,%20%22image/jpg%22)

If you are using Apache you might want to

        RewriteEngine on
        RewriteCond %{QUERY_STRING} _query=
        RewriteRule (.*) $1? [R=permanent]

Best,
Mathias
--
Mathias Göbel
Research and Development

Georg-August-Universität Göttingen
Göttingen State and University Library
D-37070 Göttingen

Papendiek 14 (hist. building, room 2.408
<https://lageplan.uni-goettingen.de/?ident=7209_4_2.OG_2.408>)
+49 551 39-20184 (Tel.)
+49 551 39-33856 (Fax.)

[hidden email] <mailto:[hidden email]=%0A-goettingen.de>
http://www.sub.uni-goettingen.de

--

--
Piotr Bański, Ph.D.
Senior Researcher,
Institut für Deutsche Sprache,
R5 6-13
68-161 Mannheim, Germany


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: your eXist-db is an open proxy

Joe Wicentowski
Omar,

Could I propose that you post this question over on the eXist mailing list?  The discussion here is getting a little off topic from TEI.  I believe you're already subscribed to exist-open, but for anyone who isn't, please join at https://lists.sourceforge.net/lists/listinfo/exist-open.

Joe

On Fri, Mar 10, 2017 at 11:07 AM, Omar Siam <[hidden email]> wrote:

Why does this work?

https://exist-curation.minerva.arz.oeaw.ac.at/exist/apps/does_not_matter/what/path/even_if_it_does_not_exist.badending?_query=xquery%20version%20%223.1%22;response:stream-binary(%20xs:base64Binary(%20data(httpclient:get(xs:anyURI(%22http://24.media.tumblr.com/tumblr_lt8vrdas9o1qb8xalo1_400.jpg%22),%20false(),%20())//httpclient:body))%20,%20%22image/jpg%22)

Also works perfectly like this

http://localhost:8080/exist/apps/does_not_matter/what/path/even_if_it_does_not_exist.badending?_query=xquery%20version%20%223.1%22;response:stream-binary(%20xs:base64Binary(%20data(httpclient:get(xs:anyURI(%22http://24.media.tumblr.com/tumblr_lt8vrdas9o1qb8xalo1_400.jpg%22),%20false(),%20())//httpclient:body))%20,%20%22image/jpg%22)

Works in /exist/apps/... and in /exist/rest/..., it does not work on /exist/xmlrpc and /xml/restxq

So I sincerely doubt that it is a misconfiguration of our proxy servers.

Who thought that the _query parameter needs to work *everywhere*?

Also have a look at this: http://exist-db.org/exist/apps/doc/?_query=xquery%20version%20%221.0%22;response:stream(%20xs:base64Binary(%20data(httpclient:get(xs:anyURI(%22http://24.media.tumblr.com/tumblr_lt8vrdas9o1qb8xalo1_400.jpg%22),%20false(),%20())//httpclient:body))%20,%20%22image/jpg%22)

Or if you prefer:

view-source:http://exist-db.org/exist/apps/doc/?_query=response:stream(httpclient:get(xs:anyURI(%27http://www.example.org/%27),%20false(),())//httpclient:body/*,%27%27)

Where in http://exist-db.org/exist/apps/doc/production_good_practice.xml did you state that it is absolutely mandatory to add smash _query= before the request hits exist-db? Like the apache config snippet does.

What sort of trap is this? Please be explicit about what "service, servlet or filter" I need to disable to stop this.

Best Regards

Omar

Am 10.03.2017 um 16:44 schrieb Joe Wicentowski:
Hi all,

For anyone considering moving an application from your own computer and putting it on the public internet, the eXist documentation states a helpful admonition: 

"For any live application it is recognised best practice to keep the attack surface of the application as small as possible. There are two aspects to this: 1. Reducing the application itself to the absolute essentials. 2. Limiting access routes to the application. eXist-db is no exception and should be configured for your production systems so that it provides only what you need and no more." (from http://exist-db.org/exist/apps/doc/production_good_practice.xml)

As applied to the oeaw.ac.at server, the issue is that the eXist server's REST interface is exposed to the public.  Essentially, the "/data" URL at https://tei2016app.acdh.oeaw.ac.at/data/ is being mapped onto the eXist server's own URL, http://localhost:8080/exist/rest/db/apps/tei-abstracts/data.  eXist's REST interface (http://exist-db.org/exist/apps/doc/devguide_rest.xml) is a convenient way to expose the documents in your collection for browsing and downloading.  But this powerful interface does allow users with access to it to execute arbitrary XQuery.  Matthias's solution (already applied, it appears!) keeps the original "/rest" URLs exposed, while filtering requests to prevent users from executing arbitrary code.  This is a good step, but in general, good practice is to prevent these "/rest" URLs from being exposed to the public, using eXist's robust URL rewriting functions to limit what visitors are able to see and access.

Like many open source projects, the built-in documentation is uneven.  For anyone getting started with eXist, I'd highly recommend Adam Retter and Erik Siegel's book, _eXist_ (O'Reilly, 2014):


The whole book is really well done and approachable.  I wrote a review at http://joewiz.org/2014/12/28/exist-the-indispensable-guide/.  

Joe

On Fri, Mar 10, 2017 at 8:33 AM, Piotr Bański <[hidden email]> wrote:
Dear Mathias,

What a pretty cat(ch)! Thanks for sharing :-)

Best regards,

  Piotr

On 03/10/17 14:26, Mathias Göbel wrote:
Dear TEI-Community,

thank you for offering an increasing number of documents stored in
outstanding great databases like eXist-db and available via REST. Would
those guys using eXist-db consider to capture&redirect the "_query"
parameter (or at least a set of function names) to avoid offering an
open proxy like in this example:

https://tei2016app.acdh.oeaw.ac.at/data/?_query=xquery%20version%20%223.1%22;response:stream-binary(%20xs:base64Binary(%20data(httpclient:get(xs:anyURI(%22http://24.media.tumblr.com/tumblr_lt8vrdas9o1qb8xalo1_400.jpg%22),%20false(),%20())//httpclient:body))%20,%20%22image/jpg%22)

If you are using Apache you might want to

        RewriteEngine on
        RewriteCond %{QUERY_STRING} _query=
        RewriteRule (.*) $1? [R=permanent]

Best,
Mathias
--
Mathias Göbel
Research and Development

Georg-August-Universität Göttingen
Göttingen State and University Library
D-37070 Göttingen

Papendiek 14 (hist. building, room 2.408
<https://lageplan.uni-goettingen.de/?ident=7209_4_2.OG_2.408>)
+49 551 39-20184 (Tel.)
+49 551 39-33856 (Fax.)

[hidden email] <mailto:[hidden email]=%0A-goettingen.de>
http://www.sub.uni-goettingen.de

--

--
Piotr Bański, Ph.D.
Senior Researcher,
Institut für Deutsche Sprache,
R5 6-13
68-161 Mannheim, Germany



Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: your eXist-db is an open proxy

Omar Siam-2
Hi!

You are right, sorry. I will post that to the correct list. Just
everyone who runs exist facing to the public: This is *not a REST
issue*. This is *not solveable* by most or any of the tipps in the
exist-db docs. it is something more scary.

Everyone using exist for serving stuff needs to use RewriteCond or sth
similar.

Best Regards

Omar
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: your eXist-db is an open proxy

Joe Wicentowski
Great point, Omar.  I'd forgotten about the one additional setting needed to lock down an eXist server from processing requests with these _query parameters.  I'll quote from Retter & Siegel, ch. 8:

"To remove the REST Server's ability to directly receive web requests, you can modify the parameter `hidden` in `$EXIST_HOME/webapp/WEB-INF/web.xml`:

  <init-param>
    <param-name>hidden</param-name>
    <param-value>true</param-value>
  </init-param>

Once you change the default value of <param-value> from "false" to "true" as shown here, requests with the ?_query parameter are blocked.  For example take this request:


Before applying the setting above, you'd get this in response (the query "1" evaluates, obviously, to the number "1"):

<exist:result xmlns:exist="http://exist.sourceforge.net/NS/exist" exist:hits="1"
    exist:start="1" exist:count="1" exist:compilation-time="6"
    exist:execution-time="6">
    <exist:value exist:type="xs:integer">1</exist:value>
</exist:result>

After applying the setting, you'd get this:

> HTTP ERROR 403
> Problem accessing /exist/apps/. Reason:
>   Not allowed to read collection

This is also explained in the web.xml file - see https://github.com/eXist-db/exist/blob/develop/webapp/WEB-INF/web.xml.tmpl#L87-L101.  This would be a great topic to be covered in the eXist prose documentation too, so I've filed an issue to ensure this idea is captured: https://github.com/eXist-db/documentation/issues/98.

Again, I'd welcome anyone interested in further discussion on this topic to move it over to exist-open.  

On Fri, Mar 10, 2017 at 11:21 AM, Omar Siam <[hidden email]> wrote:
Hi!

You are right, sorry. I will post that to the correct list. Just everyone who runs exist facing to the public: This is *not a REST issue*. This is *not solveable* by most or any of the tipps in the exist-db docs. it is something more scary.

Everyone using exist for serving stuff needs to use RewriteCond or sth similar.

Best Regards

Omar


Loading...