Description of problem:
Currently a program which wants to make HTTP requests to Beaker needs to first log in using the auth.login_* XML-RPC methods and then re-use the authentication cookie when sending subsequent requests.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Make a POST request to /systems/<fqdn>/loans/
401 response, but without WWW-Authenticate header (this is against spec). Also, the result is the same even if valid HTTP authentication headers are included in the request.
Unauthenticated request should result in a 401 with proper WWW-Authenticate header permitting HTTP Basic or Negotiate. If there are valid authentication headers in the HTTP request they should be accepted and the request should succeed.
This is made complicated by the fact that we currently rely on Apache for authentication, but only for the /login path. The web UI redirects to there if a user is not authenticated, then Apache handles the authentication.
If my understanding of Apache's authentication is correct, a possible implementation would be:
* remove "Require valid user" from the Apache config
* move the auth stuff to apply to the entire app instead of just /login
* instead of redirecting to /login, Beaker would need
This preserves the existing behaviour where some information is displayed to unauthenticated users without forcing them to log in (this has previously been requested as a useful feature), however it would break the ability to do form-based auth entirely in Beaker rather than making Apache handle it. We could handle HTTP Basic in Beaker itself if we detect Apache is not doing any authentication for us.
(In reply to Dan Callaghan from comment #0)
> If my understanding of Apache's authentication is correct, a possible
> implementation would be:
> * remove "Require valid user" from the Apache config
> * move the auth stuff to apply to the entire app instead of just /login
> * instead of redirecting to /login, Beaker would need
... to return a 401, and Apache will fill in the WWW-Authenticate. At least, I think it will, need to investigate.
A potentially lower impact trick I used to do something similar for PulpDist is to just define an alternative endpoint in Apache with a different auth setup for scripts to access.
I agree it would be preferable to just fix it properly, though.
(In reply to Dan Callaghan from comment #1)
> (In reply to Dan Callaghan from comment #0)
> > If my understanding of Apache's authentication is correct, a possible
> > implementation would be:
> > * remove "Require valid user" from the Apache config
> > * move the auth stuff to apply to the entire app instead of just /login
> > * instead of redirecting to /login, Beaker would need
> ... to return a 401, and Apache will fill in the WWW-Authenticate. At least,
> I think it will, need to investigate.
I spent some more time on this today. Apache does *not* work this way. Returning 401 from a mod_wsgi handler does not automagically cause Apache to fill in any WWW-Authenticate header. Moreover if there is no "require valid-user" directive applying to a request, Apache will not process any authentication at all -- REMOTE_USER will be unset even if the request included fully valid authentication headers.
Another possibility I looked at was using ErrorDocument, inspired by some related (but not directly applicable) techniques described here:
If we set WSGIErrorOverride On, and then ErrorDocument 401 /login, that will cause Apache to internally redirect any 401 coming from the application back to /login which will do normal authentication handling. We could theoretically then make /login do an internal redirect back to whatever the original requested URL was if authentication is successful.
The problem with WSGIErrorOverride On is that it will discard all our error responses, which breaks the text/plain errors we use for showing proper messages for AJAX requests and bkr CLI. It also prevents us from ever returning *any* kind of useful error messages in future, such as JSON-formatted errors.
While poking around in mod_wsgi.c I noticed that it also has another little-known feature which is that the application can return a 200 response with a relative Location header and mod_wsgi will perform an Apache internal redirect. That means the request will pass through all the proper Apache machinery including authentication.
Using 200+Location gets us closer because it avoids all the problems with WSGIErrorOverride On. Instead of returning 401 when the user is not authenticated, the application would return 200 with Location pointing at /login to trigger an Apache internal redirect. Apache would then do the authentication handling and then /login in the application could do an internal redirect back to the right place if authentication is successful. This makes things quite messy on the server side, but would work nicely for the client.
The downside with the 200+Location approach is that the internal redirects are always forced to GET because the request body might have been consumed. That means the technique would not work for any POST/PATCH/PUT requests, which make up a large part of the API. So it would still not solve the aim here, which is to support standard HTTP authentication for API requests.
I'm increasingly convinced that the only viable options for authentication in web apps are:
(1) protect the entire URL space consistently with authentication, and let Apache do it -- note this precludes ever having anonymous read-only access to anything in the app
(2) handle authentication entirely in the application itself and not Apache, using libraries like Flask-Kerberos, etc
The future seems to be tending towards option 2, with things like Openshift where you load-balance a bunch of Gunicorn workers running your application. Apache is typically not even involved anymore.
Option 2 is really the only solution to this bug too, assuming that we want to continue offering read-only anonymous access to parts of Beaker such as the system info and distro info (and I think we do).