Description of problem: If the Beaker server is down for an extended period (longer than visit.timeout which defaults to 6 hours), beaker-provision's authentication cookie may expire. In that case beaker-provision will keep attempting to talk with the expired cookie and never figure out why it is failing. I think the other LC daemons are affected by the same problem but I'm not sure. Workaround is to restart beaker-provision so that it authenticates afresh. Version-Release number of selected component (if applicable): 0.16.1 How reproducible: fairly easily Steps to Reproduce: 1. Set visit.timeout to some small value on the Beaker server 2. Stop httpd on the Beaker server 3. Wait for longer than visit.timeout 4. Start httpd on the Beaker server Actual results: beaker-provision keeps trying to make HTTP requests but they all fail like this, because its identity cookie has expired: Fault: <Fault 1: "<class 'bkr.server.identity.IdentityFailure'>: Anonymous access denied"> Expected results: beaker-provision should recover once the server comes back.
There are a few things we can do here... We should use a much longer timeout for cookies that are given to the LC daemons because they are basically trusted to be quite secure. Something like two weeks would be more suitable. We can make it as long as we like, there is no storage cost, because we don't have database-backed sessions anymore -- it's just a signed token. We could sign it for 10 years if we wanted. The tokens can still be invalidated by rolling over the secret key on the server side. The other thing is that beaker-provision should notice when it needs to reauthenticate. Ideally we would be using standard HTTP authentication mechanisms which means we would get a 401 when we were no longer authenticated. In that case a sane client-side library would handle that for us (e.g. requests if we used that). However the Kobo-derived XMLRPC cruft we are using currently does not allow for any of that stuff. The lamer approach would be to just look for XML-RPC faults with a particular string (such as "Anonymous access denied") and trigger a reauthentication in that case.
I have patches for those two things. > > We should use a much longer timeout for cookies that are given to the LC > daemons because they are basically trusted to be quite secure. Something > like two weeks would be more suitable. We can make it as long as we like, > there is no storage cost, because we don't have database-backed sessions > anymore -- it's just a signed token. We could sign it for 10 years if we > wanted. The tokens can still be invalidated by rolling over the secret key > on the server side. https://gerrit.beaker-project.org/#/c/4881/ > > The other thing is that beaker-provision should notice when it needs to > reauthenticate. Ideally we would be using standard HTTP authentication > mechanisms which means we would get a 401 when we were no longer > authenticated. In that case a sane client-side library would handle that for > us (e.g. requests if we used that). However the Kobo-derived XMLRPC cruft we > are using currently does not allow for any of that stuff. The lamer approach > would be to just look for XML-RPC faults with a particular string (such as > "Anonymous access denied") and trigger a reauthentication in that case. https://gerrit.beaker-project.org/#/c/4882
Beaker 23.0 has been released.
This is not fixed in 23.0. Actually the situation is now worse: beaker-provision claims to be polling for queued commands like normal but actually every request will fail silently, it never picks any new commands. + if 'Anonymous access denied' in fault.faultString: + # Trigger a reauthentication if the server is down longer than visit.timeout + # which defaults to 2 weeks. Then beaker-provision will recover + # once the server is back. + poller.hub.auth.renew_session() The problem here is that the auth.renew_session call is actually asking "do I need to renew my session? True/False", it's not saying "renew my session". So in case the session expires, this will be returning False, but beaker-provision still never re-authenticates -- each iteration of the polling loop just fails and hits this code path again. It becomes a bit more obvious if you put in a log message inside that if block: Aug 23 15:13:44 lab beaker-provision[29199]: bkr.common.xmlrpc WARNING XML-RPC connection to beaker.dcallagh.beakerdevs.lab.eng.bne.redhat.com failed: Connection refused, 1 retry left Aug 23 15:14:15 lab beaker-provision[29199]: bkr.labcontroller.provision DEBUG Session expired, re-authenticating Aug 23 15:14:35 lab beaker-provision[29199]: bkr.labcontroller.provision DEBUG Polling for queued commands Aug 23 15:14:36 lab beaker-provision[29199]: bkr.labcontroller.provision DEBUG Session expired, re-authenticating Aug 23 15:14:57 lab beaker-provision[29199]: bkr.labcontroller.provision DEBUG Polling for queued commands Aug 23 15:14:57 lab beaker-provision[29199]: bkr.labcontroller.provision DEBUG Session expired, re-authenticating [...] beaker-provision needs to call the login() method to reauthenticate, not renew_session. I was looking at this stuff (renew_session in particular) as part of bug 1368509 and I actually think it really just needs to die in a fire...
Hmm I really should have filed this as a separate bug instead of messing with this already-closed one. See bug 1369305.
(In reply to Dan Callaghan from comment #8) > This is not fixed in 23.0. Actually the situation is now worse: > beaker-provision claims to be polling for queued commands like normal but > actually every request will fail silently, it never picks any new commands. > > + if 'Anonymous access denied' in fault.faultString: > + # Trigger a reauthentication if the server is down longer > than visit.timeout > + # which defaults to 2 weeks. Then beaker-provision will > recover > + # once the server is back. > + poller.hub.auth.renew_session() > > The problem here is that the auth.renew_session call is actually asking "do > I need to renew my session? True/False", it's not saying "renew my session". > So in case the session expires, this will be returning False, but > beaker-provision still never re-authenticates -- each iteration of the > polling loop just fails and hits this code path again. Yes, renew_session is quite confusing. If python could support ? in the method name like Ruby, it would be great, :-P