If there is an error connecting to the database during application startup (even a transient error like a hostname lookup failure) the VisitIdentity extension will fail to start: 2012-08-29 08:24:46,024 turbogears.startup ERROR Error starting TurboGears extension 'identity = turbogears.identity.visitor': (OperationalError) (2005, "Unknown MySQL server host 'galangal.usersys.redhat.com' (1)") None None Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/turbogears/startup.py", line 298, in startTurboGears ext.start_extension() File "/usr/lib/python2.6/site-packages/turbogears/identity/visitor.py", line 45, in start_extension create_extension_model() File "/usr/lib/python2.6/site-packages/turbogears/identity/visitor.py", line 65, in create_extension_model provider.create_provider_model() File "/usr/lib/python2.6/site-packages/turbogears/identity/saprovider.py", line 179, in create_provider_model class_mapper(user_class).local_table.create(checkfirst=True) File "/usr/lib64/python2.6/site-packages/sqlalchemy/schema.py", line 477, in create bind.create(self, checkfirst=checkfirst) File "/usr/lib64/python2.6/site-packages/sqlalchemy/engine/base.py", line 1698, in create connection=connection, **kwargs) File "/usr/lib64/python2.6/site-packages/sqlalchemy/engine/base.py", line 1735, in _run_visitor conn = self.contextual_connect(close_with_result=False) File "/usr/lib64/python2.6/site-packages/sqlalchemy/engine/base.py", line 1829, in contextual_connect self.pool.connect(), File "/usr/lib64/python2.6/site-packages/sqlalchemy/pool.py", line 182, in connect return _ConnectionFairy(self).checkout() File "/usr/lib64/python2.6/site-packages/sqlalchemy/pool.py", line 369, in __init__ rec = self._connection_record = pool.get() File "/usr/lib64/python2.6/site-packages/sqlalchemy/pool.py", line 213, in get return self.do_get() File "/usr/lib64/python2.6/site-packages/sqlalchemy/pool.py", line 734, in do_get con = self.create_connection() File "/usr/lib64/python2.6/site-packages/sqlalchemy/pool.py", line 147, in create_connection return _ConnectionRecord(self) File "/usr/lib64/python2.6/site-packages/sqlalchemy/pool.py", line 253, in __init__ self.connection = self.__connect() File "/usr/lib64/python2.6/site-packages/sqlalchemy/pool.py", line 319, in __connect connection = self.__pool._creator() File "/usr/lib64/python2.6/site-packages/sqlalchemy/engine/strategies.py", line 82, in connect return dialect.connect(*cargs, **cparams) File "/usr/lib64/python2.6/site-packages/sqlalchemy/engine/default.py", line 249, in connect return self.dbapi.connect(*cargs, **cparams) File "/usr/lib64/python2.6/site-packages/MySQLdb/__init__.py", line 81, in Connect return Connection(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/MySQLdb/connections.py", line 187, in __init__ super(Connection, self).__init__(*args, **kwargs2) OperationalError: (OperationalError) (2005, "Unknown MySQL server host 'galangal.usersys.redhat.com' (1)") None None The whole application is then unusable and will never recover. Most requests will fail like this: 2012-08-29 08:33:52,807 cherrypy.msg INFO HTTP: Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/CherryPy-2.3.0-py2.6.egg/cherrypy/_cphttptools.py", line 121, in _run self.main() File "/usr/lib/python2.6/site-packages/CherryPy-2.3.0-py2.6.egg/cherrypy/_cphttptools.py", line 264, in main body = page_handler(*virtual_path, **self.params) File "<string>", line 3, in index ... File "/usr/lib/python2.6/site-packages/bkr/server/controllers.py", line 367, in index return self._systems(systems=System.all(identity.current.user), File "/usr/lib/python2.6/site-packages/turbogears/identity/base.py", line 167, in __getattr__ identity = self.identity() File "/usr/lib/python2.6/site-packages/turbogears/identity/base.py", line 161, in identity raise IdentityManagementNotEnabledException() IdentityManagementNotEnabledException: An attempt was made to use a facility of the TurboGears Identity Management framework, but identity management hasn't been enabled in the config file [via identity.on]. Bearing in mind that in a mod_wsgi deployment new worker processes are started on demand and recycled every N requests, this is quite a serious problem, because it means a single transient failure at the right moment can leave a worker process unusable for the next N (1000 or so) requests.
To fix this we will need to patch TurboGears not to create its visit tables on startup. We have beaker-init for that anyway. (Or we could stop using TurboGears' visit tracking...)
On Gerrit: http://gerrit.beaker-project.org/1339
Beaker 0.9.4 has been released.
All, This issue was reproduced here: [] https://beaker.engineering.redhat.com/jobs/339147 http://beaker-archive.app.eng.bos.redhat.com/beaker-logs/2012/11/3391/339147/717859/console.log <-SNIP-> 2012-11-25 19:00:41,899 backend __on_error: ERROR --- ERROR: Waiting for result_id from LC for given id (9f68e407-e899-43e7-ae9f-7ecadae5ec7a). at [' File "/usr/bin/beah-beaker-backend", line 9, in <module>\n load_entry_point(\'beah==0.6.41.dev201211060159\', \'console_scripts\', \'beah-beaker-backend\')()\n', ' File "/usr/lib/python2.6/site-packages/beah/backends/beakerlc.py", line 2007, in main\n debug.runcall(reactor.run)\n', ' File "/usr/lib/python2.6/site-packages/beah/core/debug.py", line 11, in runcall\n a_callable(*args, **kwargs)\n', ' File "/usr/lib64/python2.6/site-packages/twisted/internet/base.py", line 1128, in run\n self.mainLoop()\n', ' File "/usr/lib64/python2.6/site-packages/twisted/internet/base.py", line 1140, in mainLoop\n self.doIteration(t)\n', ' File "/usr/lib64/python2.6/site-packages/twisted/internet/selectreactor.py", line 140, in doSelect\n _logrun(selectable, _drdw, selectable, method, dict)\n', ' File "/usr/lib64/python2.6/site-packages/twisted/python/log.py", line 84, in callWithLogger\n return callWithContext({"system": lp}, func, *args, **kw)\n', ' File "/usr/lib64/python2.6/site-packages/twisted/python/log.py", line 69, in callWithContext\n return context.call({ILogContext: newCtx}, func, *args, **kw)\n', ' File "/usr/lib64/python2.6/site-packages/twisted/python/context.py", line 59, in callWithContext\n return self.currentContext().callWithContext(ctx, func, *args, **kw)\n', ' File "/usr/lib64/python2.6/site-packages/twisted/python/context.py", line 37, in callWithContext\n return func(*args,**kw)\n', ' File "/usr/lib64/python2.6/site-packages/twisted/internet/selectreactor.py", line 146, in _doReadOrWrite\n why = getattr(selectable, method)()\n', ' File "/usr/lib64/python2.6/site-packages/twisted/internet/tcp.py", line 463, in doRead\n return self.protocol.dataReceived(data)\n', ' File "/usr/lib64/python2.6/site-packages/twisted/protocols/basic.py", line 239, in dataReceived\n return self.rawDataReceived(data)\n', ' File "/usr/lib64/python2.6/site-packages/twisted/web/http.py", line 467, in rawDataReceived\n self.handleResponseEnd()\n', ' File "/usr/lib64/python2.6/site-packages/twisted/web/http.py", line 430, in handleResponseEnd\n self.handleResponse(b)\n', ' File "/usr/lib64/python2.6/site-packages/twisted/web/xmlrpc.py", line 279, in handleResponse\n self.factory.parseResponse(contents)\n', ' File "/usr/lib64/python2.6/site-packages/twisted/web/xmlrpc.py", line 310, in parseResponse\n deferred.errback(failure.Failure())\n', ' File "/usr/lib64/python2.6/site-packages/twisted/internet/defer.py", line 269, in errback\n self._startRunCallbacks(fail)\n', ' File "/usr/lib64/python2.6/site-packages/twisted/internet/defer.py", line 312, in _startRunCallbacks\n self._runCallbacks()\n', ' File "/usr/lib64/python2.6/site-packages/twisted/internet/defer.py", line 328, in _runCallbacks\n self.result = callback(self.result, *args, **kw)\n', ' File "/usr/lib/python2.6/site-packages/beah/wires/internals/repeatingproxy.py", line 302, in on_error\n self.send_next()\n', ' File "/usr/lib/python2.6/site-packages/beah/wires/internals/repeatingproxy.py", line 325, in send_next\n self.on_idle()\n', ' File "/usr/lib/python2.6/site-packages/beah/core/backends.py", line 105, in set_idle\n self._next_evt()\n', ' File "/usr/lib/python2.6/site-packages/beah/core/backends.py", line 139, in _next_evt\n ExtBackend.proc_evt(self, evt, **flags)\n', ' File "/usr/lib/python2.6/site-packages/beah/core/backends.py", line 56, in proc_evt\n answ = BasicBackend.proc_evt(self, evt, **flags)\n', ' File "/usr/lib/python2.6/site-packages/beah/core/backends.py", line 36, in proc_evt\n answ = f(evt)\n', ' File "/usr/lib/python2.6/site-packages/beah/backends/beakerlc.py", line 1776, in proc_evt_relation\n result.attach(evt.task.file(evt.arg(\'id2\')), attach_as=evt.arg(\'title2\'))\n', ' File "/usr/lib/python2.6/site-packages/beah/backends/beakerlc.py", line 1273, in attach\n self.log_error("Waiting for result_id from LC for given id (%s)." % self.id)\n', ' File "/usr/lib/python2.6/site-packages/beah/backends/beakerlc.py", line 802, in log_error\n self.backend().on_error(message)\n', ' File "/usr/lib/python2.6/site-packages/beah/backends/beakerlc.py", line 1626, in on_error\n self.__on_error("ERROR", msg, traceback.format_stack(), *args, **kwargs)\n'] 2012-11-25 19:00:42,510 backend simple_recipe: ERROR Encoutnered problem while running task '9367077'. Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/beah/backends/beakerlc.py", line 535, in simple_recipe result = task.getResult() File "/usr/lib64/python2.6/site-packages/twisted/internet/defer.py", line 584, in getResult self.result.raiseException() File "/usr/lib64/python2.6/site-packages/twisted/internet/defer.py", line 605, in _deferGenerator result = g.next() File "/usr/lib/python2.6/site-packages/beah/backends/beakerlc.py", line 605, in _run_task thingy.getResult() File "/usr/lib64/python2.6/site-packages/twisted/internet/defer.py", line 584, in getResult self.result.raiseException() File "/usr/lib64/python2.6/site-packages/twisted/web/xmlrpc.py", line 307, in parseResponse response = xmlrpclib.loads(contents)[0][0] File "/usr/lib64/python2.6/xmlrpclib.py", line 1184, in loads return u.close(), u.getmethodname() File "/usr/lib64/python2.6/xmlrpclib.py", line 838, in close raise Fault(**self._stack[0]) Fault: <Fault 1: "<class 'turbogears.identity.exceptions.IdentityManagementNotEnabledException'>:An attempt was made to use a facility of the TurboGears Identity Management framework, but identity management hasn't been enabled in the config file [via identity.on]."> <-SNIP-> -pbunyan
In this case the error was a different one, with the same result (TG identity failed to start). From /var/log/beaker/server-debug.log: 2012-11-25 19:00:40,519 turbogears.identity INFO Identity starting 2012-11-25 19:00:40,519 root ERROR tg.utils: Could not import ldapsa because the class was not found Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/turbogears/util.py", line 512, in load_class return getattr(module, classname) AttributeError: 'module' object has no attribute 'ldapsa' 2012-11-25 19:00:40,520 turbogears.startup ERROR Error starting TurboGears extension 'identity = turbogears.identity.visitor': Unknown Identity configuration error Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/turbogears/startup.py", line 298, in startTurboGears ext.start_extension() File "/usr/lib/python2.6/site-packages/turbogears/identity/visitor.py", line 45, in start_extension create_extension_model() File "/usr/lib/python2.6/site-packages/turbogears/identity/visitor.py", line 64, in create_extension_model provider = create_default_provider() File "/usr/lib/python2.6/site-packages/turbogears/identity/base.py", line 68, in create_default_provider "IdentityProvider plugin missing: %s" % provider_plugin) IdentityConfigurationException: Unknown Identity configuration error
(In reply to comment #7) This happened right around the time of the 0.10.3 hotfix upgrade. TurboGears uses setuptools entry points to look up its identity plugins. So I'm guessing if the application happens to start up at the exact moment that rpm is upgrading the beaker-server package (and replacing the setuptools junk that lives in /usr/lib/python2.6/site-packages/bkr.server-*.egg-info) it might fail to find the right entry point. TurboGears also supports specifying the identity plugin by module + class name, which is actually what we have in bkr/server/config/app.cfg. It's not even supposed to be specified in server.cfg at all, it's been left in there accidentally. So the fix is just to remove the identity.provider setting from server.cfg.
Re-closing this bug as comments 6-8 are a separate bug (cloned as bug 880424).