Description of problem: Installed rlx-1-18 with 4/24 ISO on 4/25. Everything seemed good, until : 05/03/2009 04:03 AM At that point Taskomatic threw the following traceback email (note the time - CRON job time) -------- Original Message -------- Subject: WEB TRACEBACK from rlx-1-18.rhndev.redhat.com Date: Sun, 3 May 2009 04:03:02 -0400 From: RHN Satellite <dev-null> To: cperry com.redhat.rhn.manager.kickstart.cobbler.NoCobblerTokenException: We had an error trying to login. at com.redhat.rhn.manager.kickstart.cobbler.CobblerLoginCommand.login(CobblerLoginCommand.java:57) at com.redhat.rhn.frontend.integration.IntegrationService.authorize(IntegrationService.java:113) at com.redhat.rhn.frontend.integration.IntegrationService.getAuthToken(IntegrationService.java:73) at com.redhat.rhn.manager.kickstart.cobbler.CobblerCommand.<init>(CobblerCommand.java:72) at com.redhat.rhn.manager.kickstart.cobbler.CobblerDistroSyncCommand.<init>(CobblerDistroSyncCommand.java:49) at com.redhat.rhn.taskomatic.task.CobblerSyncTask.execute(CobblerSyncTask.java:83) at com.redhat.rhn.taskomatic.task.SingleThreadedTestableTask.execute(SingleThreadedTestableTask.java:54) at org.quartz.core.JobRunShell.run(JobRunShell.java:203) at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:520) Caused by: redstone.xmlrpc.XmlRpcFault: cobbler.cexceptions.CX:'login failed: taskomatic_user' at redstone.xmlrpc.XmlRpcClient.handleResponse(XmlRpcClient.java:443) at redstone.xmlrpc.XmlRpcClient.endCall(XmlRpcClient.java:376) at redstone.xmlrpc.XmlRpcClient.invoke(XmlRpcClient.java:165) at com.redhat.rhn.manager.kickstart.cobbler.CobblerXMLRPCHelper.invokeMethod(CobblerXMLRPCHelper.java:69) at com.redhat.rhn.manager.kickstart.cobbler.CobblerLoginCommand.login(CobblerLoginCommand.java:52) ... 8 more After this event - Taskomatic every 10 minutes had generated the following traceback email: -------- Original Message -------- Subject: WEB TRACEBACK from rlx-1-18.rhndev.redhat.com Date: Sun, 3 May 2009 04:10:01 -0400 From: RHN Satellite <dev-null> To: cperry com.redhat.rhn.manager.kickstart.cobbler.NoCobblerTokenException: We had an error trying to login. at com.redhat.rhn.manager.kickstart.cobbler.CobblerLoginCommand.login(CobblerLoginCommand.java:57) at com.redhat.rhn.frontend.integration.IntegrationService.authorize(IntegrationService.java:113) at com.redhat.rhn.frontend.integration.IntegrationService.getAuthToken(IntegrationService.java:73) at com.redhat.rhn.manager.kickstart.cobbler.CobblerXMLRPCHelper.getConnection(CobblerXMLRPCHelper.java:92) at com.redhat.rhn.taskomatic.task.KickstartFileSyncTask.execute(KickstartFileSyncTask.java:66) at com.redhat.rhn.taskomatic.task.SingleThreadedTestableTask.execute(SingleThreadedTestableTask.java:54) at org.quartz.core.JobRunShell.run(JobRunShell.java:203) at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:520) Caused by: redstone.xmlrpc.XmlRpcFault: cobbler.cexceptions.CX:'login failed: taskomatic_user' at redstone.xmlrpc.XmlRpcClient.handleResponse(XmlRpcClient.java:443) at redstone.xmlrpc.XmlRpcClient.endCall(XmlRpcClient.java:376) at redstone.xmlrpc.XmlRpcClient.invoke(XmlRpcClient.java:165) at com.redhat.rhn.manager.kickstart.cobbler.CobblerXMLRPCHelper.invokeMethod(CobblerXMLRPCHelper.java:69) at com.redhat.rhn.manager.kickstart.cobbler.CobblerLoginCommand.login(CobblerLoginCommand.java:52) ... 7 more Version-Release number of selected component (if applicable): Satellite-5.3.0-RHEL5-re20090424.1-i386-embedded-oracle.iso How reproducible: I bet - VERY :/ Steps to Reproduce: 1. Install, let it run, cron kicks in, does something, taskomatic errors and unable to re-authenticate. 2. 3. Actual results: traceback emails every 10 minutes. Expected results: No traceback. Additional info: Gut feelings 1) CRON was doing something, causing an error. Taskomatic for some reason never was able to gracefully recover. I have *NOT* restarted anything on rlx-1-18 and leaving it alone with devel review. or 2) some authenticated session token expired after a week (maybe a taskomatic or cron job cleared the session token) and then taskomatic was not able to detect this and re-negotiate correctly with cobblerd. or 3) something else - maybe cobblerd just has a habit of dying on a weekly basis and needing restarts.
From /var/log/cobblerd/cobblerd.log.1 2009-04-26 04:03:01,866 - api - authenticate; ['taskomatic_user', True] 2009-04-26 04:03:01,869 - api - login succeeded; user(taskomatic_user) From /var/log/cobblerd/cobblerd.log 2009-05-03 04:03:00,064 - api - invalid token; user(???) 2009-05-03 04:03:00,065 - api - Exception occured: cobbler.cexceptions.CX 2009-05-03 04:03:00,065 - api - Exception value: 'invalid token: 9QJ7RWy+z6jyJ26OUBlMOibg8G1HatOTqA==' 2009-05-03 04:03:00,087 - api - Exception Info: File "/usr/lib/python2.4/site-packages/cobbler/remote.py", line 1567, in _dispatch return method_handle(*params) File "/usr/lib/python2.4/site-packages/cobbler/remote.py", line 1060, in token_check self.__validate_token(token) File "/usr/lib/python2.4/site-packages/cobbler/remote.py", line 968, in __validate_token raise CX(_("invalid token: %s" % token)) 2009-05-03 04:03:00,094 - api - login attempt; user(taskomatic_user) 2009-05-03 04:03:00,491 - api - authenticate; ['taskomatic_user', False] 2009-05-03 04:03:00,492 - api - login failed; user(taskomatic_user) 2009-05-03 04:03:00,493 - api - Exception occured: cobbler.cexceptions.CX 2009-05-03 04:03:00,494 - api - Exception value: 'login failed: taskomatic_user' 2009-05-03 04:03:00,494 - api - Exception Info: File "/usr/lib/python2.4/site-packages/cobbler/remote.py", line 1567, in _dispatch return method_handle(*params) File "/usr/lib/python2.4/site-packages/cobbler/remote.py", line 1033, in login raise CX(_("login failed: %s") % login_user) So taskomatic seems to login with old session token, fails, so tries to do a user authentication and fails also.
Cliff, So after looking at this, it seems tomcat is simply dead. This stops cobbler from being able to authenticate and causes taskomatic to spew errors. All of the tracebacks above look like they are from taskomatic (none from tomcat). When i run /etc/init.d/tomcat5 status I get : [root@rlx-1-18 ~]# /etc/init.d/tomcat5 status lock file found but no process running for pid 16615 "ps aux" shows no tomcat process either. So tomcat died a hard one for sure and I don't see any indication as to why. The only thing i know to do is restart tomcat and see if it happens again. Your thoughts cliff?
after talking with cliff, we're gonna restart taskomatic and see if it happens again.
So far, did not happen again weekend of May 16/17
Still no replication. Going to close this though.