Bug 1421674

Summary: Repodata stop being built, "Maximum number of workers already put"
Product: Red Hat Satellite 5 Reporter: Lukáš Hellebrandt <lhellebr>
Component: ServerAssignee: Michael Mráka <mmraka>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Red Hat Satellite QA List <satqe-list>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 570CC: dyordano, ktordeur, mmraka, tlestach, wayne.johnson, wpinheir
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-20 12:40:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1450111, 1450940    

Description Lukáš Hellebrandt 2017-02-13 12:19:11 UTC
Description of problem:
Sometimes, repodata stop being built in channel-repodata-bunch. Tail of log:

INFO   | jvm 1    | 2016/12/04 13:56:00 | 2016-12-04 13:56:00,568 [DefaultQuartzScheduler_Worker-6] INFO  com.redhat.rhn.taskomatic.task.ChannelRepodata - In the queue: 1
INFO   | jvm 1    | 2016/12/04 13:56:00 | 2016-12-04 13:56:00,569 [DefaultQuartzScheduler_Worker-9] INFO  com.redhat.rhn.taskomatic.task.ErrataCacheTask - In the queue: 2
INFO   | jvm 1    | 2016/12/04 13:56:00 | 2016-12-04 13:56:00,612 [Thread-27703] INFO  com.redhat.rhn.taskomatic.task.repomd.RepositoryWriter - Removing yum-exit-code-channel_sun04dec2016_18_49_57_31807
INFO   | jvm 1    | 2016/12/04 13:57:07 | 2016-12-04 13:57:05,623 [DefaultQuartzScheduler_Worker-2] INFO  com.redhat.rhn.taskomatic.task.ChannelRepodata - In the queue: 1
INFO   | jvm 1    | 2016/12/04 13:57:07 | 2016-12-04 13:57:05,350 [DefaultQuartzScheduler_Worker-5] INFO  com.redhat.rhn.taskomatic.task.ErrataCacheTask - In the queue: 4
INFO   | jvm 1    | 2016/12/04 13:58:00 | 2016-12-04 13:58:00,193 [DefaultQuartzScheduler_Worker-3] WARN  com.redhat.rhn.taskomatic.task.ChannelRepodata - Maximum number of workers already put ... skipping.
INFO   | jvm 1    | 2016/12/04 13:58:00 | 2016-12-04 13:58:00,582 [DefaultQuartzScheduler_Worker-10] INFO  com.redhat.rhn.taskomatic.task.ErrataCacheTask - In the queue: 1
INFO   | jvm 1    | 2016/12/04 13:59:00 | 2016-12-04 13:59:00,123 [DefaultQuartzScheduler_Worker-9] WARN  com.redhat.rhn.taskomatic.task.ChannelRepodata - Maximum number of workers already put ... skipping.
INFO   | jvm 1    | 2016/12/04 13:59:00 | 2016-12-04 13:59:00,237 [DefaultQuartzScheduler_Worker-5] INFO  com.redhat.rhn.taskomatic.task.ErrataCacheTask - In the queue: 1
INFO   | jvm 1    | 2016/12/04 14:00:00 | 2016-12-04 14:00:00,173 [DefaultQuartzScheduler_Worker-1] WARN  com.redhat.rhn.taskomatic.task.ChannelRepodata - Maximum number of workers already put ... skipping.
INFO   | jvm 1    | 2016/12/04 14:00:02 | 2016-12-04 14:00:02,184 [DefaultQuartzScheduler_Worker-5] INFO  com.redhat.rhn.taskomatic.task.ErrataCacheTask - In the queue: 3
INFO   | jvm 1    | 2016/12/04 14:01:00 | 2016-12-04 14:01:00,073 [DefaultQuartzScheduler_Worker-1] WARN  com.redhat.rhn.taskomatic.task.ChannelRepodata - Maximum number of workers already put ... skipping.
INFO   | jvm 1    | 2016/12/04 14:01:00 | 2016-12-04 14:01:00,241 [DefaultQuartzScheduler_Worker-2] INFO  com.redhat.rhn.taskomatic.task.ErrataCacheTask - In the queue: 6
INFO   | jvm 1    | 2016/12/04 14:01:41 | 2016-12-04 14:01:40,966 [DefaultQuartzScheduler_Worker-3] ERROR com.redhat.rhn.manager.kickstart.cobbler.CobblerCommand - Error calling cobbler.
INFO   | jvm 1    | 2016/12/04 14:01:41 | redstone.xmlrpc.XmlRpcFault: <class 'cobbler.cexceptions.CX'>:'internal error, unknown profile name ks-qemu-auto-2016_12_04_1314_1044:1:RedHatSatTeamQA'
INFO   | jvm 1    | 2016/12/04 14:01:41 |       at redstone.xmlrpc.XmlRpcClient.handleResponse(XmlRpcClient.java:443)
INFO   | jvm 1    | 2016/12/04 14:01:41 |       at redstone.xmlrpc.XmlRpcClient.endCall(XmlRpcClient.java:376)
INFO   | jvm 1    | 2016/12/04 14:01:41 |       at redstone.xmlrpc.XmlRpcClient.invoke(XmlRpcClient.java:165)
INFO   | jvm 1    | 2016/12/04 14:01:41 |       at com.redhat.rhn.manager.kickstart.cobbler.CobblerXMLRPCHelper.invokeMethod(CobblerXMLRPCHelper.java:69)
INFO   | jvm 1    | 2016/12/04 14:01:41 |       at com.redhat.rhn.manager.kickstart.cobbler.CobblerCommand.invokeXMLRPC(CobblerCommand.java:100)
INFO   | jvm 1    | 2016/12/04 14:01:41 |       at com.redhat.rhn.manager.kickstart.cobbler.CobblerCommand.invokeXMLRPC(CobblerCommand.java:115)
INFO   | jvm 1    | 2016/12/04 14:01:41 |       at com.redhat.rhn.manager.kickstart.cobbler.CobblerProfileSyncCommand.syncProfileToSpacewalk(CobblerProfileSyncCommand.java:153)
INFO   | jvm 1    | 2016/12/04 14:01:41 |       at com.redhat.rhn.manager.kickstart.cobbler.CobblerProfileSyncCommand.store(CobblerProfileSyncCommand.java:109)
INFO   | jvm 1    | 2016/12/04 14:01:41 |       at com.redhat.rhn.taskomatic.task.CobblerSyncTask.execute(CobblerSyncTask.java:123)
INFO   | jvm 1    | 2016/12/04 14:01:41 |       at com.redhat.rhn.taskomatic.task.RhnJavaJob.execute(RhnJavaJob.java:88)
INFO   | jvm 1    | 2016/12/04 14:01:41 |       at com.redhat.rhn.taskomatic.TaskoJob.execute(TaskoJob.java:170)
INFO   | jvm 1    | 2016/12/04 14:01:41 |       at org.quartz.core.JobRunShell.run(JobRunShell.java:216)
INFO   | jvm 1    | 2016/12/04 14:01:41 |       at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:549)
INFO   | jvm 1    | 2016/12/04 14:01:41 | 2016-12-04 14:01:41,006 [DefaultQuartzScheduler_Worker-3] ERROR com.redhat.rhn.manager.kickstart.cobbler.CobblerProfileSyncCommand - Cobbler doesn't know about this profile any more!
INFO   | jvm 1    | 2016/12/04 14:02:00 | 2016-12-04 14:02:00,070 [DefaultQuartzScheduler_Worker-6] WARN  com.redhat.rhn.taskomatic.task.ChannelRepodata - Maximum number of workers already put ... skipping.
INFO   | jvm 1    | 2016/12/04 14:02:23 | 2016-12-04 14:02:23,103 [DefaultQuartzScheduler_Worker-3] ERROR com.redhat.rhn.manager.kickstart.cobbler.CobblerCommand - Error calling cobbler.
INFO   | jvm 1    | 2016/12/04 14:02:23 | redstone.xmlrpc.XmlRpcFault: <class 'cobbler.cexceptions.CX'>:'internal error, unknown profile name ks-qemu-auto-2016_12_04_1334_7604:1:RedHatSatTeamQA'
INFO   | jvm 1    | 2016/12/04 14:02:23 |       at redstone.xmlrpc.XmlRpcClient.handleResponse(XmlRpcClient.java:443)
INFO   | jvm 1    | 2016/12/04 14:02:23 |       at redstone.xmlrpc.XmlRpcClient.endCall(XmlRpcClient.java:376)
INFO   | jvm 1    | 2016/12/04 14:02:23 |       at redstone.xmlrpc.XmlRpcClient.invoke(XmlRpcClient.java:165)
INFO   | jvm 1    | 2016/12/04 14:02:23 |       at com.redhat.rhn.manager.kickstart.cobbler.CobblerXMLRPCHelper.invokeMethod(CobblerXMLRPCHelper.java:69)
INFO   | jvm 1    | 2016/12/04 14:02:23 |       at com.redhat.rhn.manager.kickstart.cobbler.CobblerCommand.invokeXMLRPC(CobblerCommand.java:100)
INFO   | jvm 1    | 2016/12/04 14:02:23 |       at com.redhat.rhn.manager.kickstart.cobbler.CobblerCommand.invokeXMLRPC(CobblerCommand.java:115)
INFO   | jvm 1    | 2016/12/04 14:02:23 |       at com.redhat.rhn.manager.kickstart.cobbler.CobblerProfileSyncCommand.syncProfileToSpacewalk(CobblerProfileSyncCommand.java:153)
INFO   | jvm 1    | 2016/12/04 14:02:23 |       at com.redhat.rhn.manager.kickstart.cobbler.CobblerProfileSyncCommand.store(CobblerProfileSyncCommand.java:109)
INFO   | jvm 1    | 2016/12/04 14:02:23 |       at com.redhat.rhn.taskomatic.task.CobblerSyncTask.execute(CobblerSyncTask.java:123)
INFO   | jvm 1    | 2016/12/04 14:02:23 |       at com.redhat.rhn.taskomatic.task.RhnJavaJob.execute(RhnJavaJob.java:88)
INFO   | jvm 1    | 2016/12/04 14:02:23 |       at com.redhat.rhn.taskomatic.TaskoJob.execute(TaskoJob.java:170)
INFO   | jvm 1    | 2016/12/04 14:02:23 |       at org.quartz.core.JobRunShell.run(JobRunShell.java:216)
INFO   | jvm 1    | 2016/12/04 14:02:23 |       at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:549)
INFO   | jvm 1    | 2016/12/04 14:02:23 | 2016-12-04 14:02:23,103 [DefaultQuartzScheduler_Worker-3] ERROR com.redhat.rhn.manager.kickstart.cobbler.CobblerProfileSyncCommand - Cobbler doesn't know about this profile any more!
INFO   | jvm 1    | 2016/12/04 14:03:00 | 2016-12-04 14:03:00,058 [DefaultQuartzScheduler_Worker-7] WARN  com.redhat.rhn.taskomatic.task.ChannelRepodata - Maximum number of workers already put ... skipping.
INFO   | jvm 1    | 2016/12/04 14:03:00 | 2016-12-04 14:03:00,204 [DefaultQuartzScheduler_Worker-2] INFO  com.redhat.rhn.taskomatic.task.ErrataCacheTask - In the queue: 1
INFO   | jvm 1    | 2016/12/04 14:03:58 | 2016-12-04 14:03:58,254 [DefaultQuartzScheduler_Worker-9] INFO  com.redhat.rhn.taskomatic.task.ErrataCacheTask - In the queue: 2
INFO   | jvm 1    | 2016/12/04 14:04:00 | 2016-12-04 14:04:00,056 [DefaultQuartzScheduler_Worker-6] WARN  com.redhat.rhn.taskomatic.task.ChannelRepodata - Maximum number of workers already put ... skipping.
INFO   | jvm 1    | 2016/12/04 14:05:00 | 2016-12-04 14:05:00,195 [DefaultQuartzScheduler_Worker-7] WARN  com.redhat.rhn.taskomatic.task.ChannelRepodata - Maximum number of workers already put ... skipping.

This "Maximum number of workers" message is output to log every minute and repodata are not being built anymore.

How reproducible:
Sometimes, happens regularly on our stable Satellite.

Actual results:
Repodata not being built, the above warnings logged every minute. (The warnings are fine, they can appear sometimes.)

Expected results:
Repodata built correctly.

Additional info:
This is a bug created as a follow-up of bug 1240597.

Comment 2 Tomas Lestach 2017-09-18 08:40:51 UTC
Actually I do not think this is a bug. The message about Maximum number of workers says, repodata are being built with the maximum number of threads (as configured) and new repodata generation requests are still waiting to be processed.
What is the problem?

Comment 3 Jan Dobes 2017-09-19 08:48:57 UTC
Problem is that taskomatic thinks there is a worker (or workers) working on repodata but there isn't any. Worker counter seems corrupted after certain time and events. We were able to reproduce it but never found the bug in code.

Comment 4 Tomas Lestach 2017-09-19 14:40:26 UTC
Thanks. Are there any special reproducer steps?