Bug 1388814

Summary: Weekly sync plan not works
Product: Red Hat Satellite Reporter: Sasha Segal <ssegal>
Component: Sync PlansAssignee: satellite6-bugs <satellite6-bugs>
Status: CLOSED DUPLICATE QA Contact: Katello QA List <katello-qa-list>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.2.0CC: abraverm, bbuckingham, bkearney, dkliban, mhrivnak, ssegal
Target Milestone: Unspecified   
Target Release: Unused   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-12-13 15:59:51 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1338516    
Attachments:
Description Flags
Satellite logs
none
Screenshot none

Description Sasha Segal 2016-10-26 08:41:18 UTC
Created attachment 1214207 [details]
Satellite logs

Product has weekly sync plan.
Sync plan not running weekly.

Comment 1 Brad Buckingham 2016-10-31 19:19:37 UTC
Hello Sasha, 

Can you please provide the detailed steps to reproduce the issue?

Are there any errors observed in the UI or logs (e.g. /var/log/messages, /var/log/foreman/production.log, /var/log/httpd/*)?

Comment 2 Sasha Segal 2016-11-01 08:36:45 UTC
Hi,

No errors, the sync plan just not executed, please see attached screenshot. The issue persists all the time so there is no what reproduce. I can give you access to the server to see if you want. IRC ssegal

Comment 3 Sasha Segal 2016-11-01 08:37:18 UTC
Created attachment 1216004 [details]
Screenshot

Comment 4 Brad Buckingham 2016-11-03 16:39:14 UTC
Michael,

I am not seeing any errors on the katello side, can you have someone from pulp assist?  Want to confirm if the sync schedules are getting created in pulp correctly or if there may be other errors.

Comment 5 Brad Buckingham 2016-11-03 17:53:48 UTC
Dennis Kliban from the pulp team is helping to investigate.

Comment 6 Dennis Kliban 2016-11-03 19:36:41 UTC
It looks like the celerybeat process got stuck and then it was restarted today. After the restart, celerybeat dispatched a lot of tasks that it was supposed to dispatch while it was stuck. It looks like the syncs running now. Since there were a lot of them dispatched at once, it is taking a while for Pulp to catch up.

The next scheduled sync is supposed to occur on Nov 8. If it does not happen, please don't restart the process so I can take a look at it to confirm that it is stuck due to a known issue with Qpid client libraries.

Comment 8 Brad Buckingham 2016-11-10 14:51:22 UTC
This may be a duplicate of bug 1377195.

Comment 9 Dennis Kliban 2016-11-10 21:31:51 UTC
I examined the box and have determined that pulp_worker-0 has gotten into a hung state. This is a known issue with Qpid[0]. The Qpid team is working on a fix.

In your case, worker-0 received the task from the scheduler, but then got stuck when trying to dispatch the actual tasks that perform the sync and publish. As a result the database says that the scheduler fired off a message to do the sync on Nov. 8, but a task_status document was never created because the the sync task was never dispatched. 

I can only recommend that you stop pulp_workers service(s) is stopped. Verify using ps that all pulp_worker-X processes are stopped. Then start the service again.


[0] https://issues.apache.org/jira/browse/QPID-7317

Comment 10 Bryan Kearney 2016-12-13 15:59:51 UTC
The root cause of this, as dicussed in 9, is being tracked by https://bugzilla.redhat.com/show_bug.cgi?id=1377195. I am closing this out as a dupe and we will track the fix there.

*** This bug has been marked as a duplicate of bug 1377195 ***