Bug 1609928
Summary: | Pulp monthly maintenance not being ran | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Satellite | Reporter: | Mike McCune <mmccune> | ||||
Component: | Pulp | Assignee: | satellite6-bugs <satellite6-bugs> | ||||
Status: | CLOSED ERRATA | QA Contact: | jcallaha | ||||
Severity: | high | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 6.3.2 | CC: | andrew.schofield, bmbouter, cdonnell, daviddavis, dkliban, fgarciad, ggainey, ipanova, jentrena, kabbott, pcreech, rchan, rjerrido, ttereshc | ||||
Target Milestone: | Unspecified | Keywords: | PrioBumpField, Triaged | ||||
Target Release: | Unused | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | pulp-2.13.4.11-2,pulp-2.13.4.12-1 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | |||||||
: | 1612964 1628782 (view as bug list) | Environment: | |||||
Last Closed: | 2018-08-22 20:07:12 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1628782 | ||||||
Attachments: |
|
Description
Mike McCune
2018-07-30 20:42:29 UTC
You can kick off the call from your terminal: # celery -A pulp.server.async.app call pulp.server.maintenance.monthly.queue_monthly_maintenance note, this can be very slow on large Satellites with millions of rows in db.repo_profile_applicability The Pulp upstream bug status is at NEW. Updating the external tracker on this bug. The Pulp upstream bug priority is at Normal. Updating the external tracker on this bug. On a pulp database with 60k consumers we had 1.5 million entries in db.repo_profile_applicability: > db.repo_profile_applicability.find().size() 1,586,607 > db.consumers.find().size() 59,248 At this state, Bulk RegenerateApplicability one of the larger repos took about 66 minutes to complete I ran the montly job via: # celery -A pulp.server.async.app call pulp.server.maintenance.monthly.queue_monthly_maintenance This cleanup took 9.5 hours and brought repo_profile_applicability down to 56088 items. After this cleanup, I re-ran the RegenerateApplicability and it completed in 23 minutes. This is a huge improvement, and will make a big difference on customers who have large #s of repos and consumers when we need to run multiple RegenerateApplicability calls. The Pulp upstream bug priority is at High. Updating the external tracker on this bug. In terms of the symptom that the maintainence task is not being run, it is likely not an easyfix issue because it's the integration code between Celery and pulp_celerybeat. It would be helpful to have an `rpm -qa` output from one of these environments. Regarding the 16mb cap error in Comment 6, that is an issue with the task code itself, so that is a different root cause entirely. Brian, any reason we don't just set up a weekly (or monthly) os level cron job and skip celery scheduling of this? Will get an 'rpm -qa' as well. Created attachment 1472150 [details]
rpm -qa output from customer environment
That sounds like a great idea, but the issue is that you have to dispatch that specific task to the tasking system and there is no API endpoint that Pulp has that can do that. A variation on that idea is to write a script though that could import the tasking code and call apply_async_with_reservation on it which would cause the dispatch to occur. Cron could call that script. so calling from the shell: # celery -A pulp.server.async.app call pulp.server.maintenance.monthly.queue_monthly_maintenance is not sufficient? This definitely seemed to initiate the job, watching the journal as well as various collections in the database shrink while it was being run: Jul 30 13:42:34 sat-r220-07.lab.eng.rdu2.redhat.com pulp[24857]: celery.worker.strategy:INFO: Received task: pulp.server.maintenance.monthly.monthly_maintenance[486008fb-dac9-4698-bb50-6b79b510dba1] > db.repo_profile_applicability.count() 5735477 > db.repo_profile_applicability.count() 4984406 > db.repo_profile_applicability.count() 4845218 > db.repo_profile_applicability.count() 4767190 > db.repo_profile_applicability.count() 4762954 > db.repo_profile_applicability.count() 2683751 > db.repo_profile_applicability.count() 2271209 > db.repo_profile_applicability.count() 2105499 > ... Mike, you're right! It is that simple because those task types don't go through the scheduler with apply_async_with_reservation() so you can generically dispatch them. +1 to your great workaround! The Pulp upstream bug status is at ASSIGNED. Updating the external tracker on this bug. The Pulp upstream bug priority is at High. Updating the external tracker on this bug. The Pulp upstream bug status is at ASSIGNED. Updating the external tracker on this bug. The Pulp upstream bug status is at POST. Updating the external tracker on this bug. The Pulp upstream bug status is at MODIFIED. Updating the external tracker on this bug. The Pulp upstream bug status is at MODIFIED. Updating the external tracker on this bug. All upstream Pulp bugs are at MODIFIED+. Moving this bug to POST. The currently solution is to create a con job to run the maintenance task at whatever interval you think is best. Simply create a new python script with the following two lines, then run that via cron. from pulp.server.maintenance.monthly import queue_monthly_maintenance queue_monthly_maintenance.apply_async() -bash-4.2# cat test1609928.py from pulp.server.maintenance.monthly import queue_monthly_maintenance print ("adding monthly maintenance job") queue_monthly_maintenance.apply_async() print ("done") -bash-4.2# python test1609928.py && echo "--------------------------" && tail -f /var/log/messages | grep monthly_maintenance adding monthly maintenance job done -------------------------- Aug 14 09:54:35 hp-ml350egen8-01 pulp: celery.worker.strategy:INFO: Received task: pulp.server.maintenance.monthly.queue_monthly_maintenance[c642f21e-cc5a-43a9-a09b-b85a469aa944] Aug 14 09:54:35 hp-ml350egen8-01 pulp: celery.worker.strategy:INFO: Received task: pulp.server.maintenance.monthly.monthly_maintenance[3470fe62-dd1c-4cca-9154-c5beb7ac840d] Aug 14 09:54:35 hp-ml350egen8-01 pulp: celery.worker.job:INFO: Task pulp.server.maintenance.monthly.queue_monthly_maintenance[c642f21e-cc5a-43a9-a09b-b85a469aa944] succeeded in 0.0409811839927s: None Aug 14 09:54:35 hp-ml350egen8-01 pulp: celery.worker.job:INFO: Task pulp.server.maintenance.monthly.monthly_maintenance[3470fe62-dd1c-4cca-9154-c5beb7ac840d] succeeded in 0.0406332570128s: None This is the command you can run from the shell as an alternative to the python approach. I believe this is what the rpm will install in cron. celery -A pulp.server.async.app call pulp.server.maintenance.monthly.queue_monthly_maintenance Verified in Satellite 6.3.3 Snap 3. The additional fix, for larger systems, was previously applied and tested on a customer system. With no issues encountered, we are marking this as verified. We are delivering a new subpackage with this bug called pulp-maintenance which needs to be included in our composes as well as pulled in as a dep by either the 'satellite' meta RPM or some other package. As of right now, it is not installing when upgrading to the latest Satellite 6.3.3 build. # rpm -q satellite satellite-6.3.3-1.el7sat.noarch # yum install pulp-maintenance Loaded plugins: langpacks, product-id, search-disabled-repos, subscription-manager Sat6-CI_Red_Hat_Satellite_6_3_Composes_Satellite_6_3_RHEL7 | 2.5 kB 00:00:00 Sat6-CI_Red_Hat_Satellite_Puppet_4_6_3_Composes_Satellite_Puppet_4_6_3_RHEL7_x86_64 | 2.1 kB 00:00:00 Sat6-CI_Red_Hat_Satellite_Tools_6_3_Composes_Satellite_Tools | 2.1 kB 00:00:00 qemu-firmware-jenkins | 2.9 kB 00:00:00 No package pulp-maintenance available. Error: Nothing to do Requesting needsinfo from upstream developer ttereshc because the 'FailedQA' flag is set. Requesting needsinfo from upstream developer bbouters because the 'FailedQA' flag is set. clearing needinfo, this is being resolved by pcreech Added to satellite package to be auto-installed. updated to satellite.noarch 0:6.3.3-1.el7sat which pulled in the dep: Installing for dependencies: pulp-maintenance noarch 2.13.4.12-1.el7sat Sat6-CI_Red_Hat_Satellite_6_3_Composes_Satellite_6_3_RHEL7 60 k .. Check the cron config: # file /etc/cron.weekly/pulp-maintenance /etc/cron.weekly/pulp-maintenance: POSIX shell script, ASCII text executable Run cron: # /etc/cron.weekly/pulp-maintenance a372f3d9-5ace-422e-a626-6537041e8fa3 Check journal: Aug 20 17:26:42 sat-r220-07.lab.eng.rdu2.redhat.com pulp[8563]: celery.worker.job:INFO: Task pulp.server.maintenance.monthly.queue_monthly_maintenance[a372f3d9-5ace-422e-a626-6537041e8fa3] succeeded in 0.0403493009508s: None Aug 20 17:26:42 sat-r220-07.lab.eng.rdu2.redhat.com pulp[15514]: pulp.server.managers.consumer.applicability:INFO: [4b14b8a0] Orphaned consumer profiles to process: 0 Aug 20 17:26:42 sat-r220-07.lab.eng.rdu2.redhat.com pulp[8560]: celery.worker.job:INFO: Task pulp.server.maintenance.monthly.monthly_maintenance[4b14b8a0-c593-4116-9c07-2570348b2401] succeeded in 0.0408377237618s: None looks good to me. Verified in Satellite 6.3.3 Snap 4 The pulp-maintenance package is now being installed as part of satellite. This creates the cron job in weekly. -bash-4.2# rpm -q pulp-maintenance pulp-maintenance-2.13.4.12-1.el7sat.noarch -bash-4.2# file /etc/cron.weekly/pulp-maintenance /etc/cron.weekly/pulp-maintenance: POSIX shell script, ASCII text executable -bash-4.2# cat /etc/cron.weekly/pulp-maintenance #!/bin/sh celery -A pulp.server.async.app call pulp.server.maintenance.monthly.queue_monthly_maintenance -bash-4.2# bash /etc/cron.weekly/pulp-maintenance 7d5e2483-cc51-41f2-865f-c9148820b0db -bash-4.2# tail -50 /var/log/messages ... Aug 21 10:52:00 intel-canoepass-12 pulp: celery.worker.job:INFO: Task pulp.server.maintenance.monthly.queue_monthly_maintenance[7d5e2483-cc51-41f2-865f-c9148820b0db] succeeded in 0.0490260770002s: None Aug 21 10:52:00 intel-canoepass-12 pulp: pulp.server.managers.consumer.applicability:INFO: [a15c899d] Orphaned consumer profiles to process: 0 Aug 21 10:52:00 intel-canoepass-12 pulp: celery.worker.job:INFO: Task pulp.server.maintenance.monthly.monthly_maintenance[a15c899d-9d90-40cc-bd7d-92d96b34ff05] succeeded in 0.0570463659997s: None Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:2550 The Pulp upstream bug status is at CLOSED - CURRENTRELEASE. Updating the external tracker on this bug. The Pulp upstream bug status is at CLOSED - CURRENTRELEASE. Updating the external tracker on this bug. |