Bug 2027086
| Summary: | The "katello:pulp3_migration" reports wrong failed component names if one or all pulp3 related services has failed to start during content-migration process | ||
|---|---|---|---|
| Product: | Red Hat Satellite | Reporter: | Sayan Das <saydas> |
| Component: | Repositories | Assignee: | satellite6-bugs <satellite6-bugs> |
| Status: | CLOSED ERRATA | QA Contact: | Stephen Wadeley <swadeley> |
| Severity: | medium | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 6.9.6 | CC: | ahumbe, jpasqual, jsherril, juwatts, osousa, wpinheir |
| Target Milestone: | 6.9.9 | Keywords: | Triaged, Upgrades |
| Target Release: | Unused | ||
| Hardware: | All | ||
| OS: | All | ||
| Whiteboard: | |||
| Fixed In Version: | tfm-rubygem-katello-3.18.1.51-1 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-04-20 20:34:52 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
I have started observing this to be happening on multiple occasions now and I would like to re-iterate the concern from support end here. This particular message is not at all helpful or even correct : Prepare content for Pulp 3: rake aborted! The following services have not been started or are reporting errors: candlepin, foreman_tasks, pulp, pulp_auth or, Prepare content for Pulp 3: rake aborted! The following services have not been started or are reporting errors: candlepin, foreman_tasks, pulp3 A) It is true that some service is down or not working but it has nothing to do with candlepin or foreman_tasks components in any of the scenarios. B) In first scenario, where "pulp, pulp_auth" was printed, we would assume that some issue with pulp2 services are there but It was exactly the opposite i.e. the issues were related to pulpcore services. C) In second scenario, where "pulp3" was printed, we would assume that some issue with pulp3\pulpcore services are there but It was again exactly opposite i.e. all pulp2 services were disabled and stopped including squid and mongo. So while the troubleshooting is on us i.e. support, We expect the error message to convey some meaningful information that can help with the troubleshooting. It looks like this was already fixed upstream as part of https://projects.theforeman.org/issues/32058 (https://bugzilla.redhat.com/show_bug.cgi?id=1937403), but one of the 2 prs was not cherry picked for some reason. So we just need to cherry pick the second change: https://github.com/Katello/katello/pull/9287/files Hello Testing on 6.9.9-1.0 Fixed in version says: tfm-rubygem-katello-3.18.1.51-1 I have: ~]# rpm -q tfm-rubygem-katello tfm-rubygem-katello-3.18.1.53-1.el7sat.noarch ~]# rpm -q satellite satellite-6.9.9-1.el7sat.noarch ~]# We can see fix as per upstream PR, in comment 4, is in this snap: ~]# grep -r -A2 failed_services /opt /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.18.1.53/app/models/katello/ping.rb: def failed_services(result) /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.18.1.53/app/models/katello/ping.rb- result[:services].select do |_name, details| and it tests OK using method in https://github.com/Katello/katello/pull/9287 ~]# systemctl stop tomcat ~]# foreman-rake console Loading production environment (Rails 6.0.3.4) irb(main):001:0> Katello::Ping.ping!(services: [:candlepin]) Traceback (most recent call last): 3: from lib/tasks/console.rake:5:in `block in <top (required)>' 2: from (irb):1 1: from katello (3.18.1.53) app/models/katello/ping.rb:35:in `ping!' RuntimeError (The following services have not been started or are reporting errors: candlepin) irb(main):002:0> ---------------------- If you try to log into the web UI at this pit you will see: Oops, we're sorry but something went wrong A backend service [ Candlepin ] is unreachable ~]# systemctl start tomcat -------------------------- Testing as per comment 0 I added manifest, synced three repos (Ansible, Tools, Maintenance) Made and promoted TestCV with two of the repos ~]# pip3 install --upgrade click <snip> Successfully installed click-8.0.4 ~]# pip3 install --upgrade chardet <snip> Successfully installed chardet-4.0.0 ~]# satellite-maintain content prepare Running Prepare content for Pulp 3 ================================================================================ Enable applicable services: Enabling the following service(s): pulpcore-api, pulpcore-content, pulpcore-resource-manager, pulpcore-worker@1, pulpcore-worker@2, pulpcore-worker@3, pulpcore-worker@4 \ enabling pulpcore-resource-manager Created symlink from /etc/systemd/system/multi-user.target.wants/pulpcore-api.service to /etc/systemd/system/pulpcore-api.service. Created symlink from /etc/systemd/system/multi-user.target.wants/pulpcore-content.service to /etc/systemd/system/pulpcore-content.service. Created symlink from /etc/systemd/system/multi-user.target.wants/pulpcore-resource-manager.service to /etc/systemd/system/pulpcore-resource-manager.service. \ enabling pulpcore-worker@4 Created symlink from /etc/systemd/system/multi-user.target.wants/pulpcore-worker to /etc/systemd/system/pulpcore-worker@.service. Created symlink from /etc/systemd/system/multi-user.target.wants/pulpcore-worker to /etc/systemd/system/pulpcore-worker@.service. Created symlink from /etc/systemd/system/multi-user.target.wants/pulpcore-worker to /etc/systemd/system/pulpcore-worker@.service. Created symlink from /etc/systemd/system/multi-user.target.wants/pulpcore-worker to /etc/systemd/system/pulpcore-worker@.service. | All services enabled [OK] -------------------------------------------------------------------------------- Start applicable services: Starting the following service(s): rh-mongodb34-mongod, rh-redis5-redis, postgresql, qdrouterd, qpidd, squid, pulpcore-api, pulpcore-content, pulpcore-resource-manager, pulp_celerybeat, pulp_resource_manager, pulp_streamer, pulp_workers, smart_proxy_dynflow_core, tomcat, pulpcore-worker@1, pulpcore-worker@2, pulpcore-worker@3, pulpcore-worker@4, dynflow-sidekiq@orchestrator, foreman, httpd, puppetserver, dynflow-sidekiq@worker, dynflow-sidekiq@worker-hosts-queue, foreman-proxy \ All services started [OK] -------------------------------------------------------------------------------- Prepare content for Pulp 3: rake aborted! The following services have not been started or are reporting errors: pulp3 <-------NOTE-------- That is good, only pulp3 is listed, and none of " candlepin, foreman_tasks, pulp, pulp_auth" /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.18.1.53/app/models/katello/ping.rb:35:in `ping!' /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.18.1.53/lib/katello/tasks/pulp3_migration.rake:13:in `block (2 levels) in <top (required)>' /opt/rh/rh-ruby25/root/usr/share/gems/gems/rake-12.3.3/exe/rake:27:in `<top (required)>' Tasks: TOP => katello:pulp3_migration (See full trace by running task with --trace) Checking for valid Katello configuraton. [FAIL] Failed executing preserve_output=true foreman-rake katello:pulp3_migration, exit status 1 -------------------------------------------------------------------------------- Scenario [Prepare content for Pulp 3] failed. The following steps ended up in failing state: [content-prepare] Resolve the failed steps and rerun the command. In case the failures are false positives, use --whitelist="content-prepare" [root@dhcp-3-138 ~]# hammer ping database: Status: ok Server Response: Duration: 0ms candlepin: Status: ok Server Response: Duration: 34ms candlepin_events: Status: ok message: 7 Processed, 0 Failed Server Response: Duration: 0ms candlepin_auth: Status: ok Server Response: Duration: 29ms katello_events: Status: ok message: 0 Processed, 0 Failed Server Response: Duration: 1ms pulp: Status: ok Server Response: Duration: 195ms pulp_auth: Status: ok Server Response: Duration: 84ms foreman_tasks: Status: ok Server Response: Duration: 7ms [root@dhcp-3-138 ~]# foreman-maintain service status -b | grep -v OK Running Status Services ================================================================================ Get status of applicable services: Displaying the following service(s): rh-mongodb34-mongod, rh-redis5-redis, postgresql, pulpcore-api, pulpcore-content, pulpcore-resource-manager, qdrouterd, qpidd, squid, pulp_celerybeat, pulp_resource_manager, pulp_streamer, pulp_workers, pulpcore-worker, pulpcore-worker, pulpcore-worker, pulpcore-worker, smart_proxy_dynflow_core, tomcat, dynflow-sidekiq@orchestrator, foreman, httpd, puppetserver, dynflow-sidekiq@worker, dynflow-sidekiq@worker-hosts-queue, foreman-proxy / displaying pulpcore-api [FAIL] / displaying pulpcore-content [FAIL] / displaying pulpcore-resource-manager [FAIL] / displaying pulpcore-worker [FAIL] / displaying pulpcore-worker [FAIL] / All services displayed [FAIL] Some services are not running (pulpcore-api, pulpcore-content, pulpcore-resource-manager, pulpcore-worker, pulpcore-worker) -------------------------------------------------------------------------------- Scenario [Status Services] failed. The following steps ended up in failing state: [service-status] Resolve the failed steps and rerun the command. In case the failures are false positives, use --whitelist="service-status" ~]# Looks good to me. @Sayan Das: The first part looks correct, not sure if you are happy with the output of "foreman-maintain service status -b". On the grounds that this change is worth keeping I will mark this VERIFIED. Please open a new bug if further improvements are required. Thank you Hello, I believe this looks fine now. It was expected that if "pulp3" comes up as down, then the pulpcore specific services will not be running as well. -- Sayan Thank you Sayan Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Satellite 6.9.9 Async Bug Fix Update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:1478 |
Description of problem: The "katello:pulp3_migration" reports wrong failed component names if one or all pulp3 related services has failed to start during content-migration process Version-Release number of selected component (if applicable): * Satellite 6.9.6+ * tfm-rubygem-katello How reproducible: Always if forced (and in customer's infra) Steps to Reproduce: 1. Install a Satellite 6.9 with some repos enabled + synced and some CV created, published and promoted. 2. Install the "python3-pip" package if that is not al;ready installed and use "/usr/bin/pip3" commandline to update the "click" module to 8.0.1 or above and "chardet" to 4.0.0 or above. 3. Run the "satellite-maintain content prepare" step 4. Check the status of "hammer ping" and "satellite-maintain service status -b" Actual results: Step 3, Shows "The following services have not been started or are reporting errors: candlepin, foreman_tasks, pulp, pulp_auth" , even if it shows OK for "All services started". # satellite-maintain content prepare Running Prepare content for Pulp 3 ================================================================================ Enable applicable services: Enabling the following service(s): pulpcore-api, pulpcore-content, pulpcore-resource-manager, pulpcore-worker@1, pulpcore-worker@2, pulpcore-worker@3, pulpcore-worker@4 - All services enabled [OK] -------------------------------------------------------------------------------- Start applicable services: Starting the following service(s): rh-mongodb34-mongod, rh-redis5-redis, postgresql, pulpcore-api, pulpcore-content, pulpcore-resource-manager, qdrouterd, qpidd, squid, pulp_c elerybeat, pulp_resource_manager, pulp_streamer, pulp_workers, pulpcore-worker, pulpcore-worker, pulpcore-worker ce, pulpcore-worker, smart_proxy_dynflow_core, tomcat, pulpcore-worker@1, pulpcore-worker@2, pulpcore-worker@3, pulpcore-worker@4, dynflow-sidekiq@orchestrator, foreman, httpd, puppetserver, dynflow-sidekiq@worker, dynflow-sidekiq@worker-hosts-queue, foreman-proxy, fore man-cockpit | All services started [OK] -------------------------------------------------------------------------------- Prepare content for Pulp 3: rake aborted! The following services have not been started or are reporting errors: candlepin, foreman_tasks, pulp, pulp_auth /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.18.1.46/app/models/katello/ping.rb:35:in `ping!' /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.18.1.46/lib/katello/tasks/pulp3_migration.rake:13:in `block (2 levels) in <top (requi red)>' /opt/rh/rh-ruby25/root/usr/share/gems/gems/rake-12.3.3/exe/rake:27:in `<top (required)>' Tasks: TOP => katello:pulp3_migration (See full trace by running task with --trace) Checking for valid Katello configuraton. [FAIL] Failed executing preserve_output=true foreman-rake katello:pulp3_migration, exit status 1 -------------------------------------------------------------------------------- Scenario [Prepare content for Pulp 3] failed. The following steps ended up in failing state: [content-prepare] Resolve the failed steps and rerun the command. In case the failures are false positives, use --whitelist="content-prepare" Step 4: Clearly shows nothing failed in "hammer ping" but all of the pulp3 related services are down. # hammer ping database: Status: ok Server Response: Duration: 0ms candlepin: Status: ok Server Response: Duration: 36ms candlepin_events: Status: ok message: 139 Processed, 0 Failed Server Response: Duration: 0ms candlepin_auth: Status: ok Server Response: Duration: 24ms katello_events: Status: ok message: 0 Processed, 0 Failed Server Response: Duration: 0ms pulp: Status: ok Server Response: Duration: 123ms pulp_auth: Status: ok Server Response: Duration: 77ms foreman_tasks: Status: ok Server Response: Duration: 16ms # foreman-maintain service status -b | grep -v OK Running Status Services ================================================================================ Get status of applicable services: Displaying the following service(s): rh-mongodb34-mongod, rh-redis5-redis, postgresql, pulpcore-api, pulpcore-content, pulpcore-resource-manager, qdrouterd, qpidd, squid, pulp_c elerybeat, pulp_resource_manager, pulp_streamer, pulp_workers, pulpcore-worker, pulpcore-worker, pulpcore-worker ce, pulpcore-worker, smart_proxy_dynflow_core, tomcat, dynflow-sidekiq@orchestrator, foreman, httpd, puppetserver, dynflow-sidekiq @worker, dynflow-sidekiq@worker-hosts-queue, foreman-proxy, foreman-cockpit \ displaying pulpcore-api [FAIL] \ displaying pulpcore-content [FAIL] \ displaying pulpcore-resource-manager [FAIL] \ displaying pulpcore-worker [FAIL] \ displaying pulpcore-worker [FAIL] \ displaying pulpcore-worker [FAIL] | All services displayed [FAIL] Some services are not running (pulpcore-api, pulpcore-content, pulpcore-resource-manager, pulpcore-worker, pulpcore-worker ce, pulpcore-worker) -------------------------------------------------------------------------------- Scenario [Status Services] failed. Expected results: The detection of failed components should happen properly at this stage or else it makes the troubleshooting of the problem a bit difficult. ~~~ | All services started [OK] -------------------------------------------------------------------------------- Prepare content for Pulp 3: rake aborted! The following services have not been started or are reporting errors: candlepin, foreman_tasks, pulp, pulp_auth ~~~ Additional info: Obviously, we don't expect anyone to use pip to upgrade individual python modules but if one does intentionally (as explained in the Reproducer section), then the first impression we get about the issue is a wrong one. I was able to troubleshoot this problem only after going through Syslog entries to find out why those pulpcore* services won't start i.e. Nov 23 07:01:18 satellite-kv-0 pulpcore-api: raise VersionConflict(dist, req).with_context(dependent_req) Nov 23 07:01:18 satellite-kv-0 pulpcore-api: pkg_resources.ContextualVersionConflict: (click 8.0.1 (/usr/local/lib/python3.6/site-packages), Requirement.parse('click<8'), {'pulpcore'}) and Nov 23 12:23:38 satellite-kv-0 pulpcore-content: pkg_resources.ContextualVersionConflict: (chardet 4.0.0 (/usr/local/lib/python3.6/site-packages), Requirement.parse('chardet<4.0,>=2.0'), {'aiohttp'})