Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2027086 - The "katello:pulp3_migration" reports wrong failed component names if one or all pulp3 related services has failed to start during content-migration process
Summary: The "katello:pulp3_migration" reports wrong failed component names if one or ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Repositories
Version: 6.9.6
Hardware: All
OS: All
unspecified
medium
Target Milestone: 6.9.9
Assignee: satellite6-bugs
QA Contact: Stephen Wadeley
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-11-28 06:51 UTC by Sayan Das
Modified: 2022-04-20 20:35 UTC (History)
6 users (show)

Fixed In Version: tfm-rubygem-katello-3.18.1.51-1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-04-20 20:34:52 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github Katello katello pull 9287/ 0 None None None 2022-02-22 21:33:16 UTC
Red Hat Product Errata RHSA-2022:1478 0 None None None 2022-04-20 20:35:14 UTC

Description Sayan Das 2021-11-28 06:51:48 UTC
Description of problem:

The "katello:pulp3_migration" reports wrong failed component names if one or all pulp3 related services has failed to start during content-migration process


Version-Release number of selected component (if applicable):

* Satellite 6.9.6+ 
* tfm-rubygem-katello


How reproducible:

Always if forced (and in customer's infra)


Steps to Reproduce:

1. Install a Satellite 6.9 with some repos enabled + synced and some CV created, published and promoted.

2. Install the "python3-pip" package if that is not al;ready installed and use "/usr/bin/pip3" commandline to update the "click" module to 8.0.1 or above and "chardet" to 4.0.0 or above.

3. Run the "satellite-maintain content prepare" step

4. Check the status of "hammer ping" and "satellite-maintain service status -b"


Actual results:

Step 3, Shows "The following services have not been started or are reporting errors: candlepin, foreman_tasks, pulp, pulp_auth" , even if it shows OK for "All services started".


# satellite-maintain content prepare
Running Prepare content for Pulp 3
================================================================================
Enable applicable services:

Enabling the following service(s):
pulpcore-api, pulpcore-content, pulpcore-resource-manager, pulpcore-worker@1, pulpcore-worker@2, pulpcore-worker@3, pulpcore-worker@4
- All services enabled                                                [OK]
--------------------------------------------------------------------------------
Start applicable services:

Starting the following service(s):
rh-mongodb34-mongod, rh-redis5-redis, postgresql, pulpcore-api, pulpcore-content, pulpcore-resource-manager, qdrouterd, qpidd, squid, pulp_c                                                                                                 elerybeat, pulp_resource_manager, pulp_streamer, pulp_workers, pulpcore-worker, pulpcore-worker, pulpcore-worker                                                                                                 ce, pulpcore-worker, smart_proxy_dynflow_core, tomcat, pulpcore-worker@1, pulpcore-worker@2, pulpcore-worker@3, pulpcore-worker@4,                                                                                                  dynflow-sidekiq@orchestrator, foreman, httpd, puppetserver, dynflow-sidekiq@worker, dynflow-sidekiq@worker-hosts-queue, foreman-proxy, fore                                                                                                 man-cockpit
| All services started                                                [OK]
--------------------------------------------------------------------------------
Prepare content for Pulp 3:
rake aborted!
The following services have not been started or are reporting errors: candlepin, foreman_tasks, pulp, pulp_auth
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.18.1.46/app/models/katello/ping.rb:35:in `ping!'
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.18.1.46/lib/katello/tasks/pulp3_migration.rake:13:in `block (2 levels) in <top (requi                                                                                                 red)>'
/opt/rh/rh-ruby25/root/usr/share/gems/gems/rake-12.3.3/exe/rake:27:in `<top (required)>'
Tasks: TOP => katello:pulp3_migration
(See full trace by running task with --trace)
Checking for valid Katello configuraton.
                                                                      [FAIL]
Failed executing preserve_output=true foreman-rake katello:pulp3_migration, exit status 1
--------------------------------------------------------------------------------
Scenario [Prepare content for Pulp 3] failed.

The following steps ended up in failing state:

  [content-prepare]

Resolve the failed steps and rerun
the command. In case the failures are false positives,
use --whitelist="content-prepare"



Step 4: Clearly shows nothing failed in "hammer ping" but all of the pulp3 related services are down.

# hammer ping
database:
    Status:          ok
    Server Response: Duration: 0ms
candlepin:
    Status:          ok
    Server Response: Duration: 36ms
candlepin_events:
    Status:          ok
    message:         139 Processed, 0 Failed
    Server Response: Duration: 0ms
candlepin_auth:
    Status:          ok
    Server Response: Duration: 24ms
katello_events:
    Status:          ok
    message:         0 Processed, 0 Failed
    Server Response: Duration: 0ms
pulp:
    Status:          ok
    Server Response: Duration: 123ms
pulp_auth:
    Status:          ok
    Server Response: Duration: 77ms
foreman_tasks:
    Status:          ok
    Server Response: Duration: 16ms


# foreman-maintain service status -b | grep -v OK
Running Status Services
================================================================================
Get status of applicable services:

Displaying the following service(s):
rh-mongodb34-mongod, rh-redis5-redis, postgresql, pulpcore-api, pulpcore-content, pulpcore-resource-manager, qdrouterd, qpidd, squid, pulp_c                                                                                                 elerybeat, pulp_resource_manager, pulp_streamer, pulp_workers, pulpcore-worker, pulpcore-worker, pulpcore-worker                                                                                                 ce, pulpcore-worker, smart_proxy_dynflow_core, tomcat, dynflow-sidekiq@orchestrator, foreman, httpd, puppetserver, dynflow-sidekiq                                                                                                 @worker, dynflow-sidekiq@worker-hosts-queue, foreman-proxy, foreman-cockpit

\ displaying pulpcore-api                          [FAIL]
\ displaying pulpcore-content                      [FAIL]
\ displaying pulpcore-resource-manager             [FAIL]

\ displaying pulpcore-worker             [FAIL]
\ displaying pulpcore-worker             [FAIL]

\ displaying pulpcore-worker             [FAIL]

| All services displayed                                              [FAIL]
Some services are not running (pulpcore-api, pulpcore-content, pulpcore-resource-manager, pulpcore-worker, pulpcore-worker                                                                                                 ce, pulpcore-worker)
--------------------------------------------------------------------------------
Scenario [Status Services] failed.




Expected results:

The detection of failed components should happen properly at this stage or else it makes the troubleshooting of the problem a bit difficult.

~~~

| All services started                                                [OK]
--------------------------------------------------------------------------------
Prepare content for Pulp 3:
rake aborted!
The following services have not been started or are reporting errors: candlepin, foreman_tasks, pulp, pulp_auth
~~~


Additional info:

Obviously, we don't expect anyone to use pip to upgrade individual python modules but if one does intentionally (as explained in the Reproducer section), then the first impression we get about the issue is a wrong one. 

I was able to troubleshoot this problem only after going through Syslog entries to find out why those pulpcore* services won't start i.e. 

Nov 23 07:01:18 satellite-kv-0 pulpcore-api: raise VersionConflict(dist, req).with_context(dependent_req)
Nov 23 07:01:18 satellite-kv-0 pulpcore-api: pkg_resources.ContextualVersionConflict: (click 8.0.1 (/usr/local/lib/python3.6/site-packages), Requirement.parse('click<8'), {'pulpcore'})

and 

Nov 23 12:23:38 satellite-kv-0 pulpcore-content: pkg_resources.ContextualVersionConflict: (chardet 4.0.0 (/usr/local/lib/python3.6/site-packages), Requirement.parse('chardet<4.0,>=2.0'), {'aiohttp'})

Comment 3 Sayan Das 2022-01-06 07:13:10 UTC
I have started observing this to be happening on multiple occasions now and I would like to re-iterate the concern from support end here.

This particular message is not at all helpful or even correct :

Prepare content for Pulp 3:
rake aborted!
The following services have not been started or are reporting errors: candlepin, foreman_tasks, pulp, pulp_auth


or,


Prepare content for Pulp 3:
rake aborted!
The following services have not been started or are reporting errors: candlepin, foreman_tasks, pulp3


A) It is true that some service is down or not working but it has nothing to do with candlepin or foreman_tasks components in any of the scenarios.

B) In first scenario, where "pulp, pulp_auth" was printed, we would assume that some issue with pulp2 services are there but It was exactly the opposite i.e. the issues were related to pulpcore services.

C) In second scenario, where "pulp3" was printed, we would assume that some issue with pulp3\pulpcore services are there but It was again exactly opposite i.e. all pulp2 services were disabled and stopped including squid and mongo.


So while the troubleshooting is on us i.e. support, We expect the error message to convey some meaningful information that can help with the troubleshooting.

Comment 4 Justin Sherrill 2022-02-22 21:33:17 UTC
It looks like this was already fixed upstream as part of https://projects.theforeman.org/issues/32058 (https://bugzilla.redhat.com/show_bug.cgi?id=1937403), but one of the 2 prs was not cherry picked for some reason.

So we just need to cherry pick the second change:   https://github.com/Katello/katello/pull/9287/files

Comment 6 Stephen Wadeley 2022-03-24 15:03:20 UTC
Hello

Testing on 6.9.9-1.0

Fixed in version says: tfm-rubygem-katello-3.18.1.51-1

I have:

 ~]# rpm -q tfm-rubygem-katello
tfm-rubygem-katello-3.18.1.53-1.el7sat.noarch
 ~]# rpm -q satellite
satellite-6.9.9-1.el7sat.noarch
 ~]# 


We can see fix as per upstream PR, in comment 4, is in this snap:
~]# grep -r -A2 failed_services /opt

/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.18.1.53/app/models/katello/ping.rb:      def failed_services(result)
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.18.1.53/app/models/katello/ping.rb-        result[:services].select do |_name, details|

and it tests OK using method in https://github.com/Katello/katello/pull/9287

 ~]# systemctl stop tomcat
 ~]# foreman-rake console
Loading production environment (Rails 6.0.3.4)
irb(main):001:0> Katello::Ping.ping!(services: [:candlepin])
Traceback (most recent call last):
        3: from lib/tasks/console.rake:5:in `block in <top (required)>'
        2: from (irb):1
        1: from katello (3.18.1.53) app/models/katello/ping.rb:35:in `ping!'
RuntimeError (The following services have not been started or are reporting errors: candlepin)
irb(main):002:0>


----------------------

If you try to log into the web UI at this pit you will see:

 Oops, we're sorry but something went wrong A backend service [ Candlepin ] is unreachable


 ~]# systemctl start tomcat


--------------------------

Testing as per comment 0

I added manifest, synced three repos (Ansible, Tools, Maintenance)

Made and promoted TestCV with two of the repos



~]# pip3 install --upgrade click
<snip>
Successfully installed click-8.0.4


~]# pip3 install --upgrade chardet
<snip>
Successfully installed chardet-4.0.0



 ~]# satellite-maintain content prepare
Running Prepare content for Pulp 3
================================================================================
Enable applicable services: 

Enabling the following service(s):
pulpcore-api, pulpcore-content, pulpcore-resource-manager, pulpcore-worker@1, pulpcore-worker@2, pulpcore-worker@3, pulpcore-worker@4
\ enabling pulpcore-resource-manager                                            
Created symlink from /etc/systemd/system/multi-user.target.wants/pulpcore-api.service to /etc/systemd/system/pulpcore-api.service.

Created symlink from /etc/systemd/system/multi-user.target.wants/pulpcore-content.service to /etc/systemd/system/pulpcore-content.service.

Created symlink from /etc/systemd/system/multi-user.target.wants/pulpcore-resource-manager.service to /etc/systemd/system/pulpcore-resource-manager.service.
\ enabling pulpcore-worker@4                                                    
Created symlink from /etc/systemd/system/multi-user.target.wants/pulpcore-worker to /etc/systemd/system/pulpcore-worker@.service.

Created symlink from /etc/systemd/system/multi-user.target.wants/pulpcore-worker to /etc/systemd/system/pulpcore-worker@.service.

Created symlink from /etc/systemd/system/multi-user.target.wants/pulpcore-worker to /etc/systemd/system/pulpcore-worker@.service.

Created symlink from /etc/systemd/system/multi-user.target.wants/pulpcore-worker to /etc/systemd/system/pulpcore-worker@.service.
| All services enabled                                                [OK]      
--------------------------------------------------------------------------------
Start applicable services: 

Starting the following service(s):
rh-mongodb34-mongod, rh-redis5-redis, postgresql, qdrouterd, qpidd, squid, pulpcore-api, pulpcore-content, pulpcore-resource-manager, pulp_celerybeat, pulp_resource_manager, pulp_streamer, pulp_workers, smart_proxy_dynflow_core, tomcat, pulpcore-worker@1, pulpcore-worker@2, pulpcore-worker@3, pulpcore-worker@4, dynflow-sidekiq@orchestrator, foreman, httpd, puppetserver, dynflow-sidekiq@worker, dynflow-sidekiq@worker-hosts-queue, foreman-proxy
\ All services started                                                [OK]      
--------------------------------------------------------------------------------
Prepare content for Pulp 3: 
rake aborted!
The following services have not been started or are reporting errors: pulp3                                   <-------NOTE-------- That is good, only pulp3 is listed, and none of " candlepin, foreman_tasks, pulp, pulp_auth" 
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.18.1.53/app/models/katello/ping.rb:35:in `ping!'
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.18.1.53/lib/katello/tasks/pulp3_migration.rake:13:in `block (2 levels) in <top (required)>'
/opt/rh/rh-ruby25/root/usr/share/gems/gems/rake-12.3.3/exe/rake:27:in `<top (required)>'
Tasks: TOP => katello:pulp3_migration
(See full trace by running task with --trace)
Checking for valid Katello configuraton.
                                                                      [FAIL]
Failed executing preserve_output=true foreman-rake katello:pulp3_migration, exit status 1
--------------------------------------------------------------------------------
Scenario [Prepare content for Pulp 3] failed.

The following steps ended up in failing state:

  [content-prepare]

Resolve the failed steps and rerun
the command. In case the failures are false positives,
use --whitelist="content-prepare"


[root@dhcp-3-138 ~]# hammer ping
database:         
    Status:          ok
    Server Response: Duration: 0ms
candlepin:        
    Status:          ok
    Server Response: Duration: 34ms
candlepin_events: 
    Status:          ok
    message:         7 Processed, 0 Failed
    Server Response: Duration: 0ms
candlepin_auth:   
    Status:          ok
    Server Response: Duration: 29ms
katello_events:   
    Status:          ok
    message:         0 Processed, 0 Failed
    Server Response: Duration: 1ms
pulp:             
    Status:          ok
    Server Response: Duration: 195ms
pulp_auth:        
    Status:          ok
    Server Response: Duration: 84ms
foreman_tasks:    
    Status:          ok
    Server Response: Duration: 7ms

[root@dhcp-3-138 ~]# foreman-maintain service status -b | grep -v OK
Running Status Services
================================================================================
Get status of applicable services: 

Displaying the following service(s):
rh-mongodb34-mongod, rh-redis5-redis, postgresql, pulpcore-api, pulpcore-content, pulpcore-resource-manager, qdrouterd, qpidd, squid, pulp_celerybeat, pulp_resource_manager, pulp_streamer, pulp_workers, pulpcore-worker, pulpcore-worker, pulpcore-worker, pulpcore-worker, smart_proxy_dynflow_core, tomcat, dynflow-sidekiq@orchestrator, foreman, httpd, puppetserver, dynflow-sidekiq@worker, dynflow-sidekiq@worker-hosts-queue, foreman-proxy
/ displaying pulpcore-api                          [FAIL]                       
/ displaying pulpcore-content                      [FAIL]                       
/ displaying pulpcore-resource-manager             [FAIL]                       
/ displaying pulpcore-worker             [FAIL]                       
/ displaying pulpcore-worker             [FAIL]                       
/ All services displayed                                              [FAIL]    
Some services are not running (pulpcore-api, pulpcore-content, pulpcore-resource-manager, pulpcore-worker, pulpcore-worker)
--------------------------------------------------------------------------------
Scenario [Status Services] failed.

The following steps ended up in failing state:

  [service-status]

Resolve the failed steps and rerun
the command. In case the failures are false positives,
use --whitelist="service-status"

 ~]# 


Looks good to me.

@Sayan Das: The first part looks correct, not sure if you are happy with the output of "foreman-maintain service status -b".

On the grounds that this change is worth keeping I will mark this VERIFIED. Please open a new bug if further improvements are required.

Thank you

Comment 7 Sayan Das 2022-03-24 15:09:09 UTC
Hello,

I believe this looks fine now. 

It was expected that if "pulp3" comes up as down, then the pulpcore specific services will not be running as well. 


-- Sayan

Comment 8 Stephen Wadeley 2022-03-24 16:28:39 UTC
Thank you Sayan

Comment 13 errata-xmlrpc 2022-04-20 20:34:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Satellite 6.9.9 Async Bug Fix Update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:1478


Note You need to log in before you can comment on or make changes to this bug.