2048470 – Leapp upgrade fails after reboot with disabled postgresql redis tomcat services

Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 2048470 - Leapp upgrade fails after reboot with disabled postgresql redis tomcat services

Summary: Leapp upgrade fails after reboot with disabled postgresql redis tomcat services

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Satellite
Classification:	Red Hat
Component:	Upgrades
Sub Component:
Version:	6.11.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	6.11.0
Assignee:	Evgeni Golov
QA Contact:	Lukas Pramuk
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2022-01-31 10:35 UTC by Lukas Pramuk
Modified:	2022-07-05 14:33 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2022-07-05 14:32:43 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2022:5498	0	None	None	None	2022-07-05 14:33:00 UTC

Description Lukas Pramuk 2022-01-31 10:35:52 UTC

Description of problem:
Leapp upgrade fails after reboot with disabled postgres redis tomcat services
Enabling services is not enough to fix the issues.
Postgresql is not only disabled but also misconfigured - refuses to start
Only runnning satellite-installer explicitly fixes the issues


Version-Release number of selected component (if applicable):
Satellite 7.0.0 Snap7

How reproducible:
deterministic


Steps to Reproduce:
1. Prepare Sat7.0 rhel7 for LEAPP upgrade

2. Perform LEAPP upgrade to rhel8 and reboot

3. After reboot check the Satellite status

# systemctl status tomcat postgresql redis
● tomcat.service - Apache Tomcat Web Application Container
   Loaded: loaded (/usr/lib/systemd/system/tomcat.service; disabled; vendor preset: disabled)
   Active: inactive (dead)

● postgresql.service - PostgreSQL database server
   Loaded: loaded (/usr/lib/systemd/system/postgresql.service; disabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/postgresql.service.d
           └─postgresql.conf
   Active: inactive (dead)

● redis.service - Redis persistent key-value database
   Loaded: loaded (/usr/lib/systemd/system/redis.service; disabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/redis.service.d
           └─limit.conf
   Active: inactive (dead)


Actual results:
disabled services: postgresql, redis, tomcat
misconfigured: postgresql

Expected results:
all satellite services run successfully

Comment 1 Lukas Pramuk 2022-01-31 10:41:11 UTC

After enabling postgresql it still fails to run: 

# systemctl status  postgresql
● postgresql.service - PostgreSQL database server
   Loaded: loaded (/usr/lib/systemd/system/postgresql.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/postgresql.service.d
           └─postgresql.conf
   Active: failed (Result: exit-code) since Mon 2022-01-31 04:58:35 EST; 38min ago
  Process: 1301 ExecStartPre=/usr/libexec/postgresql-check-db-dir postgresql (code=exited, status=1/FAILURE)

Jan 31 04:58:35 sat.example.com systemd[1]: Starting PostgreSQL database server...
Jan 31 04:58:35 sat.example.com systemd[1]: postgresql.service: Control process exited, code=exited status=1
Jan 31 04:58:35 sat.example.com systemd[1]: postgresql.service: Failed with result 'exit-code'.
Jan 31 04:58:35 sat.example.com systemd[1]: Failed to start PostgreSQL database server.

Apache doesn't look healthy too:

# systemctl status  httpd
● httpd.service - The Apache HTTP Server
   Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Mon 2022-01-31 04:58:35 EST; 39min ago
     Docs: man:httpd.service(8)
  Process: 1314 ExecStart=/usr/sbin/httpd $OPTIONS -DFOREGROUND (code=exited, status=1/FAILURE)
 Main PID: 1314 (code=exited, status=1/FAILURE)

Jan 31 04:58:35 sat.example.com httpd[1314]: [Mon Jan 31 04:58:35.544207 2022] [so:warn] [pid 1314] AH01574: module proxy_module is already loaded, skipping
Jan 31 04:58:35 sat.example.com httpd[1314]: [Mon Jan 31 04:58:35.565848 2022] [so:warn] [pid 1314] AH01574: module proxy_http_module is already loaded, skipping
Jan 31 04:58:35 sat.example.com httpd[1314]: [Mon Jan 31 04:58:35.571191 2022] [so:warn] [pid 1314] AH01574: module proxy_wstunnel_module is already loaded, skipping
Jan 31 04:58:35 sat.example.com httpd[1314]: [Mon Jan 31 04:58:35.572644 2022] [so:warn] [pid 1314] AH01574: module ssl_module is already loaded, skipping
Jan 31 04:58:35 sat.example.com httpd[1314]: [Mon Jan 31 04:58:35.573149 2022] [so:warn] [pid 1314] AH01574: module systemd_module is already loaded, skipping
Jan 31 04:58:35 sat.example.com httpd[1314]: [Mon Jan 31 04:58:35.577067 2022] [so:warn] [pid 1314] AH01574: module cgi_module is already loaded, skipping
Jan 31 04:58:35 sat.example.com httpd[1314]: AH00534: httpd: Configuration error: More than one MPM loaded.
Jan 31 04:58:35 sat.example.com systemd[1]: httpd.service: Main process exited, code=exited, status=1/FAILURE
Jan 31 04:58:35 sat.example.com systemd[1]: httpd.service: Failed with result 'exit-code'.
Jan 31 04:58:35 sat.example.com systemd[1]: Failed to start The Apache HTTP Server.

Comment 3 Evgeni Golov 2022-01-31 12:57:41 UTC

the *underlying* issue is in foreman-maintain, which I filed BZ#2048517 for

but I'll take this BZ to make the code in leapp more robust

Comment 4 Evgeni Golov 2022-02-22 15:19:24 UTC

the code was made more robust and the latest builds in our repo have it, moving ON_DEV

Comment 5 Lukas Pramuk 2022-03-17 14:05:23 UTC

VERIFIED.

@Satellite 7.0.0 Snap13
leapp-0.13.0-100.202203021701Z.8d426bb.master.el7.noarch
leapp-upgrade-el7toel8-0.15.0-100.202203031950Z.9d7f141.master.el7.noarch

by the manual reproducer described in comment#0:

3) After reboot and leap_resume service finished check the Satellite status

# journalctl -qg 'leapp_resume.service: Succeeded'
Mar 17 07:52:37 sat.example.com systemd[1]: leapp_resume.service: Succeeded.

# satellite-maintain service status -b
Running Status Services
================================================================================
Get status of applicable services: 

Displaying the following service(s):
redis, postgresql, pulpcore-api, pulpcore-content, pulpcore-worker, pulpcore-worker, pulpcore-worker, pulpcore-worker, pulpcore-worker, pulpcore-worker, tomcat, dynflow-sidekiq@orchestrator, foreman, httpd, dynflow-sidekiq@worker-1, dynflow-sidekiq@worker-hosts-queue-1, foreman-proxy
- displaying redis                                 [OK]                         
- displaying postgresql                            [OK]                         
- displaying pulpcore-api                          [OK]                         
- displaying pulpcore-content                      [OK]                         
\ displaying pulpcore-worker             [OK]                         
\ displaying pulpcore-worker             [OK]                         
\ displaying pulpcore-worker             [OK]                         
\ displaying pulpcore-worker             [OK]                         
\ displaying pulpcore-worker             [OK]                         
\ displaying pulpcore-worker             [OK]                         
\ displaying tomcat                                [OK]                         
\ displaying dynflow-sidekiq@orchestrator          [OK]                         
\ displaying foreman                               [OK]                         
\ displaying httpd                                 [OK]                         
\ displaying dynflow-sidekiq@worker-1              [OK]                         
\ displaying dynflow-sidekiq@worker-hosts-queue-1  [OK]                         
\ displaying foreman-proxy                         [OK]                         
\ All services are running                                            [OK]      
--------------------------------------------------------------------------------

# hammer ping
database:         
    Status:          ok
    Server Response: Duration: 0ms
candlepin:        
    Status:          ok
    Server Response: Duration: 382ms
candlepin_auth:   
    Status:          ok
    Server Response: Duration: 68ms
candlepin_events: 
    Status:          ok
    message:         0 Processed, 0 Failed
    Server Response: Duration: 0ms
katello_events:   
    Status:          ok
    message:         0 Processed, 0 Failed
    Server Response: Duration: 1ms
pulp3:            
    Status:          ok
    Server Response: Duration: 222ms
pulp3_content:    
    Status:          ok
    Server Response: Duration: 198ms
foreman_tasks:    
    Status:          ok
    Server Response: Duration: 5ms

>>> after LEAPP upgrade to RHEL8 all Satellite services are running successfully

Comment 9 errata-xmlrpc 2022-07-05 14:32:43 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Satellite 6.11 Release), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5498

Note You need to log in before you can comment on or make changes to this bug.