Bug 2102825

Summary: satellite-clone fails to adjust ownership of /var/lib/pulp if it's owned by non-existing user/group
Product: Red Hat Satellite Reporter: mithun kalyat <mkalyat>
Component: Satellite CloneAssignee: Evgeni Golov <egolov>
Status: CLOSED ERRATA QA Contact: Ladislav Vasina <lvasina>
Severity: high Docs Contact:
Priority: high    
Version: 6.10.6CC: egolov, gsulliva, lvasina
Target Milestone: 6.12.0Keywords: Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-11-16 13:34:14 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description mithun kalyat 2022-06-30 18:22:04 UTC
Description of problem:

There are many other packages "missing" from the Target server (relative to the Source server) after running "satellite-clone" 
Example - just looking for packages with "ansible" in the name, found 14 packages on the Source server, but only 7 packages on the Target server:

### SOURCE server ###
[root@satellite-test ~]# rpm -qa |grep ansible | sort
ansible-2.9.27-1.el7ae.noarch
ansible-collection-redhat-satellite-2.2.0-1.el7sat.noarch
ansiblerole-foreman_scap_client-0.2.0-1.el7sat.noarch
ansiblerole-insights-client-1.7.1-1.el7sat.noarch
ansiblerole-satellite-receptor-installer-0.6.15-1.el7sat.noarch
ansible-runner-1.4.6-1.el7ar.noarch
ansible-test-2.9.27-1.el7ae.noarch
python2-ansible-runner-1.4.6-1.el7ar.noarch
python3-pulp-ansible-0.9.0-1.el7pc.noarch
tfm-rubygem-foreman_ansible-6.3.4.1-1.el7sat.noarch
tfm-rubygem-foreman_ansible_core-4.2.0-1.el7sat.noarch
tfm-rubygem-hammer_cli_foreman_ansible-0.3.4-1.el7sat.noarch
tfm-rubygem-pulp_ansible_client-0.8.0-1.el7sat.noarch
tfm-rubygem-smart_proxy_ansible-3.1.1-1.el7sat.noarch
[root@satellite-test ~]#

### TARGET server ###
[root@satellite-test ~]# rpm -qa |grep ansible | sort
ansible-2.9.27-1.el7ae.noarch
ansiblerole-foreman_scap_client-0.2.0-1.el7sat.noarch
ansiblerole-insights-client-1.7.1-1.el7sat.noarch
ansiblerole-satellite-receptor-installer-0.6.15-1.el7sat.noarch
tfm-rubygem-foreman_ansible-6.3.4.1-1.el7sat.noarch
tfm-rubygem-hammer_cli_foreman_ansible-0.3.4-1.el7sat.noarch
tfm-rubygem-pulp_ansible_client-0.8.0-1.el7sat.noarch
[root@satellite-test ~]#

As a result, satellite-clone' failed with an error in the task "Correct ownership of /var/lib/pulp":

TASK [satellite-clone : Correct ownership of /var/lib/pulp] ************************************************************************
Friday 17 June 2022  15:06:20 -0400 (0:00:00.160)       0:06:32.481 *********** 
fatal: [localhost]: FAILED! => {"msg": "The conditional check 'pulp_stat.stat.pw_name != 'pulp' or pulp_stat.stat.gr_name != 'pulp'' failed. The error was: error while evaluating conditional (pulp_stat.stat.pw_name != 'pulp' or pulp_stat.stat.gr_name != 'pulp'): 'dict object' has no attribute 'pw_name'\n\nThe error appears to be in '/usr/share/satellite-clone/roles/satellite-clone/tasks/ensure_pulp_data_permissions.yml': line 21, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: 'Correct ownership of /var/lib/pulp'\n  ^ here\n"}

Tried to re-install 'satellite' on the target server (after unlocking packages & "yum clean all") - but it did nothing, since the 'satellite' package was already installed.
"yum reinstall satellite", which worked - but it did not re-install any dependencies, just 'satellite' itself.

    # satellite-maintain packages unlock

    # yum clean all

    # yum install satellite

    # satellite-maintain packages lock

Looks like "satellite-clone" scripts are leaving out some important packages like httpd.

The directory was copied with the rsync command given in section 2.3 https://access.redhat.com/documentation/en-us/red_hat_satellite/6.10/html/upgrading_and_updating_red_hat_satellite/cloning_satellite_server#sec-Pulp_Data_Considerations("To back up without Pulp data"):

  # rsync --archive --partial --progress --compress /var/lib/pulp target_server.example.com:/var/lib/pulp

Comment 1 Evgeni Golov 2022-07-01 10:27:09 UTC
(In reply to mithun kalyat from comment #0)

> There are many other packages "missing" from the Target server (relative to
> the Source server) after running "satellite-clone" 
> Example - just looking for packages with "ansible" in the name, found 14
> packages on the Source server, but only 7 packages on the Target server:

As long as satellite-clone has not finished successfully, Satellite is not yet installed on the target system and the package lists will obviously not match.
I see no problem here.

> As a result, satellite-clone' failed with an error in the task "Correct
> ownership of /var/lib/pulp":
> 
> TASK [satellite-clone : Correct ownership of /var/lib/pulp]
> ************************************************************************
> Friday 17 June 2022  15:06:20 -0400 (0:00:00.160)       0:06:32.481
> *********** 
> fatal: [localhost]: FAILED! => {"msg": "The conditional check
> 'pulp_stat.stat.pw_name != 'pulp' or pulp_stat.stat.gr_name != 'pulp''
> failed. The error was: error while evaluating conditional
> (pulp_stat.stat.pw_name != 'pulp' or pulp_stat.stat.gr_name != 'pulp'):
> 'dict object' has no attribute 'pw_name'\n\nThe error appears to be in
> '/usr/share/satellite-clone/roles/satellite-clone/tasks/
> ensure_pulp_data_permissions.yml': line 21, column 3, but may\nbe elsewhere
> in the file depending on the exact syntax problem.\n\nThe offending line
> appears to be:\n\n\n- name: 'Correct ownership of /var/lib/pulp'\n  ^
> here\n"}

What did lead you to the conclusion this is a result of the above?

The real reason seems to be because /var/lib/pulp is owned by a UID that doesn't have a name.
I'll post a patch for satellite-clone in a moment, but until then the customer can use a workaround:
chown root:root /var/lib/pulp

This will make /var/lib/pulp owned by root and trigger the "fix permission" logic in satellite-clone correctly.

> Tried to re-install 'satellite' on the target server (after unlocking
> packages & "yum clean all") - but it did nothing, since the 'satellite'
> package was already installed.
> "yum reinstall satellite", which worked - but it did not re-install any
> dependencies, just 'satellite' itself.

This won't have any influence (and satellite-clone refuses to operate on a system that already has satellite.rpm installed)

> Looks like "satellite-clone" scripts are leaving out some important packages
> like httpd.

It's not the job of satellite-clone to install that package. This will be done by the satellite-installer which is called later (or in your case, not at all, because the permission thing failed).

Comment 2 Ladislav Vasina 2022-09-08 12:52:06 UTC
VERIFIED
Tested on: Satellite 6.12 - Snap:9
I have run my test for satellite-clone and Task (Correct ownership of /var/lib/pulp) that was stated in Comment0 as failing is not failing anymore.
See satellite-clone output below:

TASK [satellite-clone : Correct ownership of /var/lib/pulp] ********************
    changed: [localhost]

Comment 6 errata-xmlrpc 2022-11-16 13:34:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Satellite 6.12 Release), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:8506