Bug 2005986

Summary: tripleo-latest-packages-version validation fails
Product: Red Hat OpenStack Reporter: Uemit Seren <uemit.seren>
Component: validations-commonAssignee: Jiri Podivin <jpodivin>
Status: CLOSED ERRATA QA Contact: nlevinki <nlevinki>
Severity: medium Docs Contact:
Priority: medium    
Version: 16.2 (Train)CC: apetrich, gchamoul, jamsmith, jbuchta, jjoyce, jpodivin, jschluet, slinaber, tvignaud
Target Milestone: z2Keywords: Triaged
Target Release: 16.2 (Train on RHEL 8.4)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: validations-common-1.1.2-2.20211025164927.92f51ea.el8ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-03-23 22:11:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
output of dnf list --available none

Description Uemit Seren 2021-09-20 15:42:26 UTC
Description of problem:

Running the tripleo-latest-packages-version validation fails in OSP 16.1

Version-Release number of selected component (if applicable):

openstack-tripleo-validations-11.6.1-2.20210612074808.8644a02.el8ost.1.noarch
validations-common-1.1.2-2.20210611010116.el8ost.2.noarch

How reproducible:

Run the tripleo-latest-packages-version validation

Steps to Reproduce:
1. openstack tripleo validator run --validation tripleo-latest-packages-version

+--------------------------------------+---------------------------------+--------+------------+----------------+-------------------+-------------+
| UUID                                 | Validations                     | Status | Host_Group | Status_by_Host | Unreachable_Hosts | Duration    |
+--------------------------------------+---------------------------------+--------+------------+----------------+-------------------+-------------+
| 3173cb6c-2587-4d26-ac6f-42f503dd8cfc | tripleo-latest-packages-version | FAILED | undercloud | undercloud     |                   | 0:00:04.679 |
+--------------------------------------+---------------------------------+--------+------------+----------------+-------------------+-------------+


Actual results:

FAILED with:

    "validation_output": [
        {
            "task": {
                "hosts": {
                    "undercloud": {
                        "_ansible_no_log": false,
                        "action": "check_package_update",
                        "changed": false,
                        "failed": true,
                        "invocation": {
                            "module_args": {
                                "packages_list": [],
                                "pkg_mgr": null
                            }
                        },
                        "msg": "2021-09-20 15:07:00,375 [ERROR] dnf:698016:MainThread @logutil.py:194 - [Errno 13] Permission denied: '/var/log/rhsm/rhsm.log' - Further logging output will be written to stderr\n2021-09-20 15:07:00,376 [ERROR] dnf:698016:MainThread @identity.py:156 - Reload of consumer identity cert /etc/pki/consumer/cert.pem raised an exception with msg: [Errno 13] Permission denied: '/etc/pki/consumer/key.pem'\nError: No matching Packages to list\n"
                    }
                },
                "name": "Get available updates for packages",
                "status": "FAILED"
            }
        }
    ]


Expected results:

Should pass 

Additional info:

Comment 1 Gaël Chamoulaud 2021-09-21 10:56:22 UTC
Hi @uemit.seren

Comment 2 Gaël Chamoulaud 2021-09-21 10:59:24 UTC
Sorry for the first comment!

Hi Uemit,

Could you please confirm that this is happening in 16.2 and not 16.1?
Because the packages you've listed are provided for 16.2 and this validation has been released for 16.2 only.

And could you please run this command on the undercloud and paste the result in a comment here?

$ dnf list --available

Thanks for reporting this issue.

Gaël

Comment 3 Uemit Seren 2021-09-21 11:33:39 UTC
Created attachment 1824945 [details]
output of dnf list --available

Comment 4 Uemit Seren 2021-09-21 11:34:19 UTC
Hi Gaël, 

We did an in place upgrade from OSP 16.1 to OSP 16.2 and then re-ran the validations so I would say that this error is happening in OSP 16.2. 
FYI: we are using Satellite 6.5 to manage the repositories. 
I attach the output of dnf list --available to this issue.

Comment 5 Jiri Podivin 2021-09-21 12:27:16 UTC
The problem seems to be with permissions, not the validation. 

If dnf isn't working correctly, as seems to be the case here, the validation is going to fail because the assumption it works with (the package manager is operational) are broken. 

It would help to know permissions for the files concerned:
- /var/log/rhsm/rhsm.log' 
- /etc/pki/consumer/cert.pem

Maybe you cant try to repair them, that should clear the errors up.

The exact command issued by the validation is `dnf -q list --available` but I don't see how  `-q` could cause the issue.

Comment 6 Uemit Seren 2021-09-21 14:11:04 UTC
If I run the dnf list --available as the stack user I get the permission denied issue, although it still shows the packages:

[[dev]stack@uc ~]$ dnf list --available                                                                                                                                                                                       
Failed to set locale, defaulting to C.UTF-8                                                                                                                                                                                   
2021-09-21 16:07:46,633 [ERROR] dnf:618894:MainThread @logutil.py:194 - [Errno 13] Permission denied: '/var/log/rhsm/rhsm.log' - Further logging output will be written to stderr                                             
Not root, Subscription Management repositories not updated                                                                                                                                                                    
2021-09-21 16:07:46,633 [ERROR] dnf:618894:MainThread @identity.py:156 - Reload of consumer identity cert /etc/pki/consumer/cert.pem raised an exception with msg: [Errno 13] Permission denied: '/etc/pki/consumer/key.pem'


Below are the permissions of the two files:

[[dev]stack@uc ~]$ ls -la /var/log/rhsm/rhsm.log
-rw-r--r--. 1 root root 27675 Sep 21 16:00 /var/log/rhsm/rhsm.log

[[dev]stack@uc ~]$ ls -ltha /etc/pki/consumer/cert.pem
-rw-r-----. 1 root root 1.8K Nov  5  2020 /etc/pki/consumer/cert.pem

Comment 7 Uemit Seren 2021-09-21 14:28:40 UTC
FYI: if dnf list --available is run as root no permission error is thrown

Comment 8 Jiri Podivin 2021-09-21 14:34:03 UTC
Are the permissions supposed to be that way? If not, it would be prudent to fix them, afterwards the validation would be running fine again.
Otherwise, the validation can be changed to ignore those errors.
That being said, I'm wondering if inability to access these files doesn't impact the package availability.

Comment 9 Uemit Seren 2021-09-21 16:49:42 UTC
Hi Jiri, 

I checked the permissions of some of our other RHEL 8.4 hosts that are registered with RedHat Satellite and all of them have the same permission set. 
In fact if I do dnf list --available as a regular user I get the same permission error messages in the ouput.

Maybe it's specific to Satellite registered RHEL hosts ? 

FYI: If I add a "become: true" to the /usr/share/ansible/validation-playbooks/tripleo-latest-packages-version.yaml playbook the validation passes:



[[dev]stack@uc ~]$ undercloud tripleo validator run --validation tripleo-latest-packages-version
Running Validations without Overcloud settings.
+--------------------------------------+---------------------------------+--------+------------+----------------+-------------------+-------------+
| UUID                                 | Validations                     | Status | Host_Group | Status_by_Host | Unreachable_Hosts | Duration    |
+--------------------------------------+---------------------------------+--------+------------+----------------+-------------------+-------------+
| bebf2759-42e8-47af-ab8a-584f88ffae4d | tripleo-latest-packages-version | PASSED | undercloud | undercloud     |                   | 0:00:05.398 |
+--------------------------------------+---------------------------------+--------+------------+----------------+-------------------+-------------+

Comment 10 Gaël Chamoulaud 2021-09-21 18:18:06 UTC
(In reply to Uemit Seren from comment #9)
> Hi Jiri, 
> 
> I checked the permissions of some of our other RHEL 8.4 hosts that are
> registered with RedHat Satellite and all of them have the same permission
> set. 
> In fact if I do dnf list --available as a regular user I get the same
> permission error messages in the ouput.
> 
> Maybe it's specific to Satellite registered RHEL hosts ? 

Yes the validation is not managing properly hosts registered in a Satellite.
And Yes, it is definitively not an issue with the permissions of the rhsm files here.

> FYI: If I add a "become: true" to the
> /usr/share/ansible/validation-playbooks/tripleo-latest-packages-version.yaml
> playbook the validation passes:
> 

@Jiri, Uemit is right and we have to ensure the check_package_update custom module is being called with sudo!

Comment 11 Jiri Podivin 2021-09-22 07:26:03 UTC
No, I definitely wouldn't go that way. Privilege escalation should be the last recourse, not the first one. 

There are other ways to solve this. We can patch the module to either ignore all errors, or to add these among the errors we want to ignore.

Comment 22 errata-xmlrpc 2022-03-23 22:11:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of components for Red Hat OpenStack Platform 16.2.2), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:1001