Description of problem: Running the tripleo-latest-packages-version validation fails in OSP 16.1 Version-Release number of selected component (if applicable): openstack-tripleo-validations-11.6.1-2.20210612074808.8644a02.el8ost.1.noarch validations-common-1.1.2-2.20210611010116.el8ost.2.noarch How reproducible: Run the tripleo-latest-packages-version validation Steps to Reproduce: 1. openstack tripleo validator run --validation tripleo-latest-packages-version +--------------------------------------+---------------------------------+--------+------------+----------------+-------------------+-------------+ | UUID | Validations | Status | Host_Group | Status_by_Host | Unreachable_Hosts | Duration | +--------------------------------------+---------------------------------+--------+------------+----------------+-------------------+-------------+ | 3173cb6c-2587-4d26-ac6f-42f503dd8cfc | tripleo-latest-packages-version | FAILED | undercloud | undercloud | | 0:00:04.679 | +--------------------------------------+---------------------------------+--------+------------+----------------+-------------------+-------------+ Actual results: FAILED with: "validation_output": [ { "task": { "hosts": { "undercloud": { "_ansible_no_log": false, "action": "check_package_update", "changed": false, "failed": true, "invocation": { "module_args": { "packages_list": [], "pkg_mgr": null } }, "msg": "2021-09-20 15:07:00,375 [ERROR] dnf:698016:MainThread @logutil.py:194 - [Errno 13] Permission denied: '/var/log/rhsm/rhsm.log' - Further logging output will be written to stderr\n2021-09-20 15:07:00,376 [ERROR] dnf:698016:MainThread @identity.py:156 - Reload of consumer identity cert /etc/pki/consumer/cert.pem raised an exception with msg: [Errno 13] Permission denied: '/etc/pki/consumer/key.pem'\nError: No matching Packages to list\n" } }, "name": "Get available updates for packages", "status": "FAILED" } } ] Expected results: Should pass Additional info:
Hi @uemit.seren
Sorry for the first comment! Hi Uemit, Could you please confirm that this is happening in 16.2 and not 16.1? Because the packages you've listed are provided for 16.2 and this validation has been released for 16.2 only. And could you please run this command on the undercloud and paste the result in a comment here? $ dnf list --available Thanks for reporting this issue. Gaël
Created attachment 1824945 [details] output of dnf list --available
Hi Gaël, We did an in place upgrade from OSP 16.1 to OSP 16.2 and then re-ran the validations so I would say that this error is happening in OSP 16.2. FYI: we are using Satellite 6.5 to manage the repositories. I attach the output of dnf list --available to this issue.
The problem seems to be with permissions, not the validation. If dnf isn't working correctly, as seems to be the case here, the validation is going to fail because the assumption it works with (the package manager is operational) are broken. It would help to know permissions for the files concerned: - /var/log/rhsm/rhsm.log' - /etc/pki/consumer/cert.pem Maybe you cant try to repair them, that should clear the errors up. The exact command issued by the validation is `dnf -q list --available` but I don't see how `-q` could cause the issue.
If I run the dnf list --available as the stack user I get the permission denied issue, although it still shows the packages: [[dev]stack@uc ~]$ dnf list --available Failed to set locale, defaulting to C.UTF-8 2021-09-21 16:07:46,633 [ERROR] dnf:618894:MainThread @logutil.py:194 - [Errno 13] Permission denied: '/var/log/rhsm/rhsm.log' - Further logging output will be written to stderr Not root, Subscription Management repositories not updated 2021-09-21 16:07:46,633 [ERROR] dnf:618894:MainThread @identity.py:156 - Reload of consumer identity cert /etc/pki/consumer/cert.pem raised an exception with msg: [Errno 13] Permission denied: '/etc/pki/consumer/key.pem' Below are the permissions of the two files: [[dev]stack@uc ~]$ ls -la /var/log/rhsm/rhsm.log -rw-r--r--. 1 root root 27675 Sep 21 16:00 /var/log/rhsm/rhsm.log [[dev]stack@uc ~]$ ls -ltha /etc/pki/consumer/cert.pem -rw-r-----. 1 root root 1.8K Nov 5 2020 /etc/pki/consumer/cert.pem
FYI: if dnf list --available is run as root no permission error is thrown
Are the permissions supposed to be that way? If not, it would be prudent to fix them, afterwards the validation would be running fine again. Otherwise, the validation can be changed to ignore those errors. That being said, I'm wondering if inability to access these files doesn't impact the package availability.
Hi Jiri, I checked the permissions of some of our other RHEL 8.4 hosts that are registered with RedHat Satellite and all of them have the same permission set. In fact if I do dnf list --available as a regular user I get the same permission error messages in the ouput. Maybe it's specific to Satellite registered RHEL hosts ? FYI: If I add a "become: true" to the /usr/share/ansible/validation-playbooks/tripleo-latest-packages-version.yaml playbook the validation passes: [[dev]stack@uc ~]$ undercloud tripleo validator run --validation tripleo-latest-packages-version Running Validations without Overcloud settings. +--------------------------------------+---------------------------------+--------+------------+----------------+-------------------+-------------+ | UUID | Validations | Status | Host_Group | Status_by_Host | Unreachable_Hosts | Duration | +--------------------------------------+---------------------------------+--------+------------+----------------+-------------------+-------------+ | bebf2759-42e8-47af-ab8a-584f88ffae4d | tripleo-latest-packages-version | PASSED | undercloud | undercloud | | 0:00:05.398 | +--------------------------------------+---------------------------------+--------+------------+----------------+-------------------+-------------+
(In reply to Uemit Seren from comment #9) > Hi Jiri, > > I checked the permissions of some of our other RHEL 8.4 hosts that are > registered with RedHat Satellite and all of them have the same permission > set. > In fact if I do dnf list --available as a regular user I get the same > permission error messages in the ouput. > > Maybe it's specific to Satellite registered RHEL hosts ? Yes the validation is not managing properly hosts registered in a Satellite. And Yes, it is definitively not an issue with the permissions of the rhsm files here. > FYI: If I add a "become: true" to the > /usr/share/ansible/validation-playbooks/tripleo-latest-packages-version.yaml > playbook the validation passes: > @Jiri, Uemit is right and we have to ensure the check_package_update custom module is being called with sudo!
No, I definitely wouldn't go that way. Privilege escalation should be the last recourse, not the first one. There are other ways to solve this. We can patch the module to either ignore all errors, or to add these among the errors we want to ignore.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Release of components for Red Hat OpenStack Platform 16.2.2), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:1001