Bug 1913224
Summary: | [Cinderlib] - Although the Configure local CinderLib database configured to True in global maintenance mode, the execution of optional-components setup failed on hosted engine environment | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Shir Fishbain <sfishbai> | ||||||||
Component: | libsemanage | Assignee: | Petr Lautrbach <plautrba> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | Milos Malik <mmalik> | ||||||||
Severity: | high | Docs Contact: | Khushbu Borole <kborole> | ||||||||
Priority: | high | ||||||||||
Version: | 8.0 | CC: | asharov, bugs, didi, dwalsh, lvrabec, mjahoda, mmalik, plautrba, ssekidde, vmojzis | ||||||||
Target Milestone: | rc | Keywords: | Triaged | ||||||||
Target Release: | 8.0 | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||
Doc Text: |
.Rebuilds of the SELinux policy store are now more resistant to power failures
Previously, SELinux-policy rebuilds were not resistant to power failures due to write caching. Consequently, the SELinux policy store may become corrupted after a power failure during a policy rebuild. With this update, the `libsemanage` library writes all pending modifications to metadata and cached file data to the file system that contains the policy store before using it. As a result, the policy store is now more resistant to power failures and other interruptions.
|
Story Points: | --- | ||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2021-05-18 15:05:26 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Attachments: |
|
Description
Shir Fishbain
2021-01-06 10:11:07 UTC
Created attachment 1744844 [details]
setup_logs
Created attachment 1744845 [details]
engine_log
Created attachment 1744847 [details]
setup_reconfiguration_cinderlib_flow
I looked at the setup logs, and also logged into the machine, and this is what I see: 1. The failure was in running the command: semodule -i /usr/share/ovirt-engine/selinux/ansible-runner-service.cil meaning, it's not really related to cinderlib (or even to oVirt/RHV, likely). 2. stderr of the command was: libsemanage.semanage_direct_get_module_info: Unable to read ansible-runner-service module lang ext file. libsemanage.semanage_direct_get_module_info: Unable to read openvswitch-custom module lang ext file. (No such file or directory). libsemanage.semanage_direct_get_module_info: Unable to read openvswitch-custom module lang ext file. (No such file or directory). /usr/sbin/semodule: Failed on /usr/share/ovirt-engine/selinux/ansible-runner-service.cil! I searched for (parts of) it, and found (also) bug 1756835. 3. Based on the discussion in that bug, I checked the contents of /var/lib/selinux/targeted. Indeed, most (all?) of the subdirs there were semi-empty - meaning, e.g.: # ls -la --full-time /var/lib/selinux/targeted/active/modules/100/avahi total 16 drwx------. 2 root root 44 2020-12-25 00:35:50.857712594 +0200 . drwx------. 423 root root 12288 2020-12-25 00:35:50.922712961 +0200 .. -rw-------. 1 root root 0 2020-12-25 00:35:50.857712594 +0200 cil -rw-------. 1 root root 0 2020-12-25 00:35:50.857712594 +0200 hll -rw-------. 1 root root 0 2020-12-25 00:35:50.857712594 +0200 lang_ext IOW, had empty cil/hll/lang_ext files. 4. All of them had timestamps between '00:35:50.854712577' and '00:35:50.917712933'. so I checked what might have happened at that time, and found, in the first (successful) engine-setup log: ========================================================================== 2020-12-25 00:35:49,025+0200 DEBUG otopi.plugins.ovirt_engine_common.base.system.selinux plugin.executeRaw:813 execute: ('/usr/sbin/restorecon', '-r', '/usr/share/ovirt-engine/ansible-runner -service-project'), executable='None', cwd='None', env=None 2020-12-25 00:35:50,294+0200 DEBUG otopi.plugins.ovirt_engine_common.base.system.selinux plugin.executeRaw:863 execute-result: ('/usr/sbin/restorecon', '-r', '/usr/share/ovirt-engine/ansible -runner-service-project'), rc=0 2020-12-25 00:35:50,294+0200 DEBUG otopi.plugins.ovirt_engine_common.base.system.selinux plugin.execute:921 execute-output: ('/usr/sbin/restorecon', '-r', '/usr/share/ovirt-engine/ansible-ru nner-service-project') stdout: 2020-12-25 00:35:50,294+0200 DEBUG otopi.plugins.ovirt_engine_common.base.system.selinux plugin.execute:926 execute-output: ('/usr/sbin/restorecon', '-r', '/usr/share/ovirt-engine/ansible-runner-service-project') stderr: 2020-12-25 00:35:50,295+0200 DEBUG otopi.plugins.ovirt_engine_common.base.system.selinux plugin.executeRaw:813 execute: ('/usr/sbin/semanage', 'boolean', '--modify', '--on', 'httpd_can_network_connect'), executable='None', cwd='None', env=None 2020-12-25 00:35:52,690+0200 DEBUG otopi.plugins.ovirt_engine_common.base.system.selinux plugin.executeRaw:863 execute-result: ('/usr/sbin/semanage', 'boolean', '--modify', '--on', 'httpd_can_network_connect'), rc=0 2020-12-25 00:35:52,691+0200 DEBUG otopi.plugins.ovirt_engine_common.base.system.selinux plugin.execute:921 execute-output: ('/usr/sbin/semanage', 'boolean', '--modify', '--on', 'httpd_can_network_connect') stdout: 2020-12-25 00:35:52,691+0200 DEBUG otopi.plugins.ovirt_engine_common.base.system.selinux plugin.execute:926 execute-output: ('/usr/sbin/semanage', 'boolean', '--modify', '--on', 'httpd_can_network_connect') stderr: ========================================================================== Meaning, we ran restorecon (on /usr/share/ovirt-engine/ansible-runner. Not sure why we need to, but anyway), this finished, and then: 'semanage boolean --modify --on httpd_can_network_connect'. This started at 00:35:50,295 and finished at 00:35:52,691 - meaning, all the timestamps of the mentioned files are in this range, as if this command somehow trashed them. 5. Based on other comments in bug 1756835, I tried gradually moving stuff away from /var/lib/selinux/targeted and running the semodule command (with different but similar error messages), and eventually, after it still failed when it was completely empty, did 'dnf reinstall selinux-policy-targeted' and then the semodule command succedded. So I suspect this is some issue with selinux/semodule/semanage. Versions: ========================================================================== # rpm -qfi /usr/sbin/semanage /usr/sbin/semodule Name : policycoreutils-python-utils Version : 2.9 Release : 9.el8 Architecture: noarch Install Date: Tue 17 Nov 2020 11:54:26 AM IST Group : Unspecified Size : 140042 License : GPLv2 Signature : RSA/SHA256, Mon 20 Jan 2020 07:39:30 PM IST, Key ID 199e2f91fd431d51 Source RPM : policycoreutils-2.9-9.el8.src.rpm Build Date : Fri 17 Jan 2020 08:01:05 PM IST Build Host : arm64-033.build.eng.bos.redhat.com Relocations : (not relocatable) Packager : Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla> Vendor : Red Hat, Inc. URL : https://github.com/SELinuxProject/selinux Summary : SELinux policy core python utilities Description : The policycoreutils-python-utils package contains the management tools use to manage an SELinux environment. Name : policycoreutils Version : 2.9 Release : 9.el8 Architecture: x86_64 Install Date: Tue 17 Nov 2020 11:53:55 AM IST Group : Unspecified Size : 673367 License : GPLv2 Signature : RSA/SHA256, Mon 20 Jan 2020 07:39:28 PM IST, Key ID 199e2f91fd431d51 Source RPM : policycoreutils-2.9-9.el8.src.rpm Build Date : Fri 17 Jan 2020 07:58:30 PM IST Build Host : x86-vm-08.build.eng.bos.redhat.com Relocations : (not relocatable) Packager : Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla> Vendor : Red Hat, Inc. URL : https://github.com/SELinuxProject/selinux Summary : SELinux policy core utilities Description : Security-enhanced Linux is a feature of the Linux® kernel and a number of utilities with enhanced security functionality designed to add mandatory access controls to Linux. The Security-enhanced Linux kernel contains new architectural components originally developed to improve the security of the Flask operating system. These architectural components provide general support for the enforcement of many kinds of mandatory access control policies, including those based on the concepts of Type Enforcement®, Role-based Access Control, and Multi-level Security. policycoreutils contains the policy core utilities that are required for basic operation of a SELinux system. These utilities include load_policy to load policies, setfiles to label filesystems, newrole to switch roles. # rpm -qi selinux-policy-targeted Name : selinux-policy-targeted Version : 3.14.3 Release : 54.el8_3.2 Architecture: noarch Install Date: Wed 06 Jan 2021 12:45:17 PM IST Group : Unspecified Size : 52520436 License : GPLv2+ Signature : RSA/SHA256, Thu 10 Dec 2020 07:13:41 PM IST, Key ID 199e2f91fd431d51 Source RPM : selinux-policy-3.14.3-54.el8_3.2.src.rpm Build Date : Tue 08 Dec 2020 05:15:44 PM IST Build Host : s390-071.build.eng.bos.redhat.com Relocations : (not relocatable) Packager : Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla> Vendor : Red Hat, Inc. URL : https://github.com/fedora-selinux/selinux-policy Summary : SELinux targeted base policy Description : SELinux Reference policy targeted base module. ========================================================================== So, how to continue? 1. Shir, please keep the machine as-is for now, for further investigation, and if possible, please try to reproduce on another machine. 2. For the selinux issue, setting needinfo on Daniel Walsh. Dan, please have a look. Thanks! > So, how to continue? > > 1. Shir, please keep the machine as-is for now, for further investigation, > and if possible, please try to reproduce on another machine. On which environment do you recommend to check? In stand alone environments it works fine. I found the bug in HE environment only. Keeping the HE env for your investigation, ping me in G-chat for more details. > 2. For the selinux issue, setting needinfo on Daniel Walsh. Dan, please have > a look. Thanks! I'm investigating it. Given that it's not first report like this it seems to be some kind of a race condition during libsemanage commit. But I guess it's un-reproducible. (In reply to Petr Lautrbach from comment #6) > I'm investigating it. Given that it's not first report like this it seems to > be some kind of a race condition during libsemanage commit. But I guess it's > un-reproducible. Do you want/need access to Shir's systems? If not, she can use them for other things, and/or try to verify current bug again - and if it fails again similarly, you get another reproduction. It would be helpful to have system logs from 2020-12-25 00:35:50 - at least few minutes before. Other than that I don't need the system. Is it ext4, xfs,... ? It looks like to be a problem with filesystem like the write() reported success while there was no data actually written to the file. Maybe missing sync. Not sure if it's related but I found the following lines in the logs: Dec 25 01:35:34 oncilla04 vdsm[26680]: WARN unhandled close event Dec 25 01:35:45 oncilla04 vdsm[26680]: WARN unhandled write event Dec 25 01:35:47 oncilla04 vdsm[26680]: WARN unhandled close event Dec 25 01:35:57 oncilla04 vdsm[26680]: WARN unhandled write event Dec 25 01:35:59 oncilla04 vdsm[26680]: WARN unhandled close event Dec 25 01:36:09 oncilla04 vdsm[26680]: WARN unhandled write event Anyway. It looks similar to https://bugzilla.redhat.com/show_bug.cgi?id=1838762 which is supposed to be fixed in libsemanage-2.9-3.el8 shipped by RHEL-8.3. According to logs, libsemanage-2.9-3.el8 was be already installed. But according to comment 9 - https://bugzilla.redhat.com/show_bug.cgi?id=1838762#c9 libsemanage-2.9-3.el8 fixed only problem with broken boot while it let broken selinux store on user to reinstall selinux-policy-targeted. It looks like we have semi reliable reproducer in #1838762 so we'll try to fix also the store problem. This patch https://lore.kernel.org/selinux/20210128104231.102470-1-plautrba@redhat.com/T/#u should prevent some issues with empty files in the module store after policy rebuild. It waits for upstream review now. *** Bug 1917507 has been marked as a duplicate of this bug. *** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (libsemanage bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:1672 |