Bug 1913224
| Summary: | [Cinderlib] - Although the Configure local CinderLib database configured to True in global maintenance mode, the execution of optional-components setup failed on hosted engine environment | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Shir Fishbain <sfishbai> | ||||||||
| Component: | libsemanage | Assignee: | Petr Lautrbach <plautrba> | ||||||||
| Status: | CLOSED ERRATA | QA Contact: | Milos Malik <mmalik> | ||||||||
| Severity: | high | Docs Contact: | Khushbu Borole <kborole> | ||||||||
| Priority: | high | ||||||||||
| Version: | 8.0 | CC: | asharov, bugs, didi, dwalsh, lvrabec, mjahoda, mmalik, plautrba, ssekidde, vmojzis | ||||||||
| Target Milestone: | rc | Keywords: | Triaged | ||||||||
| Target Release: | 8.0 | Flags: | pm-rhel:
mirror+
|
||||||||
| Hardware: | Unspecified | ||||||||||
| OS: | Linux | ||||||||||
| Whiteboard: | |||||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||||
| Doc Text: |
.Rebuilds of the SELinux policy store are now more resistant to power failures
Previously, SELinux-policy rebuilds were not resistant to power failures due to write caching. Consequently, the SELinux policy store may become corrupted after a power failure during a policy rebuild. With this update, the `libsemanage` library writes all pending modifications to metadata and cached file data to the file system that contains the policy store before using it. As a result, the policy store is now more resistant to power failures and other interruptions.
|
Story Points: | --- | ||||||||
| Clone Of: | Environment: | ||||||||||
| Last Closed: | 2021-05-18 15:05:26 UTC | Type: | Bug | ||||||||
| Regression: | --- | Mount Type: | --- | ||||||||
| Documentation: | --- | CRM: | |||||||||
| Verified Versions: | Category: | --- | |||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||
| Embargoed: | |||||||||||
| Attachments: |
|
||||||||||
Created attachment 1744844 [details]
setup_logs
Created attachment 1744845 [details]
engine_log
Created attachment 1744847 [details]
setup_reconfiguration_cinderlib_flow
I looked at the setup logs, and also logged into the machine, and this is what I see: 1. The failure was in running the command: semodule -i /usr/share/ovirt-engine/selinux/ansible-runner-service.cil meaning, it's not really related to cinderlib (or even to oVirt/RHV, likely). 2. stderr of the command was: libsemanage.semanage_direct_get_module_info: Unable to read ansible-runner-service module lang ext file. libsemanage.semanage_direct_get_module_info: Unable to read openvswitch-custom module lang ext file. (No such file or directory). libsemanage.semanage_direct_get_module_info: Unable to read openvswitch-custom module lang ext file. (No such file or directory). /usr/sbin/semodule: Failed on /usr/share/ovirt-engine/selinux/ansible-runner-service.cil! I searched for (parts of) it, and found (also) bug 1756835. 3. Based on the discussion in that bug, I checked the contents of /var/lib/selinux/targeted. Indeed, most (all?) of the subdirs there were semi-empty - meaning, e.g.: # ls -la --full-time /var/lib/selinux/targeted/active/modules/100/avahi total 16 drwx------. 2 root root 44 2020-12-25 00:35:50.857712594 +0200 . drwx------. 423 root root 12288 2020-12-25 00:35:50.922712961 +0200 .. -rw-------. 1 root root 0 2020-12-25 00:35:50.857712594 +0200 cil -rw-------. 1 root root 0 2020-12-25 00:35:50.857712594 +0200 hll -rw-------. 1 root root 0 2020-12-25 00:35:50.857712594 +0200 lang_ext IOW, had empty cil/hll/lang_ext files. 4. All of them had timestamps between '00:35:50.854712577' and '00:35:50.917712933'. so I checked what might have happened at that time, and found, in the first (successful) engine-setup log: ========================================================================== 2020-12-25 00:35:49,025+0200 DEBUG otopi.plugins.ovirt_engine_common.base.system.selinux plugin.executeRaw:813 execute: ('/usr/sbin/restorecon', '-r', '/usr/share/ovirt-engine/ansible-runner -service-project'), executable='None', cwd='None', env=None 2020-12-25 00:35:50,294+0200 DEBUG otopi.plugins.ovirt_engine_common.base.system.selinux plugin.executeRaw:863 execute-result: ('/usr/sbin/restorecon', '-r', '/usr/share/ovirt-engine/ansible -runner-service-project'), rc=0 2020-12-25 00:35:50,294+0200 DEBUG otopi.plugins.ovirt_engine_common.base.system.selinux plugin.execute:921 execute-output: ('/usr/sbin/restorecon', '-r', '/usr/share/ovirt-engine/ansible-ru nner-service-project') stdout: 2020-12-25 00:35:50,294+0200 DEBUG otopi.plugins.ovirt_engine_common.base.system.selinux plugin.execute:926 execute-output: ('/usr/sbin/restorecon', '-r', '/usr/share/ovirt-engine/ansible-runner-service-project') stderr: 2020-12-25 00:35:50,295+0200 DEBUG otopi.plugins.ovirt_engine_common.base.system.selinux plugin.executeRaw:813 execute: ('/usr/sbin/semanage', 'boolean', '--modify', '--on', 'httpd_can_network_connect'), executable='None', cwd='None', env=None 2020-12-25 00:35:52,690+0200 DEBUG otopi.plugins.ovirt_engine_common.base.system.selinux plugin.executeRaw:863 execute-result: ('/usr/sbin/semanage', 'boolean', '--modify', '--on', 'httpd_can_network_connect'), rc=0 2020-12-25 00:35:52,691+0200 DEBUG otopi.plugins.ovirt_engine_common.base.system.selinux plugin.execute:921 execute-output: ('/usr/sbin/semanage', 'boolean', '--modify', '--on', 'httpd_can_network_connect') stdout: 2020-12-25 00:35:52,691+0200 DEBUG otopi.plugins.ovirt_engine_common.base.system.selinux plugin.execute:926 execute-output: ('/usr/sbin/semanage', 'boolean', '--modify', '--on', 'httpd_can_network_connect') stderr: ========================================================================== Meaning, we ran restorecon (on /usr/share/ovirt-engine/ansible-runner. Not sure why we need to, but anyway), this finished, and then: 'semanage boolean --modify --on httpd_can_network_connect'. This started at 00:35:50,295 and finished at 00:35:52,691 - meaning, all the timestamps of the mentioned files are in this range, as if this command somehow trashed them. 5. Based on other comments in bug 1756835, I tried gradually moving stuff away from /var/lib/selinux/targeted and running the semodule command (with different but similar error messages), and eventually, after it still failed when it was completely empty, did 'dnf reinstall selinux-policy-targeted' and then the semodule command succedded. So I suspect this is some issue with selinux/semodule/semanage. Versions: ========================================================================== # rpm -qfi /usr/sbin/semanage /usr/sbin/semodule Name : policycoreutils-python-utils Version : 2.9 Release : 9.el8 Architecture: noarch Install Date: Tue 17 Nov 2020 11:54:26 AM IST Group : Unspecified Size : 140042 License : GPLv2 Signature : RSA/SHA256, Mon 20 Jan 2020 07:39:30 PM IST, Key ID 199e2f91fd431d51 Source RPM : policycoreutils-2.9-9.el8.src.rpm Build Date : Fri 17 Jan 2020 08:01:05 PM IST Build Host : arm64-033.build.eng.bos.redhat.com Relocations : (not relocatable) Packager : Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla> Vendor : Red Hat, Inc. URL : https://github.com/SELinuxProject/selinux Summary : SELinux policy core python utilities Description : The policycoreutils-python-utils package contains the management tools use to manage an SELinux environment. Name : policycoreutils Version : 2.9 Release : 9.el8 Architecture: x86_64 Install Date: Tue 17 Nov 2020 11:53:55 AM IST Group : Unspecified Size : 673367 License : GPLv2 Signature : RSA/SHA256, Mon 20 Jan 2020 07:39:28 PM IST, Key ID 199e2f91fd431d51 Source RPM : policycoreutils-2.9-9.el8.src.rpm Build Date : Fri 17 Jan 2020 07:58:30 PM IST Build Host : x86-vm-08.build.eng.bos.redhat.com Relocations : (not relocatable) Packager : Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla> Vendor : Red Hat, Inc. URL : https://github.com/SELinuxProject/selinux Summary : SELinux policy core utilities Description : Security-enhanced Linux is a feature of the Linux® kernel and a number of utilities with enhanced security functionality designed to add mandatory access controls to Linux. The Security-enhanced Linux kernel contains new architectural components originally developed to improve the security of the Flask operating system. These architectural components provide general support for the enforcement of many kinds of mandatory access control policies, including those based on the concepts of Type Enforcement®, Role-based Access Control, and Multi-level Security. policycoreutils contains the policy core utilities that are required for basic operation of a SELinux system. These utilities include load_policy to load policies, setfiles to label filesystems, newrole to switch roles. # rpm -qi selinux-policy-targeted Name : selinux-policy-targeted Version : 3.14.3 Release : 54.el8_3.2 Architecture: noarch Install Date: Wed 06 Jan 2021 12:45:17 PM IST Group : Unspecified Size : 52520436 License : GPLv2+ Signature : RSA/SHA256, Thu 10 Dec 2020 07:13:41 PM IST, Key ID 199e2f91fd431d51 Source RPM : selinux-policy-3.14.3-54.el8_3.2.src.rpm Build Date : Tue 08 Dec 2020 05:15:44 PM IST Build Host : s390-071.build.eng.bos.redhat.com Relocations : (not relocatable) Packager : Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla> Vendor : Red Hat, Inc. URL : https://github.com/fedora-selinux/selinux-policy Summary : SELinux targeted base policy Description : SELinux Reference policy targeted base module. ========================================================================== So, how to continue? 1. Shir, please keep the machine as-is for now, for further investigation, and if possible, please try to reproduce on another machine. 2. For the selinux issue, setting needinfo on Daniel Walsh. Dan, please have a look. Thanks! > So, how to continue? > > 1. Shir, please keep the machine as-is for now, for further investigation, > and if possible, please try to reproduce on another machine. On which environment do you recommend to check? In stand alone environments it works fine. I found the bug in HE environment only. Keeping the HE env for your investigation, ping me in G-chat for more details. > 2. For the selinux issue, setting needinfo on Daniel Walsh. Dan, please have > a look. Thanks! I'm investigating it. Given that it's not first report like this it seems to be some kind of a race condition during libsemanage commit. But I guess it's un-reproducible. (In reply to Petr Lautrbach from comment #6) > I'm investigating it. Given that it's not first report like this it seems to > be some kind of a race condition during libsemanage commit. But I guess it's > un-reproducible. Do you want/need access to Shir's systems? If not, she can use them for other things, and/or try to verify current bug again - and if it fails again similarly, you get another reproduction. It would be helpful to have system logs from 2020-12-25 00:35:50 - at least few minutes before. Other than that I don't need the system. Is it ext4, xfs,... ? It looks like to be a problem with filesystem like the write() reported success while there was no data actually written to the file. Maybe missing sync. Not sure if it's related but I found the following lines in the logs: Dec 25 01:35:34 oncilla04 vdsm[26680]: WARN unhandled close event Dec 25 01:35:45 oncilla04 vdsm[26680]: WARN unhandled write event Dec 25 01:35:47 oncilla04 vdsm[26680]: WARN unhandled close event Dec 25 01:35:57 oncilla04 vdsm[26680]: WARN unhandled write event Dec 25 01:35:59 oncilla04 vdsm[26680]: WARN unhandled close event Dec 25 01:36:09 oncilla04 vdsm[26680]: WARN unhandled write event Anyway. It looks similar to https://bugzilla.redhat.com/show_bug.cgi?id=1838762 which is supposed to be fixed in libsemanage-2.9-3.el8 shipped by RHEL-8.3. According to logs, libsemanage-2.9-3.el8 was be already installed. But according to comment 9 - https://bugzilla.redhat.com/show_bug.cgi?id=1838762#c9 libsemanage-2.9-3.el8 fixed only problem with broken boot while it let broken selinux store on user to reinstall selinux-policy-targeted. It looks like we have semi reliable reproducer in #1838762 so we'll try to fix also the store problem. This patch https://lore.kernel.org/selinux/20210128104231.102470-1-plautrba@redhat.com/T/#u should prevent some issues with empty files in the module store after policy rebuild. It waits for upstream review now. *** Bug 1917507 has been marked as a duplicate of this bug. *** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (libsemanage bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:1672 |
Description of problem: After configured the Cinderlib DB as true: [1] Configure Cinderlib integration (Currently in tech preview) (Yes, No) [No]: Yes [2] Configured the database located as Local or Remote During the running of "engine-setup --reconfigure-optional-components" command, the execution of setup failed. (setup running attached) In addition to the running of CinderLib DB configuration (when the engine was in global maintenance mode), I have already succeeded to run the regular following Cinderlib integration steps on the engine and got the option (Just in the UI) to add new MBD : CinderLib integration steps: 1 )Changing the "openstack-cinderlib-rpms" to enabled =1 under rhv-4.4.4-dependencies.repo. 2) Run dnf install python3-cinder 3) Run engine-config -s ManagedBlockDomainSupported=true -> 4.5 4) systemctl restart ovirt-engine Version-Release number of selected component (if applicable): ovirt-engine-4.4.4.5-0.10.el8ev.noarch How reproducible: Always Steps to Reproduce: 1. Running Cinderlib regular integration steps as described above - Succeeded 2. Set to global maintenance mode 3. Running manually the "engine-setup --reconfigure-optional-components" on hosted engine environment - Failed 4. Running the POST request for creation MBD via REST API - Failed POST , https://<hosted_engine_environment_name>/ovirt-engine/api/storagedomains Body: <storage_domain> <name>ceph-cinderlib</name> <type>managed_block_storage</type> <storage> <type>managed_block_storage</type> <driver_options> <property> <name>rbd_ceph_conf</name> <value>/etc/ceph/ceph.conf</value> <name>rbd_keyring_conf</name> <value>/etc/ceph/ceph.client.admin.keyring</value> </property> <property> <name>rbd_pool</name> <value>volumes</value> </property> <property> <name>rbd_user</name> <value>admin</value> </property> <property> <name>use_multipath_for_image_xfer</name> <value>true</value> </property> <property> <name>volume_driver</name> <value>cinder.volume.drivers.rbd.RBDDriver</value> </property> </driver_options> </storage> <host> <name>host_mixed_1</name> </host> </storage_domain> Actual results: 1 ) The result of POST request for creation new MBD failed on Cinderlib DB required : <fault> <detail>[Cannot add Storage. Cinderlib database is required., Cannot add Storage. Cinderlib database is required.]</detail> <reason>Operation Failed</reason> </fault> 2) The engine setup reconfiguration components was failed with the following errors: from the setup running: --== CONFIGURATION PREVIEW ==-- Default SAN wipe after delete : False Host FQDN : hosted-engine-11.lab.eng.tlv2.redhat.com Firewall manager : firewalld Update Firewall : True CinderLib database host : localhost CinderLib database port : 5432 CinderLib database secured connection : False CinderLib database host name validation : False CinderLib database name : ovirt_cinderlib CinderLib database user name : ovirt_cinderlib Set up Cinderlib integration : True Configure local CinderLib database : True Engine database host : localhost Engine database port : 5432 Engine database secured connection : False Engine database host name validation : False Engine database name : engine Engine database user name : engine Engine installation : True PKI organization : lab.eng.tlv2.redhat.com Set up ovirt-provider-ovn : True Grafana integration : False Configure WebSocket Proxy : True DWH installation : True DWH database host : localhost DWH database port : 5432 DWH database secured connection : False DWH database host name validation : False DWH database name : ovirt_engine_history DWH database user name : ovirt_engine_history Backup DWH database : True Configure VMConsole Proxy : True Please confirm installation settings (OK, Cancel) [OK]: [ INFO ] Cleaning async tasks and compensations [ INFO ] Unlocking existing entities [ INFO ] Checking the Engine database consistency [ INFO ] Stage: Transaction setup [ INFO ] Stopping engine service [ INFO ] Stopping ovirt-fence-kdump-listener service [ INFO ] Stopping dwh service [ INFO ] Stopping vmconsole-proxy service [ INFO ] Stopping websocket-proxy service [ INFO ] Stage: Misc configuration (early) [ INFO ] Stage: Package installation [ INFO ] Stage: Misc configuration [ INFO ] Creating PostgreSQL 'ovirt_cinderlib' database [ INFO ] Configuring PostgreSQL [ INFO ] Upgrading CA [ INFO ] Updating OVN SSL configuration [ INFO ] Updating OVN timeout configuration [ INFO ] Backing up database localhost:ovirt_engine_history to '/var/lib/ovirt-engine-dwh/backups/dwh-20210105160339.mgjszrw_.dump'. [ INFO ] Creating/refreshing DWH database schema [ INFO ] Configuring WebSocket Proxy [ INFO ] Backing up database localhost:engine to '/var/lib/ovirt-engine/backups/engine-20210105160343.2iruju3t.dump'. [ INFO ] Creating/refreshing Engine database schema [ INFO ] Creating/refreshing Engine 'internal' domain database schema Unregistering existing client registration info. [ INFO ] Install selinux module /usr/share/ovirt-engine/selinux/ansible-runner-service.cil [ ERROR ] Failed to execute stage 'Misc configuration': Command '/usr/sbin/semodule' failed to execute [ INFO ] DNF Performing DNF transaction rollback [WARNING] Rollback of DWH database postponed to Stage "Clean up" [ INFO ] Rolling back database schema [ INFO ] Clearing Engine database engine [ INFO ] Restoring Engine database engine [ INFO ] Restoring file '/var/lib/ovirt-engine/backups/engine-20210105160343.2iruju3t.dump' to database localhost:engine. [ INFO ] Stage: Clean up Log file is located at /var/log/ovirt-engine/setup/ovirt-engine-setup-20210105160204-tpq803.log [ INFO ] Generating answer file '/var/lib/ovirt-engine/setup/answers/20210105160419-setup.conf' [WARNING] Rollback of DWH database started This might be a long process, but it should be safe to start the engine service before it finishes, if needed. [ INFO ] Clearing DWH database ovirt_engine_history [ INFO ] Restoring DWH database ovirt_engine_history [ INFO ] Restoring file '/var/lib/ovirt-engine-dwh/backups/dwh-20210105160339.mgjszrw_.dump' to database localhost:ovirt_engine_history. [ INFO ] Stage: Pre-termination [ INFO ] Stage: Termination [ ERROR ] Execution of setup failed From setup.log: 2021-01-05 15:56:59,421+0200 ERROR otopi.context context._executeMethod:154 Failed to execute stage 'Environment customization': SIG2 2021-01-05 15:56:59,493+0200 ERROR otopi.plugins.ovirt_engine_common.base.core.misc misc._terminate:153 Execution of setup failed 2021-01-05 15:58:15,076+0200 ERROR otopi.context context._executeMethod:154 Failed to execute stage 'Misc configuration': Command '/usr/sbin/semodule' failed to execute 2021-01-05 15:58:34,561+0200 DEBUG otopi.ovirt_engine_setup.engine_common.database database.restore:904 db restore rc 1 stderr ['pg_restore: while PROCESSING TOC:', 'pg_restore: from TOC entry 7272; 0 0 COMMENT EXTENSION "uuid-ossp" ', 'pg_restore: error: could not execute query: ERROR: must be owner of extension uuid-ossp', 'Command was: COMMENT ON EXTENSION "uuid-ossp" IS \'generate universally unique identifiers (UUIDs)\';', '', '', 'pg_restore: warning: errors ignored on restore: 1'] 2021-01-05 15:58:43,555+0200 DEBUG otopi.ovirt_engine_setup.engine_common.database database.restore:904 db restore rc 1 stderr ['pg_restore: while PROCESSING TOC:', 'pg_restore: from TOC entry 5166; 0 0 COMMENT EXTENSION "uuid-ossp" ', 'pg_restore: error: could not execute query: ERROR: must be owner of extension uuid-ossp', 'Command was: COMMENT ON EXTENSION "uuid-ossp" IS \'generate universally unique identifiers (UUIDs)\';', '', '', 'pg_restore: warning: errors ignored on restore: 1'] 2021-01-05 15:58:43,606+0200 ERROR otopi.plugins.ovirt_engine_common.base.core.misc misc._terminate:153 Execution of setup failed 2021-01-05 16:04:00,128+0200 ERROR otopi.context context._executeMethod:154 Failed to execute stage 'Misc configuration': Command '/usr/sbin/semodule' failed to execute 2021-01-05 16:04:19,605+0200 DEBUG otopi.ovirt_engine_setup.engine_common.database database.restore:904 db restore rc 1 stderr ['pg_restore: while PROCESSING TOC:', 'pg_restore: from TOC entry 7274; 0 0 COMMENT EXTENSION "uuid-ossp" ', 'pg_restore: error: could not execute query: ERROR: must be owner of extension uuid-ossp', 'Command was: COMMENT ON EXTENSION "uuid-ossp" IS \'generate universally unique identifiers (UUIDs)\';', '', '', 'pg_restore: warning: errors ignored on restore: 1'] 2021-01-05 16:04:28,409+0200 DEBUG otopi.ovirt_engine_setup.engine_common.database database.restore:904 db restore rc 1 stderr ['pg_restore: while PROCESSING TOC:', 'pg_restore: from TOC entry 5166; 0 0 COMMENT EXTENSION "uuid-ossp" ', 'pg_restore: error: could not execute query: ERROR: must be owner of extension uuid-ossp', 'Command was: COMMENT ON EXTENSION "uuid-ossp" IS \'generate universally unique identifiers (UUIDs)\';', '', '', 'pg_restore: warning: errors ignored on restore: 1'] 2021-01-05 16:04:28,461+0200 ERROR otopi.plugins.ovirt_engine_common.base.core.misc misc._terminate:153 Execution of setup failed Expected results: The operation of creating new MBD should success