Bug 1851986

Summary: SELinux prevents the iscsid container to work when deploying an overcloud
Product: Red Hat OpenStack Reporter: Cédric Jeanneret <cjeanner>
Component: openstack-selinuxAssignee: Julie Pichon <jpichon>
Status: CLOSED WORKSFORME QA Contact: nlevinki <nlevinki>
Severity: high Docs Contact:
Priority: high    
Version: 16.1 (Train)CC: bdobreli, bfournie, lhh, lvrabec
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-11-25 11:50:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1846364    
Bug Blocks:    
Attachments:
Description Flags
Denials listing
none
full audit.log none

Description Cédric Jeanneret 2020-06-29 14:51:31 UTC
Created attachment 1699176 [details]
Denials listing

Description of problem:
With a SELinux enforcing Undercloud, we can't deploy an overcloud anymore due to SELinux denials (see attachment).

Version-Release number of selected component (if applicable):
openstack-selinux-0.8.20-0.20200428133425.3300746.el8ost.noarch


How reproducible:
Always

Steps to Reproduce:
1. Deploy osp-16.1 undercloud node
2. Ensures you have SELinux enforced
3. Try to deploy an overcloud

Actual results:
The overcloud deploy fails due to denials related to iscsiadm commands:
iscsiadm: Maybe you are not root?\niscsiadm: Could not lock discovery DB: /run/lock/iscsi/lock.write: Permission denied\niscsiadm: Maybe you are not root?\niscsiadm: Could not lock discovery DB: /run/lock/iscsi/lock.write: Permission denied\niscsiadm: Could not add new discovery record.\n

This can be seen in /var/log/containers/ironic/ironic-conductor.log

Expected results:
It shouldn't show up in the audit.log, and we should get a successful deploy

Additional info:
Note: my node has an installed guestfs package, but its iscsid.service and iscsid.socket are disabled, and not running.

Comment 1 Cédric Jeanneret 2020-06-29 14:57:20 UTC
The potential policy would be:

module iscsi-container 1.0;

require {
        type sysfs_t;
        type container_var_run_t;
        type fsadm_var_run_t;
        type iscsi_lock_t;
        type fixed_disk_device_t;
        type container_t;
        class blk_file { getattr ioctl open read write };
        class file { create link open read rename setattr unlink write };
        class dir { add_name remove_name write };
        class lnk_file read;
}

#============= container_t ==============
allow container_t container_var_run_t:lnk_file read;
allow container_t fixed_disk_device_t:blk_file { getattr ioctl open read write };
allow container_t fsadm_var_run_t:dir { add_name remove_name write };
allow container_t fsadm_var_run_t:file { create link open read rename setattr unlink write };
allow container_t iscsi_lock_t:dir { add_name remove_name write };
allow container_t iscsi_lock_t:file { link open read unlink write };
allow container_t sysfs_t:file write;



I don't see bad things - though the access to sysfs_t might be a source of issues imho. Not sure about the "fsadm_var_run_t" though - but the "fsadm" prefix sounds like a bad idea as well....

Comment 2 Cédric Jeanneret 2020-06-29 15:15:54 UTC
After some digging with the MLS values, it appears that:
- the c135,c671 MLS is linked to another container:

[RedHat-8.2 - root@undercloud run]# ps faxZ | grep c135,c671                                                                                                                                                                                  
unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 445481 pts/5 S+   0:00  |           \_ grep --color=auto c135,c671
system_u:system_r:container_t:s0:c135,c671 133141 ? Ss   0:00  \_ dumb-init --single-child -- kolla_start
system_u:system_r:container_t:s0:c135,c671 133166 ? S   0:56      \_ /usr/bin/python3 /usr/bin/ironic-conductor

This means ironic_conductor is actually doing the iscsiadm call.

When we explore a bit more the mounts, we see this in tripleo-heat-templates/deployment/iscsi/iscsid-container-puppet.yaml
        step_3:
          iscsid:
            start_order: 2
            image: {get_param: ContainerIscsidImage}
            net: host
            privileged: true
            restart: always
            healthcheck:
              test: /openstack/healthcheck
            volumes:
              list_concat:
                - {get_attr: [ContainersCommon, volumes]}
                -
                  - /var/lib/kolla/config_files/iscsid.json:/var/lib/kolla/config_files/config.json:ro
                  - /dev/:/dev/
                  - /run/:/run/
                  - /sys:/sys
                  - /lib/modules:/lib/modules:ro
                  - /etc/iscsi:/var/lib/kolla/config_files/src-iscsid:ro
                  - /var/lib/iscsi:/var/lib/iscsi:z
            environment:
              KOLLA_CONFIG_STRATEGY: COPY_ALWAYS

And, in deployment/ironic/ironic-conductor-container-puppet.yaml:
        step_4:
          map_merge:
            - if:
              - configure_swift_temp_url
              - create_swift_temp_url_key:
                  start_order: 70
                  image: &ironic_conductor_image {get_param: ContainerIronicConductorImage}
                  net: host
                  detach: false
                  volumes:
                    list_concat:
                      - {get_attr: [ContainersCommon, volumes]}
                      -
                        - /var/lib/config-data/puppet-generated/ironic/etc/ironic:/etc/ironic:ro
                        - /var/lib/container-config-scripts/create_swift_temp_url_key.sh:/create_swift_temp_url_key.sh:ro                                                                                                                     
                  user: root
                  command: "/usr/bin/bootstrap_host_exec ironic_conductor /create_swift_temp_url_key.sh"
              - {}
            - ironic_conductor:
                start_order: 80
                image: *ironic_conductor_image
                net: host
                privileged: true
                restart: always
                healthcheck: {get_attr: [ContainersCommon, healthcheck_rpc_port]}
                volumes:
                  list_concat:
                    - {get_attr: [ContainersCommon, volumes]}
                    -
                      - /var/lib/kolla/config_files/ironic_conductor.json:/var/lib/kolla/config_files/config.json:ro
                      - /var/lib/config-data/puppet-generated/ironic:/var/lib/kolla/config_files/src:ro
                      - /lib/modules:/lib/modules:ro
                      - /sys:/sys
                      - /dev:/dev
                      - /run:/run #shared?
                      - /var/lib/ironic:/var/lib/ironic:z
                      - /var/log/containers/ironic:/var/log/ironic:z
                environment:
                  KOLLA_CONFIG_STRATEGY: COPY_ALWAYS


Both are using /run as a shared volume, which is bad, especially since iscsiadm is apparently wanting to use a sub-directory: /run/lock/iscsi/lock

Wondering if we can't just share that precise location, with a ":z" flag - though it might create an issue where iscsid_t can't access container_file_t (or something like that).

In any cases.... I wouldn't go for the proposed policy, since it opens some wide doors... I'm pretty sure there is something better (domain transition? maybe... ??)

Comment 3 Cédric Jeanneret 2020-06-29 15:55:29 UTC
Created attachment 1699188 [details]
full audit.log

Comment 4 Bob Fournier 2020-06-29 16:17:59 UTC
I installed compose RHOS-16.1-RHEL-8-20200625.n.0.

[stack@hardprov-dl360-g9-01 ~]$ getenforce 
Enforcing

I was able to deploy a node fine using iscsi:
$ openstack server list
+--------------------------------------+------+--------+-----------------------------+----------------+---------+
| ID                                   | Name | Status | Networks                    | Image          | Flavor  |
+--------------------------------------+------+--------+-----------------------------+----------------+---------+
| 08e6791e-da92-4734-9d07-191cb81e4de0 | dell | ACTIVE | ctlplane=fe32:dead:beef::d7 | overcloud-full | control |
+--------------------------------------+------+--------+-----------------------------+----------------+---------+

ironic-conductor.log
2020-06-29 12:07:59.044 7 DEBUG oslo_concurrency.processutils [req-b5444fa8-8fde-4acc-bc2c-55f5505f2cdc - - - - -] Running cmd (subprocess): sudo ironic-rootwrap /etc/ironic/rootwrap.conf iscsiadm -m node -p [fe32:dead:beef::86]:3260 -T iqn.2008-10.org.openstack:04bb57f8-76f8-4371-82fc-163299845c3e --logout execute /usr/lib/python3.6/site-packages/oslo_concurrency/processutils.py:372
2020-06-29 12:07:59.373 7 DEBUG oslo_concurrency.processutils [req-b5444fa8-8fde-4acc-bc2c-55f5505f2cdc - - - - -] CMD "sudo ironic-rootwrap /etc/ironic/rootwrap.conf iscsiadm -m node -p [fe32:dead:beef::86]:3260 -T iqn.2008-10.org.openstack:04bb57f8-76f8-4371-82fc-163299845c3e --logout" returned: 0 in 0.329s execute /usr/lib/python3.6/site-packages/oslo_concurrency/processutils.py:409
2020-06-29 12:07:59.374 7 DEBUG ironic.common.utils [req-b5444fa8-8fde-4acc-bc2c-55f5505f2cdc - - - - -] Execution completed, command line is "iscsiadm -m node -p [fe32:dead:beef::86]:3260 -T iqn.2008-10.org.openstack:04bb57f8-76f8-4371-82fc-163299845c3e --logout" execute /usr/lib/python3.6/site-packages/ironic/common/utils.py:77
2020-06-29 12:07:59.376 7 DEBUG oslo_concurrency.processutils [req-b5444fa8-8fde-4acc-bc2c-55f5505f2cdc - - - - -] Running cmd (subprocess): sudo ironic-rootwrap /etc/ironic/rootwrap.conf iscsiadm -m node -p [fe32:dead:beef::86]:3260 -T iqn.2008-10.org.openstack:04bb57f8-76f8-4371-82fc-163299845c3e -o delete execute /usr/lib/python3.6/site-packages/oslo_concurrency/processutils.py:372
2020-06-29 12:07:59.606 7 DEBUG oslo_concurrency.processutils [req-b5444fa8-8fde-4acc-bc2c-55f5505f2cdc - - - - -] CMD "sudo ironic-rootwrap /etc/ironic/rootwrap.conf iscsiadm -m node -p [fe32:dead:beef::86]:3260 -T iqn.2008-10.org.openstack:04bb57f8-76f8-4371-82fc-163299845c3e -o delete" returned: 0 in 0.230s execute /usr/lib/python3.6/site-packages/oslo_concurrency/processutils.py:409
2020-06-29 12:07:59.607 7 DEBUG ironic.common.utils [req-b5444fa8-8fde-4acc-bc2c-55f5505f2cdc - - - - -] Execution completed, command line is "iscsiadm -m node -p [fe32:dead:beef::86]:3260 -T iqn.2008-10.org.openstack:04bb57f8-76f8-4371-82fc-163299845c3e -o delete" execute /usr/lib/python3.6/site-packages/ironic/common/utils.py:77

Comment 5 Cédric Jeanneret 2020-06-29 16:28:28 UTC
It might be a regression in the podman version I'm using - I forget to mention it's a special 1.6.4 with a patch for another issue.... Might be a regression. I'm redeploying with the "stock" podman to be sure.
Apparently the "--privileged" isn't properly used in the podman version I'm using.

Comment 6 Cédric Jeanneret 2020-06-30 14:42:24 UTC
After many tests, here's the RCA:

my env was testing a patched podman, in order to ensure https://bugzilla.redhat.com/show_bug.cgi?id=1846364 was properly solved. It appears this podman version has a "nice" regression: --privileged doesn't seem to set the mandatory labels to the container's process, leading to the issue I just described.

This is why Bob couldn't reproduce it with the "stock rhos-appstream" podman. The reason why Compute QE didn't see it is due to infrared - it doesn't patch the undercloud with new package before the undercloud is deployed, hiding the issue completely.

Thus I'll add a "depends-on" the podman issue.

NOT a SELinux issue actually, since it's induced by a podman regression in a test build.

Comment 7 Cédric Jeanneret 2020-11-25 11:50:37 UTC
Closing since the depends-on is now over.