1587899 – Encounter ovs-vswitchd.service failed start cause to openshift install fail in ovs29

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1587899 - Encounter ovs-vswitchd.service failed start cause to openshift install fail in ovs29

Summary: Encounter ovs-vswitchd.service failed start cause to openshift install fail i...

Keywords:
Status:	CLOSED CANTFIX
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	selinux-policy
Sub Component:
Version:	7.5
Hardware:	Unspecified
OS:	Linux
Priority:	medium
Severity:	high
Target Milestone:	rc
Target Release:	7.5
Assignee:	Lukas Vrabec
QA Contact:	Milos Malik
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1599240 (view as bug list)
Depends On:
Blocks:	1552827 1599240
TreeView+	depends on / blocked

Reported:	2018-06-06 09:32 UTC by DeShuai Ma
Modified:	2018-08-07 15:57 UTC (History)
CC List:	12 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1599240 (view as bug list)
Environment:
Last Closed:	2018-08-03 14:03:39 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
dmesg when the error happen (34.54 KB, text/plain) 2018-06-07 03:06 UTC, DeShuai Ma	no flags	Details
View All

Description DeShuai Ma 2018-06-06 09:32:18 UTC

Description of problem:
ovs-vswitchd.service failed start cause openshift install fail, After manual start ovs-vswitchd then then the env can recover

//openshift-ansible
TASK [openshift_node : Start and enable node] **********************************
Wednesday 06 June 2018  05:11:08 -0400 (0:00:00.051)       0:07:08.250 ******** 

FAILED - RETRYING: Start and enable node (1 retries left).

FAILED - RETRYING: Start and enable node (1 retries left).

fatal: [ec2-34-204-93-229.compute-1.amazonaws.com]: FAILED! => {"attempts": 1, "changed": false, "failed": true, "msg": "Unable to start service atomic-openshift-node: A dependency job for atomic-openshift-node.service failed. See 'journalctl -xe' for details.\n"}
...ignoring

fatal: [ec2-34-204-85-58.compute-1.amazonaws.com]: FAILED! => {"attempts": 1, "changed": false, "failed": true, "msg": "Unable to start service atomic-openshift-node: A dependency job for atomic-openshift-node.service failed. See 'journalctl -xe' for details.\n"}
...ignoring


Version-Release number of selected component (if applicable):
# rpm -qf /usr/lib/systemd/system/ovs-vswitchd.service
openvswitch-2.9.0-19.el7fdp.x86_64
# cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 7.5 (Maipo)

docker version: docker-1.13.1-63.git94f4240.el7.x86_64

How reproducible:
Sometime

Steps to Reproduce:
1. Check ovs-vswitchd status when openshift-ansilbe install failed.
[root@ip-172-18-4-249 ~]# systemctl status ovs-vswitchd
● ovs-vswitchd.service - Open vSwitch Forwarding Unit
   Loaded: loaded (/usr/lib/systemd/system/ovs-vswitchd.service; static; vendor preset: disabled)
   Active: failed (Result: start-limit) since Wed 2018-06-06 05:11:52 EDT; 5s ago
  Process: 13874 ExecStart=/usr/share/openvswitch/scripts/ovs-ctl --no-ovsdb-server --no-monitor --system-id=random ${OVSUSER} start $OPTIONS (code=exited, status=1/FAILURE)
  Process: 13869 ExecStartPre=/usr/bin/chmod 0775 /dev/hugepages (code=exited, status=0/SUCCESS)
  Process: 13867 ExecStartPre=/bin/sh -c /usr/bin/chown :$${OVS_USER_ID##*:} /dev/hugepages (code=exited, status=0/SUCCESS)

Jun 06 05:11:52 ip-172-18-4-249.ec2.internal ovs-ctl[13874]: Inserting openvswitch module [FAILED]
Jun 06 05:11:52 ip-172-18-4-249.ec2.internal ovs-ctl[13874]: not removing bridge module because bridges exist (docker0) ... (warning).
Jun 06 05:11:52 ip-172-18-4-249.ec2.internal systemd[1]: Failed to start Open vSwitch Forwarding Unit.
Jun 06 05:11:52 ip-172-18-4-249.ec2.internal systemd[1]: Unit ovs-vswitchd.service entered failed state.
Jun 06 05:11:52 ip-172-18-4-249.ec2.internal systemd[1]: ovs-vswitchd.service failed.
Jun 06 05:11:52 ip-172-18-4-249.ec2.internal systemd[1]: ovs-vswitchd.service holdoff time over, scheduling restart.
Jun 06 05:11:52 ip-172-18-4-249.ec2.internal systemd[1]: start request repeated too quickly for ovs-vswitchd.service
Jun 06 05:11:52 ip-172-18-4-249.ec2.internal systemd[1]: Failed to start Open vSwitch Forwarding Unit.
Jun 06 05:11:52 ip-172-18-4-249.ec2.internal systemd[1]: Unit ovs-vswitchd.service entered failed state.
Jun 06 05:11:52 ip-172-18-4-249.ec2.internal systemd[1]: ovs-vswitchd.service failed.
[root@ip-172-18-4-249 ~]# systemctl cat ovs-vswitchd
# /usr/lib/systemd/system/ovs-vswitchd.service
[Unit]
Description=Open vSwitch Forwarding Unit
After=ovsdb-server.service network-pre.target systemd-udev-settle.service
Before=network.target network.service
Requires=ovsdb-server.service
ReloadPropagatedFrom=ovsdb-server.service
AssertPathIsReadWrite=/var/run/openvswitch/db.sock
PartOf=openvswitch.service

[Service]
Type=forking
Restart=on-failure
Environment=HOME=/var/run/openvswitch
EnvironmentFile=/etc/openvswitch/default.conf
EnvironmentFile=-/etc/sysconfig/openvswitch
EnvironmentFile=-/run/openvswitch/useropts
ExecStartPre=-/bin/sh -c '/usr/bin/chown :$${OVS_USER_ID##*:} /dev/hugepages'
ExecStartPre=-/usr/bin/chmod 0775 /dev/hugepages
ExecStart=/usr/share/openvswitch/scripts/ovs-ctl \
          --no-ovsdb-server --no-monitor --system-id=random \
          ${OVSUSER} \
          start $OPTIONS
ExecStop=/usr/share/openvswitch/scripts/ovs-ctl --no-ovsdb-server stop
ExecReload=/usr/share/openvswitch/scripts/ovs-ctl --no-ovsdb-server \
          --no-monitor --system-id=random \
          ${OVSUSER} \
          restart $OPTIONS
TimeoutSec=300

2. Start ovs-vswitchd.service manual
[root@ip-172-18-4-249 ~]# cat /run/openvswitch/useropts
OVSUSER=--ovs-user=openvswitch:hugetlbfs
[root@ip-172-18-4-249 ~]# OVSUSER=--ovs-user=openvswitch:hugetlbfs
[root@ip-172-18-4-249 ~]# /usr/share/openvswitch/scripts/ovs-ctl \
>           --no-ovsdb-server --no-monitor --system-id=random \
>           ${OVSUSER} \
>           start
Inserting openvswitch module                               [  OK  ]
Starting ovs-vswitchd PMD: net_mlx4: cannot load glue library: libibverbs.so.1: cannot open shared object file: No such file or directory
PMD: net_mlx4: cannot initialize PMD due to missing run-time dependency on rdma-core libraries (libibverbs, libmlx4)
PMD: net_mlx5: cannot load glue library: libibverbs.so.1: cannot open shared object file: No such file or directory
PMD: net_mlx5: cannot initialize PMD due to missing run-time dependency on rdma-core libraries (libibverbs, libmlx5)
                                                           [  OK  ]
Enabling remote OVSDB managers                             [  OK  ]


Actual results:


Expected results:
The service always can start success

Additional info:

Comment 1 DeShuai Ma 2018-06-06 09:39:00 UTC

openshift-ansible-3.9.30-1.git.0.a91a657.el7.noarch.rpm

Comment 3 DeShuai Ma 2018-06-06 10:02:31 UTC

This is crio env.

Comment 6 Ben Bennett 2018-06-06 17:21:16 UTC

Since the module is not loading, can you load it manually with 'modprobe openvswitch'?

I do not understand how it fails once and then succeeds.  Can you grab the output from 'dmesg' please after a failed load?

Comment 7 DeShuai Ma 2018-06-07 02:56:04 UTC

(In reply to Ben Bennett from comment #6)
> Since the module is not loading, can you load it manually with 'modprobe
> openvswitch'?
After manually add the openvswitch module, then restart the service can success.

[root@ip-172-18-10-185 ~]# modprobe openvswitch
[root@ip-172-18-10-185 ~]# lsmod |grep openvswitch
openvswitch           114842  0 
nf_nat_ipv6            14131  1 openvswitch
nf_defrag_ipv6         35104  2 openvswitch,nf_conntrack_ipv6
nf_nat_ipv4            14115  2 openvswitch,iptable_nat
nf_nat                 26787  4 openvswitch,nf_nat_ipv4,nf_nat_ipv6,nf_nat_masquerade_ipv4
nf_conntrack          133053  8 openvswitch,nf_nat,nf_nat_ipv4,nf_nat_ipv6,xt_conntrack,nf_nat_masquerade_ipv4,nf_conntrack_ipv4,nf_conntrack_ipv6
libcrc32c              12644  4 xfs,openvswitch,nf_nat,nf_conntrack
[root@ip-172-18-10-185 ~]# 
[root@ip-172-18-10-185 ~]# systemctl start ovs-vswitchd
[root@ip-172-18-10-185 ~]# 
[root@ip-172-18-10-185 ~]# systemctl status ovs-vswitchd
● ovs-vswitchd.service - Open vSwitch Forwarding Unit
   Loaded: loaded (/usr/lib/systemd/system/ovs-vswitchd.service; static; vendor preset: disabled)
   Active: active (running) since Wed 2018-06-06 22:43:11 EDT; 8s ago
  Process: 8045 ExecStart=/usr/share/openvswitch/scripts/ovs-ctl --no-ovsdb-server --no-monitor --system-id=random ${OVSUSER} start $OPTIONS (code=exited, status=0/SUCCESS)
  Process: 8041 ExecStartPre=/usr/bin/chmod 0775 /dev/hugepages (code=exited, status=0/SUCCESS)
  Process: 8039 ExecStartPre=/bin/sh -c /usr/bin/chown :$${OVS_USER_ID##*:} /dev/hugepages (code=exited, status=0/SUCCESS)
   Memory: 5.0M
   CGroup: /system.slice/ovs-vswitchd.service
           └─8077 ovs-vswitchd unix:/var/run/openvswitch/db.sock -vconsole:emer -vsyslog:err -vfile:info --mlockall --user openvswitch:hugetlbfs --no-chdir --log-file=/var/log/openvswitch/ovs-vswitchd.log --pidfile=/var/run/openvswitc...

Jun 06 22:43:10 ip-172-18-10-185.ec2.internal systemd[1]: Starting Open vSwitch Forwarding Unit...
Jun 06 22:43:10 ip-172-18-10-185.ec2.internal ovs-ctl[8045]: Starting ovs-vswitchd PMD: net_mlx4: cannot load glue library: libibverbs.so.1: cannot open shared object file: No such file or directory
Jun 06 22:43:10 ip-172-18-10-185.ec2.internal ovs-ctl[8045]: PMD: net_mlx4: cannot initialize PMD due to missing run-time dependency on rdma-core libraries (libibverbs, libmlx4)
Jun 06 22:43:10 ip-172-18-10-185.ec2.internal ovs-ctl[8045]: PMD: net_mlx5: cannot load glue library: libibverbs.so.1: cannot open shared object file: No such file or directory
Jun 06 22:43:10 ip-172-18-10-185.ec2.internal ovs-ctl[8045]: PMD: net_mlx5: cannot initialize PMD due to missing run-time dependency on rdma-core libraries (libibverbs, libmlx5)
Jun 06 22:43:11 ip-172-18-10-185.ec2.internal ovs-ctl[8045]: [  OK  ]
Jun 06 22:43:11 ip-172-18-10-185.ec2.internal ovs-ctl[8045]: Enabling remote OVSDB managers [  OK  ]
Jun 06 22:43:11 ip-172-18-10-185.ec2.internal systemd[1]: Started Open vSwitch Forwarding Unit.
Jun 06 22:43:11 ip-172-18-10-185.ec2.internal ovs-vsctl[8086]: ovs|00001|vsctl|INFO|Called as ovs-vsctl --no-wait set Open_vSwitch . external-ids:hostname=ip-172-18-10-185.ec2.internal

> 
> I do not understand how it fails once and then succeeds.  Can you grab the
> output from 'dmesg' please after a failed load?
Before in step2 i run the ovs-ctl command directly, In the output it contain 'Inserting openvswitch module                               [  OK  ]' So it I think it may like the command 'modprobe openswitch'.

Comment 8 DeShuai Ma 2018-06-07 03:06:01 UTC

Created attachment 1448572 [details]
dmesg when the error happen

Comment 9 DeShuai Ma 2018-06-07 03:07:10 UTC

After manual run 'modprobe openvswitch'
The diff with before dmesg
# diff dmesg.txt dmesg-new.txt 
572a573
> [ 2398.945305] openvswitch: Open vSwitch switching datapath

Comment 10 Anping Li 2018-06-21 05:20:57 UTC

I get similar error when deploy v3.6 on AWS using the base_image  qe-rhel-7-release. The cri-o is not enabled. After I load module openvswitch, the depoloy  went on and succeed.

Comment 11 Wenkai Shi 2018-06-22 07:49:09 UTC

Insert "set -x" in /usr/share/openvswitch/scripts/ovs-ctl, then try start ovs-vswitchd, it still failed.
Execute "modprobe openvswitch" by manual, then ovs-vswitchd start succeed.

# cat /usr/share/openvswitch/scripts/ovs-ctl
...
set -x
...

# systemctl start ovs-vswitchd
Job for ovs-vswitchd.service failed because the control process exited with error code. See "systemctl status ovs-vswitchd.service" and "journalctl -xe" for details.

# journalctl -xe 
Jun 22 03:09:19 ip-172-18-15-62.ec2.internal ovs-ctl[14393]: + local STRING rc
Jun 22 03:09:19 ip-172-18-15-62.ec2.internal ovs-ctl[14393]: + STRING='Inserting openvswitch module'
Jun 22 03:09:19 ip-172-18-15-62.ec2.internal ovs-ctl[14393]: + echo -n 'Inserting openvswitch module '
Jun 22 03:09:19 ip-172-18-15-62.ec2.internal ovs-ctl[14393]: Inserting openvswitch module + shift
Jun 22 03:09:19 ip-172-18-15-62.ec2.internal ovs-ctl[14393]: + modprobe openvswitch
Jun 22 03:09:19 ip-172-18-15-62.ec2.internal ovs-ctl[14393]: + failure 'Inserting openvswitch module'
Jun 22 03:09:19 ip-172-18-15-62.ec2.internal ovs-ctl[14393]: + local rc=1
Jun 22 03:09:19 ip-172-18-15-62.ec2.internal ovs-ctl[14393]: + '[' serial '!=' verbose -a -z '' ']'
Jun 22 03:09:19 ip-172-18-15-62.ec2.internal ovs-ctl[14393]: + echo_failure
Jun 22 03:09:19 ip-172-18-15-62.ec2.internal ovs-ctl[14393]: + '[' serial = color ']'
Jun 22 03:09:19 ip-172-18-15-62.ec2.internal ovs-ctl[14393]: + echo -n '['
Jun 22 03:09:19 ip-172-18-15-62.ec2.internal ovs-ctl[14393]: [+ '[' serial = color ']'
Jun 22 03:09:19 ip-172-18-15-62.ec2.internal ovs-ctl[14393]: + echo -n FAILED
Jun 22 03:09:19 ip-172-18-15-62.ec2.internal ovs-ctl[14393]: FAILED+ '[' serial = color ']'
Jun 22 03:09:19 ip-172-18-15-62.ec2.internal ovs-ctl[14393]: + echo -n ']'
Jun 22 03:09:19 ip-172-18-15-62.ec2.internal ovs-ctl[14393]: ]+ echo -ne '\r'
Jun 22 03:09:19 ip-172-18-15-62.ec2.internal ovs-ctl[14393]: + return 1
Jun 22 03:09:19 ip-172-18-15-62.ec2.internal ovs-ctl[14393]: + '[' -x /bin/plymouth ']'
Jun 22 03:09:19 ip-172-18-15-62.ec2.internal ovs-ctl[14393]: + return 1
Jun 22 03:09:19 ip-172-18-15-62.ec2.internal ovs-ctl[14393]: + rc=1
Jun 22 03:09:19 ip-172-18-15-62.ec2.internal ovs-ctl[14393]: + echo
Jun 22 03:09:19 ip-172-18-15-62.ec2.internal ovs-ctl[14393]: + return 1
Jun 22 03:09:19 ip-172-18-15-62.ec2.internal ovs-ctl[14393]: + test -e /sys/module/bridge
Jun 22 03:09:19 ip-172-18-15-62.ec2.internal ovs-ctl[14393]: ++ sed 's,/sys/class/net/,,g;s,/bridge,,g'
Jun 22 03:09:19 ip-172-18-15-62.ec2.internal ovs-ctl[14393]: ++ echo /sys/class/net/docker0/bridge
Jun 22 03:09:19 ip-172-18-15-62.ec2.internal ovs-ctl[14393]: + bridges=docker0
Jun 22 03:09:19 ip-172-18-15-62.ec2.internal ovs-ctl[14393]: + test docker0 '!=' '*'
Jun 22 03:09:19 ip-172-18-15-62.ec2.internal ovs-ctl[14393]: + log_warning_msg 'not removing bridge module because bridges exist (docker0)'
Jun 22 03:09:19 ip-172-18-15-62.ec2.internal ovs-ctl[14393]: + printf '%s ... (warning).\n' 'not removing bridge module because bridges exist (docker0)'
Jun 22 03:09:19 ip-172-18-15-62.ec2.internal ovs-ctl[14393]: not removing bridge module because bridges exist (docker0) ... (warning).
Jun 22 03:09:19 ip-172-18-15-62.ec2.internal ovs-ctl[14393]: + return 1
Jun 22 03:09:19 ip-172-18-15-62.ec2.internal ovs-ctl[14393]: + return 1
Jun 22 03:09:19 ip-172-18-15-62.ec2.internal ovs-ctl[14393]: + return 1
Jun 22 03:09:19 ip-172-18-15-62.ec2.internal ovs-ctl[14393]: + exit 1
Jun 22 03:09:19 ip-172-18-15-62.ec2.internal systemd[1]: ovs-vswitchd.service holdoff time over, scheduling restart.
Jun 22 03:09:19 ip-172-18-15-62.ec2.internal systemd[1]: start request repeated too quickly for ovs-vswitchd.service
Jun 22 03:09:19 ip-172-18-15-62.ec2.internal systemd[1]: Failed to start Open vSwitch Forwarding Unit.
-- Subject: Unit ovs-vswitchd.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit ovs-vswitchd.service has failed.
-- 
-- The result is failed.
Jun 22 03:09:19 ip-172-18-15-62.ec2.internal systemd[1]: Unit ovs-vswitchd.service entered failed state.
Jun 22 03:09:19 ip-172-18-15-62.ec2.internal systemd[1]: ovs-vswitchd.service failed.

# modprobe openvswitch
# systemctl start ovs-vswitchd
# systemctl is-active ovs-vswitchd
active

Comment 12 Johnny Liu 2018-07-05 10:36:27 UTC

After dig more, found this is similar to https://bugzilla.redhat.com/show_bug.cgi?id=1508495.

After I turn off selinux, restart openvswitch successfully.

In /var/log/audit/audit.log, get the following avc denial message:
type=AVC msg=audit(1530786231.099:578): avc:  denied  { read } for  pid=4458 comm="modprobe" name="modules.dep.bin" dev="dm-0" ino=9115098 scontext=system_u:system_r:openvswitch_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=file
type=AVC msg=audit(1530786231.099:579): avc:  denied  { read } for  pid=4458 comm="modprobe" name="modules.alias.bin" dev="dm-0" ino=9115100 scontext=system_u:system_r:openvswitch_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=file
type=AVC msg=audit(1530786313.596:622): avc:  denied  { read } for  pid=4620 comm="modprobe" name="modules.softdep" dev="dm-0" ino=9115101 scontext=system_u:system_r:openvswitch_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=file
type=AVC msg=audit(1530786313.596:623): avc:  denied  { read } for  pid=4620 comm="modprobe" name="modules.dep.bin" dev="dm-0" ino=9115098 scontext=system_u:system_r:openvswitch_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=file
type=AVC msg=audit(1530786313.596:624): avc:  denied  { read } for  pid=4620 comm="modprobe" name="modules.dep.bin" dev="dm-0" ino=9115098 scontext=system_u:system_r:openvswitch_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=file
type=AVC msg=audit(1530786313.596:625): avc:  denied  { read } for  pid=4620 comm="modprobe" name="modules.alias.bin" dev="dm-0" ino=9115100 scontext=system_u:system_r:openvswitch_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=file
type=AVC msg=audit(1530786313.885:629): avc:  denied  { read } for  pid=4659 comm="modprobe" name="modules.softdep" dev="dm-0" ino=9115101 scontext=system_u:system_r:openvswitch_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=file
type=AVC msg=audit(1530786313.885:630): avc:  denied  { read } for  pid=4659 comm="modprobe" name="modules.dep.bin" dev="dm-0" ino=9115098 scontext=system_u:system_r:openvswitch_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=file
type=AVC msg=audit(1530786313.885:631): avc:  denied  { read } for  pid=4659 comm="modprobe" name="modules.dep.bin" dev="dm-0" ino=9115098 scontext=system_u:system_r:openvswitch_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=file
type=AVC msg=audit(1530786313.885:632): avc:  denied  { read } for  pid=4659 comm="modprobe" name="modules.alias.bin" dev="dm-0" ino=9115100 scontext=system_u:system_r:openvswitch_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=file
type=AVC msg=audit(1530786314.091:636): avc:  denied  { read } for  pid=4698 comm="modprobe" name="modules.softdep" dev="dm-0" ino=9115101 scontext=system_u:system_r:openvswitch_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=file
type=AVC msg=audit(1530786314.091:637): avc:  denied  { read } for  pid=4698 comm="modprobe" name="modules.dep.bin" dev="dm-0" ino=9115098 scontext=system_u:system_r:openvswitch_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=file
type=AVC msg=audit(1530786314.091:638): avc:  denied  { read } for  pid=4698 comm="modprobe" name="modules.dep.bin" dev="dm-0" ino=9115098 scontext=system_u:system_r:openvswitch_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=file
type=AVC msg=audit(1530786314.091:639): avc:  denied  { read } for  pid=4698 comm="modprobe" name="modules.alias.bin" dev="dm-0" ino=9115100 scontext=system_u:system_r:openvswitch_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=file


A little surprise is when using the same image to run instance on GCP/OpenStack, did not hit such issue, only reproduce on AWS.

This is blocking Openshift QE testing on AWS.

This should be regression too, because we did hit such issue before.

Comment 15 Milos Malik 2018-07-09 09:58:36 UTC

I believe that files like modules.softdep or modules.alias.bin are mislabeled on your machine, because the SELinux denials contain tcontext=system_u:object_r:unlabeled_t:s0.

Please run following command which corrects the labels:

# restorecon -Rv /lib/modules

Comment 16 Johnny Liu 2018-07-09 10:54:16 UTC

(In reply to Milos Malik from comment #15)
> I believe that files like modules.softdep or modules.alias.bin are
> mislabeled on your machine, because the SELinux denials contain
> tcontext=system_u:object_r:unlabeled_t:s0.
> 
> Please run following command which corrects the labels:
> 
> # restorecon -Rv /lib/modules

Cool, restorecon would fix the selinux issue, I am not sure why those file are not labeled correctly, and which rpm or script should label them automatically.

[root@ip-172-18-3-5 ~]# rpm -qf /lib/modules/3.10.0-862.6.3.el7.x86_64/modules.dep.bin
file /lib/modules/3.10.0-862.6.3.el7.x86_64/modules.dep.bin is not owned by any package
[root@ip-172-18-3-5 ~]# rpm -qf /lib/modules/3.10.0-862.6.3.el7.x86_64/
kernel-3.10.0-862.6.3.el7.x86_64

Comment 17 Lukas Vrabec 2018-07-17 21:15:15 UTC

Johnny, 

Are you able to reproduce it? If yes, please feel free to re-open this bug.

Comment 18 Johnny Liu 2018-07-18 03:08:07 UTC

This issue is a little weird, I run a new install for via kickstart to get the image. After installation, all files under /lib/modules are labeled successfully. After upload the image to Amazon, the label for some file (such as: modules.dep.bin) is lost, I am not sure where the root cause, at least I could reproduce it.

Comment 30 Petr Lautrbach 2018-08-03 12:56:47 UTC

I think that the important is comment #c18 It looks like AWS manipulates modules.dep.bin when SELinux is disabled, e.g. using offline image mount, or during some early grub phase.

The workaround for this would be to enforce file system relabel during first boot (touch /.autorelabel).

I'm afraid that we can't fix this in selinux-policy.

Comment 31 Lukas Vrabec 2018-08-03 14:03:39 UTC

I agree with Petr. Closing as CANTFIX.

Comment 32 Casey Callendrello 2018-08-07 15:57:32 UTC

*** Bug 1599240 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.