Created attachment 1402281 [details] audit.log Description of problem: Note: this is not a fresh installation of osp13 with ovs2.9, instead it updates ovs from 2.6 to 2.9 in a osp10 dpdk environment and manually apply the necessary ovs configuration[1] needed for running ovs2.9(e.g user&group permission, dpdk-devargs etc etc), and finally reboot the compute node. after the compute node is rebooted, use 'nova start' to bring up vm instance(with vhost interface), vm failed to boot and stay in powering-on state, error messages in /var/log/libvirt/qemu/instance.log as beow: --------------------------------------------------- 2018-03-01T07:39:11.887429Z qemu-kvm: -netdev vhost-user,chardev=charnet0,id=hostnet0: Failed to read msg header. Read -1 instead of 12. Original request 1. 2018-03-01T07:39:11.887440Z qemu-kvm: -netdev vhost-user,chardev=charnet0,id=hostnet0: vhost_get_features failed: Input/output error (5) 2018-03-01T07:39:11.887452Z qemu-kvm: -netdev vhost-user,chardev=charnet0,id=hostnet0: failed to init vhost_net for queue 0 2018-03-01T07:39:11.887937Z qemu-kvm: -netdev vhost-user,chardev=charnet0,id=hostnet0: Failed to read msg header. Read -1 instead of 12. Original request 1. ... --------------------------------------------------- if disabling selinux with 'senforce 0', vm will then become active and running. which looks like a selinux policy issue blocking the connection/read of vhost socket.(the selinux-policy package version is at 3.13.1-192). the audit.log file is attached. log in /var/log/openvswitch/ovs-vswitchd.log shows: --------------------------------------------------- 2018-03-01T07:44:47.322Z|14530105|dpdk|ERR|VHOST_CONFIG: truncted msg 2018-03-01T07:44:47.322Z|14530106|dpdk|ERR|VHOST_CONFIG: vhost read message failed 2018-03-01T07:44:47.323Z|14530107|dpdk|INFO|VHOST_CONFIG: new vhost user connection is 402 2018-03-01T07:44:47.323Z|14530108|dpdk|INFO|VHOST_CONFIG: new device, handle is 0 2018-03-01T07:44:47.323Z|14530109|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_GET_FEATURES 2018-03-01T07:44:47.323Z|14530110|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_GET_PROTOCOL_FEATURES 2018-03-01T07:44:47.323Z|14530111|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_PROTOCOL_FEATURES 2018-03-01T07:44:47.323Z|14530112|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_GET_QUEUE_NUM 2018-03-01T07:44:47.323Z|14530113|dpdk|ERR|VHOST_CONFIG: truncted msg 2018-03-01T07:44:47.323Z|14530114|dpdk|ERR|VHOST_CONFIG: vhost read message failed 2018-03-01T07:44:47.323Z|14530115|dpdk|INFO|VHOST_CONFIG: new vhost user connection is 402 2018-03-01T07:44:47.323Z|14530116|dpdk|INFO|VHOST_CONFIG: new device, handle is 0 2018-03-01T07:44:47.323Z|14530117|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_GET_FEATURES 2018-03-01T07:44:47.323Z|14530118|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_GET_PROTOCOL_FEATURES 2018-03-01T07:44:47.323Z|14530119|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_PROTOCOL_FEATURES 2018-03-01T07:44:47.323Z|14530120|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_GET_QUEUE_NUM 2018-03-01T07:44:47.323Z|14530121|dpdk|ERR|VHOST_CONFIG: truncted msg --------------------------------------------------- Version-Release number of selected component (if applicable): [root@overcloud-compute-0 ~]# rpm -qa | grep 'qemu\|selinux\|openvswitch\|dpdk\|kernel' kernel-devel-3.10.0-514.21.1.el7.x86_64 kernel-3.10.0-693.17.1.el7.x86_64 libselinux-python-2.5-12.el7.x86_64 kernel-headers-3.10.0-858.el7.x86_64 container-selinux-2.48-1.el7.noarch ceph-selinux-10.2.10-17.el7cp.x86_64 erlang-kernel-18.3.4.7-1.el7ost.x86_64 kernel-3.10.0-514.21.1.el7.x86_64 python-openvswitch-2.6.1-16.git20161206.el7ost.noarch dpdk-17.11-7.el7.x86_64 openstack-neutron-openvswitch-9.4.1-12.el7ost.noarch libselinux-ruby-2.5-12.el7.x86_64 qemu-kvm-common-rhev-2.9.0-16.el7_4.13.x86_64 libselinux-utils-2.5-12.el7.x86_64 libselinux-2.5-12.el7.x86_64 selinux-policy-targeted-3.13.1-192.el7.noarch openvswitch-2.9.0-3.el7fdp.x86_64 qemu-img-rhev-2.9.0-16.el7_4.13.x86_64 libvirt-daemon-driver-qemu-3.9.0-13.el7.x86_64 libselinux-2.5-12.el7.i686 kernel-3.10.0-858.el7.x86_64 openstack-selinux-0.8.13-1.el7ost.noarch ipxe-roms-qemu-20170123-1.git4e85b27.el7_4.1.noarch kernel-tools-libs-3.10.0-858.el7.x86_64 kernel-devel-3.10.0-858.el7.x86_64 qemu-kvm-rhev-2.9.0-16.el7_4.13.x86_64 kernel-devel-3.10.0-693.17.1.el7.x86_64 kernel-tools-3.10.0-858.el7.x86_64 selinux-policy-3.13.1-192.el7.noarch [root@overcloud-compute-0 ~]# cat /etc/sysconfig/openvswitch OPTIONS="" DPDK_OPTIONS = "-l 4,6,20,22 -n 4 --socket-mem 20482048 -w 0000:06:00.1" [root@overcloud-compute-0 ~]# cat /etc/openvswitch/default.conf OVS_USER_ID="openvswitch:hugetlbfs" [root@overcloud-compute-0 ~]# cat /etc/libvirt/qemu.conf max_files = 32768 max_processes = 131072 user = "openvswitch" group = "hugetlbfs" [root@overcloud-compute-0 ~]# ls -al /var/run/openvswitch/ total 344 drwxr-xr-x. 2 openvswitch hugetlbfs 480 Mar 1 07:28 . drwxr-xr-x. 46 root root 1400 Mar 1 07:30 .. srwxr-x---. 1 openvswitch hugetlbfs 0 Mar 1 07:27 br-ex.mgmt srwxr-x---. 1 openvswitch hugetlbfs 0 Mar 1 07:27 br-ex.snoop srwxr-x---. 1 openvswitch hugetlbfs 0 Mar 1 07:27 br-int.mgmt srwxr-x---. 1 openvswitch hugetlbfs 0 Mar 1 07:27 br-int.snoop srwxr-x---. 1 openvswitch hugetlbfs 0 Mar 1 07:27 br-isolation.mgmt srwxr-x---. 1 openvswitch hugetlbfs 0 Mar 1 07:27 br-isolation.snoop srwxr-x---. 1 openvswitch hugetlbfs 0 Mar 1 07:28 br-link.mgmt srwxr-x---. 1 openvswitch hugetlbfs 0 Mar 1 07:27 br-link.snoop srwxr-x---. 1 openvswitch hugetlbfs 0 Mar 1 07:27 br-tun.mgmt srwxr-x---. 1 openvswitch hugetlbfs 0 Mar 1 07:27 br-tun.snoop srwxr-x---. 1 openvswitch hugetlbfs 0 Mar 1 07:27 db.sock srwxr-x---. 1 openvswitch hugetlbfs 0 Mar 1 07:27 ovsdb-server.1436.ctl -rw-r--r--. 1 openvswitch hugetlbfs 5 Mar 1 07:27 ovsdb-server.pid srwxr-x---. 1 openvswitch hugetlbfs 0 Mar 1 07:27 ovs-vswitchd.1529.ctl -rw-r--r--. 1 openvswitch hugetlbfs 5 Mar 1 07:27 ovs-vswitchd.pid -rw-r-----. 1 openvswitch hugetlbfs 208420 Mar 1 07:27 .rte_config -rw-r--r--. 1 openvswitch hugetlbfs 132608 Mar 1 07:27 .rte_hugepage_info srwxr-xr-x. 1 openvswitch hugetlbfs 0 Mar 1 07:27 .rte_mp_socket srwxr-xr-x. 1 openvswitch hugetlbfs 0 Mar 1 07:28 vhu04394507-c7 srwxr-xr-x. 1 openvswitch hugetlbfs 0 Mar 1 07:28 vhu46786d43-cf srwxr-xr-x. 1 openvswitch hugetlbfs 0 Mar 1 07:28 vhu6530141e-3c srwxr-xr-x. 1 openvswitch hugetlbfs 0 Mar 1 07:28 vhue12a098b-b1 [root@overcloud-compute-0 ~]# ls -al /etc/openvswitch/ total 180 drwxr-xr-x. 2 openvswitch hugetlbfs 152 Mar 1 07:27 . drwxr-xr-x. 143 root root 8192 Mar 1 07:30 .. -rw-r--r--. 1 openvswitch hugetlbfs 61236 Mar 1 07:29 conf.db -rw-r--r--. 1 root root 101880 Mar 1 07:27 conf.db.backup7.14.0-3974332717 -rw-------. 1 openvswitch hugetlbfs 0 Feb 28 11:18 .conf.db.~lock~ -rw-------. 1 openvswitch hugetlbfs 0 Mar 1 07:27 .conf.db.tmp.~lock~ -rw-r--r--. 1 openvswitch hugetlbfs 175 Mar 1 07:23 default.conf -rw-r--r--. 1 openvswitch hugetlbfs 37 Feb 28 11:18 system-id.conf [root@overcloud-compute-0 ~]# ls -al /var/log/openvswitch/ total 2886804 drwxr-xr-x. 2 openvswitch hugetlbfs 126 Mar 1 07:44 . drwxr-xr-x. 41 root root 4096 Mar 1 07:27 .. -rw-r--r--. 1 openvswitch hugetlbfs 674 Mar 1 07:27 ovsdb-server.log -rw-r--r--. 1 openvswitch hugetlbfs 381 Feb 28 11:23 ovsdb-server.log-20180301.gz -rw-r--r--. 1 openvswitch hugetlbfs 2806580006 Mar 1 08:02 ovs-vswitchd.log -rw-r--r--. 1 openvswitch hugetlbfs 4521 Mar 1 03:02 ovs-vswitchd.log-20180301.gz [root@overcloud-compute-0 ~]# ovs-vsctl show 84a7a13b-f8e8-4212-90cd-b52ebffe5492 Manager "ptcp:6640:127.0.0.1" is_connected: true Bridge br-tun Controller "tcp:127.0.0.1:6633" is_connected: true fail_mode: secure Port br-tun Interface br-tun type: internal Port "vxlan-ac11020f" Interface "vxlan-ac11020f" type: vxlan options: {df_default="true", in_key=flow, local_ip="172.17.2.22", out_key=flow, remote_ip="172.17.2.15"} Port "vxlan-ac11020d" Interface "vxlan-ac11020d" type: vxlan options: {df_default="true", in_key=flow, local_ip="172.17.2.22", out_key=flow, remote_ip="172.17.2.13"} Port "vxlan-ac110211" Interface "vxlan-ac110211" type: vxlan options: {df_default="true", in_key=flow, local_ip="172.17.2.22", out_key=flow, remote_ip="172.17.2.17"} Port patch-int Interface patch-int type: patch options: {peer=patch-tun} Port "vxlan-ac110213" Interface "vxlan-ac110213" type: vxlan options: {df_default="true", in_key=flow, local_ip="172.17.2.22", out_key=flow, remote_ip="172.17.2.19"} Bridge br-ex Controller "tcp:127.0.0.1:6633" is_connected: true fail_mode: secure Port phy-br-ex Interface phy-br-ex type: patch options: {peer=int-br-ex} Port br-ex Interface br-ex type: internal Bridge br-isolation fail_mode: standalone Port "vlan202" tag: 202 Interface "vlan202" type: internal Port "p5p1" Interface "p5p1" Port "vlan201" tag: 201 Interface "vlan201" type: internal Port br-isolation Interface br-isolation type: internal Port "vlan203" tag: 203 Interface "vlan203" type: internal Bridge br-link Controller "tcp:127.0.0.1:6633" is_connected: true fail_mode: secure Port phy-br-link Interface phy-br-link type: patch options: {peer=int-br-link} Port "dpdk0" Interface "dpdk0" type: dpdk options: {dpdk-devargs="0000:06:00.1"} Port br-link Interface br-link type: internal Bridge br-int Controller "tcp:127.0.0.1:6633" is_connected: true fail_mode: secure Port int-br-ex Interface int-br-ex type: patch options: {peer=phy-br-ex} Port patch-tun Interface patch-tun type: patch options: {peer=patch-int} Port br-int Interface br-int type: internal Port "vhu46786d43-cf" tag: 1 Interface "vhu46786d43-cf" type: dpdkvhostuser Port "vhu6530141e-3c" tag: 2 Interface "vhu6530141e-3c" type: dpdkvhostuser Port int-br-link Interface int-br-link type: patch options: {peer=phy-br-link} Port "vhue12a098b-b1" tag: 1 Interface "vhue12a098b-b1" type: dpdkvhostuser Port "vhu04394507-c7" tag: 2 Interface "vhu04394507-c7" type: dpdkvhostuser ovs_version: "2.9.0" [root@overcloud-compute-0 ~]# cat /etc/sysconfig/network-scripts/ifcfg-dpdk0 # This file is autogenerated by os-net-config DEVICE=dpdk0 ONBOOT=yes HOTPLUG=no NM_CONTROLLED=no PEERDNS=no DEVICETYPE=ovs TYPE=OVSDPDKPort OVS_BRIDGE=br-link OVS_EXTRA="set Interface dpdk0 options:dpdk-devargs=0000:06:00.1" [1]: https://github.com/redhat-nfvpe/infrared_upgrade_config/blob/master/ovs_upgrade/tasks/main.yml
This issue is in OSP10 with ovs2.9, which means the default vhost-user mode is dpdkvhostuser (ovs in the server mode). And the vhost sockets will be created on the /var/run/openvswitch/ directory.
The config of --socket-mem looks odd '20482048', but I didn't configure or change it manually, is it auto-generated? [root@overcloud-compute-0 log]# cat /etc/sysconfig/openvswitch OPTIONS="" DPDK_OPTIONS = "-l 4,6,20,22 -n 4 --socket-mem 20482048 -w 0000:06:00.1" Getting other_config returns good value: [root@overcloud-compute-0 log]# ovs-vsctl get Open_vSwitch . other_config {dpdk-init="true", dpdk-lcore-mask="a800a8", dpdk-socket-mem="2048,2048", pmd-cpu-mask="500050"}
(In reply to zenghui.shi from comment #2) > The config of --socket-mem looks odd '20482048', but I didn't configure or > change it manually, is it auto-generated? > > [root@overcloud-compute-0 log]# cat /etc/sysconfig/openvswitch > OPTIONS="" > DPDK_OPTIONS = "-l 4,6,20,22 -n 4 --socket-mem 20482048 -w 0000:06:00.1" > DPDK_OPTIONS is not used from ovs2.6 onwards. other_config:dpdk-socket-mem is only used. There some mismatch in the puppet configuration because of changed parameter format, but you can ignore it as it is not used.
Created attachment 1416255 [details] audit-2018-4-2.log
type=USER_AVC msg=audit(1522670657.993:1217): pid=1282 uid=81 auid=4294967295 ses=4294967295 subj=system_u:system_r:system_dbusd_t:s0-s0:c0.c1023 msg='avc: denied { send_msg } for msgtype=method_call interface=org.freedesktop.login1.Manager member=Inhibit dest=org.freedesktop.login1 spid=242890 tpid=1293 scontext=system_u:system_r:virtlogd_t:s0-s0:c0.c1023 tcontext=system_u:system_r:systemd_logind_t:s0 tclass=dbus exe=2F7573722F62696E2F646275732D6461656D6F6E3B3561633165373430202864656C6574656429 sauid=81 hostname=? addr=? terminal=?' This looks like: https://bugzilla.redhat.com/show_bug.cgi?id=1561729
Next update for OSP10 should have the fix for that USER_AVC (for now), which will be resolved in a different way by libvirt at a later time.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:2102