Description of problem: openvswitch crash while restart openvswitch service Version-Release number of selected component (if applicable): [root@dell-per730-18 ~]# rpm -qa | grep openv openvswitch-selinux-extra-policy-1.0-12.el8fdp.noarch openvswitch2.11-test-2.11.0-9.el8fdp.noarch openvswitch2.11-2.11.0-9.el8fdp.x86_64 python3-openvswitch2.11-2.11.0-9.el8fdp.x86_64 [root@dell-per730-18 ~]# [root@dell-per730-18 ~]# cat /etc/os-release NAME="Red Hat Enterprise Linux" VERSION="8.0 (Ootpa)" ID="rhel" ID_LIKE="fedora" VERSION_ID="8.0" PLATFORM_ID="platform:el8" PRETTY_NAME="Red Hat Enterprise Linux 8.0 (Ootpa)" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:redhat:enterprise_linux:8.0:GA" HOME_URL="https://www.redhat.com/" BUG_REPORT_URL="https://bugzilla.redhat.com/" REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 8" REDHAT_BUGZILLA_PRODUCT_VERSION=8.0 REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux" REDHAT_SUPPORT_PRODUCT_VERSION="8.0" [root@dell-per730-18 ~]# [root@dell-per730-18 ~]# rpm -qa | grep dpdk dpdk-tools-18.11-4.el8.x86_64 dpdk-18.11-4.el8.x86_64 [root@dell-per730-18 ~]# How reproducible: sometimes Steps to Reproduce: :: [ 06:04:52 ] :: [ BEGIN ] :: Running 'modprobe openvswitch' :: [ 06:04:52 ] :: [ PASS ] :: Command 'modprobe openvswitch' (Expected 0, got 0) :: [ 06:04:52 ] :: [ BEGIN ] :: Running 'systemctl stop openvswitch' :: [ 06:04:52 ] :: [ PASS ] :: Command 'systemctl stop openvswitch' (Expected 0, got 0) :: [ 06:04:52 ] :: [ BEGIN ] :: Running 'sleep 3' :: [ 06:04:55 ] :: [ PASS ] :: Command 'sleep 3' (Expected 0, got 0) :: [ 06:04:55 ] :: [ BEGIN ] :: Running 'systemctl start openvswitch' :: [ 06:04:56 ] :: [ PASS ] :: Command 'systemctl start openvswitch' (Expected 0, got 0) :: [ 06:04:56 ] :: [ BEGIN ] :: Running 'sleep 3' :: [ 06:04:59 ] :: [ PASS ] :: Command 'sleep 3' (Expected 0, got 0) :: [ 06:04:59 ] :: [ BEGIN ] :: Running 'ovs-vsctl --if-exists del-br ovsbr0' :: [ 06:04:59 ] :: [ PASS ] :: Command 'ovs-vsctl --if-exists del-br ovsbr0' (Expected 0, got 0) :: [ 06:04:59 ] :: [ BEGIN ] :: Running 'sleep 5' :: [ 06:05:04 ] :: [ PASS ] :: Command 'sleep 5' (Expected 0, got 0) :: [ 06:05:04 ] :: [ BEGIN ] :: Running 'ovs-vsctl set Open_vSwitch . other_config={}' :: [ 06:05:04 ] :: [ PASS ] :: Command 'ovs-vsctl set Open_vSwitch . other_config={}' (Expected 0, got 0) :: [ 06:05:04 ] :: [ BEGIN ] :: Running 'ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true' :: [ 06:05:04 ] :: [ PASS ] :: Command 'ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true' (Expected 0, got 0) :: [ 06:05:04 ] :: [ BEGIN ] :: Running 'ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem=4096,4096' :: [ 06:05:04 ] :: [ PASS ] :: Command 'ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem=4096,4096' (Expected 0, got 0) :: [ 06:05:04 ] :: [ BEGIN ] :: Running 'ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x55555554' 2019-05-17T10:05:37Z|00002|jsonrpc|WARN|unix:/var/run/openvswitch/db.sock: receive error: Connection reset by peer 2019-05-17T10:05:37Z|00003|reconnect|WARN|unix:/var/run/openvswitch/db.sock: connection dropped (Connection reset by peer) 2019-05-20T02:39:48Z|00004|jsonrpc|WARN|unix:/var/run/openvswitch/db.sock: receive error: Connection reset by peer 2019-05-20T02:39:48Z|00005|reconnect|WARN|unix:/var/run/openvswitch/db.sock: connection dropped (Connection reset by peer) 2019-05-20T02:54:35Z|00006|jsonrpc|WARN|unix:/var/run/openvswitch/db.sock: receive error: Connection reset by peer 2019-05-20T02:54:35Z|00007|reconnect|WARN|unix:/var/run/openvswitch/db.sock: connection dropped (Connection reset by peer) 2019-05-20T03:02:22Z|00008|jsonrpc|WARN|unix:/var/run/openvswitch/db.sock: receive error: Connection reset by peer 2019-05-20T03:02:22Z|00009|reconnect|WARN|unix:/var/run/openvswitch/db.sock: connection dropped (Connection reset by peer) 2019-05-20T03:02:40Z|00010|jsonrpc|WARN|unix:/var/run/openvswitch/db.sock: receive error: Connection reset by peer 2019-05-20T03:02:40Z|00011|reconnect|WARN|unix:/var/run/openvswitch/db.sock: connection dropped (Connection reset by peer) 2019-05-20T03:03:52Z|00012|jsonrpc|WARN|unix:/var/run/openvswitch/db.sock: receive error: Connection reset by peer 2019-05-20T03:03:52Z|00013|reconnect|WARN|unix:/var/run/openvswitch/db.sock: connection dropped (Connection reset by peer) 2019-05-20T03:05:18Z|00014|jsonrpc|WARN|unix:/var/run/openvswitch/db.sock: receive error: Connection reset by peer 2019-05-20T03:05:18Z|00015|reconnect|WARN|unix:/var/run/openvswitch/db.sock: connection dropped (Connection reset by peer) 2019-05-20T03:05:49Z|00016|jsonrpc|WARN|unix:/var/run/openvswitch/db.sock: receive error: Connection reset by peer 2019-05-20T03:05:49Z|00017|reconnect|WARN|unix:/var/run/openvswitch/db.sock: connection dropped (Connection reset by peer) log info May 19 23:05:21 dell-per730-18.rhts.eng.pek2.redhat.com systemd[1]: ovs-vswitchd.service: Service RestartSec=100ms expired, scheduling restart. May 19 23:05:21 dell-per730-18.rhts.eng.pek2.redhat.com systemd[1]: ovs-vswitchd.service: Scheduled restart job, restart counter is at 44. May 19 23:05:21 dell-per730-18.rhts.eng.pek2.redhat.com systemd[1]: Stopped Open vSwitch Forwarding Unit. May 19 23:05:21 dell-per730-18.rhts.eng.pek2.redhat.com systemd[1]: Starting Open vSwitch Forwarding Unit... May 19 23:05:48 dell-per730-18.rhts.eng.pek2.redhat.com ovs-vswitchd[11562]: ovs|00018|dpdk|ERR|EAL: Cannot obtain physical addresses: Success. Only vfio will function. May 19 23:05:48 dell-per730-18.rhts.eng.pek2.redhat.com ovs-ctl[11516]: Starting ovs-vswitchd EAL: FATAL: Cannot init memory May 19 23:05:48 dell-per730-18.rhts.eng.pek2.redhat.com ovs-vswitchd[11562]: ovs|00019|dpdk|ERR|EAL: Cannot init memory May 19 23:05:48 dell-per730-18.rhts.eng.pek2.redhat.com ovs-ctl[11516]: 2019-05-20T03:05:48Z|00020|dpdk|EMER|Unable to initialize DPDK: Cannot allocate memory May 19 23:05:48 dell-per730-18.rhts.eng.pek2.redhat.com ovs-ctl[11516]: ovs-vswitchd: Cannot init EAL (Cannot allocate memory) May 19 23:05:48 dell-per730-18.rhts.eng.pek2.redhat.com ovs-vswitchd[11562]: ovs|00020|dpdk|EMER|Unable to initialize DPDK: Cannot allocate memory May 19 23:05:48 dell-per730-18.rhts.eng.pek2.redhat.com systemd[1]: Started Process Core Dump (PID 16324/UID 0). May 19 23:05:48 dell-per730-18.rhts.eng.pek2.redhat.com systemd-coredump[16325]: Process 11562 (ovs-vswitchd) of user 0 dumped core. Stack trace of thread 11562: #0 0x00007fba45e4a93f raise (libc.so.6) #1 0x00007fba45e34c95 abort (libc.so.6) #2 0x0000561839c0dff4 ovs_abort_valist (ovs-vswitchd) #3 0x0000561839c0e09a ovs_abort (ovs-vswitchd) #4 0x0000561839c36994 dpdk_init (ovs-vswitchd) #5 0x0000561839ae6565 bridge_run (ovs-vswitchd) #6 0x00005618399328ad main (ovs-vswitchd) #7 0x00007fba45e36813 __libc_start_main (libc.so.6) #8 0x000056183993383e _start (ovs-vswitchd) Stack trace of thread 11568: #0 0x00007fba45f0fd97 epoll_wait (libc.so.6) #1 0x0000561839948dd4 eal_intr_thread_main (ovs-vswitchd) #2 0x00007fba46a9e2de start_thread (libpthread.so.0) #3 0x00007fba45f0fa63 __clone (libc.so.6) Stack trace of thread 11569: #0 0x00007fba46aa8a17 recvmsg (libpthread.so.0) #1 0x0000561839952937 mp_handle (ovs-vswitchd) #2 0x00007fba46a9e2de start_thread (libpthread.so.0) #3 0x00007fba45f0fa63 __clone (libc.so.6) May 19 23:05:49 dell-per730-18.rhts.eng.pek2.redhat.com systemd[1]: Stopping Open vSwitch Database Unit... May 19 23:05:49 dell-per730-18.rhts.eng.pek2.redhat.com ovs-vsctl[8928]: ovs|00016|jsonrpc|WARN|unix:/var/run/openvswitch/db.sock: receive error: Connection reset by peer May 19 23:05:49 dell-per730-18.rhts.eng.pek2.redhat.com ovs-vsctl[8928]: ovs|00017|reconnect|WARN|unix:/var/run/openvswitch/db.sock: connection dropped (Connection reset by peer) May 19 23:05:49 dell-per730-18.rhts.eng.pek2.redhat.com ovs-ctl[16453]: [42B blob data] May 19 23:05:49 dell-per730-18.rhts.eng.pek2.redhat.com systemd[1]: Stopped Open vSwitch Database Unit. May 19 23:05:49 dell-per730-18.rhts.eng.pek2.redhat.com systemd[1]: Starting Open vSwitch Database Unit... May 19 23:05:49 dell-per730-18.rhts.eng.pek2.redhat.com ovs-ctl[16497]: [35B blob data] May 19 23:05:49 dell-per730-18.rhts.eng.pek2.redhat.com ovs-vsctl[16565]: ovs|00001|vsctl|INFO|Called as ovs-vsctl --no-wait -- init -- set Open_vSwitch . db-version=7.16.1 May 19 23:05:49 dell-per730-18.rhts.eng.pek2.redhat.com ovs-vsctl[16570]: ovs|00001|vsctl|INFO|Called as ovs-vsctl --no-wait set Open_vSwitch . ovs-version=2.11.0 "external-ids:system-id=\"7a65e75b-43a8-4e83-b7> May 19 23:05:49 dell-per730-18.rhts.eng.pek2.redhat.com ovs-ctl[16497]: [49B blob data] May 19 23:05:49 dell-per730-18.rhts.eng.pek2.redhat.com ovs-ctl[16497]: [44B blob data] May 19 23:05:49 dell-per730-18.rhts.eng.pek2.redhat.com systemd[1]: Started Open vSwitch Database Unit. May 19 23:05:49 dell-per730-18.rhts.eng.pek2.redhat.com ovs-vsctl[16581]: ovs|00001|vsctl|INFO|Called as ovs-vsctl --no-wait set Open_vSwitch . external-ids:hostname=dell-per730-18.rhts.eng.pek2.redhat.com May 19 23:05:55 dell-per730-18.rhts.eng.pek2.redhat.com ovs-vswitchd[11561]: ovs|00002|daemon_unix|ERR|fork child died before signaling startup (killed (Aborted), core dumped) May 19 23:05:55 dell-per730-18.rhts.eng.pek2.redhat.com ovs-vswitchd[11561]: ovs|00003|daemon_unix|EMER|could not detach from foreground session May 19 23:05:55 dell-per730-18.rhts.eng.pek2.redhat.com ovs-ctl[11516]: ovs-vswitchd: could not detach from foreground session May 19 23:05:55 dell-per730-18.rhts.eng.pek2.redhat.com ovs-ctl[11516]: [13B blob data] May 19 23:05:55 dell-per730-18.rhts.eng.pek2.redhat.com systemd[1]: ovs-vswitchd.service: Control process exited, code=exited status=1 May 19 23:05:55 dell-per730-18.rhts.eng.pek2.redhat.com systemd[1]: ovs-vswitchd.service: Failed with result 'exit-code'. May 19 23:05:55 dell-per730-18.rhts.eng.pek2.redhat.com systemd[1]: Failed to start Open vSwitch Forwarding Unit. May 19 23:05:55 dell-per730-18.rhts.eng.pek2.redhat.com systemd[1]: Dependency failed for Open vSwitch. May 19 23:05:55 dell-per730-18.rhts.eng.pek2.redhat.com systemd[1]: openvswitch.service: Job openvswitch.service/start failed with result 'dependency'. May 19 23:05:55 dell-per730-18.rhts.eng.pek2.redhat.com systemd[1]: ovs-vswitchd.service: Service RestartSec=100ms expired, scheduling restart. May 19 23:05:55 dell-per730-18.rhts.eng.pek2.redhat.com systemd[1]: ovs-vswitchd.service: Scheduled restart job, restart counter is at 45. May 19 23:05:55 dell-per730-18.rhts.eng.pek2.redhat.com systemd[1]: Stopped Open vSwitch Forwarding Unit. May 19 23:05:55 dell-per730-18.rhts.eng.pek2.redhat.com systemd[1]: Starting Open vSwitch Forwarding Unit... 1. 2. 3. Actual results: crash Expected results: works fine Additional info:
[root@dell-per730-18 bond]# cat /proc/meminfo MemTotal: 65670332 kB MemFree: 32871824 kB MemAvailable: 35802732 kB Buffers: 5264 kB Cached: 3538392 kB SwapCached: 0 kB Active: 630940 kB Inactive: 2978896 kB Active(anon): 245608 kB Inactive(anon): 69592 kB Active(file): 385332 kB Inactive(file): 2909304 kB Unevictable: 56160 kB Mlocked: 56164 kB SwapTotal: 29298684 kB SwapFree: 29298684 kB Dirty: 36 kB Writeback: 0 kB AnonPages: 122424 kB Mapped: 187252 kB Shmem: 216396 kB Slab: 685592 kB SReclaimable: 200712 kB SUnreclaim: 484880 kB KernelStack: 9072 kB PageTables: 466576 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 49550936 kB Committed_AS: 1184432 kB VmallocTotal: 34359738367 kB VmallocUsed: 0 kB VmallocChunk: 0 kB HardwareCorrupted: 0 kB AnonHugePages: 4096 kB ShmemHugePages: 0 kB ShmemPmdMapped: 0 kB HugePages_Total: 24 HugePages_Free: 24 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 1048576 kB Hugetlb: 25165824 kB DirectMap4k: 1391344 kB DirectMap2M: 14241792 kB DirectMap1G: 53477376 kB [root@dell-per730-18 bond]# cat /proc/cmdline BOOT_IMAGE=(hd0,msdos1)/vmlinuz-4.18.0-80.el8.x86_64 root=/dev/mapper/rhel_dell--per730--18-root ro kpti ksdevice=bootif crashkernel=auto resume=/dev/mapper/rhel_dell--per730--18-swap rd.lvm.lv=rhel_dell-per730-18/root rd.lvm.lv=rhel_dell-per730-18/swap console=ttyS0,115200n81 skew_tick=1 nohz_full=2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 rcu_nocbs=2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 tuned.non_isolcpus=aaaaaaab intel_pstate=disable nosoftlockup nohz=on default_hugepagesz=1G hugepagesz=1G hugepages=24 intel_iommu=on iommu=pt modprobe.blacklist=qedi modprobe.blacklist=qedf modprobe.blacklist=qedr skew_tick=1 nohz=on nohz_full=2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 rcu_nocbs=2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 tuned.non_isolcpus=aaaaaaab intel_pstate=disable nosoftlockup
For to do 1 [root@dell-per730-18 ~]# for id in 0000:01:00.0 0000:01:00.1 0000:02:00.0 0000:02:00.1; do lspci -n -s $id; done 01:00.0 0200: 14e4:165f 01:00.1 0200: 14e4:165f 02:00.0 0200: 14e4:165f 02:00.1 0200: 14e4:165f [root@dell-per730-18 ~]# ************************************************************** For to do 2 [root@dell-per730-18 ~]# modprobe openvswitch [root@dell-per730-18 ~]# systemctl stop openvswitch [root@dell-per730-18 ~]# sleep 3 [root@dell-per730-18 ~]# systemctl start openvswitch [root@dell-per730-18 ~]# sleep 3 [root@dell-per730-18 ~]# ovs-vsctl --if-exists del-br ovsbr0 [root@dell-per730-18 ~]# sleep 5 [root@dell-per730-18 ~]# ovs-vsctl set Open_vSwitch . other_config={} [root@dell-per730-18 ~]# ovs-vsctl set Open_vSwitch . other_config:dpdk-extra="-b 0000:04:00.0 -b 0000:04:00.1" [root@dell-per730-18 ~]# ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true [root@dell-per730-18 ~]# ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem=4096,4096 [root@dell-per730-18 ~]# ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x55555554 2019-05-28T03:18:42Z|00002|jsonrpc|WARN|unix:/var/run/openvswitch/db.sock: receive error: Connection reset by peer 2019-05-28T03:18:42Z|00003|reconnect|WARN|unix:/var/run/openvswitch/db.sock: connection dropped (Connection reset by peer) 2019-05-28T03:19:55Z|00004|jsonrpc|WARN|unix:/var/run/openvswitch/db.sock: receive error: Connection reset by peer 2019-05-28T03:19:55Z|00005|reconnect|WARN|unix:/var/run/openvswitch/db.sock: connection dropped (Connection reset by peer) 2019-05-28T03:20:01Z|00006|jsonrpc|WARN|unix:/var/run/openvswitch/db.sock: receive error: Connection reset by peer 2019-05-28T03:20:01Z|00007|reconnect|WARN|unix:/var/run/openvswitch/db.sock: connection dropped (Connection reset by peer) 2019-05-28T03:20:04Z|00008|jsonrpc|WARN|unix:/var/run/openvswitch/db.sock: receive error: Connection reset by peer 2019-05-28T03:20:04Z|00009|reconnect|WARN|unix:/var/run/openvswitch/db.sock: connection dropped (Connection reset by peer) 2019-05-28T03:20:07Z|00010|jsonrpc|WARN|unix:/var/run/openvswitch/db.sock: receive error: Connection reset by peer 2019-05-28T03:20:07Z|00011|reconnect|WARN|unix:/var/run/openvswitch/db.sock: connection dropped (Connection reset by peer) 2019-05-28T03:20:38Z|00012|jsonrpc|WARN|unix:/var/run/openvswitch/db.sock: receive error: Connection reset by peer 2019-05-28T03:20:38Z|00013|reconnect|WARN|unix:/var/run/openvswitch/db.sock: connection dropped (Connection reset by peer) 2019-05-28T03:20:41Z|00014|jsonrpc|WARN|unix:/var/run/openvswitch/db.sock: receive error: Connection reset by peer 2019-05-28T03:20:41Z|00015|reconnect|WARN|unix:/var/run/openvswitch/db.sock: connection dropped (Connection reset by peer) 2019-05-28T03:20:48Z|00016|jsonrpc|WARN|unix:/var/run/openvswitch/db.sock: receive error: Connection reset by peer 2019-05-28T03:20:48Z|00017|reconnect|WARN|unix:/var/run/openvswitch/db.sock: connection dropped (Connection reset by peer) 2019-05-28T03:21:29Z|00018|jsonrpc|WARN|unix:/var/run/openvswitch/db.sock: receive error: Connection reset by peer 2019-05-28T03:21:29Z|00019|reconnect|WARN|unix:/var/run/openvswitch/db.sock: connection dropped (Connection reset by peer) Log as below May 27 23:21:32 dell-per730-18.rhts.eng.pek2.redhat.com systemd[1]: Failed to start Open vSwitch Forwarding Unit. May 27 23:21:32 dell-per730-18.rhts.eng.pek2.redhat.com systemd[1]: Dependency failed for Open vSwitch. May 27 23:21:32 dell-per730-18.rhts.eng.pek2.redhat.com systemd[1]: openvswitch.service: Job openvswitch.service/start failed with result 'dependency'. May 27 23:21:32 dell-per730-18.rhts.eng.pek2.redhat.com systemd[1]: ovs-vswitchd.service: Service RestartSec=100ms expired, scheduling restart. May 27 23:21:32 dell-per730-18.rhts.eng.pek2.redhat.com systemd[1]: ovs-vswitchd.service: Scheduled restart job, restart counter is at 10. May 27 23:21:32 dell-per730-18.rhts.eng.pek2.redhat.com systemd[1]: Stopped Open vSwitch Forwarding Unit. May 27 23:21:32 dell-per730-18.rhts.eng.pek2.redhat.com systemd[1]: Starting Open vSwitch Forwarding Unit... May 27 23:21:58 dell-per730-18.rhts.eng.pek2.redhat.com ovs-vswitchd[17718]: ovs|00017|dpdk|ERR|EAL: rte_mem_virt2phy(): cannot open /proc/self/pagemap: Permission denied May 27 23:21:58 dell-per730-18.rhts.eng.pek2.redhat.com ovs-vswitchd[17718]: ovs|00018|dpdk|ERR|EAL: Cannot obtain physical addresses: Permission denied. Only vfio will function. May 27 23:21:59 dell-per730-18.rhts.eng.pek2.redhat.com ovs-ctl[17676]: Starting ovs-vswitchd EAL: FATAL: Cannot init memory May 27 23:21:59 dell-per730-18.rhts.eng.pek2.redhat.com ovs-vswitchd[17718]: ovs|00019|dpdk|ERR|EAL: Cannot init memory May 27 23:21:59 dell-per730-18.rhts.eng.pek2.redhat.com ovs-ctl[17676]: 2019-05-28T03:21:59Z|00020|dpdk|EMER|Unable to initialize DPDK: Cannot allocate memory May 27 23:21:59 dell-per730-18.rhts.eng.pek2.redhat.com ovs-ctl[17676]: ovs-vswitchd: Cannot init EAL (Cannot allocate memory) May 27 23:21:59 dell-per730-18.rhts.eng.pek2.redhat.com ovs-vswitchd[17718]: ovs|00020|dpdk|EMER|Unable to initialize DPDK: Cannot allocate memory May 27 23:22:05 dell-per730-18.rhts.eng.pek2.redhat.com ovs-vswitchd[17717]: ovs|00002|daemon_unix|ERR|fork child died before signaling startup (killed (Aborted)) May 27 23:22:05 dell-per730-18.rhts.eng.pek2.redhat.com ovs-vswitchd[17717]: ovs|00003|daemon_unix|EMER|could not detach from foreground session May 27 23:22:05 dell-per730-18.rhts.eng.pek2.redhat.com ovs-ctl[17676]: ovs-vswitchd: could not detach from foreground session May 27 23:22:05 dell-per730-18.rhts.eng.pek2.redhat.com ovs-ctl[17676]: [13B blob data] May 27 23:22:05 dell-per730-18.rhts.eng.pek2.redhat.com systemd[1]: ovs-vswitchd.service: Control process exited, code=exited status=1 May 27 23:22:05 dell-per730-18.rhts.eng.pek2.redhat.com systemd[1]: ovs-vswitchd.service: Failed with result 'exit-code'. May 27 23:22:05 dell-per730-18.rhts.eng.pek2.redhat.com systemd[1]: Failed to start Open vSwitch Forwarding Unit. May 27 23:22:06 dell-per730-18.rhts.eng.pek2.redhat.com systemd[1]: ovs-vswitchd.service: Service RestartSec=100ms expired, scheduling restart. May 27 23:22:06 dell-per730-18.rhts.eng.pek2.redhat.com systemd[1]: ovs-vswitchd.service: Scheduled restart job, restart counter is at 11. May 27 23:22:06 dell-per730-18.rhts.eng.pek2.redhat.com systemd[1]: Stopped Open vSwitch Forwarding Unit. May 27 23:22:06 dell-per730-18.rhts.eng.pek2.redhat.com systemd[1]: Starting Open vSwitch Forwarding Unit... May 27 23:22:28 dell-per730-18.rhts.eng.pek2.redhat.com ovs-vsctl[16252]: ovs|00020|fatal_signal|WARN|terminating with signal 2 (Interrupt) May 27 23:22:32 dell-per730-18.rhts.eng.pek2.redhat.com ovs-vswitchd[17781]: ovs|00017|dpdk|ERR|EAL: rte_mem_virt2phy(): cannot open /proc/self/pagemap: Permission denied May 27 23:22:32 dell-per730-18.rhts.eng.pek2.redhat.com ovs-vswitchd[17781]: ovs|00018|dpdk|ERR|EAL: Cannot obtain physical addresses: Permission denied. Only vfio will function. May 27 23:22:32 dell-per730-18.rhts.eng.pek2.redhat.com ovs-ctl[17739]: Starting ovs-vswitchd EAL: FATAL: Cannot init memory May 27 23:22:32 dell-per730-18.rhts.eng.pek2.redhat.com ovs-ctl[17739]: 2019-05-28T03:22:32Z|00020|dpdk|EMER|Unable to initialize DPDK: Cannot allocate memory May 27 23:22:32 dell-per730-18.rhts.eng.pek2.redhat.com ovs-ctl[17739]: ovs-vswitchd: Cannot init EAL (Cannot allocate memory) May 27 23:22:32 dell-per730-18.rhts.eng.pek2.redhat.com ovs-vswitchd[17781]: ovs|00019|dpdk|ERR|EAL: Cannot init memory May 27 23:22:32 dell-per730-18.rhts.eng.pek2.redhat.com ovs-vswitchd[17781]: ovs|00020|dpdk|EMER|Unable to initialize DPDK: Cannot allocate memory May 27 23:22:39 dell-per730-18.rhts.eng.pek2.redhat.com ovs-vswitchd[17780]: ovs|00002|daemon_unix|ERR|fork child died before signaling startup (killed (Aborted)) May 27 23:22:39 dell-per730-18.rhts.eng.pek2.redhat.com ovs-vswitchd[17780]: ovs|00003|daemon_unix|EMER|could not detach from foreground session May 27 23:22:39 dell-per730-18.rhts.eng.pek2.redhat.com ovs-ctl[17739]: ovs-vswitchd: could not detach from foreground session May 27 23:22:39 dell-per730-18.rhts.eng.pek2.redhat.com ovs-ctl[17739]: [13B blob data] May 27 23:22:39 dell-per730-18.rhts.eng.pek2.redhat.com systemd[1]: ovs-vswitchd.service: Control process exited, code=exited status=1 May 27 23:22:39 dell-per730-18.rhts.eng.pek2.redhat.com systemd[1]: ovs-vswitchd.service: Failed with result 'exit-code'. May 27 23:22:39 dell-per730-18.rhts.eng.pek2.redhat.com systemd[1]: Failed to start Open vSwitch Forwarding Unit. May 27 23:22:39 dell-per730-18.rhts.eng.pek2.redhat.com systemd[1]: ovs-vswitchd.service: Service RestartSec=100ms expired, scheduling restart. May 27 23:22:39 dell-per730-18.rhts.eng.pek2.redhat.com systemd[1]: ovs-vswitchd.service: Scheduled restart job, restart counter is at 12. May 27 23:22:39 dell-per730-18.rhts.eng.pek2.redhat.com systemd[1]: Stopped Open vSwitch Forwarding Unit. May 27 23:22:39 dell-per730-18.rhts.eng.pek2.redhat.com systemd[1]: Starting Open vSwitch Forwarding Unit... May 27 23:23:06 dell-per730-18.rhts.eng.pek2.redhat.com ovs-vswitchd[17836]: ovs|00017|dpdk|ERR|EAL: rte_mem_virt2phy(): cannot open /proc/self/pagemap: Permission denied May 27 23:23:06 dell-per730-18.rhts.eng.pek2.redhat.com ovs-vswitchd[17836]: ovs|00018|dpdk|ERR|EAL: Cannot obtain physical addresses: Permission denied. Only vfio will function. May 27 23:23:06 dell-per730-18.rhts.eng.pek2.redhat.com ovs-ctl[17794]: Starting ovs-vswitchd EAL: FATAL: Cannot init memory May 27 23:23:06 dell-per730-18.rhts.eng.pek2.redhat.com ovs-ctl[17794]: 2019-05-28T03:23:06Z|00020|dpdk|EMER|Unable to initialize DPDK: Cannot allocate memory May 27 23:23:06 dell-per730-18.rhts.eng.pek2.redhat.com ovs-ctl[17794]: ovs-vswitchd: Cannot init EAL (Cannot allocate memory) May 27 23:23:06 dell-per730-18.rhts.eng.pek2.redhat.com ovs-vswitchd[17836]: ovs|00019|dpdk|ERR|EAL: Cannot init memory May 27 23:23:06 dell-per730-18.rhts.eng.pek2.redhat.com ovs-vswitchd[17836]: ovs|00020|dpdk|EMER|Unable to initialize DPDK: Cannot allocate memory
Created attachment 1575471 [details] ovs log file
Created attachment 1576457 [details] openvswitch log with iova mode change
Created attachment 1576484 [details] openvswitch log with iova mode change and bind dpdk
Please, could you have a try with the following test packages: http://brew-task-repos.usersys.redhat.com/repos/scratch/dmarchan/openvswitch2.11/2.11.0/20.el8fdn.bz1711739/
*** Bug 1672849 has been marked as a duplicate of this bug. ***
When testing the fixes (once pushed in fdp), please do a non regression on Mellanox nics (more info in bz1672849).
*** Bug 1736517 has been marked as a duplicate of this bug. ***
*** Bug 1737713 has been marked as a duplicate of this bug. ***
(In reply to David Marchand from comment #22) > Please, could you have a try with the following test packages: > http://brew-task-repos.usersys.redhat.com/repos/scratch/dmarchan/ > openvswitch2.11/2.11.0/20.el8fdn.bz1711739/ 18 wget http://brew-task-repos.usersys.redhat.com/repos/scratch/dmarchan/openvswitch2.11/2.11.0/20.el8fdn.bz1711739/x86_64/openvswitch2.11-2.11.0-20.el8fdn.bz1711739.src.rpm 19 rpm -ivh openvswitch2.11-2.11.0-20.el8fdn.bz1711739.src.rpm 45 yum -y install libcap-ng-devel 46 yum -y install libmnl-devel 49 yum install numactl-devel 50 yum install openssl-devel 51 yum -y install python3-devel 52 yum -y install python3-sphinx 57 yum -y install numactl-devel unbound-devel 58 rpmbuild -ba SPECS/openvswitch2.11.spec 90 rpm -ivh openvswitch2.11-2.11.0-20.el8.bz1711739.x86_64.rpm 91 rpm -ivh python3-openvswitch2.11-2.11.0-20.el8.bz1711739.x86_64.rpm 93 modprobe openvswitch 94 systemctl stop openvswitch 95 sleep 3 96 systemctl start openvswitch 97 sleep 3 98 ovs-vsctl --if-exists del-br ovsbr0 99 sleep 5 100 ovs-vsctl set Open_vSwitch . other_config={} 101 ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true 102 ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem=4096,4096 103 ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x55555554 15.899Z|00001|vlog|INFO|opened log file /var/log/openvswitch/ovs-vswitchd.log 2019-08-20T04:13:15.926Z|00002|ovs_numa|INFO|Discovered 16 CPU cores on NUMA node 0 2019-08-20T04:13:15.926Z|00003|ovs_numa|INFO|Discovered 16 CPU cores on NUMA node 1 2019-08-20T04:13:15.926Z|00004|ovs_numa|INFO|Discovered 2 NUMA nodes and 32 CPU cores 2019-08-20T04:13:15.927Z|00005|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connecting... 2019-08-20T04:13:15.927Z|00006|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connected 2019-08-20T04:13:15.929Z|00007|dpdk|INFO|DPDK Disabled - Use other_config:dpdk-init to enable 2019-08-20T04:13:15.936Z|00008|bridge|INFO|ovs-vswitchd (Open vSwitch) 2.11.0 2019-08-20T04:13:23.979Z|00009|dpdk|INFO|Using DPDK 18.11.2 2019-08-20T04:13:23.979Z|00010|dpdk|INFO|DPDK Enabled - initializing... 2019-08-20T04:13:23.979Z|00011|dpdk|INFO|No vhost-sock-dir provided - defaulting to /var/run/openvswitch 2019-08-20T04:13:23.979Z|00012|dpdk|INFO|IOMMU support for vhost-user-client disabled. 2019-08-20T04:13:23.979Z|00013|dpdk|INFO|Per port memory for DPDK devices disabled. 2019-08-20T04:13:23.979Z|00014|dpdk|INFO|EAL ARGS: ovs-vswitchd --socket-mem 1024,1024 --socket-limit 1024,1024 -l 0. 2019-08-20T04:13:23.985Z|00015|dpdk|INFO|EAL: Detected 32 lcore(s) 2019-08-20T04:13:23.985Z|00016|dpdk|INFO|EAL: Detected 2 NUMA nodes 2019-08-20T04:13:23.987Z|00017|dpdk|INFO|EAL: Multi-process socket /var/run/openvswitch/dpdk/rte/mp_socket 2019-08-20T04:13:24.018Z|00018|dpdk|INFO|EAL: rte_mem_virt2phy(): cannot open /proc/self/pagemap: Permission denied 2019-08-20T04:13:24.018Z|00019|dpdk|INFO|EAL: Selected IOVA mode 'VA' 2019-08-20T04:13:24.019Z|00020|dpdk|INFO|EAL: Probing VFIO support... 2019-08-20T04:13:24.019Z|00021|dpdk|INFO|EAL: VFIO support initialized 2019-08-20T04:13:50.985Z|00022|dpdk|INFO|EAL: PCI device 0000:04:00.0 on NUMA socket 0 2019-08-20T04:13:50.985Z|00023|dpdk|INFO|EAL: probe driver: 15b3:1017 net_mlx5 2019-08-20T04:13:51.042Z|00024|dpdk|INFO|EAL: PCI device 0000:04:00.1 on NUMA socket 0 2019-08-20T04:13:51.042Z|00025|dpdk|INFO|EAL: probe driver: 15b3:1017 net_mlx5 2019-08-20T04:13:51.091Z|00026|dpdk|INFO|EAL: PCI device 0000:05:00.0 on NUMA socket 0 2019-08-20T04:13:51.091Z|00027|dpdk|INFO|EAL: probe driver: 8086:10fb net_ixgbe 2019-08-20T04:13:51.091Z|00028|dpdk|INFO|EAL: PCI device 0000:05:00.1 on NUMA socket 0 2019-08-20T04:13:51.091Z|00029|dpdk|INFO|EAL: probe driver: 8086:10fb net_ixgbe 2019-08-20T04:13:51.091Z|00030|dpdk|INFO|EAL: PCI device 0000:07:00.0 on NUMA socket 0 2019-08-20T04:13:51.091Z|00031|dpdk|INFO|EAL: probe driver: 8086:1583 net_i40e 2019-08-20T04:13:51.091Z|00032|dpdk|INFO|EAL: PCI device 0000:07:00.1 on NUMA socket 0 2019-08-20T04:13:51.091Z|00033|dpdk|INFO|EAL: probe driver: 8086:1583 net_i40e 2019-08-20T04:13:51.091Z|00034|dpdk|INFO|EAL: PCI device 0000:83:00.0 on NUMA socket 1 2019-08-20T04:13:51.091Z|00035|dpdk|INFO|EAL: probe driver: 8086:158b net_i40e 2019-08-20T04:13:51.091Z|00036|dpdk|INFO|EAL: PCI device 0000:83:00.1 on NUMA socket 1 2019-08-20T04:13:51.091Z|00037|dpdk|INFO|EAL: probe driver: 8086:158b net_i40e 2019-08-20T04:13:51.091Z|00038|dpdk|INFO|EAL: PCI device 0000:86:00.0 on NUMA socket 1 2019-08-20T04:13:51.091Z|00039|dpdk|INFO|EAL: probe driver: 8086:1572 net_i40e 2019-08-20T04:13:51.091Z|00040|dpdk|INFO|EAL: PCI device 0000:86:00.1 on NUMA socket 1 2019-08-20T04:13:51.091Z|00041|dpdk|INFO|EAL: probe driver: 8086:1572 net_i40e 2019-08-20T04:13:51.092Z|00042|dpdk|INFO|DPDK Enabled - initialized 2019-08-20T04:13:51.092Z|00043|timeval|WARN|Unreasonably long 27113ms poll interval (7ms user, 26858ms system) 2019-08-20T04:13:51.092Z|00044|timeval|WARN|faults: 69208264 minor, 0 major 2019-08-20T04:13:51.092Z|00045|timeval|WARN|context switches: 96 voluntary, 21 involuntary 2019-08-20T04:27:45.109Z|00045|coverage|INFO|Event coverage, avg rate over last: 5 seconds, last minute, last hour, hash=9ebe8ac7: 2019-08-20T04:27:45.109Z|00046|coverage|INFO|bridge_reconfigure 0.0/sec 0.000/sec 0.0000/sec total: 1 2019-08-20T04:27:45.109Z|00047|coverage|INFO|cmap_expand 0.0/sec 0.000/sec 0.0000/sec total: 9 2019-08-20T04:27:45.109Z|00048|coverage|INFO|miniflow_malloc 0.0/sec 0.000/sec 0.0000/sec total: 14 2019-08-20T04:27:45.109Z|00049|coverage|INFO|hmap_expand 0.0/sec 0.000/sec 0.0000/sec total: 405 2019-08-20T04:27:45.109Z|00050|coverage|INFO|txn_unchanged 0.0/sec 0.000/sec 0.0000/sec total: 3 2019-08-20T04:27:45.109Z|00051|coverage|INFO|poll_create_node 0.0/sec 0.000/sec 0.0000/sec total: 53 2019-08-20T04:27:45.109Z|00052|coverage|INFO|poll_zero_timeout 0.0/sec 0.000/sec 0.0000/sec total: 1 2019-08-20T04:27:45.109Z|00053|coverage|INFO|seq_change 0.0/sec 0.000/sec 0.0000/sec total: 65 2019-08-20T04:27:45.109Z|00054|coverage|INFO|pstream_open 0.0/sec 0.000/sec 0.0000/sec total: 1 2019-08-20T04:27:45.109Z|00055|coverage|INFO|stream_open 0.0/sec 0.000/sec 0.0000/sec total: 1 2019-08-20T04:27:45.109Z|00056|coverage|INFO|util_xalloc 0.0/sec 0.000/sec 0.0000/sec total: 7204 2019-08-20T04:27:45.109Z|00057|coverage|INFO|netlink_received 0.0/sec 0.000/sec 0.0000/sec total: 60 2019-08-20T04:27:45.109Z|00058|coverage|INFO|netlink_recv_jumbo 0.0/sec 0.000/sec 0.0000/sec total: 7 2019-08-20T04:27:45.109Z|00059|coverage|INFO|netlink_sent 0.0/sec 0.000/sec 0.0000/sec total: 54 2019-08-20T04:27:45.109Z|00060|coverage|INFO|92 events never hit 2019-08-20T04:27:45.109Z|00061|poll_loop|INFO|wakeup due to [POLLIN] on fd 16 (NETLINK_ROUTE<->NETLINK_ROUTE) at ../lib/netlink-socket.c:1401 (98% CPU usage) 2019-08-20T04:27:45.109Z|00062|memory|INFO|58124 kB peak resident set size after 29.6 seconds 2019-08-20T04:27:45.282Z|00063|poll_loop|INFO|wakeup due to [POLLIN] on fd 43 (NETLINK_ROUTE<->NETLINK_ROUTE) at ../lib/netlink-socket.c:1401 (98% CPU usage) 2019-08-20T04:27:45.282Z|00064|poll_loop|INFO|wakeup due to [POLLIN] on fd 18 (NETLINK_ROUTE<->NETLINK_ROUTE) at ../lib/netlink-socket.c:1401 (98% CPU usage) It can work So , the result is fixed .
* Tue Aug 06 2019 David Marchand <david.marchand> - 2.11.0-20 - Renumbered dpdk patches - Backport IOVA fixes (#1711739)
Comment to synch with JIRA
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2940