Description of problem: openstack kilo with all updates centos7.1 openvswitch 2.3.1 with centos7 kernel 3.10.0-123.20.1.el7.x86_64 I had a working openstack infrastructure I can instantiate a VM, it fetches metadata correctly, a private IP is correctly associated with the instance I can # ip netns exec qrouter-97be4b64-71d0-4443-84d0-ec8cfb9f94a4 ping 192.168.1.3 PING 192.168.1.3 (192.168.1.3) 56(84) bytes of data. 64 bytes from 192.168.1.3: icmp_seq=1 ttl=64 time=0.387 ms 64 bytes from 192.168.1.3: icmp_seq=2 ttl=64 time=0.426 ms After upgrading all openstack nodes, including the network node (neutron agents) and compute node (nova with kvm) to kernel 3.10.0-229.11.1.el7.x86_64 no changes in the configuration whatsoever # ip netns exec qrouter-97be4b64-71d0-4443-84d0-ec8cfb9f94a4 ping 192.168.1.3 PING 192.168.1.3 (192.168.1.3) 56(84) bytes of data. From 192.168.1.254 icmp_seq=1 Destination Host Unreachable From 192.168.1.254 icmp_seq=2 Destination Host Unreachable I have fetched openvswitch 2.4.0, and compiled it against the latest kernel as well as compiled the openvswitch kernel module still with no success I have taken a deep look into all logs (with debug mode), I have performed route, tcpdumps in the net namespace and couldn't determine what was the exact problem ping reaches the private router ]# ip netns exec qrouter-43dbd8f3-b36c-4f43-be95-91e3d92cc86a ping 192.168.1.254 PING 192.168.1.254 (192.168.1.254) 56(84) bytes of data. 64 bytes from 192.168.1.254: icmp_seq=1 ttl=64 time=0.067 ms 64 bytes from 192.168.1.254: icmp_seq=2 ttl=64 time=0.025 ms but seems the problem in the br-int does not pass any packets to the instance doing # ip netns exec qrouter-43dbd8f3-b36c-4f43-be95-91e3d92cc86a ping 192.168.1.2 I see # ip netns exec qrouter-43dbd8f3-b36c-4f43-be95-91e3d92cc86a tcpdump -i qr-871497f4-1b -vvv host 192.168.1.2 tcpdump: listening on qr-871497f4-1b, link-type EN10MB (Ethernet), capture size 65535 bytes 13:19:27.627954 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.1.2 tell nimbus-net01.ncg.ingrid.pt, length 28 13:19:28.629674 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.1.2 tell nimbus-net01.ncg.ingrid.pt, length 28 there are requests, but no replies For kernel 3.10.0-123.20.1.el7.x86_64 # modinfo openvswitch filename: /lib/modules/3.10.0-123.20.1.el7.x86_64/kernel/net/openvswitch/openvswitch.ko license: GPL description: Open vSwitch switching datapath srcversion: 1241855A733802E089FD201 depends: libcrc32c,vxlan,gre intree: Y vermagic: 3.10.0-123.20.1.el7.x86_64 SMP mod_unload modversions signer: CentOS Linux kernel signing key sig_key: 18:2E:BB:09:CD:40:C9:4C:A0:C3:CE:4E:E3:F7:1D:F5:20:B4:DA:80 sig_hashalgo: sha256 # modinfo gre filename: /lib/modules/3.10.0-123.20.1.el7.x86_64/kernel/net/ipv4/gre.ko license: GPL author: D. Kozlov (xeb) description: GRE over IPv4 demultiplexer driver srcversion: 976DD3A723FD7DBEA067264 depends: intree: Y vermagic: 3.10.0-123.20.1.el7.x86_64 SMP mod_unload modversions signer: CentOS Linux kernel signing key sig_key: 18:2E:BB:09:CD:40:C9:4C:A0:C3:CE:4E:E3:F7:1D:F5:20:B4:DA:80 sig_hashalgo: sha256 For kernel 3.10.0-229.11.1.el7.x86_64 # modinfo openvswitch filename: /lib/modules/3.10.0-229.11.1.el7.x86_64/kernel/net/openvswitch/openvswitch.ko license: GPL description: Open vSwitch switching datapath rhelversion: 7.1 srcversion: FFFD428FDB7B6580B22B985 depends: libcrc32c,vxlan,gre intree: Y vermagic: 3.10.0-229.11.1.el7.x86_64 SMP mod_unload modversions signer: CentOS Linux kernel signing key sig_key: 99:7D:A0:E2:1A:70:E7:B6:13:42:3A:B6:22:65:07:4A:78:60:35:4C sig_hashalgo: sha256 # modinfo gre filename: /lib/modules/3.10.0-229.11.1.el7.x86_64/kernel/net/ipv4/gre.ko license: GPL author: D. Kozlov (xeb) description: GRE over IPv4 demultiplexer driver rhelversion: 7.1 srcversion: 4F8C563CCD7AC190E40FEE6 depends: intree: Y vermagic: 3.10.0-229.11.1.el7.x86_64 SMP mod_unload modversions signer: CentOS Linux kernel signing key sig_key: 99:7D:A0:E2:1A:70:E7:B6:13:42:3A:B6:22:65:07:4A:78:60:35:4C sig_hashalgo: sha256 if requested I can post ovs-vsctl show ifconfig and any logs Version-Release number of selected component (if applicable): How reproducible: In principle it should be reproducible always with the following steps Steps to Reproduce: 1. fully working openstack kilo in centos 7.1 where network and compute node are at kernel 3.10.0-123.20.1.el7.x86_64 2. instantiate a VM and check the network (private net) to it. do not destroy this VM 3. upgrade network and compute node to kernel 3.10.0-229.11.1.el7.x86_64, startup the same VM, and check the network Actual results: Expected results: Additional info:
What happens if you just boot with the previous kernel? Does it work? Thanks
hi Flavio my latest tests that where exactly that I was just booting to the previous kernel, and getting successful the tunnels and all network to the VM best Mario
Hi Mario, I've checked with OVS QE and RHOS QE and the OVS works with that kernel, so I am surprised that changing the kernel breaks something. So, I will need additional information. Could you please provide the output of the following commands after had reproduced the issue? # rpm -qi openvswitch # rpm -V openvswitch # ovs-vsctl show # iptables -L -nv # iptables -t nat -L -nv # ovs-ofctl dump-flows br-ex # ovs-ofctl dump-flows br-int # ip netns exec qrouter-<UID> ip addr list # ip netns exec qrouter-<UID> ip link list # ip netns exec qrouter-<UID> iptables -L -nv # ip netns exec qrouter-<UID> iptables -t nat -L -nv # dmesg # /var/log/openvswitch/*log # systemctl Thanks!
I forgot one command: # plotnetcfg Thanks fbl
hi Flavio sorry for the late reply, was in vacations in the meantime I have upgraded to the rpms from the 24 August don't know at the moment if that was the problem, but it disapeared I managed to instantiate successfully an instance with both the net and comp nodes with the latest kernel, and have network working properly both with a private ip and and with a public IP apologies for this, and thanks for the help you can close the bug (or non bug as I see it now) best Mario
Hi Mario, Ok, unfortunately I couldn't reproduce the issue yet, so I can't dig deeper. I will close this bug with insufficient data and you're free to re-open if you see the issue again. Thanks!