Description of problem: The problem manifests as hosts not getting DHCP responses and coming up with no networking. Upon further examination on the compute node, we can see the VLAN tags are not setup to forward the packets coming back from the dhcp-agent correctly. For example, a failing host had the following network in its libvirt.xml --- <interface type="bridge"> <mac address="fa:16:3e:8d:89:ad"/> <model type="virtio"/> <source bridge="br-int"/> <target dev="tapdb931e65-a2"/> <virtualport type="openvswitch"> <parameters interfaceid="db931e65-a204-481a-8328-c2fc61fdb637"/> </virtualport> </interface> --- We know the dhcp-agent is running on "host03". Watching the outgoing GRE traffic from the compute node, we see the hosts DHCP request broadcast across all tunnels and the response from host03 --- 20:03:08.467913 IP host13.internal.oslab.priv > host03.internal.oslab.priv: GREv0, key=0x3, length 360: IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from fa:16:3e:8d:89:ad (oui Unknown), length 310 20:03:08.467925 IP host13.internal.oslab.priv > host14.internal.oslab.priv: GREv0, key=0x3, length 360: IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from fa:16:3e:8d:89:ad (oui Unknown), length 310 20:03:08.467931 IP host13.internal.oslab.priv > host15.internal.oslab.priv: GREv0, key=0x3, length 360: IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from fa:16:3e:8d:89:ad (oui Unknown), length 310 20:03:08.467936 IP host13.internal.oslab.priv > host10.internal.oslab.priv: GREv0, key=0x3, length 360: IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from fa:16:3e:8d:89:ad (oui Unknown), length 310 20:03:08.467940 IP host13.internal.oslab.priv > host07.internal.oslab.priv: GREv0, key=0x3, length 360: IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from fa:16:3e:8d:89:ad (oui Unknown), length 310 20:03:08.467944 IP host13.internal.oslab.priv > host12.internal.oslab.priv: GREv0, key=0x3, length 360: IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from fa:16:3e:8d:89:ad (oui Unknown), length 310 20:03:08.467947 IP host13.internal.oslab.priv > host08.internal.oslab.priv: GREv0, key=0x3, length 360: IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from fa:16:3e:8d:89:ad (oui Unknown), length 310 20:03:08.467952 IP host13.internal.oslab.priv > host09.internal.oslab.priv: GREv0, key=0x3, length 360: IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from fa:16:3e:8d:89:ad (oui Unknown), length 310 20:03:08.467955 IP host13.internal.oslab.priv > host11.internal.oslab.priv: GREv0, key=0x3, length 360: IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from fa:16:3e:8d:89:ad (oui Unknown), length 310 20:03:08.468176 IP host03.internal.oslab.priv > host13.internal.oslab.priv: GREv0, key=0x3, length 381: IP host03.internal.oslab.priv.bootps > 192.168.0.18.bootpc: BOOTP/DHCP, Reply, length 331 --- From this we can see it is using GRE tunnel id 0x3 On the br-tun bridge of the compute node, we can see that gre-1 port is connected to host03 (192.168.0.3). we can also see that "gre-1" is port 10 to the switch --- [root@host13 instances]# ovs-vsctl show bb0ea84f-73a9-4d74-92f7-baf1c63a87ec Bridge br-tun Port "gre-1" Interface "gre-1" type: gre options: {in_key=flow, local_ip="192.168.0.13", out_key=flow, remote_ip="192.168.0.3"} [root@host13 instances]# ovs-ofctl dump-ports-desc br-tun OFPST_PORT_DESC reply (xid=0x2): 10(gre-1): addr:b2:b6:6d:1a:6c:dd config: 0 state: 0 speed: 0 Mbps now, 0 Mbps max --- Therefore, looking at the flow rules for the switch, we hit table 0 and match the following from (in_port=10) --- cookie=0x0, duration=763213.137s, table=0, n_packets=34572132, n_bytes=47713844512, idle_age=17, hard_age=65534, priority=1,in_port=10 actions=resubmit(,2) --- which passes us to table 2 --- cookie=0x0, duration=32861.845s, table=2, n_packets=2983474, n_bytes=4133767077, idle_age=0, priority=1,tun_id=0x3 actions=mod_vlan_vid:232,resubmit(,10) --- which adds the vlan 232 and passes it to table 10 --- cookie=0x0, duration=766148.086s, table=10, n_packets=34715673, n_bytes=47730852993, idle_age=0, hard_age=65534, priority=1 actions=learn(table=20,hard_timeout=300,priority=1,NXM_OF_VLAN_TCI[0..11],NXM_OF_ETH +_DST[]=NXM_OF_ETH_SRC[],load:0->NXM_OF_VLAN_TCI[],load:NXM_NX_TUN_ID[]->NXM_NX_TUN_ID[],output:NXM_OF_IN_PORT[]),output:1 --- which saves the MAC for future reference and passes out port 1 (the internal br-tun->br-int patch and theoretically on to the vm) The problem is, looking at the ports on br-int, our vm's interface is attached to a port with vlan 138 tag --- [root@host13 instances]# ovs-vsctl show bb0ea84f-73a9-4d74-92f7-baf1c63a87ec Bridge br-int Port "tapdb931e65-a2" tag: 138 Interface "tapdb931e65-a2" --- The end result is that packets do not make it back to the vm It's a bit hard to determine what went wrong here. Looking at the ovs database logs we see --- [root@host13 instances]# ovsdb-tool show-log | grep '02-11 12:07' record 8502: 2014-02-11 12:07:29 "ovs-vsctl: ovs-vsctl --timeout=5 -- --may-exist add-port br-int tapdb931e65-a2 -- set Interface tapdb931e65-a2 "external-ids:attached-mac=\"FA:16:3E:8D:89:AD\"" -- set Interface tapdb931e65-a2 "external-ids:iface-id=\"db931e65-a204-481a-8328-c2fc61fdb637\"" -- set Interface tapdb931e65-a2 "external-ids:vm-id=\"79c1b1f3-6100-4b3c-8aec-30ebb823b43e\"" -- set Interface tapdb931e65-a2 external-ids:iface-status=active" record 8503: 2014-02-11 12:07:29 record 8504: 2014-02-11 12:07:34 "ovs-vsctl: /usr/bin/ovs-vsctl --timeout=2 set Port tapdb931e65-a2 tag=232" record 8505: 2014-02-11 12:07:34 record 8506: 2014-02-11 12:07:36 "ovs-vsctl: /usr/bin/ovs-vsctl --timeout=2 set Port tapdb931e65-a2 tag=138" record 8507: 2014-02-11 12:07:36 --- which is interesting -- it set the right tag for the port and then 2 seconds later set the incorrect tag. Looking at the output rules for the switch, I can see matching for both vlan 232 and 138 --- cookie=0x0, duration=41413.007s, table=21, n_packets=107, n_bytes=21224, idle_age=474, priority=1,dl_vlan=138 actions=strip_vlan,set_tunnel:0x3,output:10,output:7,output:8,output:2,output:3,output:9,output:4, +output:5,output:6 cookie=0x0, duration=41411.844s, table=21, n_packets=1106, n_bytes=79016, idle_age=9444, priority=1,dl_vlan=232 actions=strip_vlan,set_tunnel:0x3,output:8,output:7,output:10,output:2,output:3,output:9,output: +4,output:5,output:6 --- This seems to explain why the packets get out, but can't come back in
Created attachment 862100 [details] openvswitch-agent log files from around the time of the interface setup
Created attachment 862101 [details] neutron server log file
I have attached log files from the openvswitch-agent on the compute node around the time of the problem, and the neutron server log files from the same time. From the ovs plugin we see the switch of vlan happening --- 2014-02-11 12:07:34.163 7548 INFO neutron.plugins.openvswitch.agent.ovs_neutron_agent [-] Port db931e65-a204-481a-8328-c2fc61fdb637 updated. Details: {u'admin_state_up': True, u'network_id': u'20803109-5bea-46 38-8b02-bcdbace3f0b2', u'segmentation_id': 3, u'physical_network': None, u'device': u'db931e65-a204-481a-8328-c2fc61fdb637', u'port_id': u'db931e65-a204-481a-8328-c2fc61fdb637', u'network_type': u'gre'} 2014-02-11 12:07:34.164 7548 DEBUG neutron.agent.linux.utils [-] Running command: ['sudo', 'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ovs-vsctl', '--timeout=2', 'set', 'Port', 'tapdb931e65-a2', 'tag=232'] execute /usr/lib/python2.6/site-packages/neutron/agent/linux/utils.py:43 2014-02-11 12:07:35.835 653 INFO neutron.plugins.openvswitch.agent.ovs_neutron_agent [-] Port db931e65-a204-481a-8328-c2fc61fdb637 updated. Details: {u'admin_state_up': True, u'network_id': u'20803109-5bea-463 8-8b02-bcdbace3f0b2', u'segmentation_id': 3, u'physical_network': None, u'device': u'db931e65-a204-481a-8328-c2fc61fdb637', u'port_id': u'db931e65-a204-481a-8328-c2fc61fdb637', u'network_type': u'gre'} 2014-02-11 12:07:35.836 653 DEBUG neutron.agent.linux.utils [-] Running command: ['sudo', 'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ovs-vsctl', '--timeout=2', 'set', 'Port', 'tapdb931e65-a2', 'tag=138'] execute /usr/lib/python2.6/site-packages/neutron/agent/linux/utils.py:43 --- What is also interesting is two almost simultaneous calls to --- 2014-02-11 11:11:32.793 653 INFO neutron.plugins.openvswitch.agent.ovs_neutron_agent [-] Assigning 138 as local vlan for net-id=20803109-5bea-4638-8b02-bcdbace3f0b2 2014-02-11 11:11:33.971 7548 INFO neutron.plugins.openvswitch.agent.ovs_neutron_agent [-] Assigning 232 as local vlan for net-id=20803109-5bea-4638-8b02-bcdbace3f0b2 --- This call to provision_local_vlan() [1] would have setup the output rules as described above (but note this was well before these ports got configured). This explains why traffic from vlans 138 & 232 can get *out* since both rules are there. According to logs, neither vlan was reclaimed (reclain_local_vlan() [1]) at any point. [1] https://github.com/openstack/neutron/blob/master/neutron/plugins/openvswitch/agent/ovs_neutron_agent.py
A fresh set of eyes looking at this made the logs jump out at me 2014-02-11 11:11:32.793 653 INFO 2014-02-11 11:11:33.971 7548 INFO that's the same message from two different pids. sure enough, there's two agents running --- [root@host13 neutron]# ps -aef | grep openvswitc[h]-agent neutron 653 1 1 Feb02 ? 03:19:08 /usr/bin/python /usr/bin/neutron-openvswitch-agent --log-file /var/log/neutron/openvswitch-agent.log --config-file /usr/share/neutron/neutron-dist.conf --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/openvswitch/ovs_neutron_plugin.ini neutron 7548 1 1 Jan20 ? 07:24:19 /usr/bin/python /usr/bin/neutron-openvswitch-agent --log-file /var/log/neutron/openvswitch-agent.log --config-file /usr/share/neutron/neutron-dist.conf --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/openvswitch/ovs_neutron_plugin.ini --- this pretty much explains everything!
Ian, do you know how to reproduce this double-agent behavior? I tried: 1) restarting: ok, 2) stopping: ok, it's gone, 3) starting twice: only one started
I agree the init scripts look OK. The second process had been running for about a month -- I think usually the agents would duplicate their actions sequentially, so things mostly appeared to work. My bot starts lots of vm's, so I hit the race condition where the two agents interleaved and ended up with invalid state. There had been a lot of package upgrades, etc in the mean time, so we can't really tell how it got into this state ... certainly could be operator issues. Usually with a deamon that binds to a port it's harder to end up in this situation. I'm not very familiar with the rpc portion of the agent, but I wonder if there is some way of figuring out via queue names if there is already an agent listening on this host, and failing to start?
(In reply to Ian Wienand from comment #8) > There had been a lot of package upgrades, etc in the mean time, so we can't > really tell how it got into this state ... certainly could be operator > issues. Yes, it could have been anything, upgrades, operator, .. the pid file mechanism with file locking should have been enough to avoid two agents, may be an older version of the init script was different?. > Usually with a deamon that binds to a port it's harder to end up in this > situation. Totally true. > I'm not very familiar with the rpc portion of the agent, but I > wonder if there is some way of figuring out via queue names if there is > already an agent listening on this host, and failing to start? It looks good for an RFE for Juno, Icehouse is on feature freeze now. Anyway before adding more complexity into neutron it could be good to get the init script reviewed by a few more eyes to see if we're missing something.
This proble origin was in several agents of the same type started on the same host (which conflicted to each other). It was actually fixed in bz#1076994 *** This bug has been marked as a duplicate of bug 1076994 ***