Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1064109

Summary:

Incorrect VLAN for VM in br-int on compute node results in lack of network connectivity

Product:

Red Hat OpenStack

Reporter:

Ian Wienand <iwienand>

Component:

openstack-neutron

Assignee:

RHOS Maint <rhos-maint>

Status:

CLOSED DUPLICATE

QA Contact:

Ofer Blaut <oblaut>

Severity:

unspecified

Docs Contact:

Priority:

unspecified

Version:

4.0

CC:

chrisw, iwienand, lpeer, majopela, mangelajo, yeylon

Target Milestone:

---

Target Release:

5.0 (RHEL 7)

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2014-05-28 12:10:23 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
openvswitch-agent log files from around the time of the interface setup	none
neutron server log file	none

Description Ian Wienand 2014-02-12 03:21:23 UTC

Description of problem:

The problem manifests as hosts not getting DHCP responses and coming up with no networking.

Upon further examination on the compute node, we can see the VLAN tags are not setup to forward the packets coming back from the dhcp-agent correctly.

For example, a failing host had the following network in its libvirt.xml

---
    <interface type="bridge">
      <mac address="fa:16:3e:8d:89:ad"/>
      <model type="virtio"/>
      <source bridge="br-int"/>
      <target dev="tapdb931e65-a2"/>
      <virtualport type="openvswitch">
        <parameters interfaceid="db931e65-a204-481a-8328-c2fc61fdb637"/>
      </virtualport>
    </interface>
---

We know the dhcp-agent is running on "host03".  Watching the outgoing GRE traffic from the compute node, we see the hosts DHCP request broadcast across all tunnels and the response from host03

---
20:03:08.467913 IP host13.internal.oslab.priv > host03.internal.oslab.priv: GREv0, key=0x3, length 360: IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from fa:16:3e:8d:89:ad (oui Unknown), length 310
20:03:08.467925 IP host13.internal.oslab.priv > host14.internal.oslab.priv: GREv0, key=0x3, length 360: IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from fa:16:3e:8d:89:ad (oui Unknown), length 310
20:03:08.467931 IP host13.internal.oslab.priv > host15.internal.oslab.priv: GREv0, key=0x3, length 360: IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from fa:16:3e:8d:89:ad (oui Unknown), length 310
20:03:08.467936 IP host13.internal.oslab.priv > host10.internal.oslab.priv: GREv0, key=0x3, length 360: IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from fa:16:3e:8d:89:ad (oui Unknown), length 310
20:03:08.467940 IP host13.internal.oslab.priv > host07.internal.oslab.priv: GREv0, key=0x3, length 360: IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from fa:16:3e:8d:89:ad (oui Unknown), length 310
20:03:08.467944 IP host13.internal.oslab.priv > host12.internal.oslab.priv: GREv0, key=0x3, length 360: IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from fa:16:3e:8d:89:ad (oui Unknown), length 310
20:03:08.467947 IP host13.internal.oslab.priv > host08.internal.oslab.priv: GREv0, key=0x3, length 360: IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from fa:16:3e:8d:89:ad (oui Unknown), length 310
20:03:08.467952 IP host13.internal.oslab.priv > host09.internal.oslab.priv: GREv0, key=0x3, length 360: IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from fa:16:3e:8d:89:ad (oui Unknown), length 310
20:03:08.467955 IP host13.internal.oslab.priv > host11.internal.oslab.priv: GREv0, key=0x3, length 360: IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from fa:16:3e:8d:89:ad (oui Unknown), length 310
20:03:08.468176 IP host03.internal.oslab.priv > host13.internal.oslab.priv: GREv0, key=0x3, length 381: IP host03.internal.oslab.priv.bootps > 192.168.0.18.bootpc: BOOTP/DHCP, Reply, length 331
---

From this we can see it is using GRE tunnel id 0x3

On the br-tun bridge of the compute node, we can see that gre-1 port is connected to host03 (192.168.0.3).  we can also see that "gre-1" is port 10 to the switch

---
[root@host13 instances]# ovs-vsctl show
bb0ea84f-73a9-4d74-92f7-baf1c63a87ec
    Bridge br-tun
        Port "gre-1"
            Interface "gre-1"
                type: gre
                options: {in_key=flow, local_ip="192.168.0.13", out_key=flow, remote_ip="192.168.0.3"}

[root@host13 instances]# ovs-ofctl dump-ports-desc  br-tun
OFPST_PORT_DESC reply (xid=0x2):

 10(gre-1): addr:b2:b6:6d:1a:6c:dd
     config:     0
     state:      0
     speed: 0 Mbps now, 0 Mbps max
---

Therefore, looking at the flow rules for the switch, we hit table 0 and match the following from (in_port=10)

---
 cookie=0x0, duration=763213.137s, table=0, n_packets=34572132, n_bytes=47713844512, idle_age=17, hard_age=65534, priority=1,in_port=10 actions=resubmit(,2)
---

which passes us to table 2

---
 cookie=0x0, duration=32861.845s, table=2, n_packets=2983474, n_bytes=4133767077, idle_age=0, priority=1,tun_id=0x3 actions=mod_vlan_vid:232,resubmit(,10)
---

which adds the vlan 232 and passes it to table 10

---
  cookie=0x0, duration=766148.086s, table=10, n_packets=34715673, n_bytes=47730852993, idle_age=0, hard_age=65534, priority=1 actions=learn(table=20,hard_timeout=300,priority=1,NXM_OF_VLAN_TCI[0..11],NXM_OF_ETH
+_DST[]=NXM_OF_ETH_SRC[],load:0->NXM_OF_VLAN_TCI[],load:NXM_NX_TUN_ID[]->NXM_NX_TUN_ID[],output:NXM_OF_IN_PORT[]),output:1
---

which saves the MAC for future reference and passes out port 1 (the internal br-tun->br-int patch and theoretically on to the vm)

The problem is, looking at the ports on br-int, our vm's interface is attached to a port with vlan 138 tag

---
[root@host13 instances]# ovs-vsctl show
bb0ea84f-73a9-4d74-92f7-baf1c63a87ec
    Bridge br-int
        Port "tapdb931e65-a2"
            tag: 138
            Interface "tapdb931e65-a2"
---

The end result is that packets do not make it back to the vm

It's a bit hard to determine what went wrong here.  Looking at the ovs database logs we see

---
[root@host13 instances]# ovsdb-tool show-log | grep '02-11 12:07'
record 8502: 2014-02-11 12:07:29 "ovs-vsctl: ovs-vsctl --timeout=5 -- --may-exist add-port br-int tapdb931e65-a2 -- set Interface tapdb931e65-a2 "external-ids:attached-mac=\"FA:16:3E:8D:89:AD\"" -- set Interface tapdb931e65-a2 "external-ids:iface-id=\"db931e65-a204-481a-8328-c2fc61fdb637\"" -- set Interface tapdb931e65-a2 "external-ids:vm-id=\"79c1b1f3-6100-4b3c-8aec-30ebb823b43e\"" -- set Interface tapdb931e65-a2 external-ids:iface-status=active"
record 8503: 2014-02-11 12:07:29
record 8504: 2014-02-11 12:07:34 "ovs-vsctl: /usr/bin/ovs-vsctl --timeout=2 set Port tapdb931e65-a2 tag=232"
record 8505: 2014-02-11 12:07:34
record 8506: 2014-02-11 12:07:36 "ovs-vsctl: /usr/bin/ovs-vsctl --timeout=2 set Port tapdb931e65-a2 tag=138"
record 8507: 2014-02-11 12:07:36
---

which is interesting -- it set the right tag for the port and then 2 seconds later set the incorrect tag.

Looking at the output rules for the switch, I can see matching for both vlan 232 and 138

---
 cookie=0x0, duration=41413.007s, table=21, n_packets=107, n_bytes=21224, idle_age=474, priority=1,dl_vlan=138 actions=strip_vlan,set_tunnel:0x3,output:10,output:7,output:8,output:2,output:3,output:9,output:4,
+output:5,output:6
 cookie=0x0, duration=41411.844s, table=21, n_packets=1106, n_bytes=79016, idle_age=9444, priority=1,dl_vlan=232 actions=strip_vlan,set_tunnel:0x3,output:8,output:7,output:10,output:2,output:3,output:9,output:
+4,output:5,output:6
---

This seems to explain why the packets get out, but can't come back in

Comment 1 Ian Wienand 2014-02-12 03:37:58 UTC

Created attachment 862100 [details]
openvswitch-agent log files from around the time of the interface setup

Comment 3 Ian Wienand 2014-02-12 04:01:41 UTC

Created attachment 862101 [details]
neutron server log file

Comment 4 Ian Wienand 2014-02-12 05:27:22 UTC

I have attached log files from the openvswitch-agent on the compute node around the time of the problem, and the neutron server log files from the same time.

From the ovs plugin we see the switch of vlan happening

---
2014-02-11 12:07:34.163 7548 INFO neutron.plugins.openvswitch.agent.ovs_neutron_agent [-] Port db931e65-a204-481a-8328-c2fc61fdb637 updated. Details: {u'admin_state_up': True, u'network_id': u'20803109-5bea-46
38-8b02-bcdbace3f0b2', u'segmentation_id': 3, u'physical_network': None, u'device': u'db931e65-a204-481a-8328-c2fc61fdb637', u'port_id': u'db931e65-a204-481a-8328-c2fc61fdb637', u'network_type': u'gre'}
2014-02-11 12:07:34.164 7548 DEBUG neutron.agent.linux.utils [-] Running command: ['sudo', 'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ovs-vsctl', '--timeout=2', 'set', 'Port', 'tapdb931e65-a2', 'tag=232'] execute /usr/lib/python2.6/site-packages/neutron/agent/linux/utils.py:43


2014-02-11 12:07:35.835 653 INFO neutron.plugins.openvswitch.agent.ovs_neutron_agent [-] Port db931e65-a204-481a-8328-c2fc61fdb637 updated. Details: {u'admin_state_up': True, u'network_id': u'20803109-5bea-463
8-8b02-bcdbace3f0b2', u'segmentation_id': 3, u'physical_network': None, u'device': u'db931e65-a204-481a-8328-c2fc61fdb637', u'port_id': u'db931e65-a204-481a-8328-c2fc61fdb637', u'network_type': u'gre'}
2014-02-11 12:07:35.836 653 DEBUG neutron.agent.linux.utils [-] Running command: ['sudo', 'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ovs-vsctl', '--timeout=2', 'set', 'Port', 'tapdb931e65-a2', 'tag=138'] execute /usr/lib/python2.6/site-packages/neutron/agent/linux/utils.py:43
---

What is also interesting is two almost simultaneous calls to 

---
2014-02-11 11:11:32.793 653 INFO neutron.plugins.openvswitch.agent.ovs_neutron_agent [-] Assigning 138 as local vlan for net-id=20803109-5bea-4638-8b02-bcdbace3f0b2
2014-02-11 11:11:33.971 7548 INFO neutron.plugins.openvswitch.agent.ovs_neutron_agent [-] Assigning 232 as local vlan for net-id=20803109-5bea-4638-8b02-bcdbace3f0b2
---

This call to provision_local_vlan() [1] would have setup the output rules as described above (but note this was well before these ports got configured).  This explains why traffic from vlans 138 & 232 can get *out* since both rules are there.  According to logs, neither vlan was reclaimed (reclain_local_vlan() [1]) at any point.

[1] https://github.com/openstack/neutron/blob/master/neutron/plugins/openvswitch/agent/ovs_neutron_agent.py

Comment 6 Ian Wienand 2014-02-12 09:09:22 UTC

A fresh set of eyes looking at this made the logs jump out at me

2014-02-11 11:11:32.793 653 INFO 
2014-02-11 11:11:33.971 7548 INFO 

that's the same message from two different pids.

sure enough, there's two agents running

---
[root@host13 neutron]# ps -aef | grep openvswitc[h]-agent
neutron    653     1  1 Feb02 ?        03:19:08 /usr/bin/python /usr/bin/neutron-openvswitch-agent --log-file /var/log/neutron/openvswitch-agent.log --config-file /usr/share/neutron/neutron-dist.conf --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/openvswitch/ovs_neutron_plugin.ini
neutron   7548     1  1 Jan20 ?        07:24:19 /usr/bin/python /usr/bin/neutron-openvswitch-agent --log-file /var/log/neutron/openvswitch-agent.log --config-file /usr/share/neutron/neutron-dist.conf --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/openvswitch/ovs_neutron_plugin.ini
---

this pretty much explains everything!

Comment 7 Miguel Angel Ajo 2014-02-13 11:51:58 UTC

Ian, do you know how to reproduce this double-agent behavior?

I tried:

1) restarting: ok, 
2) stopping: ok, it's gone, 
3) starting twice:  only one started

Comment 8 Ian Wienand 2014-02-16 23:56:36 UTC

I agree the init scripts look OK.  

The second process had been running for about a month -- I think usually the agents would duplicate their actions sequentially, so things mostly appeared to work.  

My bot starts lots of vm's, so I hit the race condition where the two agents interleaved and ended up with invalid state.

There had been a lot of package upgrades, etc in the mean time, so we can't really tell how it got into this state ... certainly could be operator issues.

Usually with a deamon that binds to a port it's harder to end up in this situation.  I'm not very familiar with the rpc portion of the agent, but I wonder if there is some way of figuring out via queue names if there is already an agent listening on this host, and failing to start?

Comment 9 Miguel Angel Ajo 2014-02-24 06:25:47 UTC

(In reply to Ian Wienand from comment #8)

> There had been a lot of package upgrades, etc in the mean time, so we can't
> really tell how it got into this state ... certainly could be operator
> issues.

Yes, it could have been anything, upgrades, operator, .. the pid file mechanism with file locking should have been enough to avoid two agents, may be an older version of the init script was different?.


> Usually with a deamon that binds to a port it's harder to end up in this
> situation. 

Totally true.

> I'm not very familiar with the rpc portion of the agent, but I
> wonder if there is some way of figuring out via queue names if there is
> already an agent listening on this host, and failing to start?

It looks good for an RFE for Juno, Icehouse is on feature freeze now. Anyway before adding more complexity into neutron it could be good to get the init script reviewed by a few more eyes to see if we're missing something.

Comment 10 Miguel Angel Ajo 2014-05-28 12:10:23 UTC

This proble origin was in several agents of the same type started on the same host (which conflicted to each other). 
It was actually fixed in bz#1076994

*** This bug has been marked as a duplicate of bug 1076994 ***