Bug 1473763 - openstack-neutron: after rebooting compute, br-isolated doesn't have any flows.
openstack-neutron: after rebooting compute, br-isolated doesn't have any flows.
Status: ON_QA
Product: Red Hat OpenStack
Classification: Red Hat
Component: os-net-config (Show other bugs)
12.0 (Pike)
Unspecified Unspecified
urgent Severity urgent
: rc
: 12.0 (Pike)
Assigned To: Jakub Libosvar
Marian Krcmarik
: AutomationBlocker, Triaged
: 1469751 1475764 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-07-21 10:52 EDT by Alexander Chuzhoy
Modified: 2017-09-27 12:00 EDT (History)
14 users (show)

See Also:
Fixed In Version: os-net-config-7.2.1-0.20170825174722.77fe592.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
neutron-openvswitch-agent log (114.58 KB, text/plain)
2017-07-21 10:57 EDT, Alexander Chuzhoy
no flags Details
neutron-openvswitch-agent log after enabling debug (577.52 KB, text/plain)
2017-07-21 11:08 EDT, Alexander Chuzhoy
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Launchpad 1712517 None None None 2017-08-23 09:47 EDT
OpenStack gerrit 495919 None None None 2017-08-21 12:30 EDT
OpenStack gerrit 496707 None None None 2017-08-23 09:24 EDT

  None (edit)
Description Alexander Chuzhoy 2017-07-21 10:52:21 EDT
openstack-neutron: after rebooting compute, br-isolated doesn't have any flows.

Environment:
python-neutron-lib-1.7.0-0.20170529134801.0ee4f4a.el7ost.noarch
openstack-neutron-ml2-11.0.0-0.20170710190714.a51271d.el7ost.noarch
openstack-neutron-openvswitch-11.0.0-0.20170710190714.a51271d.el7ost.noarch
openstack-neutron-11.0.0-0.20170710190714.a51271d.el7ost.noarch
python-neutronclient-6.3.0-0.20170601203754.ba535c6.el7ost.noarch
python-neutron-11.0.0-0.20170710190714.a51271d.el7ost.noarch
python-neutron-lbaas-11.0.0-0.20170706121958.el7ost.noarch
openstack-neutron-metering-agent-11.0.0-0.20170710190714.a51271d.el7ost.noarch
puppet-neutron-11.2.0-0.20170704143150.62f49da.el7ost.noarch
openstack-neutron-l2gw-agent-10.1.0-0.20170706052213.25133e6.el7ost.noarch
openstack-neutron-sriov-nic-agent-11.0.0-0.20170710190714.a51271d.el7ost.noarch
openstack-neutron-lbaas-11.0.0-0.20170706121958.el7ost.noarch
openstack-neutron-linuxbridge-11.0.0-0.20170710190714.a51271d.el7ost.noarch
openstack-neutron-common-11.0.0-0.20170710190714.a51271d.el7ost.noarch

openstack-tripleo-heat-templates-7.0.0-0.20170710191337.el7ost.noarch
instack-undercloud-7.1.1-0.20170710151630.el7ost.noarch
openstack-puppet-modules-10.0.0-0.20170315222135.0333c73.el7.1.noarch


Steps to reproduce:
1. Deploy OSP12.
2. Reboot compute
3. Try to ping machines in the same setup via IPs on networks residing on br-isolated bridge

Result:

()[root@overcloud-compute-0 /]# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

# HEAT_HOSTS_START - Do not edit manually within this section!
10.0.0.108  overcloud.localdomain
192.168.24.15  overcloud.ctlplane.localdomain
172.17.1.17  overcloud.internalapi.localdomain
172.17.3.11  overcloud.storage.localdomain
172.17.4.10  overcloud.storagemgmt.localdomain
172.17.1.22 overcloud-controller-0.localdomain overcloud-controller-0
10.0.0.103 overcloud-controller-0.external.localdomain overcloud-controller-0.external
172.17.1.22 overcloud-controller-0.internalapi.localdomain overcloud-controller-0.internalapi
172.17.3.10 overcloud-controller-0.storage.localdomain overcloud-controller-0.storage
172.17.4.11 overcloud-controller-0.storagemgmt.localdomain overcloud-controller-0.storagemgmt
172.17.2.16 overcloud-controller-0.tenant.localdomain overcloud-controller-0.tenant
192.168.24.8 overcloud-controller-0.management.localdomain overcloud-controller-0.management
192.168.24.8 overcloud-controller-0.ctlplane.localdomain overcloud-controller-0.ctlplane
172.17.1.12 overcloud-controller-1.localdomain overcloud-controller-1
10.0.0.105 overcloud-controller-1.external.localdomain overcloud-controller-1.external
172.17.1.12 overcloud-controller-1.internalapi.localdomain overcloud-controller-1.internalapi
172.17.3.20 overcloud-controller-1.storage.localdomain overcloud-controller-1.storage
172.17.4.18 overcloud-controller-1.storagemgmt.localdomain overcloud-controller-1.storagemgmt
172.17.2.20 overcloud-controller-1.tenant.localdomain overcloud-controller-1.tenant
192.168.24.7 overcloud-controller-1.management.localdomain overcloud-controller-1.management
192.168.24.7 overcloud-controller-1.ctlplane.localdomain overcloud-controller-1.ctlplane
172.17.1.10 overcloud-controller-2.localdomain overcloud-controller-2
10.0.0.104 overcloud-controller-2.external.localdomain overcloud-controller-2.external
172.17.1.10 overcloud-controller-2.internalapi.localdomain overcloud-controller-2.internalapi
172.17.3.15 overcloud-controller-2.storage.localdomain overcloud-controller-2.storage
172.17.4.19 overcloud-controller-2.storagemgmt.localdomain overcloud-controller-2.storagemgmt
172.17.2.12 overcloud-controller-2.tenant.localdomain overcloud-controller-2.tenant
192.168.24.10 overcloud-controller-2.management.localdomain overcloud-controller-2.management
192.168.24.10 overcloud-controller-2.ctlplane.localdomain overcloud-controller-2.ctlplane

172.17.1.19 overcloud-compute-0.localdomain overcloud-compute-0
192.168.24.9 overcloud-compute-0.external.localdomain overcloud-compute-0.external
172.17.1.19 overcloud-compute-0.internalapi.localdomain overcloud-compute-0.internalapi
172.17.3.21 overcloud-compute-0.storage.localdomain overcloud-compute-0.storage
192.168.24.9 overcloud-compute-0.storagemgmt.localdomain overcloud-compute-0.storagemgmt
172.17.2.11 overcloud-compute-0.tenant.localdomain overcloud-compute-0.tenant
192.168.24.9 overcloud-compute-0.management.localdomain overcloud-compute-0.management
192.168.24.9 overcloud-compute-0.ctlplane.localdomain overcloud-compute-0.ctlplane
172.17.1.11 overcloud-compute-1.localdomain overcloud-compute-1
192.168.24.12 overcloud-compute-1.external.localdomain overcloud-compute-1.external
172.17.1.11 overcloud-compute-1.internalapi.localdomain overcloud-compute-1.internalapi
172.17.3.17 overcloud-compute-1.storage.localdomain overcloud-compute-1.storage
192.168.24.12 overcloud-compute-1.storagemgmt.localdomain overcloud-compute-1.storagemgmt
172.17.2.18 overcloud-compute-1.tenant.localdomain overcloud-compute-1.tenant
192.168.24.12 overcloud-compute-1.management.localdomain overcloud-compute-1.management
192.168.24.12 overcloud-compute-1.ctlplane.localdomain overcloud-compute-1.ctlplane




# HEAT_HOSTS_END

()[root@overcloud-compute-0 /]# ping 172.17.1.12
PING 172.17.1.12 (172.17.1.12) 56(84) bytes of data.
^C
--- 172.17.1.12 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 999ms

()[root@overcloud-compute-0 /]# ping 172.17.4.18   
PING 172.17.4.18 (172.17.4.18) 56(84) bytes of data.
^C
--- 172.17.4.18 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms




Expected result:
Should be able to ping successfully.


Workaround:
Running on the affected compute: "sudo ovs-ofctl add-flow br-isolated priority=0,actions=normal" fixes it.





More info:
[root@overcloud-compute-0 ~]# docker ps
CONTAINER ID        IMAGE                                                                               COMMAND             CREATED             STATUS              PORTS               NAMES
22be6672c1b1        192.168.24.1:8787/rhosp12/openstack-neutron-openvswitch-agent-docker:2017-07-13.2   "kolla_start"       10 hours ago        Up 20 minutes                           neutron_ovs_agent
c116a66fe481        192.168.24.1:8787/rhosp12/openstack-ceilometer-compute-docker:2017-07-13.2          "kolla_start"       10 hours ago        Up 20 minutes                           ceilometer_agent_compute
235fb393b375        192.168.24.1:8787/rhosp12/openstack-nova-compute-docker:2017-07-13.2                "kolla_start"       10 hours ago        Up 20 minutes                           nova_compute

[root@overcloud-compute-0 ~]# docker logs neutron_ovs_agent
INFO:__main__:Loading config file at /var/lib/kolla/config_files/config.json
INFO:__main__:Validating config file
INFO:__main__:Kolla config strategy set to: COPY_ALWAYS
INFO:__main__:Copying service configuration files
INFO:__main__:Deleting /etc/neutron/plugins/ml2/openvswitch_agent.ini
INFO:__main__:Copying /var/lib/kolla/config_files/src/etc/neutron/plugins/ml2/openvswitch_agent.ini to /etc/neutron/plugins/ml2/openvswitch_agent.ini
INFO:__main__:Deleting /etc/neutron/neutron.conf
INFO:__main__:Copying /var/lib/kolla/config_files/src/etc/neutron/neutron.conf to /etc/neutron/neutron.conf
INFO:__main__:Writing out command to execute
INFO:__main__:Setting permission for /var/log/neutron
INFO:__main__:Setting permission for /var/log/neutron/neutron-openvswitch-agent.log
Running command: '/usr/bin/neutron-openvswitch-agent --config-file /usr/share/neutron/neutron-dist.conf --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/ml2/openvswitch_agent.ini --config-file /etc/neutron/plugins/ml2/ml2_conf.ini --config-dir /etc/neutron/conf.d/common'
Guru meditation now registers SIGUSR1 and SIGUSR2 by default for backward compatibility. SIGUSR1 will no longer be registered in a future release, so please use SIGUSR2 to generate reports.


The neutron_ovs_agent container launching command:
/bin/docker run --name neutron_ovs_agent --pid=host --env=KOLLA_CONFIG_STRATEGY=COPY_ALWAYS --volume=/etc/hosts:/etc/hosts:ro --volume=/etc/localtime:/etc/localtime:ro --volume=/etc/puppet:/etc/puppet:ro --volume=/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro --volume=/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro --volume=/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro --volume=/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro --volume=/dev/log:/dev/log --volume=/etc/ssh/ssh_known_hosts:/etc/ssh/ssh_known_hosts:ro --volume=/var/lib/kolla/config_files/neutron_ovs_agent.json:/var/lib/kolla/config_files/config.json:ro --volume=/var/lib/config-data/puppet-generated/neutron/:/var/lib/kolla/config_files/src:ro --volume=/lib/modules:/lib/modules:ro --volume=/run:/run --volume=/var/log/containers/neutron:/var/log/neutron --net=host --privileged=true 192.168.24.1:8787/rhosp12/openstack-neutron-openvswitch-agent-docker:2017-07-13.2



[root@overcloud-compute-0 ~]# cat /var/lib/config-data/puppet-generated/neutron/etc/neutron/plugins/ml2/openvswitch_agent.ini 

[ovs]
bridge_mappings=datacentre:br-ex,tenant:br-isolated
integration_bridge=br-int
tunnel_bridge=br-tun
local_ip=172.17.2.11

[agent]
l2_population=False
arp_responder=False
enable_distributed_routing=False
drop_flows_on_start=False
extensions=qos
tunnel_types=vxlan
vxlan_udp_port=4789

[securitygroup]
firewall_driver=iptables_hybrid



[root@overcloud-compute-0 ~]# cat /var/lib/config-data/puppet-generated/neutron/etc/neutron/neutron.conf |grep -v "^#\|^$"
[DEFAULT]
auth_strategy=keystone
core_plugin=ml2
service_plugins=router,qos,trunk
dns_domain=openstacklocal
allow_overlapping_ips=True
host=overcloud-compute-0.redhat.local
global_physnet_mtu=1500
dhcp_agents_per_network=3
log_dir=/var/log/neutron
transport_url=rabbit://guest:B863eK6PqfMmfEeWNdavFCRQj@overcloud-controller-0.internalapi.localdomain:5672,guest:B863eK6PqfMmfEeWNdavFCRQj@overcloud-controller-1.internalapi.localdomain:5672,guest:B863eK6PqfMmfEeWNdavFCRQj@overcloud-controller-2.internalapi.localdomain:5672/?ssl=0
rpc_backend=rabbit
control_exchange=neutron
[agent]
root_helper=sudo neutron-rootwrap /etc/neutron/rootwrap.conf
[cors]
[cors.subdomain]
[database]
[keystone_authtoken]
[matchmaker_redis]
[nova]
[oslo_concurrency]
lock_path=$state_path/lock
[oslo_messaging_amqp]
[oslo_messaging_kafka]
[oslo_messaging_notifications]
transport_url=rabbit://guest:B863eK6PqfMmfEeWNdavFCRQj@overcloud-controller-0.internalapi.localdomain:5672,guest:B863eK6PqfMmfEeWNdavFCRQj@overcloud-controller-1.internalapi.localdomain:5672,guest:B863eK6PqfMmfEeWNdavFCRQj@overcloud-controller-2.internalapi.localdomain:5672/?ssl=0
[oslo_messaging_rabbit]
ssl=False
rabbit_port=5672
rabbit_userid=guest
rabbit_password=B863eK6PqfMmfEeWNdavFCRQj
heartbeat_timeout_threshold=60
[oslo_messaging_zmq]
[oslo_middleware]
[oslo_policy]
[quotas]
[ssl]




()[root@overcloud-compute-0 /]# ovs-vsctl show
93ba1521-7227-4014-a0ed-040507897121
    Manager "ptcp:6640:127.0.0.1"
        is_connected: true
    Bridge br-ex
        Controller "tcp:127.0.0.1:6633"
            is_connected: true
        fail_mode: secure
        Port br-ex
            Interface br-ex
                type: internal
        Port phy-br-ex
            Interface phy-br-ex
                type: patch
                options: {peer=int-br-ex}
    Bridge br-isolated
        Controller "tcp:127.0.0.1:6633"
            is_connected: true
        fail_mode: standalone
        Port "vlan30"
            tag: 30
            Interface "vlan30"
                type: internal
        Port "vlan20"
            tag: 20
            Interface "vlan20"
                type: internal
        Port "vlan50"
            tag: 50
            Interface "vlan50"
                type: internal
        Port "eth1"
            Interface "eth1"
        Port br-isolated
            Interface br-isolated
                type: internal
        Port phy-br-isolated
            Interface phy-br-isolated
                type: patch
                options: {peer=int-br-isolated}
    Bridge br-tun
        Controller "tcp:127.0.0.1:6633"
            is_connected: true
        fail_mode: secure
        Port "vxlan-ac110214"
            Interface "vxlan-ac110214"
                type: vxlan
                options: {df_default="true", in_key=flow, local_ip="172.17.2.11", out_key=flow, remote_ip="172.17.2.20"}
        Port patch-int
            Interface patch-int
                type: patch
                options: {peer=patch-tun}
        Port "vxlan-ac110210"
            Interface "vxlan-ac110210"
                type: vxlan
                options: {df_default="true", in_key=flow, local_ip="172.17.2.11", out_key=flow, remote_ip="172.17.2.16"}
        Port "vxlan-ac11020c"
            Interface "vxlan-ac11020c"
                type: vxlan
                options: {df_default="true", in_key=flow, local_ip="172.17.2.11", out_key=flow, remote_ip="172.17.2.12"}
        Port br-tun
            Interface br-tun
                type: internal
        Port "vxlan-ac110212"
            Interface "vxlan-ac110212"
                type: vxlan
                options: {df_default="true", in_key=flow, local_ip="172.17.2.11", out_key=flow, remote_ip="172.17.2.18"}
    Bridge br-int
        Controller "tcp:127.0.0.1:6633"
            is_connected: true
        fail_mode: secure
        Port int-br-ex
            Interface int-br-ex
                type: patch
                options: {peer=phy-br-ex}
        Port patch-tun
            Interface patch-tun
                type: patch
                options: {peer=patch-int}
        Port int-br-isolated
            Interface int-br-isolated
                type: patch
                options: {peer=phy-br-isolated}
        Port br-int
            Interface br-int
                type: internal
    ovs_version: "2.7.0"
Comment 1 Alexander Chuzhoy 2017-07-21 10:57 EDT
Created attachment 1302482 [details]
neutron-openvswitch-agent log
Comment 2 Alexander Chuzhoy 2017-07-21 11:08 EDT
Created attachment 1302486 [details]
neutron-openvswitch-agent log after enabling debug
Comment 3 Alexander Chuzhoy 2017-07-21 11:18:57 EDT
The issue is intermittent, not reproducing after every reboot.
Comment 4 Jakub Libosvar 2017-08-17 10:00:35 EDT
bug 1475764 says they are able to reproduce this issue outside of container, I'm marking this as a blocker and bump its importance
Comment 5 Jakub Libosvar 2017-08-17 10:00:48 EDT
*** Bug 1475764 has been marked as a duplicate of this bug. ***
Comment 12 Assaf Muller 2017-08-25 09:17:43 EDT
Fix merged to upstream stable/pike, should show up in the next promotion.
Comment 14 Assaf Muller 2017-08-28 09:54:51 EDT
*** Bug 1469751 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.