On a freshly installed OSP13 with: - 3 controllers & 3 compute nodes - containerized ha deployment - no network isolation, 1 nic only On the compute nodes, the neutron_ovs_agent container is in "Restarting" status. Logs show the following error: > 2018-03-15 03:52:53.643 254626 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.ovs_ryuapp [-] Agent main thread died of an exception: Exception: Could not retrieve schema from tcp:127.0.0.1:6640: Connection refused No service listens on port 6640 on the compute nodes. On the undercloud, `openstack stack failures list` returns no error. This has been reproduced on deployments with 1 compute & 1 controller and 3 computes & 3 controllers Steps to Reproduce: 1. Deploy OSP13 with containerized services without network isolation and 1 NIC Actual results: the neutron_ovs_agent fails to reach ovsdb-server Expected results: the neutron_ovs_agent succeeds to reach ovsdb-server Additional info: ======= Docker containers on the compute node: [root@overcloud-novacompute-2 ~]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 39f98b1db2ca docker-registry.engineering.redhat.com/rhosp13/openstack-neutron-openvswitch-agent:2018-03-02.2 "kolla_start" 35 hours ago Restarting (1) 6 hours ago neutron_ovs_agent 3bce00a06d26 docker-registry.engineering.redhat.com/rhosp13/openstack-cron:2018-03-02.2 "kolla_start" 35 hours ago Up 35 hours logrotate_crond aa4dbd562191 docker-registry.engineering.redhat.com/rhosp13/openstack-nova-compute:2018-03-02.2 "kolla_start" 35 hours ago Up 35 hours nova_migration_target 3ac733b05a57 docker-registry.engineering.redhat.com/rhosp13/openstack-ceilometer-compute:2018-03-02.2 "kolla_start" 35 hours ago Up 35 hours ceilometer_agent_compute b134175bf647 docker-registry.engineering.redhat.com/rhosp13/openstack-nova-compute:2018-03-02.2 "kolla_start" 35 hours ago Up 35 hours (healthy) nova_compute 96cb93ecc0e4 docker-registry.engineering.redhat.com/rhosp13/openstack-iscsid:2018-03-02.2 "kolla_start" 35 hours ago Up 35 hours iscsid a538a6c37cc5 docker-registry.engineering.redhat.com/rhosp13/openstack-nova-libvirt:2018-03-02.2 "kolla_start" 35 hours ago Up 35 hours nova_libvirt b52a34ff71c2 docker-registry.engineering.redhat.com/rhosp13/openstack-nova-libvirt:2018-03-02.2 "kolla_start" 35 hours ago Up 35 hours nova_virtlogd ========= Environments used to deploy the overcloud environments: - templates/00-node-info.yaml - /usr/share/openstack-tripleo-heat-templates/environments/docker.yaml - templates/docker_registry.yaml - /usr/share/openstack-tripleo-heat-templates/environments/docker-ha.yaml ========== 00-node-info.yaml (undercloud) [stack@undercloud ~]$ cat templates/00-node-info.yaml parameter_defaults: OvercloudControllerFlavor: control OvercloudComputeFlavor: compute ControllerCount: 3 ComputeCount: 3 NtpServer: '172.16.0.1' NeutronNetworkType: 'vxlan,vlan' NeutronTunnelTypes: 'vxlan'
The same deployment with OSP12 has a working neutron-openvswitch-agent service
Can you please generate an sosreport on any node with this failure?
Hi, the sosreport from one compute node is available at the following URL: http://file.rdu.redhat.com/~fcharlie/share/sosreport-overcloud-novacompute-2-20180319202056.tar.xz Quoting sosreport, "The checksum is: 060b6e61f1e79cacfdfee551843f2a72" I'll keep the lab up this time, it is reachable from the Red Hat VPN, I just need an ssh key to give access to the undercloud.
I've logged in to the machines and saw that the OVS service was unable to start. Disabling SELinux resolved the issue. Arie reported the SELinux issue in the bug this is duplicating, which already has a fix and should be in the next puddle. *** This bug has been marked as a duplicate of bug 1554964 ***