Description of problem: neutron_ovs_agent container present on controller and compute nodes is in an unhealthy state. [root@controller-0 ~]# docker ps | grep neutron_ovs_agent 37961fdd3cf5 192.0.60.1:8787/rh-osbs/rhosp13-openstack-neutron-openvswitch-agent:20200303.1 "dumb-init --singl..." 24 hours ago Up 24 hours (unhealthy) neutron_ovs_agent This was observed on DPDK enabled setups (both on ComputeOvsDpdkSriov and ComputeHCIOvsDpdk roles), but it might be unrelated to roles. When running the health check script: [root@computehciovsdpdk-0 ~]# docker exec -it neutron_ovs_agent /openstack/healthcheck There is no neutron-openvsw process with opened RabbitMQ ports (5671,5672) running in the container ()[neutron@controller-0 /]$ ss -ntp | grep -E ":($ports).*,pid=($pids)." ESTAB 0 0 10.10.116.107:59460 10.10.116.105:5672 users:(("/usr/bin/python",pid=138071,fd=24)) ESTAB 0 0 10.10.116.107:59168 10.10.116.105:5672 users:(("/usr/bin/python",pid=137524,fd=7)) ESTAB 0 0 10.10.116.107:45170 10.10.116.106:5672 users:(("/usr/bin/python",pid=138071,fd=23)) ESTAB 0 0 10.10.116.107:35534 10.10.116.107:5672 users:(("neutron-server:",pid=136363,fd=15)) ESTAB 0 0 10.10.116.107:35670 10.10.116.102:3306 users:(("neutron-server:",pid=136373,fd=13)) ESTAB 0 0 10.10.116.107:48994 10.10.116.102:3306 users:(("neutron-server:",pid=136363,fd=18)) ESTAB 0 0 10.10.116.107:59454 10.10.116.105:5672 users:(("/usr/bin/python",pid=138071,fd=21)) ESTAB 0 0 10.10.116.107:32812 10.10.116.105:5672 users:(("neutron-server:",pid=136364,fd=14)) ESTAB 0 0 10.10.116.107:44740 10.10.116.106:5672 users:(("/usr/bin/python",pid=136996,fd=8)) ESTAB 0 0 10.10.116.107:35933 10.10.116.102:3306 users:(("neutron-server:",pid=136364,fd=13)) ESTAB 0 0 10.10.116.107:41165 10.10.116.102:3306 users:(("neutron-server:",pid=136373,fd=15)) ESTAB 0 0 10.10.116.107:35532 10.10.116.107:5672 users:(("neutron-server:",pid=136360,fd=15)) ESTAB 0 0 10.10.116.107:35486 10.10.116.107:5672 users:(("neutron-server:",pid=136360,fd=13)) ESTAB 0 0 10.10.116.107:46152 10.10.116.105:5672 users:(("neutron-server:",pid=136343,fd=16)) ESTAB 0 0 127.0.0.1:60492 127.0.0.1:6640 users:(("/usr/bin/python",pid=136996,fd=18)) ESTAB 0 0 127.0.0.1:33872 127.0.0.1:6640 users:(("ovsdb-client",pid=138757,fd=3)) ESTAB 0 0 10.10.116.107:51860 10.10.116.105:5672 users:(("neutron-server:",pid=136373,fd=14)) ESTAB 0 0 10.10.116.107:36390 10.10.116.107:5672 users:(("neutron-server:",pid=136362,fd=17)) ESTAB 0 0 10.10.116.107:52079 10.10.116.102:3306 users:(("neutron-server:",pid=136360,fd=18)) ESTAB 0 0 127.0.0.1:6633 127.0.0.1:44272 users:(("/usr/bin/python",pid=138071,fd=18)) ESTAB 0 0 10.10.116.107:36324 10.10.116.107:5672 users:(("/usr/bin/python",pid=138071,fd=15)) ESTAB 0 0 127.0.0.1:33232 127.0.0.1:6640 users:(("/usr/bin/python",pid=137524,fd=19)) ESTAB 0 0 10.10.116.107:36410 10.10.116.107:5672 users:(("/usr/bin/python",pid=138071,fd=25)) ESTAB 0 0 10.10.116.107:45192 10.10.116.106:5672 users:(("/usr/bin/python",pid=138071,fd=27)) ESTAB 0 0 10.10.116.107:34399 10.10.116.102:3306 users:(("neutron-server:",pid=136362,fd=18)) ESTAB 0 0 10.10.116.107:44718 10.10.116.106:5672 users:(("/usr/bin/python",pid=136996,fd=6)) ESTAB 0 0 10.10.116.107:59072 10.10.116.105:5672 users:(("/usr/bin/python",pid=137140,fd=8)) ESTAB 0 0 10.10.116.107:60080 10.10.116.106:5672 users:(("neutron-server:",pid=136343,fd=14)) ESTAB 0 0 127.0.0.1:6633 127.0.0.1:44216 users:(("/usr/bin/python",pid=138071,fd=12)) ESTAB 0 0 10.10.116.107:59000 10.10.116.105:5672 users:(("/usr/bin/python",pid=136996,fd=5)) ESTAB 0 0 10.10.116.107:35456 10.10.116.107:5672 users:(("neutron-server:",pid=136360,fd=12)) ESTAB 0 0 10.10.116.107:50232 10.10.116.102:3306 users:(("neutron-server:",pid=136363,fd=16)) ESTAB 0 0 10.10.116.107:36104 10.10.116.107:5672 users:(("/usr/bin/python",pid=137524,fd=6)) ESTAB 0 0 10.10.116.107:36038 10.10.116.107:5672 users:(("/usr/bin/python",pid=137140,fd=9)) ESTAB 0 0 10.10.116.107:45098 10.10.116.106:5672 users:(("/usr/bin/python",pid=138071,fd=16)) ESTAB 0 0 10.10.116.107:45160 10.10.116.106:5672 users:(("neutron-server:",pid=136360,fd=17)) ESTAB 0 0 10.10.116.107:44280 10.10.116.106:5672 users:(("neutron-server:",pid=136363,fd=14)) ESTAB 8 0 10.10.116.107:45758 10.10.116.105:5672 users:(("neutron-server:",pid=136348,fd=16)) ESTAB 0 0 10.10.116.107:50377 10.10.116.102:3306 users:(("neutron-server:",pid=136360,fd=16)) ESTAB 0 0 10.10.116.107:45210 10.10.116.106:5672 users:(("/usr/bin/python",pid=138071,fd=29)) ESTAB 0 0 10.10.116.107:44284 10.10.116.106:5672 users:(("neutron-server:",pid=136360,fd=14)) ESTAB 0 0 10.10.116.107:44924 10.10.116.106:5672 users:(("/usr/bin/python",pid=137524,fd=9)) ESTAB 0 0 10.10.116.107:35466 10.10.116.107:5672 users:(("neutron-server:",pid=136363,fd=12)) ESTAB 0 0 10.10.116.107:38342 10.10.116.106:5672 users:(("neutron-server:",pid=136345,fd=16)) ESTAB 8 0 10.10.116.107:50910 10.10.116.107:5672 users:(("neutron-server:",pid=136348,fd=14)) ESTAB 0 0 127.0.0.1:6633 127.0.0.1:44206 users:(("/usr/bin/python",pid=138071,fd=11)) ESTAB 0 0 10.10.116.107:36416 10.10.116.107:5672 users:(("/usr/bin/python",pid=138071,fd=26)) ESTAB 0 0 127.0.0.1:33862 127.0.0.1:6640 users:(("ovsdb-client",pid=138755,fd=3)) ESTAB 0 0 10.10.116.107:57766 10.10.116.107:5672 users:(("neutron-server:",pid=136345,fd=15)) ESTAB 0 0 10.10.116.107:44860 10.10.116.106:5672 users:(("/usr/bin/python",pid=137524,fd=5)) ESTAB 0 0 10.10.116.107:48206 10.10.116.102:3306 users:(("neutron-server:",pid=136362,fd=16)) ESTAB 0 0 127.0.0.1:33684 127.0.0.1:6640 users:(("/usr/bin/python",pid=138071,fd=8)) ESTAB 0 0 10.10.116.107:58546 10.10.116.105:5672 users:(("neutron-server:",pid=136362,fd=13)) ESTAB 0 0 10.10.116.107:45116 10.10.116.106:5672 users:(("/usr/bin/python",pid=138071,fd=17)) ESTAB 0 0 10.10.116.107:44720 10.10.116.106:5672 users:(("/usr/bin/python",pid=136996,fd=7)) ESTAB 0 0 10.10.116.107:58516 10.10.116.105:5672 users:(("neutron-server:",pid=136362,fd=12)) ESTAB 0 0 10.10.116.107:59372 10.10.116.105:5672 users:(("/usr/bin/python",pid=138071,fd=14)) ESTAB 0 0 10.10.116.107:35510 10.10.116.107:5672 users:(("neutron-server:",pid=136362,fd=14)) ESTAB 0 0 10.10.116.107:32821 10.10.116.102:3306 users:(("neutron-server:",pid=136373,fd=16)) ESTAB 0 0 10.10.116.107:44160 10.10.116.106:5672 users:(("neutron-server:",pid=136373,fd=6),("neutron-server:",pid=136364,fd=6),("neutron-server:",pid=136363,fd=6),("neutron-server:",pid=136362,fd=6),("neutron-server:",pid=136360,fd=6),("neutron-server:",pid=136348,fd=6),("neutron-server:",pid=136345,fd=6),("neutron-server:",pid=136343,fd=6),("/usr/bin/python",pid=135432,fd=6)) ESTAB 0 0 10.10.116.107:59488 10.10.116.105:5672 users:(("/usr/bin/python",pid=138071,fd=28)) ESTAB 0 0 10.10.116.107:45072 10.10.116.106:5672 users:(("/usr/bin/python",pid=138071,fd=13)) ESTAB 0 0 10.10.116.107:44226 10.10.116.106:5672 users:(("neutron-server:",pid=136364,fd=12)) ESTAB 0 0 10.10.116.107:45186 10.10.116.106:5672 users:(("neutron-server:",pid=136363,fd=17)) ESTAB 0 0 10.10.116.107:35530 10.10.116.107:5672 users:(("neutron-server:",pid=136362,fd=15)) ESTAB 0 0 10.10.116.107:58548 10.10.116.105:5672 users:(("neutron-server:",pid=136363,fd=13)) ESTAB 0 0 10.10.116.107:44232 10.10.116.106:5672 users:(("neutron-server:",pid=136373,fd=12)) Output from 13z10 where the container is healthy: ()[root@controller-0 /]# ss -ntp | grep -E ":($ports).*,pid=($pids)," ESTAB 0 0 10.10.131.106:39016 10.10.131.112:5672 users:(("neutron-openvsw",pid=784124,fd=26)) ESTAB 0 0 10.10.131.106:39138 10.10.131.112:5672 users:(("neutron-openvsw",pid=784124,fd=29)) ESTAB 0 0 10.10.131.106:34084 10.10.131.100:5672 users:(("neutron-openvsw",pid=784124,fd=14)) ESTAB 0 0 10.10.131.106:34600 10.10.131.100:5672 users:(("neutron-openvsw",pid=784124,fd=22)) ESTAB 0 0 10.10.131.106:34202 10.10.131.100:5672 users:(("neutron-openvsw",pid=784124,fd=15)) ESTAB 0 0 10.10.131.106:34836 10.10.131.100:5672 users:(("neutron-openvsw",pid=784124,fd=28)) ESTAB 0 0 10.10.131.106:34336 10.10.131.100:5672 users:(("neutron-openvsw",pid=784124,fd=16)) ESTAB 0 0 10.10.131.106:34582 10.10.131.100:5672 users:(("neutron-openvsw",pid=784124,fd=18)) ESTAB 0 0 10.10.131.106:34598 10.10.131.100:5672 users:(("neutron-openvsw",pid=784124,fd=24)) ESTAB 0 0 10.10.131.106:34462 10.10.131.100:5672 users:(("neutron-openvsw",pid=784124,fd=17)) ESTAB 0 0 127.0.0.1:33528 127.0.0.1:6640 users:(("neutron-openvsw",pid=784124,fd=8)) ESTAB 0 0 10.10.131.106:34840 10.10.131.100:5672 users:(("neutron-openvsw",pid=784124,fd=30)) ESTAB 0 0 127.0.0.1:6633 127.0.0.1:34810 users:(("neutron-openvsw",pid=784124,fd=12)) ESTAB 0 0 127.0.0.1:6633 127.0.0.1:34808 users:(("neutron-openvsw",pid=784124,fd=11)) ESTAB 0 0 10.10.131.106:34714 10.10.131.100:5672 users:(("neutron-openvsw",pid=784124,fd=25)) ESTAB 0 0 127.0.0.1:6633 127.0.0.1:34812 users:(("neutron-openvsw",pid=784124,fd=13)) ESTAB 0 0 10.10.131.106:39134 10.10.131.112:5672 users:(("neutron-openvsw",pid=784124,fd=27)) ESTAB 0 0 127.0.0.1:6633 127.0.0.1:34806 users:(("neutron-openvsw",pid=784124,fd=9)) Version-Release number of selected component (if applicable): container: rh-osbs/rhosp13-openstack-neutron-openvswitch-agent:20200303.1 puddle: 2020-03-10.1 How reproducible: Alway Steps to Reproduce: 1. Deploy an overcloud 2. Log into controller and compute nodes and view neutron_ovs_agent container Actual results: Container is unhealthy while there are no errors or warning in logs Expected results: Container is healthy if there are no errors or warnings in log Additional info: Will provide SOS report in comment
I was debugging the issue on Vadim's cluster before raising this bz. I think this issue is related to https://bugs.launchpad.net/tripleo/+bug/1821856. And specifically this particular fix - https://review.opendev.org/#/c/648027/6/healthcheck/common.sh@23 I haven't analyzed how this regression was introduced. Looping Tengu for his comments.
Hello, This is probably a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1813758 Care to confirm?
Yes it is, marking as duplicate. *** This bug has been marked as a duplicate of bug 1813758 ***