Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1754416

Summary: ovn-metadata-agent unable start haproxy
Product: Red Hat OpenStack Reporter: Attila Fazekas <afazekas>
Component: puppet-tripleoAssignee: Cédric Jeanneret <cjeanner>
Status: CLOSED ERRATA QA Contact: Emilien Macchi <emacchi>
Severity: high Docs Contact:
Priority: high    
Version: 16.0 (Train)CC: apevec, bperkins, cjeanner, emacchi, jjoyce, jschluet, lhh, majopela, mmethot, ravsingh, sclewis, scohen, slinaber, tvignaud
Target Milestone: betaKeywords: Triaged
Target Release: 16.0 (Train on RHEL 8.1)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: puppet-tripleo-11.2.1-0.20190925093429.ebe6f1a.el8ost Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-02-06 14:42:04 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Attila Fazekas 2019-09-23 08:11:40 UTC
Description of problem:
First issue is error code 127, 2th attempt fails with already allocated ID.
127 typically means something is not found in the path, so maybe something is missing from the container.

compute-0: /var/log/containers/neutron/ovn-metadata-agent.log:
3fc3557e0dbd:neutron-haproxy-ovnmeta-70e80585-aa52-4459-bacb-e34072a2eee2:Created'
+ echo 'Starting a new child container neutron-haproxy-ovnmeta-2c74164c-43a2-439f-a8ec-711b10a570e3'
+ nsenter --net=/run/netns/ovnmeta-2c74164c-43a2-439f-a8ec-711b10a570e3 --preserve-credentials -m -t 1 podman run --detach --log-driver json-file --log-opt path=/var/log/containers/stdouts/neutron-haproxy-ovnmeta-2c74164c-43a2-439f-a8ec-711b10a570e3.log -v /var/lib/config-data/puppet-generated/neutron/etc/neutron:/etc/neutron:ro -v /run/netns:/run/netns:shared -v /var/lib/neutron:/var/lib/neutron:z,shared -v /dev/log:/dev/log --net host --pid host --privileged -u root --name neutron-haproxy-ovnmeta-2c74164c-43a2-439f-a8ec-711b10a570e3 192.168.24.1:8787/rhosp16/openstack-neutron-metadata-agent-ovn:20190920.1 /bin/bash -c 'HAPROXY="$(if [ -f /usr/sbin/haproxy-systemd-wrapper ]; then echo "/usr/sbin/haproxy -Ds"; else echo "/usr/sbin/haproxy -Ws"; fi)"; exec $HAPROXY -f /var/lib/neutron/ovn-metadata-proxy/2c74164c-43a2-439f-a8ec-711b10a570e3.conf'
[conmon:e]: No such log driver json-file
Error: write child: broken pipe
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event Traceback (most recent call last):
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event   File "/usr/lib/python3.6/site-packages/ovsdbapp/event.py", line 143, in notify_loop
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event     match.run(event, row, updates)
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event   File "/usr/lib/python3.6/site-packages/networking_ovn/agent/metadata/agent.py", line 93, in run
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event     self.agent.update_datapath(str(row.datapath.uuid))
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event   File "/usr/lib/python3.6/site-packages/networking_ovn/agent/metadata/agent.py", line 303, in update_datapath
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event     self.provision_datapath(datapath)
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event   File "/usr/lib/python3.6/site-packages/networking_ovn/agent/metadata/agent.py", line 417, in provision_datapath
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event     self.conf, bind_address=METADATA_DEFAULT_IP, network_id=datapath)
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event   File "/usr/lib/python3.6/site-packages/networking_ovn/agent/metadata/driver.py", line 200, in spawn_monitored_metadata_proxy
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event     pm.enable()
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event   File "/usr/lib/python3.6/site-packages/neutron/agent/linux/external_process.py", line 89, in enable
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event     run_as_root=self.run_as_root)
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event   File "/usr/lib/python3.6/site-packages/neutron/agent/linux/ip_lib.py", line 713, in execute
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event     run_as_root=run_as_root)
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event   File "/usr/lib/python3.6/site-packages/neutron/agent/linux/utils.py", line 147, in execute
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event     returncode=returncode)
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event neutron_lib.exceptions.ProcessExecutionError: Exit code: 127; Stdin: ; Stdout: Starting a new child container neutron-haproxy-ovnmeta-2c74164c-43a2-439f-a8ec-711b10a570e3
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event ; Stderr: + export DOCKER_HOST=
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event + DOCKER_HOST=
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event + ARGS='-f /var/lib/neutron/ovn-metadata-proxy/2c74164c-43a2-439f-a8ec-711b10a570e3.conf'


The per proxy log file did fond with pattern:
/var/log/containers/stdouts/neutron-haproxy-ovnmeta-.*.log

Version-Release number of selected component (if applicable):
python3-networking-ovn-7.0.0-0.20190919195659.1073411.el8ost.noarch
haproxy-1.8.15-5.el8.x86_64

RHOS_TRUNK-16.0-RHEL-8-20190920.n.2
How reproducible:
always

Steps to Reproduce:
1. install minim geneve setup
2. run tempest
3. ssh failes

Actual results:
tempest connectivity test all failing

Expected results:
tempest connectivity test all passing

Additional info:
Live system not observed yet, BZ based on logs form a test run.

Comment 2 Attila Fazekas 2019-09-23 09:53:20 UTC
--log-driver json-file  # is not liked by podman >=1.4.2 ).

Comment 3 Cédric Jeanneret 2019-09-23 11:59:36 UTC
Hello,

So the main issue is:
podman 1.4.1 dropped the "json-file" log-driver, making the whole thing crash.
It was added back for podman 1.4.3, as an alias to k8s-file. I'll push a patch shortly that will ensure we're using the right driver name.

Cheers,

C.

Comment 4 Attila Fazekas 2019-09-23 12:49:10 UTC
Changing the  /var/lib/neutron/ovn_metadata_haproxy_wrapper (json-file -> k8s-file) on the compute nodes 
and restarting the 192.168.24.1:8787/rhosp16/openstack-neutron-metadata-agent-ovn:20190920.1 can workaround the issue.

Comment 5 Cédric Jeanneret 2019-09-25 09:04:10 UTC
Upstream merged at last.

I think I've seen downstream cherry-picking the patch in 16-trunk right? So I think we're clean... ?

Comment 6 Emilien Macchi 2019-09-26 17:41:10 UTC
I tested it today and it works now:

podman inspect neutron-haproxy-ovnmeta-6fdefc04-35b9-482f-a38b-b53f21327b9f

(snip)

        "HostConfig": {
            "ContainerIDFile": "",
            "LogConfig": {
                "Type": "k8s-file",
                "Config": {}
            }

Comment 10 Marc Methot 2019-11-05 16:14:56 UTC
This is also impacting RHOSP15 as latest podman package on rhel8 is 1.4.2-5 which also lacks the link to k8s-file.

Cheers,
MM

Comment 14 errata-xmlrpc 2020-02-06 14:42:04 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:0283