Bug 1754416 - ovn-metadata-agent unable start haproxy
Summary: ovn-metadata-agent unable start haproxy
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: puppet-tripleo
Version: 16.0 (Train)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: beta
: 16.0 (Train on RHEL 8.1)
Assignee: Cédric Jeanneret
QA Contact: Emilien Macchi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-09-23 08:11 UTC by Attila Fazekas
Modified: 2023-03-24 15:30 UTC (History)
14 users (show)

Fixed In Version: puppet-tripleo-11.2.1-0.20190925093429.ebe6f1a.el8ost
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-02-06 14:42:04 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1844856 0 None None None 2019-09-23 12:05:12 UTC
OpenStack gerrit 683937 0 'None' MERGED Update log-driver value for podman 2021-02-03 14:13:38 UTC
Red Hat Knowledge Base (Solution) 4553331 0 None None None 2019-11-05 16:14:55 UTC
Red Hat Product Errata RHEA-2020:0283 0 None None None 2020-02-06 14:42:30 UTC

Description Attila Fazekas 2019-09-23 08:11:40 UTC
Description of problem:
First issue is error code 127, 2th attempt fails with already allocated ID.
127 typically means something is not found in the path, so maybe something is missing from the container.

compute-0: /var/log/containers/neutron/ovn-metadata-agent.log:
3fc3557e0dbd:neutron-haproxy-ovnmeta-70e80585-aa52-4459-bacb-e34072a2eee2:Created'
+ echo 'Starting a new child container neutron-haproxy-ovnmeta-2c74164c-43a2-439f-a8ec-711b10a570e3'
+ nsenter --net=/run/netns/ovnmeta-2c74164c-43a2-439f-a8ec-711b10a570e3 --preserve-credentials -m -t 1 podman run --detach --log-driver json-file --log-opt path=/var/log/containers/stdouts/neutron-haproxy-ovnmeta-2c74164c-43a2-439f-a8ec-711b10a570e3.log -v /var/lib/config-data/puppet-generated/neutron/etc/neutron:/etc/neutron:ro -v /run/netns:/run/netns:shared -v /var/lib/neutron:/var/lib/neutron:z,shared -v /dev/log:/dev/log --net host --pid host --privileged -u root --name neutron-haproxy-ovnmeta-2c74164c-43a2-439f-a8ec-711b10a570e3 192.168.24.1:8787/rhosp16/openstack-neutron-metadata-agent-ovn:20190920.1 /bin/bash -c 'HAPROXY="$(if [ -f /usr/sbin/haproxy-systemd-wrapper ]; then echo "/usr/sbin/haproxy -Ds"; else echo "/usr/sbin/haproxy -Ws"; fi)"; exec $HAPROXY -f /var/lib/neutron/ovn-metadata-proxy/2c74164c-43a2-439f-a8ec-711b10a570e3.conf'
[conmon:e]: No such log driver json-file
Error: write child: broken pipe
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event Traceback (most recent call last):
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event   File "/usr/lib/python3.6/site-packages/ovsdbapp/event.py", line 143, in notify_loop
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event     match.run(event, row, updates)
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event   File "/usr/lib/python3.6/site-packages/networking_ovn/agent/metadata/agent.py", line 93, in run
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event     self.agent.update_datapath(str(row.datapath.uuid))
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event   File "/usr/lib/python3.6/site-packages/networking_ovn/agent/metadata/agent.py", line 303, in update_datapath
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event     self.provision_datapath(datapath)
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event   File "/usr/lib/python3.6/site-packages/networking_ovn/agent/metadata/agent.py", line 417, in provision_datapath
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event     self.conf, bind_address=METADATA_DEFAULT_IP, network_id=datapath)
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event   File "/usr/lib/python3.6/site-packages/networking_ovn/agent/metadata/driver.py", line 200, in spawn_monitored_metadata_proxy
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event     pm.enable()
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event   File "/usr/lib/python3.6/site-packages/neutron/agent/linux/external_process.py", line 89, in enable
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event     run_as_root=self.run_as_root)
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event   File "/usr/lib/python3.6/site-packages/neutron/agent/linux/ip_lib.py", line 713, in execute
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event     run_as_root=run_as_root)
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event   File "/usr/lib/python3.6/site-packages/neutron/agent/linux/utils.py", line 147, in execute
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event     returncode=returncode)
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event neutron_lib.exceptions.ProcessExecutionError: Exit code: 127; Stdin: ; Stdout: Starting a new child container neutron-haproxy-ovnmeta-2c74164c-43a2-439f-a8ec-711b10a570e3
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event ; Stderr: + export DOCKER_HOST=
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event + DOCKER_HOST=
2019-09-20 19:14:22.047 26603 ERROR ovsdbapp.event + ARGS='-f /var/lib/neutron/ovn-metadata-proxy/2c74164c-43a2-439f-a8ec-711b10a570e3.conf'


The per proxy log file did fond with pattern:
/var/log/containers/stdouts/neutron-haproxy-ovnmeta-.*.log

Version-Release number of selected component (if applicable):
python3-networking-ovn-7.0.0-0.20190919195659.1073411.el8ost.noarch
haproxy-1.8.15-5.el8.x86_64

RHOS_TRUNK-16.0-RHEL-8-20190920.n.2
How reproducible:
always

Steps to Reproduce:
1. install minim geneve setup
2. run tempest
3. ssh failes

Actual results:
tempest connectivity test all failing

Expected results:
tempest connectivity test all passing

Additional info:
Live system not observed yet, BZ based on logs form a test run.

Comment 2 Attila Fazekas 2019-09-23 09:53:20 UTC
--log-driver json-file  # is not liked by podman >=1.4.2 ).

Comment 3 Cédric Jeanneret 2019-09-23 11:59:36 UTC
Hello,

So the main issue is:
podman 1.4.1 dropped the "json-file" log-driver, making the whole thing crash.
It was added back for podman 1.4.3, as an alias to k8s-file. I'll push a patch shortly that will ensure we're using the right driver name.

Cheers,

C.

Comment 4 Attila Fazekas 2019-09-23 12:49:10 UTC
Changing the  /var/lib/neutron/ovn_metadata_haproxy_wrapper (json-file -> k8s-file) on the compute nodes 
and restarting the 192.168.24.1:8787/rhosp16/openstack-neutron-metadata-agent-ovn:20190920.1 can workaround the issue.

Comment 5 Cédric Jeanneret 2019-09-25 09:04:10 UTC
Upstream merged at last.

I think I've seen downstream cherry-picking the patch in 16-trunk right? So I think we're clean... ?

Comment 6 Emilien Macchi 2019-09-26 17:41:10 UTC
I tested it today and it works now:

podman inspect neutron-haproxy-ovnmeta-6fdefc04-35b9-482f-a38b-b53f21327b9f

(snip)

        "HostConfig": {
            "ContainerIDFile": "",
            "LogConfig": {
                "Type": "k8s-file",
                "Config": {}
            }

Comment 10 Marc Methot 2019-11-05 16:14:56 UTC
This is also impacting RHOSP15 as latest podman package on rhel8 is 1.4.2-5 which also lacks the link to k8s-file.

Cheers,
MM

Comment 14 errata-xmlrpc 2020-02-06 14:42:04 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:0283


Note You need to log in before you can comment on or make changes to this bug.