Bug 1936955

Summary: Octavia 16.1 CI jobs failing with driver-agent error
Product: Red Hat OpenStack Reporter: Bruna Bonguardo <bbonguar>
Component: openstack-tripleo-heat-templatesAssignee: Gregory Thiemonge <gthiemon>
Status: CLOSED ERRATA QA Contact: Bruna Bonguardo <bbonguar>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 16.1 (Train)CC: bhaley, chrisbro, dhill, ffernand, gthiemon, igallagh, ihrachys, lpeer, majopela, mburns, rlondhe, scohen, spower
Target Milestone: z4Keywords: AutomationBlocker, Triaged
Target Release: 16.1 (Train on RHEL 8.2)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-11.3.2-1.20210104205664.el8ost.2 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-03-17 15:36:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Comment 1 Gregory Thiemonge 2021-03-09 14:31:50 UTC
driver-agent.log:

2021-03-06 05:29:54.912 7 INFO octavia.common.config [-] Logging enabled!
2021-03-06 05:29:54.912 7 INFO octavia.common.config [-] /usr/bin/octavia-driver-agent version 5.0.3
2021-03-06 05:29:54.912 7 DEBUG octavia.common.config [-] command line: /usr/bin/octavia-driver-agent --config-file /usr/share/octavia/octavia-dist.conf --config-file /etc/octavia/octavia.conf --log-file /var/log/octavia/driver-agent.log --config-dir /etc/octavia/conf.d/common setup_logging /usr/lib/python3.6/site-packages/octavia/common/config.py:823
2021-03-06 05:29:55.047 7 INFO octavia.cmd.driver_agent [-] Driver agent status listener process starts:
2021-03-06 05:29:55.050 7 INFO octavia.cmd.driver_agent [-] Driver agent statistics listener process starts:
2021-03-06 05:29:55.055 7 INFO octavia.cmd.driver_agent [-] Driver agent get listener process starts:
2021-03-06 05:29:55.053 26 ERROR octavia.cmd.driver_agent [-] status_listener raised exception: [Errno 13] Permission denied. Restarting status_listener.: PermissionError: [Errno 13] Permission denied
2021-03-06 05:29:55.053 26 ERROR octavia.cmd.driver_agent Traceback (most recent call last):
2021-03-06 05:29:55.053 26 ERROR octavia.cmd.driver_agent   File "/usr/lib/python3.6/site-packages/octavia/cmd/driver_agent.py", line 65, in _process_wrapper
2021-03-06 05:29:55.053 26 ERROR octavia.cmd.driver_agent     function(exit_event)
2021-03-06 05:29:55.053 26 ERROR octavia.cmd.driver_agent   File "/usr/lib/python3.6/site-packages/octavia/api/drivers/driver_agent/driver_listener.py", line 118, in status_listener
2021-03-06 05:29:55.053 26 ERROR octavia.cmd.driver_agent     StatusRequestHandler)
2021-03-06 05:29:55.053 26 ERROR octavia.cmd.driver_agent   File "/usr/lib64/python3.6/socketserver.py", line 456, in __init__
2021-03-06 05:29:55.053 26 ERROR octavia.cmd.driver_agent     self.server_bind()
2021-03-06 05:29:55.053 26 ERROR octavia.cmd.driver_agent   File "/usr/lib64/python3.6/socketserver.py", line 470, in server_bind
2021-03-06 05:29:55.053 26 ERROR octavia.cmd.driver_agent     self.socket.bind(self.server_address)
2021-03-06 05:29:55.053 26 ERROR octavia.cmd.driver_agent PermissionError: [Errno 13] Permission denied
2021-03-06 05:29:55.053 26 ERROR octavia.cmd.driver_agent 


Similar to 1879849 (16.2 bug)

Comment 4 Gregory Thiemonge 2021-03-10 16:32:53 UTC
OSP16.1 is not affected (phase2 job, manual deployment), Octavia works correctly. The issue occurs only in the DFG jobs.

The DFG jobs are patching the overcloud images before deploying: the "--images-update yes" option is passed to "infrared tripleo-undercloud".
This extra step adds some custom yum repositories in the overcloud, so the package versions can differ from a standard deployment (such as the phase2 job).

In our case, it seems that a new podman release fixed some security issues (1.6.4-19.module+el8.2.0+10175+e12b0910 vs 1.6.4-17.module+el8.2.0+9347+d4fa9cbb) and those new constraints break the driver-agent container's volumes.

The fix already exists in 16.2 and needs to be backported to 16.1: https://bugzilla.redhat.com/show_bug.cgi?id=1879849
Because the issue occurs only when installing podman from a custom repo, I believe that the backport can wait until 16.1.5.

Comment 5 Gregory Thiemonge 2021-03-10 16:52:38 UTC
*** Bug 1937448 has been marked as a duplicate of this bug. ***

Comment 6 Brian Haley 2021-03-11 15:28:21 UTC
Package built

Comment 15 errata-xmlrpc 2021-03-17 15:36:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 16.1.4 director bug fix advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:0817

Comment 16 Gregory Thiemonge 2021-03-29 13:22:43 UTC
*** Bug 1869908 has been marked as a duplicate of this bug. ***