Bug 1768511 - ovirt-ha-broker "sometimes" fails to load on RHEL8 due to a permission error on a systemd defined RuntimeDirectory
Summary: ovirt-ha-broker "sometimes" fails to load on RHEL8 due to a permission error ...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-hosted-engine-ha
Classification: oVirt
Component: Broker
Version: 2.3.5
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ovirt-4.4.0
: 2.4.0
Assignee: Simone Tiraboschi
QA Contact: Nikolai Sednev
URL:
Whiteboard:
Depends On: 1795672
Blocks: 1726988
TreeView+ depends on / blocked
 
Reported: 2019-11-04 16:03 UTC by Simone Tiraboschi
Modified: 2020-05-28 09:36 UTC (History)
2 users (show)

Fixed In Version: ovirt-hosted-engine-ha-2.4.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-05-20 20:03:23 UTC
oVirt Team: Integration
Embargoed:
sbonazzo: ovirt-4.4?


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github oVirt ovirt-ansible-hosted-engine-setup pull 262 0 'None' closed ovirt-ha-broker first start workaround 2020-11-23 19:12:23 UTC
oVirt gerrit 104349 0 'None' MERGED Other python3 fixes 2021-02-10 17:49:58 UTC

Description Simone Tiraboschi 2019-11-04 16:03:09 UTC
Description of problem:
ovirt-ha-broker "sometimes" fails to load on RHEL8 due to a permission error.

On daemon start, ovirt-ha-broker try to bind a Unix domain socket at /var/run/ovirt-hosted-engine-ha/broker.socket but it's not always successful and it can fail with a '[Errno 13] Permission denied' exception.

/var/run/ovirt-hosted-engine-ha is going to be created by systemd due a RuntimeDirectory directive:

User=vdsm
Group=kvm
RuntimeDirectory=ovirt-hosted-engine-ha
RuntimeDirectoryMode=0755

User, group and RuntimeDirectoryMode shoulb be enough to have writable by ovirt-ha-broker process running as vdsm user but this is not always true on RHEL8.

Version-Release number of selected component (if applicable):


How reproducible:
not systematic

Steps to Reproduce:
1. try to deploy hosted-engine
2. check /var/log/ovirt-hosted-engine-ha/broker.log for "[Errno 13] Permission denied"
3.

Actual results:
Listener::ERROR::2019-11-04 16:39:30,230::broker::68::ovirt_hosted_engine_ha.broker.broker.Broker::(run) Failed initializing the broker: [Errno 13] Permission denied
Listener::ERROR::2019-11-04 16:39:30,233::broker::70::ovirt_hosted_engine_ha.broker.broker.Broker::(run) Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/broker.py", line 65, in run
    self._listener = self._get_listener()
  File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/broker.py", line 155, in _get_listener
    self._status_broker_instance)
  File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py", line 54, in __init__
    self._server = unixrpc.UnixXmlRpcServer(constants.BROKER_SOCKET_FILE)
  File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/unixrpc.py", line 41, in __init__
    request_handler)
  File "/usr/lib64/python3.6/socketserver.py", line 456, in __init__
    self.server_bind()
  File "/usr/lib64/python3.6/socketserver.py", line 470, in server_bind
    self.socket.bind(self.server_address)
PermissionError: [Errno 13] Permission denied

Listener::ERROR::2019-11-04 16:39:30,234::broker::71::ovirt_hosted_engine_ha.broker.broker.Broker::(run) Trying to restart the broker


Expected results:
ovirt-ha-broker always correctly starts

Additional info:
no reproducible on RHEL7

Comment 1 Yedidyah Bar David 2020-01-02 10:53:45 UTC
Simone, did you file a bug against EL/systemd about this?

Sandro: It seems like Simone deliberately left the bug on POST, hoping to revert the temporary workaround when a real solution is available, but then you moved the bug to MODIFIED, so we need some other means to track it.

Comment 2 Nikolai Sednev 2020-04-01 15:17:01 UTC
Not seen on RHEL8.2, worked for me on fresh and clean environment, on which I successfully deployed HE 4.4 on NFS.

Tested on host with these components:
rhvm-appliance.x86_64 2:4.4-20200326.0.el8ev
ovirt-hosted-engine-setup-2.4.4-1.el8ev.noarch
ovirt-hosted-engine-ha-2.4.2-1.el8ev.noarch
Red Hat Enterprise Linux release 8.2 Beta (Ootpa)
Linux 4.18.0-193.el8.x86_64 #1 SMP Fri Mar 27 14:35:58 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

Engine:
ovirt-engine-setup-base-4.4.0-0.26.master.el8ev.noarch
ovirt-engine-4.4.0-0.26.master.el8ev.noarch
openvswitch2.11-2.11.0-48.el8fdp.x86_64
Linux 4.18.0-192.el8.x86_64 #1 SMP Tue Mar 24 14:06:40 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux release 8.2 Beta (Ootpa)

Comment 3 Sandro Bonazzola 2020-05-20 20:03:23 UTC
This bugzilla is included in oVirt 4.4.0 release, published on May 20th 2020.

Since the problem described in this bug report should be
resolved in oVirt 4.4.0 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.

Comment 4 Simone Tiraboschi 2020-05-28 09:36:24 UTC
(In reply to Yedidyah Bar David from comment #1)
> Simone, did you file a bug against EL/systemd about this?

No, still not


Note You need to log in before you can comment on or make changes to this bug.