Bug 1768511

Summary: ovirt-ha-broker "sometimes" fails to load on RHEL8 due to a permission error on a systemd defined RuntimeDirectory
Product: [oVirt] ovirt-hosted-engine-ha Reporter: Simone Tiraboschi <stirabos>
Component: BrokerAssignee: Simone Tiraboschi <stirabos>
Status: CLOSED CURRENTRELEASE QA Contact: Nikolai Sednev <nsednev>
Severity: high Docs Contact:
Priority: high    
Version: 2.3.5CC: bugs, didi
Target Milestone: ovirt-4.4.0Flags: sbonazzo: ovirt-4.4?
Target Release: 2.4.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ovirt-hosted-engine-ha-2.4.0 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-05-20 20:03:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Integration RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1795672    
Bug Blocks: 1726988    

Description Simone Tiraboschi 2019-11-04 16:03:09 UTC
Description of problem:
ovirt-ha-broker "sometimes" fails to load on RHEL8 due to a permission error.

On daemon start, ovirt-ha-broker try to bind a Unix domain socket at /var/run/ovirt-hosted-engine-ha/broker.socket but it's not always successful and it can fail with a '[Errno 13] Permission denied' exception.

/var/run/ovirt-hosted-engine-ha is going to be created by systemd due a RuntimeDirectory directive:

User=vdsm
Group=kvm
RuntimeDirectory=ovirt-hosted-engine-ha
RuntimeDirectoryMode=0755

User, group and RuntimeDirectoryMode shoulb be enough to have writable by ovirt-ha-broker process running as vdsm user but this is not always true on RHEL8.

Version-Release number of selected component (if applicable):


How reproducible:
not systematic

Steps to Reproduce:
1. try to deploy hosted-engine
2. check /var/log/ovirt-hosted-engine-ha/broker.log for "[Errno 13] Permission denied"
3.

Actual results:
Listener::ERROR::2019-11-04 16:39:30,230::broker::68::ovirt_hosted_engine_ha.broker.broker.Broker::(run) Failed initializing the broker: [Errno 13] Permission denied
Listener::ERROR::2019-11-04 16:39:30,233::broker::70::ovirt_hosted_engine_ha.broker.broker.Broker::(run) Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/broker.py", line 65, in run
    self._listener = self._get_listener()
  File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/broker.py", line 155, in _get_listener
    self._status_broker_instance)
  File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py", line 54, in __init__
    self._server = unixrpc.UnixXmlRpcServer(constants.BROKER_SOCKET_FILE)
  File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/unixrpc.py", line 41, in __init__
    request_handler)
  File "/usr/lib64/python3.6/socketserver.py", line 456, in __init__
    self.server_bind()
  File "/usr/lib64/python3.6/socketserver.py", line 470, in server_bind
    self.socket.bind(self.server_address)
PermissionError: [Errno 13] Permission denied

Listener::ERROR::2019-11-04 16:39:30,234::broker::71::ovirt_hosted_engine_ha.broker.broker.Broker::(run) Trying to restart the broker


Expected results:
ovirt-ha-broker always correctly starts

Additional info:
no reproducible on RHEL7

Comment 1 Yedidyah Bar David 2020-01-02 10:53:45 UTC
Simone, did you file a bug against EL/systemd about this?

Sandro: It seems like Simone deliberately left the bug on POST, hoping to revert the temporary workaround when a real solution is available, but then you moved the bug to MODIFIED, so we need some other means to track it.

Comment 2 Nikolai Sednev 2020-04-01 15:17:01 UTC
Not seen on RHEL8.2, worked for me on fresh and clean environment, on which I successfully deployed HE 4.4 on NFS.

Tested on host with these components:
rhvm-appliance.x86_64 2:4.4-20200326.0.el8ev
ovirt-hosted-engine-setup-2.4.4-1.el8ev.noarch
ovirt-hosted-engine-ha-2.4.2-1.el8ev.noarch
Red Hat Enterprise Linux release 8.2 Beta (Ootpa)
Linux 4.18.0-193.el8.x86_64 #1 SMP Fri Mar 27 14:35:58 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

Engine:
ovirt-engine-setup-base-4.4.0-0.26.master.el8ev.noarch
ovirt-engine-4.4.0-0.26.master.el8ev.noarch
openvswitch2.11-2.11.0-48.el8fdp.x86_64
Linux 4.18.0-192.el8.x86_64 #1 SMP Tue Mar 24 14:06:40 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux release 8.2 Beta (Ootpa)

Comment 3 Sandro Bonazzola 2020-05-20 20:03:23 UTC
This bugzilla is included in oVirt 4.4.0 release, published on May 20th 2020.

Since the problem described in this bug report should be
resolved in oVirt 4.4.0 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.

Comment 4 Simone Tiraboschi 2020-05-28 09:36:24 UTC
(In reply to Yedidyah Bar David from comment #1)
> Simone, did you file a bug against EL/systemd about this?

No, still not