Bug 1571994 - ovirt-image-daemon fails to start due to permissions on ovirt-image-daemon log file causing host deployment to fail
Summary: ovirt-image-daemon fails to start due to permissions on ovirt-image-daemon lo...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-imageio-daemon
Version: 4.1.10
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ovirt-4.2.3
: ---
Assignee: Daniel Erez
QA Contact: Ying Cui
URL:
Whiteboard:
Depends On:
Blocks: 1574384
TreeView+ depends on / blocked
 
Reported: 2018-04-25 22:09 UTC by amashah
Modified: 2020-08-03 15:29 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1574384 (view as bug list)
Environment:
Last Closed: 2018-05-15 17:57:01 UTC
oVirt Team: Storage
Target Upstream Version:
lsvaty: testing_plan_complete-


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Bugzilla 1559029 None CLOSED nodectl motd breaks host-deploy 2019-09-05 09:37:50 UTC
Red Hat Knowledge Base (Article) 3424031 None None None 2018-04-25 22:09:40 UTC
Red Hat Knowledge Base (Solution) 3424031 None None None 2018-04-27 22:25:31 UTC
Red Hat Product Errata RHBA-2018:1520 None None None 2018-05-15 17:57:34 UTC
oVirt gerrit 90699 master MERGED daemon: spec - set daemon.log permissions 2020-06-25 09:53:37 UTC
oVirt gerrit 90879 ovirt-4.1 ABANDONED daemon: spec - set daemon.log permissions 2020-06-25 09:53:37 UTC

Internal Links: 1559029

Description amashah 2018-04-25 22:09:41 UTC
Description of problem:

During a new deployment using RHV-H 4.1.10 image 20180315 (based on RHEL 7.50)

HE deploy fails due to ovirt-imageio-daemon:

2018-04-24 16:09:29 DEBUG otopi.context context._executeMethod:142 method exception
Traceback (most recent call last):
  File "/tmp/ovirt-UltQreVLyp/pythonlib/otopi/context.py", line 132, in _executeMethod
    method['method']()
  File "/tmp/ovirt-UltQreVLyp/otopi-plugins/ovirt-host-deploy/vdsm/packages.py", line 183, in _start
    self.services.state('ovirt-imageio-daemon', True)
  File "/tmp/ovirt-UltQreVLyp/otopi-plugins/otopi/services/systemd.py", line 141, in state
    service=name,
RuntimeError: Failed to start service 'ovirt-imageio-daemon'

After failure, when attempting to manually restart ovirt-imageio-daemon:
After that following to message logs while trying to restart ovirt-imageio-daemon 

Apr 23 16:01:13 rhv-host ovirt-imageio-daemon: stream = open(self.baseFilename, self.mode)
Apr 23 16:01:13 rhv-host ovirt-imageio-daemon: IOError: [Errno 13] Permission denied: '/var/log/ovirt-imageio-daemon/daemon.log'
Apr 23 16:01:13 rhv-host systemd: ovirt-imageio-daemon.service: main process exited, code=exited, status=1/FAILURE
Apr 23 16:01:13 rhv-host systemd: Failed to start oVirt ImageIO Daemon.
Apr 23 16:01:13 rhv-host systemd: Unit ovirt-imageio-daemon.service entered failed state.
Apr 23 16:01:13 rhv-host systemd: ovirt-imageio-daemon.service failed


Permission issue on daemon.log:

[root@rhel-li-3 ~]# ls -laZ /var/log/ovirt-imageio-daemon
drwx------. vdsm kvm  system_u:object_r:var_log_t:s0   .
drwxr-xr-x. root root system_u:object_r:var_log_t:s0   ..
-rw-r--r--. root root system_u:object_r:var_log_t:s0   daemon.log

Once I change the owner daemon.log file
   chown vdsm.kvm /var/log/ovirt-imageio-daemon/daemon.log


Version-Release number of selected component (if applicable):
RHV-H 4.1.10 

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:

root:root permissions on daemon.log

Expected results:

vdsm:kvm permissions on daemon.log


Additional info:
Somewhat related, but different:
KB Article: https://access.redhat.com/solutions/2930491
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1425502

Comment 1 Sandro Bonazzola 2018-04-26 06:58:48 UTC
Yuval, is this due to the issue we had in appliance about wrong permissions?

Comment 4 Yuval Turgeman 2018-04-26 08:14:55 UTC
Sandro, probably not - the ovirt-imageio-daemon is installed in RHVH and looks like the packaging of this rpm is wrong (unless a chown is expeted somewhere):

[root@node-6740 ovirt_imageio_daemon]# rpm -q --dump ovirt-imageio-daemon|grep "daemon.log "
/var/log/ovirt-imageio-daemon/daemon.log 0 1521368643 e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 0100644 root root 0 0 0 X

Comment 5 Daniel Erez 2018-04-26 11:32:16 UTC
(In reply to Yuval Turgeman from comment #4)
> Sandro, probably not - the ovirt-imageio-daemon is installed in RHVH and
> looks like the packaging of this rpm is wrong (unless a chown is expeted
> somewhere):
> 
> [root@node-6740 ovirt_imageio_daemon]# rpm -q --dump
> ovirt-imageio-daemon|grep "daemon.log "
> /var/log/ovirt-imageio-daemon/daemon.log 0 1521368643
> e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 0100644
> root root 0 0 0 X

The permissions of daemon.log file should be vdsm:kvm. It is set in 
ovirt-imageio-daemon.spec.in: "%dir %attr(755, vdsm, kvm) %{logdir}".
How is the daemon installed in HE env? Any difference than regular host deployment? Which permissions the appliance gets? Perhaps something changed in HE deployment recently (as I'm not aware of such issue in previous builds).

Comment 6 Yuval Turgeman 2018-04-26 12:05:26 UTC
I'm not sure which version of ovirt-imageio-daemon we are talking about, but `rpm -q --dump` queries the rpm db itself.
I checked the latest version as well:

[yturgema@piggie ~/Downloads]$ rpm -qp --dump ovirt-imageio-daemon-1.3.1-0.el7ev.noarch.rpm|grep "daemon.log "
/var/log/ovirt-imageio-daemon/daemon.log 0 1523781570 e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 0100644 root root 0 0 0 X

As for the spec, I think the problem is this:

touch %{buildroot}%{logdir}/daemon.log

since it's created as root

Try to change the %ghost line for daemon.log to something like:
%ghost %attr(644, vdsm, kvm) %{logdir}/daemon.log*

Comment 7 Yaniv Kaul 2018-04-26 12:10:36 UTC
Daniel, can you take a look?

Comment 8 Daniel Erez 2018-04-26 12:48:04 UTC
(In reply to Yuval Turgeman from comment #6)
> I'm not sure which version of ovirt-imageio-daemon we are talking about, but
> `rpm -q --dump` queries the rpm db itself.
> I checked the latest version as well:
> 
> [yturgema@piggie ~/Downloads]$ rpm -qp --dump
> ovirt-imageio-daemon-1.3.1-0.el7ev.noarch.rpm|grep "daemon.log "
> /var/log/ovirt-imageio-daemon/daemon.log 0 1523781570
> e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 0100644
> root root 0 0 0 X
> 
> As for the spec, I think the problem is this:
> 
> touch %{buildroot}%{logdir}/daemon.log
> 
> since it's created as root
> 
> Try to change the %ghost line for daemon.log to something like:
> %ghost %attr(644, vdsm, kvm) %{logdir}/daemon.log*

Added - https://gerrit.ovirt.org/#/c/90699/

Seems fine now:

$ rpm -qp --dump ovirt-imageio-daemon-1.3.2-0.fc26.noarch.rpm | grep "daemon.log "

/var/log/ovirt-imageio-daemon/daemon.log 0 1524746121 e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 0100644 vdsm kvm 0 0 0 X

Comment 9 Daniel Erez 2018-04-26 13:32:45 UTC
(In reply to Daniel Erez from comment #8)
> (In reply to Yuval Turgeman from comment #6)
> > I'm not sure which version of ovirt-imageio-daemon we are talking about, but
> > `rpm -q --dump` queries the rpm db itself.
> > I checked the latest version as well:
> > 
> > [yturgema@piggie ~/Downloads]$ rpm -qp --dump
> > ovirt-imageio-daemon-1.3.1-0.el7ev.noarch.rpm|grep "daemon.log "
> > /var/log/ovirt-imageio-daemon/daemon.log 0 1523781570
> > e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 0100644
> > root root 0 0 0 X
> > 
> > As for the spec, I think the problem is this:
> > 
> > touch %{buildroot}%{logdir}/daemon.log
> > 
> > since it's created as root
> > 
> > Try to change the %ghost line for daemon.log to something like:
> > %ghost %attr(644, vdsm, kvm) %{logdir}/daemon.log*
> 
> Added - https://gerrit.ovirt.org/#/c/90699/
> 
> Seems fine now:
> 
> $ rpm -qp --dump ovirt-imageio-daemon-1.3.2-0.fc26.noarch.rpm | grep
> "daemon.log "
> 
> /var/log/ovirt-imageio-daemon/daemon.log 0 1524746121
> e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 0100644
> vdsm kvm 0 0 0 X

@Simone - would it be good enough for fixing the HE deployment?

@Amar - is this issue reproducible only on a specific version? Can you please try it in latest 4.2 build?

Comment 10 Yuval Turgeman 2018-04-26 13:35:18 UTC
It's probably not enough since it's %ghost, it would just fix the `rpm --verify`.  What versions are you running exactly ?

Comment 12 Simone Tiraboschi 2018-04-26 16:06:57 UTC
(In reply to Daniel Erez from comment #9)
> @Simone - would it be good enough for fixing the HE deployment?

Yes, I think so

Comment 16 Daniel Erez 2018-04-29 12:53:37 UTC
(In reply to Simone Tiraboschi from comment #12)
> (In reply to Daniel Erez from comment #9)
> > @Simone - would it be good enough for fixing the HE deployment?
> 
> Yes, I think so

I'll release a new build of imageio then, so we'll see if it gets reproduced.

Comment 23 Nikolai Sednev 2018-05-01 12:00:35 UTC
what are reproduction steps?
Is this a RHEVH specific?

Comment 26 Nikolai Sednev 2018-05-02 08:41:15 UTC
Not being reproduced on RHEL systems:
alma03 ~]# ls -laZ /var/log/ovirt-imageio-daemon
drwxr-xr-x. vdsm kvm  system_u:object_r:var_log_t:s0   .
drwxr-xr-x. root root system_u:object_r:var_log_t:s0   ..
-rw-r--r--. vdsm kvm  system_u:object_r:var_log_t:s0   daemon.log

All deployments on all types of storages except from FC were successful for vintage and Node 0 using CLI.

Works for me on these components:
ovirt-engine-4.2.3.3-0.1.el7.noarch
rhvm-appliance-4.2-20180427.0.el7.noarch
ovirt-hosted-engine-setup-2.2.19-1.el7ev.noarch
ovirt-hosted-engine-ha-2.2.11-1.el7ev.noarch
Linux 3.10.0-862.el7.x86_64 #1 SMP Wed Mar 21 18:14:51 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux Server release 7.5 (Maipo)

Please retest on RHVH as it turns to be RHVH specific bug.

Comment 27 Yuval Turgeman 2018-05-02 09:21:15 UTC
You can reproduce this on RHEL as follows:

1. Make sure ovirt-imageio-daemon service is stopped and /var/log/ovirt-imageio-daemon/daemon.log doesn't exist.
2. systemctl start ovirt-imageio-daemon

3. Make sure ownership for /var/log/ovirt-imageio-daemon/daemon.log is correct:

[root@node-7140 ~]# ls -l /var/log/ovirt-imageio-daemon/daemon.log
-rw-r--r--. 1 vdsm kvm 1210 Apr 30 18:29 /var/log/ovirt-imageio-daemon/daemon.log

4. rpm -V ovirt-imageio-daemon will show wrong user/group:

[root@node-7140 ~]# rpm -V ovirt-imageio-daemon
.....UG..  g /var/log/ovirt-imageio-daemon/daemon.log

5. Ask rpm to fix the files according to its packaging:

[root@node-7140 ~]# rpm --setugids ovirt-imageio-daemon

6. Check again

[root@node-7140 ~]# ls -l /var/log/ovirt-imageio-daemon/daemon.log
-rw-r--r--. 1 root root 1362 May  2 09:15 /var/log/ovirt-imageio-daemon/daemon.log

Comment 28 Nikolai Sednev 2018-05-02 10:17:11 UTC
Reproduction in this way will probably work, but the issue is in that on RHEL the file being created with correct UID and UG, thus deployment will be successful, so its solely RHVH specific issue.

Comment 29 Yuval Turgeman 2018-05-02 10:54:24 UTC
(In reply to Nikolai Sednev from comment #28)
> Reproduction in this way will probably work, but the issue is in that on
> RHEL the file being created with correct UID and UG, thus deployment will be
> successful, so its solely RHVH specific issue.

Right, unless a RHEL-H user would run for some reason "rpm --setugids"...  Anyway, I just thought it was important to document the steps :)

Comment 30 Daniel Erez 2018-05-02 14:49:38 UTC
Released fix to ovirt-imageio-daemon-1.3.1.1.

Comment 34 Allon Mureinik 2018-05-07 12:33:18 UTC
Daniel, can we respin the errata to get this included in the latest 4.2.3 compose?

Comment 35 Daniel Erez 2018-05-07 12:35:33 UTC
(In reply to Allon Mureinik from comment #34)
> Daniel, can we respin the errata to get this included in the latest 4.2.3
> compose?

Yes, already added imageio-1.3.1.2 to 4.2.3.

Comment 36 Allon Mureinik 2018-05-07 12:50:22 UTC
hmm... The ET is already on REL_PREP, moving to ON_QA.

Comment 37 Ying Cui 2018-05-08 10:18:09 UTC
VERIFIED the bug on build redhat-virtualization-host-4.2-20180507.0(ovirt-imageio-daemon-1.3.1.2-0.el7ev.noarch)

Pre-check: Reproduced the issue with released build redhat-virtualization-host-4.1-20180314.0 upgrade to redhat-virtualization-host-4.1-20180410.1. After upgrading, the ownership for /var/log/ovirt-imageio-daemon/daemon.log was incorrect as the following. Host deployment failed.

# ls -al /var/log/ovirt-imageio-daemon/daemon.log
-rw-r--r--. 1 root root 302 May  8 09:50 /var/log/ovirt-imageio-daemon/daemon.log

Steps to verify this bug:
1. Installed the version redhat-virtualization-host-4.1-20180314.0
2. Made the daemon ovirt-imageio-daemon is running to generate the daemon.log before upgrading.(start the daemon manually or host deployment process and stop.)
3. Checked the ownership of /var/log/ovirt-imageio-daemon/daemon.log was vdsm:kvm
4. Upgraded to the version redhat-virtualization-host-4.2-20180507.0 successful by yum update.
5. Checked the ownership of /var/log/ovirt-imageio-daemon/daemon.log, the ownership is correct with vdsm:kvm after upgrading.
# ls -al /var/log/ovirt-imageio-daemon/daemon.log 
-rw-r--r--. 1 vdsm kvm 1096 May  8 13:28 /var/log/ovirt-imageio-daemon/daemon.log
6. Host deployment on the environment successful on nfs storage.

# hosted-engine --vm-status
--== Host 1 status ==--

conf_on_shared_storage             : True
Status up-to-date                  : True
Hostname                           : ibm-x3650m5-06.lab.eng.pek2.redhat.com
Host ID                            : 1
Engine status                      : {"health": "good", "vm": "up", "detail": "Up"}
Score                              : 3400
stopped                            : False
Local maintenance                  : False
crc32                              : 2cd86815
local_conf_timestamp               : 10967
Host timestamp                     : 10967
Extra metadata (valid at timestamp):
	metadata_parse_version=1
	metadata_feature_version=1
	timestamp=10967 (Tue May  8 14:22:43 2018)
	host-id=1
	score=3400
	vm_conf_refresh_time=10967 (Tue May  8 14:22:44 2018)
	conf_on_shared_storage=True
	maintenance=False
	state=EngineUp
	stopped=False

Comment 40 errata-xmlrpc 2018-05-15 17:57:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1520

Comment 41 Franta Kust 2019-05-16 13:07:42 UTC
BZ<2>Jira Resync

Comment 42 Daniel Gur 2019-08-28 13:14:11 UTC
sync2jira

Comment 43 Daniel Gur 2019-08-28 13:18:28 UTC
sync2jira


Note You need to log in before you can comment on or make changes to this bug.