Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1436172

Summary: systemd-tmpfiles-clean.service fails on RHV-H
Product: Red Hat Enterprise Virtualization Manager Reporter: Roman Hodain <rhodain>
Component: ovirt-node-ngAssignee: Ryan Barry <rbarry>
Status: CLOSED ERRATA QA Contact: Yihui Zhao <yzhao>
Severity: high Docs Contact:
Priority: high    
Version: 4.0.7CC: coli, cshao, dfediuck, dguo, dougsland, eheftman, hachen, huzhao, jiawu, mgoldboi, michen, qiyuan, rbarry, rhodain, sbonazzo, virt-bugs, weiwang, yaniwang, ycui, yzhao
Target Milestone: ovirt-4.1.1-1Flags: rbarry: needinfo-
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: imgbased-0.9.20-0.1.el7ev Doc Type: Bug Fix
Doc Text:
Previously, imgbased did not add groups that were present on the new image but not the old one. As a result, systemd-tmpfiles-clean.service failed. In this release, imgbased adds groups from new layers and systemd-tmpfiles-clean.service is functioning properly.
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-04-20 19:05:04 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Node RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Roman Hodain 2017-03-27 11:52:27 UTC
Description of problem:
    Service systemd-tmpfiles-clean.service fails during the boot of the RHV-H node.

Version-Release number of selected component (if applicable):
    rhvh-4.0-0.20170307.0

How reproducible:
    100%

Steps to Reproduce:
    1) Boot rhvh-4.0-0.20170307.0
    2) systemctl status systemd-tmpfiles-clean.service

Actual results:
systemd-tmpfiles-clean.service - Cleanup of Temporary Directories
   Loaded: loaded (/usr/lib/systemd/system/systemd-tmpfiles-clean.service; static; vendor preset: disabled)
   Active: failed (Result: exit-code) since Mon 2017-03-27 11:44:17 GMT; 6min ago
     Docs: man:tmpfiles.d(5)
           man:systemd-tmpfiles(8)
  Process: 9987 ExecStart=/usr/bin/systemd-tmpfiles --clean (code=exited, status=1/FAILURE)
 Main PID: 9987 (code=exited, status=1/FAILURE)

Mar 27 11:44:17 vmware-172.gsslab.brq.redhat.com systemd[1]: Starting Cleanup of Temporary Directories...
Mar 27 11:44:17 vmware-172.gsslab.brq.redhat.com systemd-tmpfiles[9987]: [/usr/lib/tmpfiles.d/screen.conf:2] Unknown group 'screen'.

Expected results:
Package screen is installed on the node, but the screen group which is supposed to be created during the package installation is not create on the host.

Comment 1 Ryan Barry 2017-03-27 16:44:14 UTC
The cause of this is likely that we copy /etc/group over from the old layer.

New entries in /etc/group which are created (such as 'screen') are not added.

Unfortunately, it's not sufficient to simply grab /etc/group from the new layer. Or even to do a simple diff and add lines with "+", since GIDs may change between images.

Instead, we'll need to:

actually parse it
find keys which are not present in the old layer (group name is a reasonable key)
find out whether the GID it wants is already claimed
determine a new one if not
then add it

The same shoudl be performed for /etc/passwd (with a matching GID, if we added a new one), and a line added to /etc/shadow, which can be a straight diff

Comment 2 Ryan Barry 2017-03-27 16:50:05 UTC
The cause of this is likely that we copy /etc/group over from the old layer.

New entries in /etc/group which are created (such as 'screen') are not added.

Unfortunately, it's not sufficient to simply grab /etc/group from the new layer. Or even to do a simple diff and add lines with "+", since GIDs may change between images.

Instead, we'll need to:

actually parse it
find keys which are not present in the old layer (group name is a reasonable key)
find out whether the GID it wants is already claimed
determine a new one if not
then add it

The same shoudl be performed for /etc/passwd (with a matching GID, if we added a new one), and a line added to /etc/shadow, which can be a straight diff

Comment 4 Ryan Barry 2017-03-31 13:20:23 UTC
imgbased did not previously add groups which were present on the new image (but not the old one)

To reproduce:

Install a version of RHVH which does not include screen (4.0.5, for example).
Upgrade to 4.0.7/4.1

The service will fail

Comment 5 Yihui Zhao 2017-04-01 05:57:16 UTC
Can reproduce the issue.

Test Version: rhvh-4.0-0.20161116.0+1(not include screen package) upgrade to rhvh-4.0-0.20170307.0+1(not include screen package) , then upgrade to rhvh-4.1-0.20170323.0+1(include screen package)

Test steps:
1. Install rhvh-4.0-0.20161116.0+1 via ISO
2. Check service "systemctl status systemd-tmpfiles-clean.service"
3. Upgrade to rhvh-4.0-0.20170307.0+1
4. Check service "systemctl status systemd-tmpfiles-clean.service"
5. Upgrade to rhvh-4.1-0.20170323.0+1
6. Check service "systemctl status systemd-tmpfiles-clean.service"
7. Add host to engine 
8. Check service "systemctl status systemd-tmpfiles-clean.service"

Result:

1. After step2,4,6

[root@dell-per730-35 ~]# systemctl status systemd-tmpfiles-clean.service
● systemd-tmpfiles-clean.service - Cleanup of Temporary Directories
   Loaded: loaded (/usr/lib/systemd/system/systemd-tmpfiles-clean.service; static; vendor preset: disabled)
   Active: inactive (dead)
     Docs: man:tmpfiles.d(5)
           man:systemd-tmpfiles(8)



2. After step8, 

[root@dell-per730-35 ~]# systemctl status systemd-tmpfiles-clean.service
● systemd-tmpfiles-clean.service - Cleanup of Temporary Directories
   Loaded: loaded (/usr/lib/systemd/system/systemd-tmpfiles-clean.service; static; vendor preset: disabled)
   Active: failed (Result: exit-code) since Sat 2017-04-01 13:40:55 CST; 8min ago
     Docs: man:tmpfiles.d(5)
           man:systemd-tmpfiles(8)
 Main PID: 37731 (code=exited, status=1/FAILURE)

Apr 01 13:40:55 dell-per730-35.lab.eng.pek2.redhat.com systemd[1]: Starting Cleanup of Temporary Directories...
Apr 01 13:40:55 dell-per730-35.lab.eng.pek2.redhat.com systemd-tmpfiles[37731]: [/usr/lib/tmpfiles.d/screen.conf:2] Unknown group 'screen'.
Apr 01 13:40:55 dell-per730-35.lab.eng.pek2.redhat.com systemd[1]: systemd-tmpfiles-clean.service: main process exited, code=exited, status=1/FAILURE
Apr 01 13:40:55 dell-per730-35.lab.eng.pek2.redhat.com systemd[1]: Failed to start Cleanup of Temporary Directories.
Apr 01 13:40:55 dell-per730-35.lab.eng.pek2.redhat.com systemd[1]: Unit systemd-tmpfiles-clean.service entered failed state.
Apr 01 13:40:55 dell-per730-35.lab.eng.pek2.redhat.com systemd[1]: systemd-tmpfiles-clean.service failed.

Comment 6 Ryan Barry 2017-04-03 14:05:56 UTC
The fix for this is the same as rhbz#1435887, so it will end up in 4.1.1-1

Comment 9 Yihui Zhao 2017-04-05 07:24:00 UTC
Test version:
[root@dell-per730-35 ~]# imgbase w
[INFO] You are on rhvh-4.1-0.20170403.0+1
[root@dell-per730-35 ~]# imgbase layout
rhvh-4.0-0.20161116.0
 +- rhvh-4.0-0.20161116.0+1
rhvh-4.1-0.20170403.0
 +- rhvh-4.1-0.20170403.0+1

imgbased-0.9.20-0.1.el7ev.noarch

Steps:
1. Install rhvh-4.0-0.20161116.0+1
2. Check service "systemctl status systemd-tmpfiles-clean.service"
3. Upgrade to rhvh-4.1-0.20170403.0+1
4. Add host to engine
5. Check service "systemctl status systemd-tmpfiles-clean.service"

Results:
After step2, 

[root@dell-per730-35 ~]# systemctl status systemd-tmpfiles-clean.service
● systemd-tmpfiles-clean.service - Cleanup of Temporary Directories
   Loaded: loaded (/usr/lib/systemd/system/systemd-tmpfiles-clean.service; static; vendor preset: disabled)
   Active: inactive (dead)
     Docs: man:tmpfiles.d(5)
           man:systemd-tmpfiles(8)


After step5,

[root@dell-per730-35 ~]# systemctl status systemd-tmpfiles-clean.service
● systemd-tmpfiles-clean.service - Cleanup of Temporary Directories
   Loaded: loaded (/usr/lib/systemd/system/systemd-tmpfiles-clean.service; static; vendor preset: disabled)
   Active: inactive (dead)
     Docs: man:tmpfiles.d(5)
           man:systemd-tmpfiles(8)


So, the bug is fixed. Change status to verified.

Comment 10 Emma Heftman 2017-04-09 13:58:45 UTC
Hi Ryan. Can you please set the requires_doc_text flag, and add doc text if required. Thanks.

Comment 11 errata-xmlrpc 2017-04-20 19:05:04 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1114