Bug 1589968

Summary: 0014897: filesystem-3.2-25.el7.x86_64 wont install on LXC/docker containers due to /sys being read only.
Product: Red Hat Enterprise Linux 7 Reporter: James Lavoy <jalavoy>
Component: filesystemAssignee: Pavel Zhukov <pzhukov>
Status: CLOSED WONTFIX QA Contact: qe-baseos-daemons
Severity: low Docs Contact:
Priority: unspecified    
Version: 7.5CC: alex, bganta, bluems, connarpierce, fkrska, hasuzuki, jmatthew, silvaesilva, thozza, voidmaterial
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-12-16 16:09:14 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1780662    
Attachments:
Description Flags
example
none
strace of error none

Description James Lavoy 2018-06-11 18:35:02 UTC
Created attachment 1450175 [details]
example

Description of problem:
due to /sys being a read only mount inside of the LXC container, filesystem attempts to chown /sys and cannot, causing a cpio failure.

Running transaction
  Updating : filesystem-3.2-25.el7.x86_64 1/2
Error unpacking rpm package filesystem-3.2-25.el7.x86_64
error: unpacking of archive failed on file /sys: cpio: chown
  Verifying : filesystem-3.2-25.el7.x86_64 1/2
filesystem-3.2-21.el7.x86_64 was supposed to be removed but is not!
  Verifying : filesystem-3.2-21.el7.x86_64 


Version-Release number of selected component (if applicable):
3.2-25.el7


How reproducible:
Always

Steps to Reproduce:
yum update filesystem -y

or

 yum install --downloadonly filesystem && rpm -Uvh /var/cache/yum/x86_64/7/base/packages/filesystem-3.2-25.el7.x86_64.rpm --force

Actual results:
error: unpacking of archive failed on file /sys: cpio: chown failed - Read-only file system
error: filesystem-3.2-25.el7.x86_64: install failed

Expected results:
Preparing...                          ################################# [100%]
Updating / installing...
   1:filesystem-3.2-25.el7            ################################# [ 50%]
Cleaning up / removing...
   2:filesystem-3.2-21.el7            ################################# [100%]

Additional info:
This is likely a problem on fedora as well although I have not tested it.

I've also made a bug report with CentOS (where I saw the problem) here: https://bugs.centos.org/view.php?id=14897

Comment 2 Ondrej Vasik 2018-06-11 20:38:38 UTC
Not much I can do about it from filesystem spec file... it is not the first report of this issue, but I haven't seen any proposal of solution yet. Actually, filesystem has /sys with 555 permissions, so readonly as well... if the permissions are the same in the docker, then chown should not be done by rpm. If the permissions or owner/group differ, then it can be a problem when updating.

Comment 3 James Lavoy 2018-06-11 20:44:51 UTC
Is there a reason we don't just remove the %attr macro and replace it with a non failure causing chown and chmod?

That would still give the desired effect when possible without breaking when not.

Comment 4 James Lavoy 2018-06-11 20:58:22 UTC
Ahh I see,

Removing the %files definition for /sys would cause /sys to be unlinked on update (if it weren't readonly).

Hmm, I don't have a good solution for this either I guess.

Comment 5 Ondrej Vasik 2018-06-12 07:37:05 UTC
Only way I'm aware of is to %ghost the /sys dir in %files and do some lua magic in pretrans scriptlet - however, it is a bit fragile and I broke filesystem (and buildroot for everyone) by such lua scriptlets in the past...

Comment 6 James Lavoy 2018-06-12 13:53:16 UTC
Well I leave it up to you. But I definitely think it needs to be fixed at some point.

Comment 7 alex 2018-12-12 21:08:10 UTC
Is there any workaround for this?

Comment 8 bluems 2019-03-13 05:34:44 UTC
HP DL360 G5 has this bug.

Comment 9 Martin Osvald 🛹 2019-06-19 13:21:47 UTC
Hello James,

I am a new maintainer of filesystem package and got to this when going through the existing BZs. I hope you still be able and willing to help me with this.

I would point out Ondrej's statement in comment 2:

~~~
... if the permissions are the same in the docker, then chown should not be done by rpm. ...
~~~

which is true unless lstat() fails from some unexpected reason at line #1394:

rpm-4.11.3/lib/fsm.c:
~~~
1389 static int fsmChown(const char *path, uid_t uid, gid_t gid)
1390 {
1391     int rc = chown(path, uid, gid);
1392     if (rc < 0) {
1393         struct stat st;
1394         if (lstat(path, &st) == 0 && st.st_uid == uid && st.st_gid == gid)
1395             rc = 0;
1396     }
1397     if (_fsm_debug)
1398         rpmlog(RPMLOG_DEBUG, " %8s (%s, %d, %d) %s\n", __func__,
1399                path, (int)uid, (int)gid,
1400                (rc < 0 ? strerror(errno) : ""));
1401     if (rc < 0) rc = CPIOERR_CHOWN_FAILED;
1402     return rc;
1403 }
~~~

The above shows that if chown() fails at line #1391 (and it surely failed for you) the line #1395 should normally lead to preventing any error in case st_uid and st_gid are the same as root:root which I expect to be set such for you so I wonder why lstat() at line #1394 failed which then led to returning CPIOERR_CHOWN_FAILED at line #1401.

So, please, to debug this further could you reproduce the problem by executing the below commands and attaching the resulting file 'rpm-strace.txt' and output from ls command to the BZ then?

~~~
# strace -qfTttvys 4096 -o rpm-strace.txt rpm -Uvh filesystem-3.2-25.el7.x86_64.rpm
# ls -ld / /sys
~~~

The resulting file from the strace command should shed some light on why it is failing for you even it shouldn't.

In case of any problems with the strace command, please, let me know.

Thank you!

Comment 10 James Lavoy 2019-06-19 14:43:07 UTC
Hey Martin, thanks for taking a look.

The box I originally saw this on doesn't exist anymore, so I made a brand new container to test this on. Note that this is an unprivileged container. Which did result in the same behavior.

> [root@filesystem ~]# strace -qfTttvys 4096 -o rpm-strace.txt rpm -Uvh filesystem-3.2-25.el7.x86_64.rpm
> Preparing...                          ################################# [100%]
> Updating / installing...
>    1:filesystem-3.2-25.el7            ################################# [ 50%]
> error: unpacking of archive failed on file /sys: cpio: chown failed - Operation not permitted
> error: filesystem-3.2-25.el7.x86_64: install failed
> error: filesystem-3.2-21.el7.x86_64: erase skipped
> [root@filesystem ~]#


> [root@filesystem ~]# ls -ld / /sys
> dr-xr-xr-x 18 root  root  23 Jun 19 14:36 /
> dr-xr-xr-x 13 65534 65534  0 Jun 19 14:36 /sys

strace attached

Comment 11 James Lavoy 2019-06-19 14:43:35 UTC
Created attachment 1582270 [details]
strace of error

Comment 12 James Lavoy 2019-06-19 14:47:42 UTC
Interestingly, however not surprisingly. This seems to work fine on a privileged container, where /sys is seen by the container as being owned by root.

> [root@filesystem ~]# strace -qfTttvys 4096 -o rpm-strace-privileged.txt rpm -Uvh filesystem-3.2-25.el7.x86_64.rpm
> Preparing...                          ################################# [100%]
> Updating / installing...
>    1:filesystem-3.2-25.el7            ################################# [ 50%]
> Cleaning up / removing...
>    2:filesystem-3.2-21.el7            ################################# [100%]
> [root@filesystem ~]# ls -ld / /sys
> dr-xr-xr-x 18 root root 23 Jun 19 14:46 /
> dr-xr-xr-x 13 root root  0 Jun 19 14:36 /sys

Comment 13 Martin Osvald 🛹 2019-06-20 06:38:33 UTC
Thank you for providing the strace output!

This is follow up to investigation made in comment 9...

That's strange, according to strace chown() as expected fails, but lstat() doesn't and it is returning a strange uid and gid of value 65534 instead of 0 for the /sys directory:

~~~
332   14:36:51.644994 chown("/sys", 0, 0) = -1 EPERM (Operation not permitted) <0.000020>
332   14:36:51.645077 lstat("/sys", {st_dev=makedev(0, 131), st_ino=1, st_mode=S_IFDIR|0555, st_nlink=13, st_uid=65534, st_gid=65534, st_blksize=4096, st_blocks=0, st_size=0, st_atime=2019/06/19-14:36:26.408510672, st_mtime=2019/06/19-14:36:26.408510672, st_ctime=2019/06/19-14:36:26.408510672}) = 0 <0.000019>
~~~

thus not setting 0 as retval, thus failing with CPIOERR_CHOWN_FAILED.

I am no expert in containers so don't know right now why for unpriviledged containers /sys has uid/gid 65534 and whether it is right/expected, but it breaks the intended logic in fsmChown() and expectation of spec file for privileges being 0/root for it all the time.

unprivileged container:

> [root@filesystem ~]# ls -ld / /sys
> dr-xr-xr-x 18 root  root  23 Jun 19 14:36 /
> dr-xr-xr-x 13 65534 65534  0 Jun 19 14:36 /sys

vs privileged:

> [root@filesystem ~]# ls -ld / /sys
> dr-xr-xr-x 18 root root 23 Jun 19 14:46 /
> dr-xr-xr-x 13 root root  0 Jun 19 14:36 /sys


I will try to investigate what we could/should do in this situation/scenario and let you know...

Comment 14 James Lavoy 2019-06-20 14:53:49 UTC
Just in case you aren't aware of what the difference between privileged and unprivileged are, this sums it up pretty well: https://linuxcontainers.org/lxc/getting-started/#creating-unprivileged-containers-as-a-user

I expect this is expected behavior, I just don't know how we'd work around it (or if we can)

Comment 15 Balakrishna 2019-09-13 17:58:15 UTC
I have also face same issue, this is because if any file system partition is 100% it will not allow to update filesystem package. When checked i have mounted latest ISO on /media which showing 100% which is causing the problem, unmounted the /media i am able to done update else you can also free up the space in the partition which 100%. 

/dev/loop0                      iso9660   4.2G  4.2G     0 100% /media

Comment 18 Tomáš Hozza 2019-12-16 16:09:06 UTC
Red Hat Enterprise Linux version 7 entered the Maintenance Support 1 Phase in August 2019. In this phase only qualified Critical and Important Security errata advisories (RHSAs) and Urgent Priority Bug Fix errata advisories (RHBAs) may be released as they become available. Other errata advisories may be delivered as appropriate.

This bug has been reviewed by Support and Engineering representative and does not meet the inclusion criteria for Maintenance Support 1 Phase.

For more information about Red Hat Enterprise Linux Lifecycle, please see https://access.redhat.com/support/policy/updates/errata/

Comment 19 RHEL Program Management 2019-12-16 16:09:14 UTC
Development Management has reviewed and declined this request. You may appeal this decision by using your Red Hat support channels, who will make certain  the issue receives the proper prioritization with product and development management.

https://www.redhat.com/support/process/production/#howto

Comment 20 silvaesilva 2020-12-07 21:34:17 UTC
Not trying to necro this thread but this problem still exists for RedHat 7 AND 8.

For LXC, this workaround, from proxmox thread, allows the upgrade to proceed:

echo "%_netsharedpath /sys:/proc" >> /etc/rpm/macros.dist; yum -y update

Comment 21 Pavel Zhukov 2020-12-08 11:37:43 UTC
(In reply to silvaesilva from comment #20)
> Not trying to necro this thread but this problem still exists for RedHat 7
> AND 8.
> 
> For LXC, this workaround, from proxmox thread, allows the upgrade to proceed:
> 
> echo "%_netsharedpath /sys:/proc" >> /etc/rpm/macros.dist; yum -y update

The issue has been fixed in Fedora rawhide and will land into next major release of Red Hat Enterprise Linux eventually.