Description of problem: This problem was previously reported and fixed in: https://bugzilla.redhat.com/show_bug.cgi?id=1869030 but the patch for that bug was then backed out, because it was only thought to be relevant in context of systemd-nspawn. In practice this appears to hit much more widely. This glibc change has broken Fedora rawhide when running inside docker containers. This has broken libvirt CI running on GitLab CI, which uses docker Downgrading from glibc-2.32.9000-16.fc34.x86_64.rpm, back to glibc-2.32.9000-15.fc34.x86_64.rpm, which has the workaround, fixes execution inside docker but that's not viable for users todo manually every time. A patch has been sent to Docker to allow faccessat2, but that is pretty recent and so it is not widely deployed at this time. https://github.com/moby/moby/pull/41353/files I think that glibc needs to keep the workaround from bug 1869030 a good while longer yet, to allow time for fixed docker to become available widely. I suspect the more general root cause problem lies in 'runc' which is returning EPERM when filtering syscalls instead of ENOSYS https://github.com/opencontainers/runc/issues/2151 Version-Release number of selected component (if applicable): glibc-2.32.9000-16.fc34.x86_64.rpm How reproducible: I've seen it in docker under GitLab Ci, but not in podman running locally, but I suspect that's because my podman setup isn't doing syscall filtering
*** Bug 1899913 has been marked as a duplicate of this bug. ***
The issue can be demonstrated on Fedora 33 host with docker from moby-engine packaged in Fedora 33, see bug 1897493.
This is bad. I think I'm hitting the same issue. R (in a rawhide docker image) stopped working after updating glibc to glibc-2.32.9000-16.fc34. I see: $ R ERROR: R_HOME ('/usr/lib64/R') not found
glibc is basically sitting between the kernel and the cloud. I've brought the discussion to what I think are the appropriate forums: https://lore.kernel.org/linux-api/87lfer2c0b.fsf@oldenburg2.str.redhat.com/ https://groups.google.com/a/opencontainers.org/g/dev/c/8Phfq3VBxtw I've also posted a glibc upstream patch to show what it would look like: https://sourceware.org/pipermail/libc-alpha/2020-November/119955.html Personally, I find it difficult to support such an approach technically, and I would like to see some reassurance from kernel developers that this is okay.
Can we get the previous workaround re-applied to rawhide as a stop gap until the upstream discussions reach some conclusion about the right long term fix ? This broken faccessat() is quite disruptive to people using rawhide in containers
The workaround has been categorically rejected by kernel developers and glibc developers alike. Work is under way to address this in runc and potentially libseccomp.
Here is another case that fails for the same reason: FROM registry.fedoraproject.org/fedora:rawhide RUN echo "echo test" > test.sh RUN chmod +x test.sh RUN ls -l test.sh RUN test -x test.sh
I really appreciate that the fix is on the way. I just want to point out again that runc und libseccomp are components that are often in the scope of infrastructure operators. So it will take a (long) time until they get updated. There are so many people involved that it simply will take a lot of time... Developer providing the fix, upstream review and hopefully merge, backports to stable versions used by all the major distributions out there, package distribution and finally the operator who accepts the new version and deploys it... Some projects already had to disabled their CI jobs for building and testing on current Fedora releases because of this issue. Any chance to get something like a "special" package for the use in containers or provide "working" Fedora 34 container images?
(In reply to Florian Bezdeka from comment #15) > Some projects already had to disabled their CI jobs for building and testing > on current Fedora releases because of this issue. Only Fedora Rawhide is impacted, and the goals of Rawhide are different from the goals of a stable release. Please review the Fedora Rawhide goals here: https://fedoraproject.org/wiki/Releases/Rawhide#Goals "To identify and fix issues with packages before they reach a stable release of Fedora." > Any chance to get something like a "special" package for the use in > containers or provide "working" Fedora 34 container images? Fedora 34 does not release until May 20th 2021: https://fedorapeople.org/groups/schedule/f-34/f-34-key-tasks.html My opinion is that we have some time to work on a solution that integrates the best possible fixes from upstream. Thank you for your comments.
(In reply to Carlos O'Donell from comment #16) > Only Fedora Rawhide is impacted, and the goals of Rawhide are different from > the goals of a stable release. > > Please review the Fedora Rawhide goals here: > https://fedoraproject.org/wiki/Releases/Rawhide#Goals > > "To identify and fix issues with packages before they reach a stable release > of Fedora." I understand that point of view. I will now give the point of view of one of those open source projects running Fedora-rawhide images in the CI: in the CGAL project (https://www.cgal.org/ and https://github.com/CGAL/cgal), we want to identify and fix issues when our software library is compiled with the compilers and system libraries of Fedora Rawhide, so that our software is always ready to run under Fedora XY as soon as it is released. When the `glibc` or the kernel of Rawhide have an issue with `runc`, we can no longer test our software with it. That is probably what Florian wanted to point out in comment #15.
I totally agree, we run CI against fedora:rawhide to catch compiler and library problems early and currently this isn't possible anymore.
Same here. With travis and github actions a work around is to not restrict the container [1] sudo docker create --cap-add=SYS_PTRACE --security-opt seccomp=unconfined \ --name mobydick registry.fedoraproject.org/fedora:rawhide \ /tmp/BOUT-dev/.travis_fedora.sh mpich [1] https://github.com/boutproject/BOUT-dev/blob/next/.travis_fedora.sh#L24
(In reply to david08741 from comment #21) > Same here. With travis and github actions a work around is to not restrict > the container [1] > > sudo docker create --cap-add=SYS_PTRACE --security-opt seccomp=unconfined \ > --name mobydick registry.fedoraproject.org/fedora:rawhide \ > /tmp/BOUT-dev/.travis_fedora.sh mpich > > [1] https://github.com/boutproject/BOUT-dev/blob/next/.travis_fedora.sh#L24 Or alternatively pass an updated seccomp default profile that includes faccessat2? I see that Moby updated their default profile to include faccessat2 about 4 months ago to SCMP_ACT_ALLOW: https://github.com/moby/moby/blob/master/profiles/seccomp/default.json#L97
I am using the "container:" keyword in GitHub action to run on rawhide, is there are workaround for that, too?
An another alternative is to just explicitly downgrade glibc in your rawhide containers. This is viable as a short term hack, as long as new glibc doesn't introduce a new symbol that apps pick up a dependency on, which hasn't been a problem in this rawhide cycle so far. This is how we've temporarily worked around this problem in libvirt, for example https://gitlab.com/libvirt/libvirt-appdev-guide-python/-/commit/93837ef20164a46469e495cfe7bd887e59828bdb
(In reply to Christoph Junghans from comment #20) > I totally agree, we run CI against fedora:rawhide to catch compiler and > library problems early and currently this isn't possible anymore. Those are great reasons to use fedora:rawhide. Thank you for using it! Unfortunately your infrastructure providers have limited your access to kernel functionality and you can no longer run fedora:rawhide. We will continue to track this situation and raise the issue with affected upstreams. We will track this closely as Fedora Rawhide approaches release as Fedora 34. (In reply to Laurent Rineau from comment #17) > I understand that point of view. I will now give the point of view of one of > those open source projects running Fedora-rawhide images in the CI: in the > CGAL project (https://www.cgal.org/ and https://github.com/CGAL/cgal), we > want to identify and fix issues when our software library is compiled with > the compilers and system libraries of Fedora Rawhide, so that our software > is always ready to run under Fedora XY as soon as it is released. When the > `glibc` or the kernel of Rawhide have an issue with `runc`, we can no longer > test our software with it. That is probably what Florian wanted to point out > in comment #15. Please reach out to your infrastructure providers and ask them to update their seccomp filters? This has been done already by systemd for systemd-nspawn to support Fedora and Fedora COPR builders. Upstream for moby looks updated with faccessat2. Upstream updates for runc, docker, and others is still in progress (last I checked) to fix this "once and for all" so the problem doesn't keep happening. Otherwise this will happen again and again until the infrastructure is updated to correctly manage and mediate access to new kernel functionality.
The main place it needs changing is in libseccomp, and the fix is part of the 2.4.4 release[1] onwards. No distribution traditionally used for CI workers ships it. The closest is Ubuntu 20.04 at 2.3.3 but it still means manual poking just to get the fedora:rawhide image working correctly. It's hard to swallow you really expect every "infrastructure provider" to happily jump in to backport newer libseccomp to every server used for CI. [1] https://github.com/seccomp/libseccomp/commit/b3206ad5645dceda89538ea8acc984078ab697ab
*** Bug 1906575 has been marked as a duplicate of this bug. ***
*** Bug 1910208 has been marked as a duplicate of this bug. ***
*** Bug 1914984 has been marked as a duplicate of this bug. ***
This issue is seriously blocking testing of Fedora rawhide and ELN kernels, as containers are heavily utilized in the CKI process.
(In reply to Veronika Kabatova from comment #30) > This issue is seriously blocking testing of Fedora rawhide and ELN kernels, > as containers are heavily utilized in the CKI process. Please talk to your container runtime vendor to fix this. Depending on what you use, bug 1908281 may be what you are after. Unfortunately, there has not been any feedback on that bug.
Fedora as container runtime vendor shipping moby-engine-19.03.13-1.ce.git4484c46.fc33.x86_64 (with libseccomp-2.5.0-3.fc33.x86_64) in Fedora 33 manifests the problem.
RHEL 8 currently ships libseccomp < 2.4.4. So iiuc, won't any container runtime running on RHEL8 that uses the system libseccomp show this behavior?
(In reply to Michael Hofmann from comment #33) > RHEL 8 currently ships libseccomp < 2.4.4. So iiuc, won't any container > runtime running on RHEL8 that uses the system libseccomp show this behavior? I do not know. There is no technical requirement for a container runtime to use libseccomp, or the system version of that library. I filed bug 1908281 after verifying that a libseccomp update fixed the issue for a particular container runtime. Each runtime is probably different and likely needs a different investigation.
(In reply to Jan Pazdziora from comment #32) > Fedora as container runtime vendor shipping > moby-engine-19.03.13-1.ce.git4484c46.fc33.x86_64 (with > libseccomp-2.5.0-3.fc33.x86_64) in Fedora 33 manifests the problem. Would you please file a bug against moby-engine? Thanks.
Actually, the same version of Docker (docker-ce-20.10.2-3) works on Fedora 33, but not in RHEL 8.2. I mean, with the same container engine version, running Fedora Rawhide shows this issue on RHEL 8.2 but not on Fedora 33.
(In reply to Christoph Junghans from comment #23) > I am using the "container:" keyword in GitHub action to run on rawhide, is > there are workaround for that, too? Yes, you can supply arbitrary docker options: https://docs.github.com/en/actions/reference/workflow-syntax-for-github-actions#jobsjob_idcontaineroptions container: image: docker.io/foo:bar options: --security-opt=... (you can also use --privileged)
(In reply to Martin Pitt from comment #37) > (In reply to Christoph Junghans from comment #23) > > I am using the "container:" keyword in GitHub action to run on rawhide, is > > there are workaround for that, too? > > Yes, you can supply arbitrary docker options: > https://docs.github.com/en/actions/reference/workflow-syntax-for-github- > actions#jobsjob_idcontaineroptions > > container: > image: docker.io/foo:bar > options: --security-opt=... > > (you can also use --privileged) I figured that out yesterday, too, but thanks for mentioning it here!
This bug appears to have been reported against 'rawhide' during the Fedora 34 development cycle. Changing version to 34.
FYI I've done more investigation into the situation with GitLab CI The mormal GitLab CI job environment is *fine* with faccessat2() - we can see that it correctly returns ENOSYS and glibc does the fallback Only If using docker:dind (docker-in-docker) then we see faccesat2() returning EPERM. This is not GitLab's fault, rather the problem is in the current "docker:dind" image. This has a version of "runc" that lacks the fix in https://github.com/opencontainers/runc/pull/2750. It appears the docker:dind is updated reasonably frequently with newer runc, so hopefully this should resolve itself in the not too distant future, at which point I think common uses of GitLab CI will be unaffected by this problem FYI, my repo pipeline showing the different scenarios, with only dind failing is https://gitlab.com/berrange/scratch/-/pipelines/253835120
*** Bug 1931616 has been marked as a duplicate of this bug. ***
(In reply to david08741 from comment #21) > Same here. With travis and github actions a work around is to not restrict > the container [1] > > sudo docker create --cap-add=SYS_PTRACE --security-opt seccomp=unconfined \ > --name mobydick registry.fedoraproject.org/fedora:rawhide \ > /tmp/BOUT-dev/.travis_fedora.sh mpich > > [1] https://github.com/boutproject/BOUT-dev/blob/next/.travis_fedora.sh#L24 I have a question about the temporary work around. Which command option is better in the following 2 command options to add it to `docker run`? * `--cap-add=SYS_PTRACE --security-opt seccomp=unconfined` * `--security-opt seccomp=unconfined` I tested this issue on my repository with small reproducer. Both command options work. https://bugzilla.redhat.com/show_bug.cgi?id=1931616 https://github.com/junaruga/fedora-test-command-test
FYI, the docker:dind images now have runc 1.0.0-rc93, and thus correctly report ENOSYS, rather than EPERM, triggering the normal glibc back compat code. I've verified that rawhide now works correctly in gitlab CI, when using latest published docker.io/library/docker:dind available today (image hash 6e82c575b16f) So from POV of my initial bug report description, this BZ can be closed wontfix, unless glibc maintainers want to keep it open for other reasons.
GitHub Action's ubuntu-20.04's docker still needs --security-opt seccomp=unconfined to run containers based on Fedora rawhide images.
One unfortunate change that this has at least with moby-engine-19.03.13-1.ce.git4484c46.fc33.x86_64 is that installing systemd in the build time now does not create /var/log/journal directory. Consider Dockerfile FROM registry.fedoraproject.org/fedora:rawhide RUN dnf install -y systemd && dnf clean all RUN rpm -q --scripts systemd | grep mkdir.*/var/log/journal RUN test -w /var ; echo $? RUN ls -la /var/log/journal With registry.fedoraproject.org/fedora:rawhide the result is Step 3/5 : RUN rpm -q --scripts systemd | grep mkdir.*/var/log/journal ---> Running in ab72a62e123b [ -w /var ] && mkdir -p /var/log/journal Removing intermediate container ab72a62e123b ---> f877a0a81251 Step 4/5 : RUN test -w /var ; echo $? ---> Running in addf68cf91a5 1 Removing intermediate container addf68cf91a5 ---> b42f1dae86f7 Step 5/5 : RUN ls -la /var/log/journal ---> Running in d95e4987869c ls: cannot access '/var/log/journal': No such file or directory The command '/bin/sh -c ls -la /var/log/journal' returned a non-zero code: 2 When the FROM line is changed to registry.fedoraproject.org/fedora:33, the result is Step 3/5 : RUN rpm -q --scripts systemd | grep mkdir.*/var/log/journal ---> Running in 37517ff4d5be mkdir -p /var/log/journal Removing intermediate container 37517ff4d5be ---> 583554d3e461 Step 4/5 : RUN test -w /var ; echo $? ---> Running in fc8f0fb2358a 0 Removing intermediate container fc8f0fb2358a ---> c9ebf96aa400 Step 5/5 : RUN ls -la /var/log/journal ---> Running in 9a3ff5cc9df4 total 12 drwxr-sr-x. 2 root systemd-journal 4096 Mar 20 05:51 . drwxr-xr-x. 1 root root 4096 Mar 20 05:51 .. Removing intermediate container 9a3ff5cc9df4 ---> c2e311de80ea Successfully built c2e311de80ea The difference seems to stem from the test -w /var check. Sadly, it's not possible to specify seccomp=unconfined for build.
I just ran into this issue when testing with fresh Fedora 34 container images on older distros: # podman run -it --rm fedora:34 bash -c '[ -r / ] && echo readable' <nothing> With updating to the latest libseccomp it seems to be fixed on Fedora 32: # rpm -q libseccomp libseccomp-2.5.0-3.fc32.x86_64 # podman run -it --rm fedora:34 bash -c '[ -r / ] && echo readable' readable But there does not seem to be a fix for centos-8. The latest package is libseccomp-2.4.3-1.el8.x86_64. Any suggestions? Or expectations how long the seccomp=unconfined workaround will need to be used?
I'd just like to add that this actually breaks all autotools-based builds because 'test -x /' returns 1 inside fedora:34-based containers and autotools use that to check if they are running inside a sane environment. Also, it happens with 'buildah' too, of course.
> I'd just like to add that this actually breaks all autotools-based builds because 'test -x /' returns 1 inside fedora:34-based containers and autotools use that to check if they are running inside a sane environment. IOW, all autotools-based builds inside fedora:34-based containers running on RHEL 7 are currently broken. This is very unfortunate.
(In reply to Vratislav Podzimek from comment #48) > > I'd just like to add that this actually breaks all autotools-based builds because 'test -x /' returns 1 inside fedora:34-based containers and autotools use that to check if they are running inside a sane environment. > > IOW, all autotools-based builds inside fedora:34-based containers running on > RHEL 7 are currently broken. This is very unfortunate. We have determined this is a container runtime bug. I have been told that container runtimes have been fixed in their latest versions. If the container runtime of your choice does not work, you need to report this to the container runtime vendor. Thanks.
It's not only RHEL 7 that is affected. There are many more systems affected. All combinations that match - HOST does not have the libseccomp/docker/... fixes installed - CONTAINER uses most recent glibc versions are affected. recent glibc versions used inside containers have now a dependency to the HOST: The fixes for docker/libseccomp/... MUST be deployed. Otherwise such containers are no longer usable. I guess it will never happen that all container runtimes / hosts out there will be updated. We already had to drop Fedora support for some of our packages. We are no longer able to build. Waiting for the HOST distribution to deliver the fixes, which means waiting for the next stable release in our case.
I'm not sure if I understand correctly, so shall we push to get an updated libseccomp to the host distro (e.g. centos-8) or rather to get the Fedora 34 container images fixed?
(In reply to Petr Šplíchal from comment #51) > I'm not sure if I understand correctly, so shall we push to get an > updated libseccomp to the host distro (e.g. centos-8) or rather to > get the Fedora 34 container images fixed? It really depends on the container run-time. I believe non-distribution run-times use the runc upstream kludge and no longer need libseccomp updates to return ENOSYS for newer system calls. podman probably still needs libseccomp updates. But CentOS Stream 8 should already have them. In any case, if you find a broken container host, you need to first report it to the container run-time, I think.
> In any case, if you find a broken container host, you need to first report it to the container run-time, I think. Well, feel free to clone this for RHEL 7, request a rebase of libseccomp or backport the change needed as a downstream patch, get all the ACKs and get it fixed. :) I mean, building things on/for Fedora 34 inside of a container running on RHEL 7 is not so uncommon use case. And the way I understand it, more and more containers will not be running on RHEL 7. That's quite unfortunate. I'm afraid you guys have much better options for getting the change into RHEL 7 than me as a person using the developer subscription. (But I do understand your position and remember it well. ;))
(In reply to Vratislav Podzimek from comment #53) > > In any case, if you find a broken container host, you need to first report it to the container run-time, I think. > > Well, feel free to clone this for RHEL 7, request a rebase of libseccomp or > backport the change needed as a downstream patch, get all the ACKs and get > it fixed. :) I mean, building things on/for Fedora 34 inside of a container > running on RHEL 7 is not so uncommon use case. And the way I understand it, > more and more containers will not be running on RHEL 7. That's quite > unfortunate. I'm afraid you guys have much better options for getting the > change into RHEL 7 than me as a person using the developer subscription. > (But I do understand your position and remember it well. ;)) That Red Hat Enterprise Linux 7 bug is bug 1908281. It was concluded that the change is not necessary. If you encounter a problem, you need to file a report with the container engine vendor. libseccomp is not a container engine (although some container engines use).
(In reply to Florian Weimer from comment #54) > (In reply to Vratislav Podzimek from comment #53) > > > In any case, if you find a broken container host, you need to first report it to the container run-time, I think. > > > > Well, feel free to clone this for RHEL 7, request a rebase of libseccomp or > > backport the change needed as a downstream patch, get all the ACKs and get > > it fixed. :) I mean, building things on/for Fedora 34 inside of a container > > running on RHEL 7 is not so uncommon use case. And the way I understand it, > > more and more containers will not be running on RHEL 7. That's quite > > unfortunate. I'm afraid you guys have much better options for getting the > > change into RHEL 7 than me as a person using the developer subscription. > > (But I do understand your position and remember it well. ;)) > > That Red Hat Enterprise Linux 7 bug is bug 1908281. It was concluded that > the change is not necessary. > > If you encounter a problem, you need to file a report with the container > engine vendor. libseccomp is not a container engine (although some container > engines use). I can help you to report this if you can identify the container engine in question. Given that this is not a glibc bug and not a kernel bug, but caused by the container engine handling system calls incorrectly, I'm closing this bug report.
> I can help you to report this if you can identify the container engine in question. I'm sorry, but I don't know what "the container engine" is. This is a trivial reproducer of the problem (on a RHEL 7 machine): $ c=$(buildah from fedora:34) $ buildah run $c /bin/bash # test -x / # echo $? 1 # exit So I'm just using buildah. I guess that means the same would happen with podman. I thought this was happening with anything using runc, including docker, but maybe I'm wrong? Thanks for your help @fweimer !
(In reply to Florian Weimer from comment #55) > > I can help you to report this if you can identify the container engine in > question. With Travis CI docker installation, it is necessary to use docker run --security-opt=seccomp:unconfined to fix failing FreeIPA Fedora 34-based container https://travis-ci.com/github/adelton/freeipa-container/jobs/502814468 to https://travis-ci.com/github/adelton/freeipa-container/jobs/503038000 The Build system information shown in the jobs is docker version Client: Version: 19.03.8 API version: 1.40 Go version: go1.13.8 Git commit: afacb8b7f0 Built: Tue Jun 23 22:27:11 2020 OS/Arch: linux/arm64 Experimental: false Server: Engine: Version: 19.03.8 API version: 1.40 (minimum version 1.12) Go version: go1.13.8 Git commit: afacb8b7f0 Built: Thu Jun 18 08:26:54 2020 OS/Arch: linux/arm64 Experimental: false containerd: Version: 1.3.3-0ubuntu2 GitCommit: runc: Version: spec: 1.0.1-dev GitCommit: docker-init: Version: 0.18.0 GitCommit:
(In reply to Vratislav Podzimek from comment #56) > > I can help you to report this if you can identify the container engine in question. > > I'm sorry, but I don't know what "the container engine" is. This is a > trivial reproducer of the problem (on a RHEL 7 machine): > > $ c=$(buildah from fedora:34) > $ buildah run $c /bin/bash > # test -x / > # echo $? > 1 > # exit > > So I'm just using buildah. I guess that means the same would happen with > podman. I thought this was happening with anything using runc, including > docker, but maybe I'm wrong? Thanks for reporting this here. I am not familiar with the internal workings of these tools; I believe there are several variants with differing dependencies. I'd appreciate if you could file this as a problem report here (no subscription required): https://bugzilla.redhat.com/enter_bug.cgi?product=Red%20Hat%20Enterprise%20Linux%207&version=7.9&component=buildah Then the relevant team can look into it and decide if and how to fix this for Red Hat Enterprise Linux 7. (There is already a bug requesting a libseccomp update, but without reviewing the buildah internals, it's impossible to tell whether an libseccomp update by itself would actually help.)
> I'd appreciate if you could file this as a problem report here (no subscription required): Done: https://bugzilla.redhat.com/show_bug.cgi?id=1962080