RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1795574 - Many base docker images fail with code 132 (SIGILL) and cannot even run bash
Summary: Many base docker images fail with code 132 (SIGILL) and cannot even run bash
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: docker
Version: 7.7
Hardware: x86_64
OS: Unspecified
unspecified
high
Target Milestone: rc
: ---
Assignee: Jindrich Novy
QA Contact: atomic-bugs@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-01-28 11:01 UTC by Selim Arikan
Modified: 2020-06-09 22:21 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-06-09 22:21:50 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
ABRT log of the failure (43.28 KB, application/gzip)
2020-01-28 11:01 UTC, Selim Arikan
no flags Details

Description Selim Arikan 2020-01-28 11:01:29 UTC
Created attachment 1655979 [details]
ABRT log of the failure

Description of problem:

Hi,

We have a really weird bug that plagues the system. One day, a container failed and crashed with an unrelated reason and we could not start it ever again. When we stopped other containers, they also could not be started ever again. 
--> Containers worked fine in-memory but could not be started ever again.
We see the error message "docker exited with code 132" when we try to run or bash into the container.

When we did some investigations, we found the messages with SIGILL invalid opcode in libc-2.28.so.
--> Before the problem, containers were not touched, updated or restarted, no system updates have been made and system is also not restarted.

We suspected perhaps the libc version is to blame and to isolate the problem, we started trying base docker images. We found some working, some not.
Images that also show the same SIGILL message and fail:
- debian-buster-slim (glibc 2.28)
- fedora:31 (glibc 2.30)
- Ubuntu:19.10 (glibc 2.30)

Images that we can run with docker and bash into it:
- CentOS7 (glibc 2.17)
- CentOS8 (glibc 2.28)
- Ubuntu:18.04 (glibc 2.27)

Later, we tried restarting and reinstalling Docker, restarting the system and updating the system but nothing worked so far.

We also have a second hardware and software *identical* machine and it functions normally. We tested the images that are failing in the buggy system and they work without any problem in the identical machine.


Version-Release number of selected component (if applicable):
Docker CE version 19.03.5, build 633a0ea

How reproducible:
Always reproducible in the failing machine with certain containers.
--> Not reproducible in the working identical machine

Steps to Reproduce:
1. Get official Fedora 31 Docker image https://hub.docker.com/_/fedora (or debian-buster-slim from official repo)
2. Try to run it "docker run -it <container id> /bin/bash"
3. echo $?
--> 132

Actual results:
Bash returns to the host machine.

Expected results:
Container bash should be seen.


Additional info:
CPU: Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz
ABRT log folder is attached.

Comment 2 Selim Arikan 2020-01-28 11:03:36 UTC
Initial containers that are failed and cannot be restarted ever again also have debian-buster-slim as base image.

Comment 3 Tom Sweeney 2020-01-28 15:55:01 UTC
Selim,

I'm not clear at which version of Docker you're using on the machine with the failures.  Is it: Docker CE version 19.03.5, build 633a0ea?    If so, that's not supported by Red Hat AFAIK and you'd have to report the issue to upstream Docker.

Dan thoughts?

Comment 4 Daniel Walsh 2020-01-28 18:45:46 UTC
This looks like you are using an upstream version of docker?

Comment 5 Selim Arikan 2020-01-29 13:28:58 UTC
Hello,

We are using the official stable Docker repository without any modifications. 
Specifically, the package is this one: https://download.docker.com/linux/centos/7/x86_64/stable/Packages/docker-ce-19.03.5-3.el7.x86_64.rpm

Other CentOS packages are compatible with their corresponding RHEL versions. As far as I know, this package should work on RHEL as well since it is the stable and unmodified build.

Comment 6 Selim Arikan 2020-02-04 09:10:15 UTC
-- Update -- 

Hi,

We now have also lost the second (identical to the first as in HW and SW configuration) system to the same problem.
Now, both of the systems are unusable and showing the message "SIGILL, 132" if we try to run aforementioned containers.

Comment 7 Tom Sweeney 2020-06-09 22:21:50 UTC
We have no plans to ship another version of Docker at this time. RHEL7 is in final support stages where only security fixes will get released.  Customers should move to use Podman which is available starting in RHEL 7.6.


Note You need to log in before you can comment on or make changes to this bug.