Bug 1878023 - The job pod in crash due to " `GLIBC_2.28' not found" [NEEDINFO]
Summary: The job pod in crash due to " `GLIBC_2.28' not found"
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Release
Version: 4.6
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 4.6.0
Assignee: Luke Meyer
QA Contact: Jian Zhang
URL:
Whiteboard:
: 1878038 1878150 1880648 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-09-11 05:51 UTC by Jian Zhang
Modified: 2020-10-27 16:40 UTC (History)
28 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-27 16:40:07 UTC
Target Upstream Version:
dsanzmor: needinfo-
jiazha: needinfo? (lmeyer)


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2020:4196 0 None None None 2020-10-27 16:40:23 UTC

Description Jian Zhang 2020-09-11 05:51:24 UTC
Description of problem:
The OLM operators cannot be subscribed. The unpack job failed, errors:
[root@preserve-olm-env data]# oc logs 6e82d79cb934dfdbcf869491849f15d230e06c5a41622a69a9097657614rrbj
opm: /lib64/libc.so.6: version `GLIBC_2.28' not found (required by opm)

Version-Release number of selected component (if applicable):
Cluster version is 4.6.0-0.nightly-2020-09-10-031249

How reproducible:
always

Steps to Reproduce:
1. Install OCP 4.6
2. Subacribe an OLM operator, for example, descheduler-operator
[root@preserve-olm-env data]# oc get sub -A
NAMESPACE                             NAME                                PACKAGE                             SOURCE            CHANNEL
openshift-kube-descheduler-operator   cluster-kube-descheduler-operator   cluster-kube-descheduler-operator   qe-app-registry   4.6
[root@preserve-olm-env data]# oc get sub -n openshift-kube-descheduler-operator
NAME                                PACKAGE                             SOURCE            CHANNEL
cluster-kube-descheduler-operator   cluster-kube-descheduler-operator   qe-app-registry   4.6
[root@preserve-olm-env data]# oc get ip -n openshift-kube-descheduler-operator
NAME            CSV                                                    APPROVAL    APPROVED
install-xhn84   clusterkubedescheduleroperator.4.6.0-202009041839.p0   Automatic   true

3. The descheduler-operator failed to install.

Actual results:
The unpack job pod in crash. 

[root@preserve-olm-env data]# oc get pods
NAME                                                              READY   STATUS             RESTARTS   AGE
6e82d79cb934dfdbcf869491849f15d230e06c5a41622a69a9097657614rrbj   0/1     CrashLoopBackOff   1          13s
[root@preserve-olm-env data]# oc logs 6e82d79cb934dfdbcf869491849f15d230e06c5a41622a69a9097657614rrbj
opm: /lib64/libc.so.6: version `GLIBC_2.28' not found (required by opm)

Expected results:
The Job pod works well. And, OLM operators can be installed successfully.

Additional info:

Comment 1 Evan Cordell 2020-09-11 12:16:47 UTC
operator-registry is built in ubi8, but the final image is using ubi7, so it's missing important libs.

This should be rectified once the ubi8 switch is completed, but we're working with ART to make the problem go away sooner for operator-registry.

Comment 2 Evan Cordell 2020-09-11 12:49:26 UTC
*** Bug 1878038 has been marked as a duplicate of this bug. ***

Comment 3 Kevin Rizza 2020-09-11 15:04:55 UTC
*** Bug 1878150 has been marked as a duplicate of this bug. ***

Comment 4 Christian Hernandez 2020-09-11 19:49:46 UTC
(Working on the Windows Container Operator.)

I also ran into this bug.

```
$ oc get pods -n windows-machine-config-operator 
No resources found in windows-machine-config-operator namespace.
[ec2-user@bastion ~]$ oc get jobs -A
NAMESPACE               NAME                                                              COMPLETIONS   DURATION   AGE
openshift-marketplace   fb33deb65897ef4d8ca1240957d82e451f00dc5e73c567858a22ca6be81b78c   0/1           40s        40s

$ oc get pods -n openshift-marketplace 
NAME                                                              READY   STATUS    RESTARTS   AGE
certified-operators-g75bg                                         1/1     Running   0          14m
certified-operators-ngvg9                                         1/1     Running   0          142m
community-operators-5b2g7                                         1/1     Running   0          142m
community-operators-fc942                                         1/1     Running   0          14m
fb33deb65897ef4d8ca1240957d82e451f00dc5e73c567858a22ca6be8slqww   0/1     Error     2          46s
marketplace-operator-7dbcf67c74-n9gj9                             1/1     Running   0          142m
redhat-marketplace-mzrwh                                          1/1     Running   0          142m
redhat-marketplace-ndr6c                                          1/1     Running   0          14m
redhat-operators-7cnx9                                            1/1     Running   0          142m
redhat-operators-dc7lf                                            1/1     Running   0          14m
wmco-kjhw4                                                        1/1     Running   0          71s

$ oc logs -n openshift-marketplace fb33deb65897ef4d8ca1240957d82e451f00dc5e73c567858a22ca6be8slqww
opm: /lib64/libc.so.6: version `GLIBC_2.28' not found (required by opm)
```

Comment 5 David Sanz 2020-09-14 08:25:00 UTC
This is also affecting installation tests

09-14 10:10:29  + oc adm release extract --insecure -a ./.secrets --command oc --from registry.svc.ci.openshift.org/ocp/release:4.6
09-14 10:10:33  [Pipeline] sh
09-14 10:10:34  + ./oc adm release extract --insecure -a ./.secrets --command openshift-install --from registry.svc.ci.openshift.org/ocp/release:4.6
09-14 10:10:34  ./oc: /lib64/libc.so.6: version `GLIBC_2.28' not found (required by ./oc)

Comment 7 Vijay Avuthu 2020-09-14 09:30:18 UTC
We too hit this issue and it blocks the deployments as we use the latest nightly builds in our automation.

version: openshift-client-linux-4.6.0-0.nightly-2020-09-12-230035

 ./oc version
./oc: /lib64/libc.so.6: version `GLIBC_2.28' not found (required by ./oc)

Comment 8 Johnny Liu 2020-09-14 09:47:48 UTC
Let us move this to oc component.

Comment 9 Maciej Szulik 2020-09-14 11:44:47 UTC
All of openshift-related repositories are being migrated to ubi8, which means you need rhel 8 as well.
I'm not sure what's the plan for publishing rhel7 compatible binaries, due to glibc version mismatch.
I'm sending this over to ART team, since they own that part of the build process.

Comment 10 Luke Meyer 2020-09-14 13:48:55 UTC
operator-registry was built in rhel8 builder and based on a rhel7 base by accident. That was the last "successful" build, which is failing at runtime as you see.

The rhel 8 migration is more or less complete (chasing down a few bugs) so you shouldn't see this in other images. For operator-registry there was a complication in moving to golang 1.15 and a different one in moving back to golang 1.14, so this has been kind of blocked. However I think I can get a new build to unblock people while we get that sorted out.

Comment 11 Luke Meyer 2020-09-14 13:55:04 UTC
openshift-clients and openshift-hyperkube RPMs will continue to build in parallel on rhel7 and rhel8, and I assume cri-o etc as well (ART will keep the buildroots at parity to enable this). I guess the client/installer builds in the images will be RHEL 8 only? I'm not sure how that will impact customers.

Comment 12 Francesco Romani 2020-09-14 14:07:36 UTC
performance-addon-operator is affected too. Additionally, I'd like to ask if anyone has a link to the reason why we don't build binaries fully static, been looking for the rationale with little success yet. Thanks!

Comment 13 Yaakov Selkowitz 2020-09-14 14:34:31 UTC
Looks like it's going to have to be 1.14 for now due to https://github.com/golang/go/issues/40974.  Fixes to get 1.14 building again, and the darwin cross build to complete, are currently in progress.

Comment 14 Luke Meyer 2020-09-14 14:52:04 UTC
operator-registry-container-v4.6.0-202009141408.p0 should solve it temporarily (while we work out further builds). Please test once we get a new nightly.

Not sure what we're going to do about downloads of binary oc from mirror and console - as compiled, they will only work on rhel 8.

Comment 15 Evan Cordell 2020-09-14 16:35:13 UTC
> Not sure what we're going to do about downloads of binary oc from mirror and console - as compiled, they will only work on rhel 8.

Is static linking an option for oc? It would solve this problem, though I recall there being concern with statically linking go (has its own TLS implementation when statically linking, so reasoning about openssl CVEs goes away).

Comment 16 Johnny Liu 2020-09-15 04:48:44 UTC
> Not sure what we're going to do about downloads of binary oc from mirror and console - as compiled, they will only work on rhel 8.

From now on, oc is limited only working on RHEL8? Generally oc should be cross-platform, should be run RHEL7/RHEL8/Mac OS right?

Comment 17 Yaakov Selkowitz 2020-09-15 20:05:42 UTC
We appear to have the same issue with two different binaries, can we please not conflate them?

comment 0 shows that opm was missing symbols, which should be fixed in the latest builds per comment 14.  Unless opm is also meant to be extracted from and run outside of the operator-registry container?

oc does have the same problem, and yes it needs to be run anywhere.  This is being worked through bug 1879081.

Comment 23 Song Lai 2020-09-17 00:59:53 UTC
I ran into this bug too. Which build has the fix?

Comment 24 Sabina Aledort 2020-09-17 06:53:20 UTC
We appear to have the same issue with performance addon operator in 4.6.0-0.nightly-2020-09-12-230035

[root@cnfd1-installer ~]# oc get pod -A | grep mark
openshift-marketplace                              ab5a8123c2ca40a0b692ac7bb782b161c3e6e78f8174e37b96951adae1pnmtm           0/1     PodInitializing   0          5s
openshift-marketplace                              marketplace-operator-7c4b4b9db6-rblbj                                     1/1     Running           0          90m
openshift-marketplace                              performance-addon-operator-g8j4h                                          1/1     Running           0          27s
[root@cnfd1-installer ~]# oc get pod -A | grep mark
openshift-marketplace                              ab5a8123c2ca40a0b692ac7bb782b161c3e6e78f8174e37b96951adae1pnmtm           0/1     Error       1          7s
openshift-marketplace                              marketplace-operator-7c4b4b9db6-rblbj                                     1/1     Running     0          90m
openshift-marketplace                              performance-addon-operator-g8j4h                                          1/1     Running     0          29s

[root@cnfd1-installer ~]# oc logs -n openshift-marketplace ab5a8123c2ca40a0b692ac7bb782b161c3e6e78f8174e37b96951adae1pnmtm
opm: /lib64/libc.so.6: version `GLIBC_2.28' not found (required by opm)

Events:
  Type     Reason          Age                From               Message
  ----     ------          ----               ----               -------
  Normal   Scheduled       79s                default-scheduler  Successfully assigned openshift-marketplace/ab5a8123c2ca40a0b692ac7bb782b161c3e6e78f8174e37b96951adae1pnmtm to cnfd1-worker-0.fci1.kni.lab.eng.bos.redhat.com
  Normal   AddedInterface  77s                multus             Add eth0 [10.132.2.24/23]
  Normal   Pulled          77s                kubelet            Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:de79112f0715264c036ab9877b0d6a34b37f2a1f676705c280d5e3737089742b" already present on machine
  Normal   Created         77s                kubelet            Created container util
  Normal   Started         77s                kubelet            Started container util
  Normal   Pulling         76s                kubelet            Pulling image "registry-proxy.engineering.redhat.com/rh-osbs/openshift4-performance-addon-operator-bundle-registry-container-rhel8@sha256:0ea08d8073eeb144536d3f5949fac6bb2d7a14fbc3c198e5021b69edf7f279ed"
  Normal   Pulled          75s                kubelet            Successfully pulled image "registry-proxy.engineering.redhat.com/rh-osbs/openshift4-performance-addon-operator-bundle-registry-container-rhel8@sha256:0ea08d8073eeb144536d3f5949fac6bb2d7a14fbc3c198e5021b69edf7f279ed" in 1.018237774s
  Normal   Created         75s                kubelet            Created container pull
  Normal   Started         75s                kubelet            Started container pull
  Normal   Pulled          34s (x4 over 74s)  kubelet            Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:232e321e42279d5660b64bb7010b239948f7f59e6b3309c86102392724b8f3d6" already present on machine
  Normal   Created         34s (x4 over 74s)  kubelet            Created container extract
  Normal   Started         34s (x4 over 74s)  kubelet            Started container extract
  Warning  BackOff         19s (x6 over 72s)  kubelet            Back-off restarting failed container

Comment 25 Matt Mahoney 2020-09-17 18:18:01 UTC
Note that with the latest CVP Kiail IIB build this issue no longer exists.

Comment 27 Song Lai 2020-09-21 07:49:53 UTC
It's been working for me since 4.6.0-0.nightly-2020-09-18-071428.

Comment 28 Tomas Smetana 2020-09-22 07:29:32 UTC
*** Bug 1880648 has been marked as a duplicate of this bug. ***

Comment 31 Jian Zhang 2020-09-23 01:11:31 UTC
Hi Vikas,

I know it has been fixed in the latest payload(see below). But, as I metioned in comment 26, it's not easy for the user to find which build has the fixed PR. It's better if you can paster the fixed PR link here. Or, point out the exact available build, not said `a build from last week`. And, as the workflow, each bug should have the fixed PR link. 
Anyway, I verify it first this time, please link the fixed PR later, thanks! Thank for your understadning!

[root@preserve-olm-env data]# oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.6.0-0.nightly-2020-09-21-230455   True        False         22h     Cluster version is 4.6.0-0.nightly-2020-09-21-230455

[root@preserve-olm-env client]# oc get sub -A
NAMESPACE           NAME              PACKAGE           SOURCE                CHANNEL
default             etcd              etcd              community-operators   singlenamespace-alpha
openshift-logging   cluster-logging   cluster-logging   qe-app-registry       4.6
[root@preserve-olm-env client]# oc get csv -n default
NAME                                   DISPLAY           VERSION                 REPLACES              PHASE
clusterlogging.4.6.0-202009192030.p0   Cluster Logging   4.6.0-202009192030.p0                         Succeeded
etcdoperator.v0.9.4                    etcd              0.9.4                   etcdoperator.v0.9.2   Succeeded

[root@preserve-olm-env client]# oc project
Using project "openshift-marketplace" on server "https://api.jiazha22.qe.devcluster.openshift.com:6443".
[root@preserve-olm-env client]# oc get job
NAME                                                              COMPLETIONS   DURATION   AGE
1c6fae33b9bbf0971af0c7e8278465a1f35be67b478e79c299e01436876089c   1/1           7s         3h8m
cc8c72b382bd1ac5ca006964332b9f198745bc52b7851422b254d4dcc55d8c0   1/1           93s        3h21m
[root@preserve-olm-env client]# oc get pods
NAME                                                              READY   STATUS              RESTARTS   AGE
1c6fae33b9bbf0971af0c7e8278465a1f35be67b478e79c299e0143687dp5nw   0/1     Completed           0          3h8m
cc8c72b382bd1ac5ca006964332b9f198745bc52b7851422b254d4dcc569p2l   0/1     Completed           0          3h21m

Comment 34 errata-xmlrpc 2020-10-27 16:40:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196


Note You need to log in before you can comment on or make changes to this bug.