Bug 1926558 - [release-4.4] OCP installs fail on ppc64le/s390x in the bootstrap phase
Summary: [release-4.4] OCP installs fail on ppc64le/s390x in the bootstrap phase
Keywords:
Status: CLOSED EOL
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Multi-Arch
Version: 4.4
Hardware: Unspecified
OS: Linux
high
high
Target Milestone: ---
: 4.4.z
Assignee: Prashanth Sundararaman
QA Contact: Barry Donahue
URL:
Whiteboard:
Depends On: 1926564
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-02-09 03:14 UTC by Prashanth Sundararaman
Modified: 2021-03-09 13:54 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1926564 (view as bug list)
Environment:
Last Closed: 2021-03-09 13:54:17 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
package diff for s390x (47.82 KB, text/plain)
2021-02-09 03:34 UTC, Prashanth Sundararaman
no flags Details
package diff for ppc64le (54.06 KB, text/plain)
2021-02-09 03:34 UTC, Prashanth Sundararaman
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift installer pull 4628 0 None closed Bug 1926558: [release-4.4]: rhcos.json: Update bootimages for ppc64le/s390x to fix kubelet error 2021-02-16 15:19:00 UTC

Description Prashanth Sundararaman 2021-02-09 03:14:21 UTC
Description of problem:
Recently discovered in the CI that 4.4 builds are failing in the bootstrap phase.
On manually trying an install , it looks like the error is the kubelet service not starting and this was the error log:

Feb 08 22:02:17 psundara-ocp-ptj44-bootstrap hyperkube[2888]: I0208 22:02:17.946044    2888 server.go:540] No cloud provider specified: "" from the config file: ""
Feb 08 22:02:17 psundara-ocp-ptj44-bootstrap hyperkube[2888]: W0208 22:02:17.946082    2888 server.go:563] standalone mode, no API client
Feb 08 22:02:17 psundara-ocp-ptj44-bootstrap hyperkube[2888]: F0208 22:02:17.946123    2888 server.go:273] failed to run Kubelet: No authentication method configured


There seems to have been a recent change in the installer code that seemed to have indirectly caused this issue: https://github.com/openshift/installer/commit/2e69b437cf204b0f01e0ba1d99097200b75ccb44. 

Looks like the hyperkube package for ppc64le/s390x was old because the bootimage was old.

This change works fine for x86 because the x86 rhcos bootimages were bumped for a CVE: https://github.com/openshift/installer/pull/3985. At that time the ppc64le/s390x images were never bumped.


The bootimages need to be bumped for ppc64le/s390x to fix this issue

Comment 1 Prashanth Sundararaman 2021-02-09 03:16:45 UTC
Going to update the bootimages to : 

https://releases-art-rhcos.svc.ci.openshift.org/art/storage/releases/rhcos-4.4-s390x/44.82.202011122341-0/s390x/meta.json for s390x
https://releases-art-rhcos.svc.ci.openshift.org/art/storage/releases/rhcos-4.4-ppc64le/44.82.202012010439-0/ppc64le/meta.json for ppc6l4e

these are the closest builds to x86 i could find and this is the hyperkube package diff:

  "openshift-hyperkube": {
            "rhcos-4.4-s390x/44.81.202004252140-0": "openshift-hyperkube-4.4.0-202004241258.git.1.bfb96d1.el8.s390x",
            "rhcos-4.4-s390x/44.82.202011122341-0": "openshift-hyperkube-4.4.0-202011122017.p0.git.0.149ca32.el8.s390x"
        },


 "openshift-hyperkube": {
            "rhcos-4.4-ppc64le/44.81.202004252240-0": "openshift-hyperkube-4.4.0-202004241258.git.1.bfb96d1.el8.ppc64le",
            "rhcos-4.4-ppc64le/44.82.202012010439-0": "openshift-hyperkube-4.4.0-202011130111.p0.git.0.4861dfa.el8.ppc64le"
        },

Comment 2 Prashanth Sundararaman 2021-02-09 03:34:26 UTC
Created attachment 1755828 [details]
package diff for s390x

Comment 3 Prashanth Sundararaman 2021-02-09 03:34:56 UTC
Created attachment 1755829 [details]
package diff for ppc64le

Comment 7 Prashanth Sundararaman 2021-02-12 14:10:05 UTC
Ci jobs for 4.4 are passing again.


Note You need to log in before you can comment on or make changes to this bug.