Bug 1984086
Summary: | [4.8] Installation with multipath parameters in parmfile fails (DNS resolution missing) | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Jonathan Lebon <jlebon> |
Component: | RHCOS | Assignee: | Jonathan Lebon <jlebon> |
Status: | CLOSED ERRATA | QA Contact: | Douglas Slavens <dslavens> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 4.8 | CC: | alklein, bdonahue, bgilbert, danili, dornelas, dslavens, hhei, jlebon, jligon, jschinta, madeel, miabbott, mrussell, ndubrovs, nstielau, pod, psundara, sniemann, sorth, wolfgang.voesch |
Target Milestone: | --- | ||
Target Release: | 4.8.z | ||
Hardware: | s390x | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | 1974411 | Environment: | |
Last Closed: | 2022-06-30 16:35:30 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1974411, 2006965 | ||
Bug Blocks: |
Description
Jonathan Lebon
2021-07-20 15:45:31 UTC
Setting reviewed-in-sprint as we are waiting for OpenShift to pick up the RHCOS PR Hi Jonathan, will take bug fix take additional time to land? If so, I'd like to set the "reviewed-in-sprint" flag to "+" to indicate that we have examined this bug during the sprint. Adding "reviewed-in-sprint" as this bug is a clone of 1974411 and the other bug is currently at POST. Hi Jonathan, I noticed that BZ 1974411 is current ON_QA, does that affect the status of this bug? If not, I would like to set the "reviewed-in-sprint" flag if this bug will not be resolved before the end of the current sprint (September 4th) This bug will move to MODIFIED as soon as https://github.com/openshift/installer/pull/5171 lands, which may happen this week. Hi Jonathan, do you think this bug will continued to be open in the upcoming sprint? If so, I'd like to add the "reviewed-in-sprint" flag. Also, should this bug be re-assigned to the RHCOS component since Jonathan is from the RHCOS team? This is still just waiting a bootimage bump. (In reply to Dan Li from comment #6) > Also, should > this bug be re-assigned to the RHCOS component since Jonathan is from the > RHCOS team? I'm OK with this if you'd like. The original issue was specifically hit on s390x, and while the fixes aren't architecture specific, it'd be good to have the bug be verified on s390x once this goes to ON_QA. Re-assigning to the RHCOS component, however, we are keeping Doug as the default QA contact, once this bug reaches ON_QA, we can assign to one of the s390x QA to validate. The fix for this bug has landed in a bootimage bump, as tracked in bug 1982001 (now in status MODIFIED). Moving this bug to MODIFIED. It turns out that the fix did not properly land in the bootimage bump tracked by bug 1982001, so this bug has been moved to the bootimage bump tracked by bug 2006965. To prevent a recurrence of this problem, the CoreOS team has changed its process so bootimage-dependent bugs are verified _before_ the bootimage bump is PRed. As a result, this bug is in need of verification. Please check that the problem is fixed in a current RHCOS 4.8 build on s390x, and then set the Bugzilla `Verified` field to `Tested`. Setting status to ON_QA for verification. The fix for this bug will not be delivered to customers until it lands in an updated bootimage. That process is tracked in bug 2006965, which has status ASSIGNED. Moving this bug back to POST. Note that the bootimage bump process now requires pre-landing verification of dependent bugs while they're still in POST. Yes this will be the version that get tested. I provided the link to Muhammad who will conduct the testing. I still didn't find which RHCOS/OpenShift version has the fix for this BZ on the provided link: https://releases-rhcos-art.cloud.privileged.psi.redhat.com/?stream=releases/rhcos-4.8-s390x It is also not clear whether the fix for this BZ has already landed in an RHCOS images. @miabbott I can't read private comments if you are writing any... @madeel The gist of comment 20 is that any recent RHCOS 4.8 build should have the fix, so please try to verify with a current build from the ART build browser. Verified: DNS missing resolution is not reproducible using 4.8.17 / 48.84.202110152102-0 However the node is not able to boot with multipath enabled in kernel arguments(this was not seen on 4.9): rd.multipath=default coreos.inst.install_dev=/dev/mapper/mpatha The coreos.inst.install_dev karg value did not carry to coreos-installer: coreos-installer-service: coreos-installer install /dev/sda --ignition-url http://bastion.ocp-m1314001.lnxne.boe:8080/ignition/master.ign --insecure-ignition --append-karg zfcp.allow_lun_scan=0 --append-karg cio_ignore=all,!condev --append-karg rd.znet=qeth,0.0.1003,0.0.1004,0.0.1005,layer2=1 --append-karg rd.zfcp=0.0.8002,0x500507630a0350a4,0x400140ED00000000 --append-karg rd.zfcp=0.0.8002,0x500507630a1b50a4,0x400140ED00000000 --append-karg rd.zfcp=0.0.8102,0x500507630a0b50a4,0x400140ED00000000 --append-karg rd.zfcp=0.0.8102,0x500507630a1350a4,0x400140ED00000000 --firstboot-args rd.neednet=1 ip=10.13.114.3::10.13.114.1:255.255.255.0::enc1003:none nameserver=10.13.114.1 coreos-installer-service: Installing Red Hat Enterprise Linux CoreOS 48.84.202110152102-0 (Ootpa) s390x (512-byte sectors) sda: sda3 sda4 16.005498¨ sda: sda3 sda4 17.002226¨ coreos-installer-service: Read disk 123.7 MiB/3.4 GiB (3%) ...output omitted... 47.957533¨ coreos-installer-service: Read disk 3.4 GiB/3.4 GiB (100%) 49.318579¨ GPT:Primary header thinks Alt. header is not at the end of the disk. 49.318586¨ GPT:7219199 != 251658239 49.318588¨ GPT:Alternate GPT header not at the end of the disk. 49.318589¨ GPT:7219199 != 251658239 49.318590¨ GPT: Use GNU Parted to correct GPT errors. 49.318599¨ sda: sda3 sda4 49.544607¨ coreos-installer-service: Error: found multiple devices on /dev/sda with label "boot" 49.544768¨ coreos-installer-service: Resetting partition table 49.565539¨ sda: 49.790196¨ coreos-installer-service: Error: install failed FAILED Ý0m¨ Failed to start CoreOS Installer. Sorry my mistake the above statement regarding "coreos.inst.install_dev karg value did not carry to coreos-installer" is wrong. However, following error can be seen with 4.8: coreos-installer-service: coreos-installer install /dev/mapper/mpatha --ignition-url http://bastion.ocp-m1314001.lnxne.boe:8080/ignition/master.ign --insecure-ignition --append-karg zfcp.allow_lun_scan=0 --append-karg cio_ignore=all,!condev --append-karg rd.znet=qeth,0.0.1003,0.0.1004,0.0.1005,layer2=1 --append-karg rd.zfcp=0.0.8002,0x500507630a0350a4,0x400140ED00000000 --append-karg rd.zfcp=0.0.8002,0x500507630a1b50a4,0x400140ED00000000 --append-karg rd.zfcp=0.0.8102,0x500507630a0b50a4,0x400140ED00000000 --append-karg rd.zfcp=0.0.8102,0x500507630a1350a4,0x400140ED00000000 --firstboot-args rd.neednet=1 ip=10.13.114.3::10.13.114.1:255.255.255.0::enc1003:none nameserver=10.13.114.1 coreos-installer-service: Installing Red Hat Enterprise Linux CoreOS 48.84.202110152102-0 (Ootpa) s390x (512-byte sectors) sd 0:0:0:1089290241: alua: port group 00 state A preferred supports tolusnA 18.583546¨ sd 0:0:0:1089290241: alua: port group 00 state A preferred supports tolusnA 18.626007¨ coreos-installer-service: device-mapper: resume ioctl on mpatha3 failed: Invalid argument 18.626131¨ coreos-installer-service: resume failed on mpatha3 18.964908¨ coreos-installer-service: Error: getting partition table for /dev/mapper/mpatha 18.965080¨ coreos-installer-service: Caused by: 18.965125¨ coreos-installer-service: "kpartx" "-u" "-n" "/dev/dm-0" failed with exit code: 1 FAILED Ý0m¨ Failed to start CoreOS Installer. See 'systemctl status coreos-installer.service' for details. DEPEND Ý0m¨ Dependency failed for CoreOS Installer Target. Marking pre-verified per comment 23. The issue in comment 24 appears to be a separate problem. The fix for this bug has landed in a bootimage bump, as tracked in bug 2006965 (now in status MODIFIED). Moving this bug to MODIFIED. @madeel would you help to check if the bug is fixed with the latest build in https://releases-rhcos-art.cloud.privileged.psi.redhat.com/?stream=releases/rhcos-4.8-s390x (and issue in Comment #24 can be reproduced)? Thanks! @hhei I verified the defect and with release 4.8.45 the installation finished successful (Issue reported in comment 24 doesn't showed up anymore). Thanks @Amadeus for your verification, change status to verified based on comment #31 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.8.45 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:5167 |