Bug 1974411 - Installation with multipath parameters in parmfile fails (DNS resolution missing)
Summary: Installation with multipath parameters in parmfile fails (DNS resolution miss...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Multi-Arch
Version: 4.8
Hardware: s390x
OS: Linux
medium
medium
Target Milestone: ---
: 4.9.0
Assignee: madeel
QA Contact: Douglas Slavens
URL:
Whiteboard:
Depends On: 1981999
Blocks: 1984086
TreeView+ depends on / blocked
 
Reported: 2021-06-21 15:37 UTC by Stefan Orth
Modified: 2021-10-18 17:36 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 1984086 (view as bug list)
Environment:
Last Closed: 2021-10-18 17:35:54 UTC
Target Upstream Version:
Embargoed:
madeel: needinfo-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github coreos coreos-installer pull 565 0 None open Add `coreos-installer install --http-retries` 2021-06-24 19:06:13 UTC
Github coreos fedora-coreos-config pull 1119 0 None open [rhcos-4.8] coreos-propagate-multipath-conf: various minor tweaks 2021-07-19 15:46:03 UTC
Red Hat Issue Tracker MULTIARCH-1356 0 None Backlog Installation with multipath parameters in parmfile fails (DNS resolution missing) 2021-06-21 15:41:01 UTC
Red Hat Product Errata RHSA-2021:3759 0 None None None 2021-10-18 17:36:15 UTC

Description Stefan Orth 2021-06-21 15:37:17 UTC
Description of problem:

An installation with multipath parameters in the parmfile:

rd.multipath=default
coreos.inst.install_dev=/dev/mapper/mpatha

and hostnames in the parmfile fails.

The installation ends with an emergency shell. Network (IP) is configured, but name resolution is not working. Ping with IP to other system works.

The same installation (with MP parameters) works, if IP addresses are specified instead of hostnames in the parmfile.

It also works with hostnames in the parmfile, if rd.multipath=default is removed and sda is used instead of dev/mapper/mpatha.

It looks like the MP parameter(s) breaks the correct setup of the name resolution during installation. Not sure if it should be there, but there is no /etc/resolv.conf in the booted linux (emergency shell).


Version-Release number of selected component (if applicable):

oc version
Client Version: 4.8.0-0.nightly-s390x-2021-06-18-055818
Server Version: 4.8.0-0.nightly-s390x-2021-06-18-055818
Kubernetes Version: v1.21.0-rc.0+120883f


How reproducible:

Install a node with MP parameters and hostnames in the parmfile.


Steps to Reproduce:
1.
2.
3.

Actual results:

Installation ends in an emergency shell.

Expected results:

Installation process works.

Additional info:

Comment 1 Prashanth Sundararaman 2021-06-21 23:09:41 UTC
Created attachment 1792825 [details]
error snapshot

Comment 2 Prashanth Sundararaman 2021-06-21 23:14:08 UTC
Hi Jonathan,

Is this a possible regression caused by https://github.com/coreos/fedora-coreos-config/pull/1011 ?

Like the original description says, if the ignition url is configured with a hostname, the coreos-installer errors out. If configured with an ip address, it works.

Thanks
Prashanth

Comment 3 Dan Li 2021-06-22 14:25:50 UTC
Setting "Blocker-" after discussing with the team. Based on these reasons:
1. configuring multipath as a day 2 operation still works
2. specifying ip address instead of hostname works

Comment 4 Jonathan Lebon 2021-06-22 15:02:14 UTC
Hmm, I'm not sure how this could be multipath related.
It looks a lot like https://bugzilla.redhat.com/show_bug.cgi?id=1967483, except in the initrd.

Full logs from the initrd would be helpful, esp. NetworkManager.

Comment 5 Prashanth Sundararaman 2021-06-22 19:57:13 UTC
funnily enough the coreos-livepxe-rootfs.service succeeds so it is able to resolve the hostname there, but not when running the coreos-installer.

Comment 6 Jonathan Lebon 2021-06-22 20:05:19 UTC
(In reply to Jonathan Lebon from comment #4)
> Hmm, I'm not sure how this could be multipath related.
> It looks a lot like https://bugzilla.redhat.com/show_bug.cgi?id=1967483,
> except in the initrd.

Sorry, this is incorrect. This BZ matches rhbz#1967483 in that respect as well, since coreos-installer.service runs in the real root.
(I'm so used to "emergency shell" referring to the initrd emergency shell that my brain jumped to that. :) )

Comment 7 Nikita Dubrovskii (IBM) 2021-06-23 12:50:43 UTC
Today did some testing of custom rhcos-4.8 (with https://github.com/coreos/coreos-installer/pull/564), ignition config gets downloaded from github.com - system works without any DNS issues.
(
Here is cmdline:
```
Kernel command line: rd.neednet=1 dfltcc=off random.trust_cpu=on rd.znet=qeth,0.0.bdf0,0.0.bdf1,0.0.bdf2,layer2=1,por
tno=0 console=ttysclp0 ip=172.18.142.3::172.18.0.1:255.254.0.0:coreos:encbdf0:off nameserver=172.18.0.1 coreos.inst=yes coreos.inst.
insecure=yes coreos.inst.ignition_url=https://raw.githubusercontent.com/nikita-dubrovskii/s390x-ignition-configs/master/ignition.ign
 coreos.live.rootfs_url=http://172.18.10.243/rhcos-48.84.202106231130-0-live-rootfs.s390x.img zfcp.allow_lun_scan=0 cio_ignore=all,!
condev rd.zfcp=0.0.1903,0x500507630910d435,0x408240d100000000 rd.zfcp=0.0.1943,0x500507630914d435,0x408240d100000000 coreos.inst.ins
tall_dev=sda coreos.inst.mpath=yes 
```
)

Using another zVM/Linux as http-server with ignition config - also works (http://m1314001.lnxne.boe:8080/ignition/ignition.ign).
But using http://bastion.ocp-m1314001.lnxne.boe:8080/ignition/ignition.ign - doesn't work, 
so i guess there is smth wrong with bastion node's config (as you can see same m1314001 is used as http-server).

Comment 8 Jonathan Lebon 2021-06-24 19:06:15 UTC
> Using another zVM/Linux as http-server with ignition config - also works (http://m1314001.lnxne.boe:8080/ignition/ignition.ign).
But using http://bastion.ocp-m1314001.lnxne.boe:8080/ignition/ignition.ign - doesn't work, 
so i guess there is smth wrong with bastion node's config (as you can see same m1314001 is used as http-server).

That's interesting, thanks for the tests. I did some interactive debugging via screenshare with @madeel on this and indeed we saw the install pass without multipath enabled, and fail with it enabled.

I'm still not sure how multipath can affect DNS resolution, unless it simply makes an existing race easier to trigger. If that's the case, then it might be helped by https://github.com/coreos/coreos-installer/pull/565. I've made a scratch build with that patch:

http://brew-task-repos.usersys.redhat.com/repos/scratch/jlebon/coreos-installer/0.9.0/7.pr565.rhaos4.8.el8/s390x/

Re-hosted RPMs in a public space if you don't have VPN access:

https://jlebon.fedorapeople.org/coreos-installer-0.9.0-7.pr565.rhaos4.8.el8.s390x.rpm
https://jlebon.fedorapeople.org/coreos-installer-bootinfra-0.9.0-7.pr565.rhaos4.8.el8.s390x.rpm

Developers with access to an s390x machine who can reproduce this bug should be able to build an RHCOS image with those RPMs and test that.

Comment 9 Nikita Dubrovskii (IBM) 2021-06-25 08:00:07 UTC
Ok, look's like i've got what' wrong here:

1) with 'coreos.inst.install_dev=/dev/mapper/mpatha rd.multipath=default' and `hostname.com/ignition.conf` and in the parm file:
coreos-installer cannot fetch ignition (DNS), but! at first coreos tries to propagate 'multipat.conf' to the '/sysroot', so we end up with a failure:

```
coreos-propagate-multipath-conf[926]: cp: cannot create regular file '/sysroot/etc/multipath.conf': Read-only file system 
systemd[1]: coreos-propagate-multipath-conf.service: Main process exited, code=exited, status=1/FAILURE 
...

systemd[1]: Reached target Emergency Mode. 
```

2) with `coreos.inst.install_dev=/dev/mapper/mpatha rd.multipath=default` and `1.2.3.4/ignition.conf`  in the parm file:
coreos-installer can fetch ignition (no DNS) , but fails with `kpartx` (propagation of 'multipat.conf' to the '/sysroot' also failed):

```
coreos-propagate-multipath-conf[926]: cp: cannot create regular file '/sysroot/etc/multipath.conf': Read-only file system 
...
systemd[1]: Reached target Emergency Mode. 
...

[   23.522376] coreos-installer-service[1859]: device-mapper: resume ioctl on mpatha4  failed: Invalid argument 
[   23.522453] coreos-installer-service[1859]: resume failed on mpatha4 
[   23.811211] coreos-installer-service[1859]: Error: getting partition table for /dev/mapper/mpatha 
[   23.811374] coreos-installer-service[1859]: Caused by: 
[   23.811395] coreos-installer-service[1859]:     "kpartx" "-u" "-n" "/dev/dm-0" failed with exit code: 1 
Failed to start CoreOS Installer. 
```

If we take a look at /etc/resolv.conf without multipath, we have valid config:
```
search lnxne.boe  
nameserver 172.18.0.1
```

But with `rd.multipath=default` it's empty, systemd already had failed, so for me it looks not like a DNS issue.


And installing this way also makes no sense - during fristboot coreos starts without multipath, 
so i don't see any reason for installing coreos with `rd.multipath=default` right now.


i would assume this as not a bug, or not a DNS-bug

Comment 10 Jonathan Lebon 2021-06-28 16:10:41 UTC
(In reply to Nikita Dubrovskii (IBM) from comment #9)
> Ok, look's like i've got what' wrong here:
> 
> 1) with 'coreos.inst.install_dev=/dev/mapper/mpatha rd.multipath=default'
> and `hostname.com/ignition.conf` and in the parm file:

What is that karg? Do you mean `ip=...`? Can you show the full parmfile you used?

> coreos-installer cannot fetch ignition (DNS), but! at first coreos tries to
> propagate 'multipat.conf' to the '/sysroot', so we end up with a failure:
> 
> ```
> coreos-propagate-multipath-conf[926]: cp: cannot create regular file
> '/sysroot/etc/multipath.conf': Read-only file system 
> systemd[1]: coreos-propagate-multipath-conf.service: Main process exited,
> code=exited, status=1/FAILURE 
> ...
> 
> systemd[1]: Reached target Emergency Mode. 
> ```

Ouch good catch. So we continue on to the real root even if the service failed.

> 2) with `coreos.inst.install_dev=/dev/mapper/mpatha rd.multipath=default`
> and `1.2.3.4/ignition.conf`  in the parm file:
> coreos-installer can fetch ignition (no DNS) , but fails with `kpartx`
> (propagation of 'multipat.conf' to the '/sysroot' also failed):
> 
> ```
> coreos-propagate-multipath-conf[926]: cp: cannot create regular file
> '/sysroot/etc/multipath.conf': Read-only file system 
> ...
> systemd[1]: Reached target Emergency Mode. 
> ...
> 
> [   23.522376] coreos-installer-service[1859]: device-mapper: resume ioctl
> on mpatha4  failed: Invalid argument 
> [   23.522453] coreos-installer-service[1859]: resume failed on mpatha4 
> [   23.811211] coreos-installer-service[1859]: Error: getting partition
> table for /dev/mapper/mpatha 
> [   23.811374] coreos-installer-service[1859]: Caused by: 
> [   23.811395] coreos-installer-service[1859]:     "kpartx" "-u" "-n"
> "/dev/dm-0" failed with exit code: 1 
> Failed to start CoreOS Installer. 
> ```
> 
> If we take a look at /etc/resolv.conf without multipath, we have valid
> config:
> ```
> search lnxne.boe  
> nameserver 172.18.0.1
> ```
> 
> But with `rd.multipath=default` it's empty, systemd already had failed, so
> for me it looks not like a DNS issue.

OK, so I think there are two issues here:
1. `coreos-propagate-multipath-conf.service` doesn't have

```
OnFailure=emergency.target
OnFailureJobMode=isolate
```

2. We have no ordering between `coreos-propagate-multipath-conf.service` and `sysroot-etc.mount`.

<times passes>

Filed: https://github.com/coreos/fedora-coreos-config/pull/1077

Can you try that out?

> And installing this way also makes no sense - during fristboot coreos starts
> without multipath, 
> so i don't see any reason for installing coreos with `rd.multipath=default`
> right now.

It's valid to turn on multipath at installation time so that coreos-installer can copy the content on top of the multipath target (for the same reasons as https://github.com/coreos/fedora-coreos-config/pull/1011). coreos-installer should support this already (see e.g. https://github.com/coreos/coreos-installer/pull/499), but if we hit issues with kpartx there, let's work on fixing them.

Comment 11 Dan Li 2021-06-28 18:49:30 UTC
Hi Muhammad, do you think this bug will be resolved before the end of this sprint (July 3rd)? If not, can we set "Reviewed-in-Sprint"?

Comment 12 madeel 2021-06-29 07:45:13 UTC
Hi Dan, The root cause is still not clear, so please set the reviewed flag.

Comment 13 Nikita Dubrovskii (IBM) 2021-06-29 08:45:21 UTC
(In reply to Jonathan Lebon from comment #10)
> (In reply to Nikita Dubrovskii (IBM) from comment #9)
> > Ok, look's like i've got what' wrong here:
> > 
> > 1) with 'coreos.inst.install_dev=/dev/mapper/mpatha rd.multipath=default'
> > and `hostname.com/ignition.conf` and in the parm file:
> 
> What is that karg? Do you mean `ip=...`? Can you show the full parmfile you
> used?

no, it's not an IP here, but some hostname:
``` ip=172.18.142.3::172.18.0.1:255.254.0.0:coreos:encbdf0:off nameserver=172.18.0.1 coreos.inst=yes coreos.inst.ignition_url=http://m1314001.lnxne.boe:8080/ignition/ignition.ign ```

```

> OK, so I think there are two issues here:
> 1. `coreos-propagate-multipath-conf.service` doesn't have
> 
> ```
> OnFailure=emergency.target
> OnFailureJobMode=isolate
> ```
> 2. We have no ordering between `coreos-propagate-multipath-conf.service` and
> `sysroot-etc.mount`.
> 
> <times passes>
> 
> Filed: https://github.com/coreos/fedora-coreos-config/pull/1077
> 
> Can you try that out?

Did it, works as expected:
- system could be installed using DNS (```coreos.inst.ignition_url=http://m1314001.lnxne.boe:8080/ignition/ignition.ign```)
- system could be installed using IP (```coreos.inst.ignition_url=http://172.18.10.243/ignition.ign```)

> with kpartx there, let's work on fixing them.

Here is PR for kpartx issue:
https://github.com/coreos/coreos-installer/pull/566

Comment 14 Dan Li 2021-07-19 14:38:56 UTC
Hi Muhammad, do you think this bug will move past ON_QA by the end of this Sprint? If not, can we add "reviewed-in-sprint" flag?

Comment 15 madeel 2021-07-19 15:02:34 UTC
Hi Jonathan, do you know when the fix[https://github.com/coreos/fedora-coreos-config/pull/1077] will be pickup by RHCOS?

Comment 16 Jonathan Lebon 2021-07-19 15:46:06 UTC
Will try to get it in the next 4.8 bootimage bump.

Comment 17 Jonathan Lebon 2021-07-20 15:50:53 UTC
Latest RHCOS 4.9 build should have the necessary patches for this bug, so should be ready to be verified. Muhammad, can you verify it's fixed?

Comment 18 Jonathan Lebon 2021-07-20 18:18:32 UTC
Sorry for the confusion on this. It has to stay in POST until the 4.9 bootimage bump PR gets merged.

Comment 19 Dan Li 2021-07-21 14:11:18 UTC
Setting reviewed-in-sprint as we are waiting for OpenShift to pick up the RHCOS PR

Comment 20 Dan Li 2021-08-10 16:53:09 UTC
Hi Muhammad, do you think this bug will reach ON_QA by the end of this sprint (August 14th)? If not, can we add "reviewed-in-sprint" flag?

Comment 21 madeel 2021-08-11 07:09:57 UTC
Hi Dan, the fix is landed in 4.9. Though it needs to be tested, therefore you can add the "reviewed-in-sprint" flag.

Comment 22 Stefan Orth 2021-08-11 09:34:43 UTC
I successfully installed two nodes with rd.multipath=default in parmfile:

rd.neednet=1 rd.multipath=default console=ttysclp0   coreos.inst.install_dev=/dev/mapper/mpatha coreos.live.rootfs_url=http://bistro.lnxne.boe/redhat/alkl/rhcos/nightly/PE/rhcos-49.84.202108041448-0/rhcos-49.84.202108041448-0-live-rootfs.s390x.img coreos.inst.ignition_url=http://bastion.m3558001.lnxne.boe:8080/ignition/worker.ign ip=10.107.1.52::10.107.1.51:255.255.255.0::ence383:none nameserver=10.107.1.51 zfcp.allow_lun_scan=0 cio_ignore=all,!condev rd.znet=qeth,0.0.e383,0.0.e384,0.0.e385,layer2=1 rd.zfcp=0.0.1c42,0x5001738030290140,0x0002000000000000 rd.zfcp=0.0.1c02,0x5001738030290140,0x0002000000000000 rd.zfcp=0.0.1c42,0x5001738030290151,0x0002000000000000 rd.zfcp=0.0.1c02,0x5001738030290151,0x0002000000000000


On worker node (without Day-2 operation):
-----------------------------------------

[core@bootstrap-0 ~]$ lsblk
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda      8:0    0 112.2G  0 disk 
|-sda3   8:3    0   384M  0 part /boot
`-sda4   8:4    0 111.8G  0 part 
sdb      8:16   0 112.2G  0 disk 
|-sdb3   8:19   0   384M  0 part 
`-sdb4   8:20   0 111.8G  0 part 
sdc      8:32   0 112.2G  0 disk 
|-sdc3   8:35   0   384M  0 part 
`-sdc4   8:36   0 111.8G  0 part /sysroot
sdd      8:48   0 112.2G  0 disk 
|-sdd3   8:51   0   384M  0 part 
`-sdd4   8:52   0 111.8G  0 part 
[core@bootstrap-0 ~]$ cat /proc/cmdline 
random.trust_cpu=on ignition.platform.id=metal $ignition_firstboot ostree=/ostree/boot.0/rhcos/98ec6ea1fe3b3ff8df599883fdae7041851b624a27b00bb70a3f8488616b6e93/0 zfcp.allow_lun_scan=0 cio_ignore=all,!condev rd.znet=qeth,0.0.e383,0.0.e384,0.0.e385,layer2=1 rd.zfcp=0.0.1c42,0x5001738030290140,0x0002000000000000 rd.zfcp=0.0.1c02,0x5001738030290140,0x0002000000000000 rd.zfcp=0.0.1c42,0x5001738030290151,0x0002000000000000 rd.zfcp=0.0.1c02,0x5001738030290151,0x0002000000000000 root=UUID=7c1ea61a-5c64-4a97-807f-f8f55f949faa rw rootflags=prjquota 


[core@bootstrap-0 ~]$ lszdev
TYPE         ID                                              ON   PERS  NAMES
zfcp-host    0.0.1c02                                        yes  no    
zfcp-host    0.0.1c42                                        yes  no    
zfcp-lun     0.0.1c02:0x5001738030290140:0x0002000000000000  yes  no    sdd sg3
zfcp-lun     0.0.1c02:0x5001738030290151:0x0002000000000000  yes  no    sdc sg2
zfcp-lun     0.0.1c42:0x5001738030290140:0x0002000000000000  yes  no    sda sg0
zfcp-lun     0.0.1c42:0x5001738030290151:0x0002000000000000  yes  no    sdb sg1
qeth         0.0.e383:0.0.e384:0.0.e385                      yes  no    ence383
generic-ccw  0.0.0009                                        yes  no 

----------------------------------------------------------------------------------

BUT: On one system, I have run the installation 3 times, on the other system 2 times to have success.

The failed installations stop in emergency shell, but DNS / hostname / IP / FCP looks good:

lsblk
lsblk  
NAME        MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT  
loop0         7:0    0   7.9G  0 loop  /run/ephemeral  
loop1         7:1    0 771.9M  0 loop  /sysroot  
sda           8:0    0 112.2G  0 disk    
`-mpatha    253:0    0 112.2G  0 mpath   
  |-mpatha3 253:1    0   384M  0 part    
  `-mpatha4 253:2    0 111.8G  0 part    
sdb           8:16   0 112.2G  0 disk    
`-mpatha    253:0    0 112.2G  0 mpath   
  |-mpatha3 253:1    0   384M  0 part    
  `-mpatha4 253:2    0 111.8G  0 part    
sdc           8:32   0 112.2G  0 disk    
`-mpatha    253:0    0 112.2G  0 mpath   
  |-mpatha3 253:1    0   384M  0 part    
  `-mpatha4 253:2    0 111.8G  0 part    
sdd           8:48   0 112.2G  0 disk    
`-mpatha    253:0    0 112.2G  0 mpath   
  |-mpatha3 253:1    0   384M  0 part    
  `-mpatha4 253:2    0 111.8G  0 part    
bash-4.4# lszdev
lszdev  
TYPE         ID                                              ON   PERS  NAMES  
zfcp-host    0.0.1c02                                        yes  no      
zfcp-host    0.0.1c42                                        yes  no      
zfcp-lun     0.0.1c02:0x5001738030290140:0x0002000000000000  yes  no    sdd sg3 
 
zfcp-lun     0.0.1c02:0x5001738030290151:0x0002000000000000  yes  no    sdc sg2 
 
zfcp-lun     0.0.1c42:0x5001738030290140:0x0002000000000000  yes  no    sdb sg1 
 
zfcp-lun     0.0.1c42:0x5001738030290151:0x0002000000000000  yes  no    sda sg0 
 
qeth         0.0.e383:0.0.e384:0.0.e385                      yes  no    ence383 
 
generic-ccw  0.0.0009                                        yes  no      


ping -c 4 bastion.m3558001.lnxne.boe
ping -c 4 bastion.m3558001.lnxne.boe  
PING bastion.m3558001.lnxne.boe (172.18.160.1) 56(84) bytes of data.  
64 bytes from 172.18.160.1 (172.18.160.1): icmp_seq=1 ttl=64 time=0.175 ms  
64 bytes from 172.18.160.1 (172.18.160.1): icmp_seq=2 ttl=64 time=0.203 ms  
64 bytes from 172.18.160.1 (172.18.160.1): icmp_seq=3 ttl=64 time=0.184 ms  
64 bytes from 172.18.160.1 (172.18.160.1): icmp_seq=4 ttl=64 time=0.207 ms  
  
--- bastion.m3558001.lnxne.boe ping statistics ---  
4 packets transmitted, 4 received, 0% packet loss, time 3090ms  
rtt min/avg/max/mdev = 0.175/0.192/0.207/0.016 ms  
bash-4.4# 

For some reason, the image was not downloaded.

@madeel mentioned that this could be related to https://bugzilla.redhat.com/show_bug.cgi?id=1991928 On my system, there is only one NIC configured.

Comment 23 Stefan Orth 2021-08-11 10:14:55 UTC
After Day-2 Operation:

Last login: Wed Aug 11 09:32:37 2021 from 10.107.1.51
[core@bootstrap-0 ~]$ lsblk
NAME        MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda           8:0    0 112.2G  0 disk  
`-mpatha    253:0    0 112.2G  0 mpath 
  |-mpatha3 253:1    0   384M  0 part  /boot
  `-mpatha4 253:2    0 111.8G  0 part  /sysroot
sdb           8:16   0 112.2G  0 disk  
`-mpatha    253:0    0 112.2G  0 mpath 
  |-mpatha3 253:1    0   384M  0 part  /boot
  `-mpatha4 253:2    0 111.8G  0 part  /sysroot
sdc           8:32   0 112.2G  0 disk  
`-mpatha    253:0    0 112.2G  0 mpath 
  |-mpatha3 253:1    0   384M  0 part  /boot
  `-mpatha4 253:2    0 111.8G  0 part  /sysroot
sdd           8:48   0 112.2G  0 disk  
`-mpatha    253:0    0 112.2G  0 mpath 
  |-mpatha3 253:1    0   384M  0 part  /boot
  `-mpatha4 253:2    0 111.8G  0 part  /sysroot
[core@bootstrap-0 ~]$ sudo multipath -ll
mpatha (20017380030290193) dm-0 IBM,2810XIV
size=112G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='service-time 0' prio=50 status=active
  |- 1:0:0:2 sdc 8:32 active ready running
  |- 1:0:1:2 sdd 8:48 active ready running
  |- 0:0:0:2 sda 8:0  active ready running
  `- 0:0:1:2 sdb 8:16 active ready running

Comment 24 Dan Li 2021-08-11 14:16:45 UTC
The Z team tested this specific error described in the initial comment and the it has been fixed; however, other bugs have come up and we think it may be related to BZ 1991928: https://bugzilla.redhat.com/show_bug.cgi?id=1991928 so we are tracking the other bug. Moving it to VERIFIED at the moment

Comment 25 RHCOS Bug Bot 2021-08-11 21:59:21 UTC
The fix for this bug will not be delivered to customers until it lands in an updated bootimage.  That process is tracked in bug 1981999, which is in state ASSIGNED.  Moving this bug back to POST.

Comment 26 madeel 2021-08-17 08:32:41 UTC
Setting needinfo- as there are other networking related patches coming with the tracker bug 1981999. We will wait for the bootimage bump.

Comment 27 Micah Abbott 2021-08-27 13:49:50 UTC
Boot image bump is merged, moving to MODIFIED

Comment 30 RHCOS Bug Bot 2021-09-02 16:36:31 UTC
The fix for this bug will not be delivered to customers until it lands in an updated bootimage.  That process is tracked in bug 1981999, which is in state ASSIGNED.  Moving this bug back to POST.

Comment 31 Dan Li 2021-09-06 15:25:13 UTC
Adding "reviewed-in-sprint" as the bug will not be resolved before the end of this sprint.

Comment 32 Dan Li 2021-09-20 18:20:16 UTC
Hi Muhammad, do you think this bug is still waiting for the PRs to merge? If so, we may want to add "reviewed-in-sprint"

Comment 33 madeel 2021-09-21 06:35:13 UTC
Hi Dan, We could not reproduce the problem mentioned in the BZ anymore. We can close this.

Comment 34 Dan Li 2021-09-21 11:04:32 UTC
Closing per Muhammad's Comment 33

Comment 35 RHCOS Bug Bot 2021-09-21 11:05:17 UTC
The fix for this bug will not be delivered to customers until it lands in an updated bootimage.  That process is tracked in bug 1981999, which is in state POST.  Moving this bug back to POST.

Comment 36 Dan Li 2021-09-21 11:19:41 UTC
Bot is linked with the bootimage bug and therefore cannot close. Adding "reviewed-in-sprint"

Comment 37 RHCOS Bug Bot 2021-09-22 18:37:26 UTC
The fix for this bug has landed in a bootimage bump, as tracked in bug 1981999 (now in status MODIFIED).  Moving this bug to MODIFIED.

Comment 40 Douglas Slavens 2021-10-04 20:24:18 UTC
This has been verified.

Comment 41 Dan Li 2021-10-04 20:27:03 UTC
I believe Doug's Comment 40 refers back to Comment 33 regarding the fact that we can no longer reproduce the problem; however, we were unable to close this bug as it is linked to BZ 1981999 (which is VERIFIED)

Comment 43 errata-xmlrpc 2021-10-18 17:35:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:3759


Note You need to log in before you can comment on or make changes to this bug.