Bug 1942152

Summary: Make grub2 more robust against Open Firmware storage race condition causing system boot failures
Product: Red Hat Enterprise Linux 8 Reporter: sgardner
Component: grub2Assignee: Bootloader engineering team <bootloader-eng-team>
Status: CLOSED ERRATA QA Contact: Petr Janda <pjanda>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 8.4CC: borgan, bugproxy, dhorak, diegodo, fmartine, mgandhi, pjanda, rmetrich, sbarcomb, sgardner
Target Milestone: betaKeywords: OtherQA, Triaged, ZStream
Target Release: 8.5   
Hardware: ppc64le   
OS: Linux   
Whiteboard:
Fixed In Version: grub2-2.02-103.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1961265 (view as bug list) Environment:
Last Closed: 2021-11-09 19:53:59 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1862632    
Bug Blocks: 1916117, 1961265    
Attachments:
Description Flags
Retry open and read on failure none

Description sgardner 2021-03-23 18:37:16 UTC
Description of problem:
IBM is providing a grub patch for PPC systems.  This patch will allow grub to retry IO and connection requests to fc devices in the event that the device returns a timeout.



Actual results:
Systems fail to boot with storage related errors.


Expected results:
Systems will no fail to boot.


Additional info:
Adding IBM team to this BZ.  As this is affecting some very large customers, there is a heavy push to have this backported into RHEL8.

Comment 1 IBM Bug Proxy 2021-04-07 13:00:57 UTC
------- Comment From diegodo.com 2021-04-07 09:00 EDT-------
Hello Redhat,

this is the patch that I'm willing to send upstream.

Please, let me know your thoughts.

This patch is on top of patch: Avoiding many unecessary open close

Comment 2 IBM Bug Proxy 2021-04-07 13:01:08 UTC
Created attachment 1769869 [details]
Retry open and read on failure

Comment 3 Hanns-Joachim Uhl 2021-04-07 13:54:02 UTC
(In reply to IBM Bug Proxy from comment #1)
> ------- Comment From diegodo.com 2021-04-07 09:00 EDT-------
...
> 
> This patch is on top of patch: Avoiding many unecessary open close
.
... which was integrated into RHEL8.4 through
LTC bug 187174 - RH1862632- RHEL8.3 Beta - ISST-LTE:PowerVM: Fleetwood:raylp83: LPAR installed on a namespace of Kona NVME card takes long time to boot  ...

Comment 4 IBM Bug Proxy 2021-05-11 13:51:38 UTC
------- Comment From fnovak.com 2021-05-11 09:48 EDT-------
Where does this stand?  No update in month+

Comment 5 Javier Martinez Canillas 2021-05-12 11:29:37 UTC
(In reply to IBM Bug Proxy from comment #4)
> ------- Comment From fnovak.com 2021-05-11 09:48 EDT-------
> Where does this stand?  No update in month+


Sorry, we were busy with other tasks. How far should this go for z-stream ?

Comment 6 sgardner 2021-05-12 12:38:57 UTC
We don't currently have any customers on RHEL8 hitting this race condition, but I don't want to wait until 8.5 to release this.  I think 8.3.z and 8.4.z is sufficient.

Comment 7 Javier Martinez Canillas 2021-05-12 12:47:09 UTC
(In reply to sgardner from comment #6)
> We don't currently have any customers on RHEL8 hitting this race condition,
> but I don't want to wait until 8.5 to release this.  I think 8.3.z and 8.4.z
> is sufficient.

Thanks for the info. Yes, I set to 8.5 because we first need to push the fix in
before backporting to z-stream. As far as I know, 8.3.z is not an EUS release
and was EOL after the 8.4 release, so then we only need to fix this in 8.4.z ?

Comment 8 Petr Janda 2021-05-17 08:25:57 UTC
I suppose IBM is able to test it when patch is merged. So providing qa_ack.

Comment 13 Brock Organ 2021-06-10 19:03:06 UTC
early access to packages (as requested):

http://people.redhat.com/~borgan/.8.5/grub2-ppc64le-2.02-103.el8.ppc64le/

Comment 15 Brock Organ 2021-07-07 14:46:45 UTC
added grub2-ppc64le-modules.noarch.rpm to the package list:

http://people.redhat.com/~borgan/.8.5/grub2-ppc64le-2.02-103.el8.ppc64le/

Comment 16 IBM Bug Proxy 2021-08-02 13:32:23 UTC
------- Comment From diegodo.com 2021-08-02 09:28 EDT-------
Hello Redhat,

the provided packages are working as expected.

Please make it available.

Let me know if something is missing from our side.

Thanks

Comment 24 errata-xmlrpc 2021-11-09 19:53:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (grub2 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:4466