Bug 1942148

Summary: Make grub2 more robust against Open Firmware storage race condition causing system boot failures [rhel-7.9.z]
Product: Red Hat Enterprise Linux 7 Reporter: sgardner
Component: grub2Assignee: Jan Hlavac <jhlavac>
Status: CLOSED ERRATA QA Contact: Petr Janda <pjanda>
Severity: high Docs Contact:
Priority: high    
Version: 7.9CC: bootloader-eng-team, borgan, bugproxy, diegodo, fmartine, jhlavac, jreznik, pjanda, rmetrich, sbarcomb, sgardner
Target Milestone: rcKeywords: OtherQA, Triaged, ZStream
Target Release: 7.9   
Hardware: ppc64le   
OS: Linux   
Whiteboard:
Fixed In Version: grub2-2.02-0.87.el7_9.7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-10-12 15:27:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1857216    
Attachments:
Description Flags
Retry open and read on failure
none
Patch avoiding many unecessary open/close during the boot none

Description sgardner 2021-03-23 18:28:10 UTC
Description of problem:
IBM is providing a grub patch for PPC systems.  This patch will allow grub to retry IO and connection requests to fc devices in the event that the device returns a timeout.



Actual results:
Systems fail to boot with storage related errors.


Expected results:
Systems will no fail to boot.


Additional info:
Adding IBM team to this BZ.  As this is affecting some very large customers, there is a heavy push to have this backported into RHEL7.9.

Comment 4 IBM Bug Proxy 2021-04-01 13:41:12 UTC
Created attachment 1768256 [details]
Retry open and read on failure


------- Comment on attachment From diegodo.com 2021-04-01 09:33 EDT-------


Hello Redhat,

this is the patch that I'm willing to send upstream.

Please, let me know your thoughts.

This patch is on top of Avoiding many unecessary open close that I'll attach here as well (I don't know if it is already applied to RHEL7.9)

Comment 5 IBM Bug Proxy 2021-04-01 13:41:13 UTC
Created attachment 1768257 [details]
Patch avoiding many unecessary open/close during the boot

Comment 6 Petr Janda 2021-05-17 08:56:01 UTC
I expect IBM will verify it, providing qa_ack.

Comment 7 Javier Martinez Canillas 2021-05-17 14:14:32 UTC
(In reply to IBM Bug Proxy from comment #4)
> Created attachment 1768256 [details]
> Retry open and read on failure
> 
> 
> ------- Comment on attachment From diegodo.com 2021-04-01 09:33
> EDT-------
> 
> 
> Hello Redhat,
> 
> this is the patch that I'm willing to send upstream.
> 
> Please, let me know your thoughts.
> 
> This patch is on top of Avoiding many unecessary open close that I'll attach
> here as well (I don't know if it is already applied to RHEL7.9)

Yes, the latter was included in build grub2-2.02-0.87.el7_9.6.

Comment 8 Hanns-Joachim Uhl 2021-05-19 09:38:12 UTC
(In reply to Javier Martinez Canillas from comment #7)
...
> 
> Yes, the latter was included in build grub2-2.02-0.87.el7_9.6.
.
Hello Red Hat / Javier,
... can you please provide us the updated grub2 rpm (for ppc64le ..) for our early testing ...?
Please advise ...
Thanks in advance for your support.

Comment 9 sgardner 2021-06-03 15:54:25 UTC
I just wanted to provide some clarification just in case there is some confusion.  The patch to "avoiding many unnecessary open/close during the boot" has already been backported into RHEL7.9.  This BZ is ONLY for backporting the "Retry open and read on failure" code from attachment "https://bugzilla.redhat.com/attachment.cgi?id=1768256".

Created attachment 1768256 [details]
Retry open and read on failure

Comment 11 IBM Bug Proxy 2021-06-10 19:10:42 UTC
------- Comment From janani.com 2021-06-10 15:00 EDT-------
Thank you Brock

Comment 16 Brock Organ 2021-06-13 03:13:05 UTC
(In reply to Brock Organ from comment #10)
> (In reply to Hanns-Joachim Uhl from comment #8)
> > (In reply to Javier Martinez Canillas from comment #7)
> > ...
> > > 
> > > Yes, the latter was included in build grub2-2.02-0.87.el7_9.6.
> > .
> > Hello Red Hat / Javier,
> > ... can you please provide us the updated grub2 rpm (for ppc64le ..) for our
> > early testing ...?
> > Please advise ...
> > Thanks in advance for your support.
> 
> early access to packages:
> 
> http://people.redhat.com/~borgan/.8.5/grub2-2.02-0.87.el7_9.6.ppc64le/


Hi Team,

Steven has corrected my package list, here is the right set of new packages to test, sorry for the miscommunication:

http://people.redhat.com/~borgan/.8.5/grub2-2.02-0.87.el7_9.7.ppc64le/

Comment 17 IBM Bug Proxy 2021-07-22 14:41:37 UTC
------- Comment From diegodo.com 2021-07-22 10:36 EDT-------
(In reply to comment #15)
> (In reply to Brock Organ from comment #10)
> > (In reply to Hanns-Joachim Uhl from comment #8)
> > > (In reply to Javier Martinez Canillas from comment #7)
> > > ...
> > > >
> > > > Yes, the latter was included in build grub2-2.02-0.87.el7_9.6.
> > > .
> > > Hello Red Hat / Javier,
> > > ... can you please provide us the updated grub2 rpm (for ppc64le ..) for our
> > > early testing ...?
> > > Please advise ...
> > > Thanks in advance for your support.
> >
> > early access to packages:
> >
> > http://people.redhat.com/~borgan/.8.5/grub2-2.02-0.87.el7_9.6.ppc64le/
> Hi Team,
> Steven has corrected my package list, here is the right set of new packages
> to test, sorry for the miscommunication:
> http://people.redhat.com/~borgan/.8.5/grub2-2.02-0.87.el7_9.7.ppc64le/

Hi Redhat,

just for my better understading: what is the next step here?
Is the package already available to customers?

Thanks

Comment 18 IBM Bug Proxy 2021-08-02 13:32:15 UTC
------- Comment From diegodo.com 2021-08-02 09:27 EDT-------
Hello Redhat,

the provided packages are working as expected.

Please make it available.

Let me know if something is missing from our side.

Thanks

Comment 20 Petr Janda 2021-08-17 06:29:04 UTC
Hello

I consider it as verified by customer.

Petr

Comment 31 errata-xmlrpc 2021-10-12 15:27:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (grub2 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3794