Bug 1942152 - Make grub2 more robust against Open Firmware storage race condition causing system boot failures
Summary: Make grub2 more robust against Open Firmware storage race condition causing s...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: grub2
Version: 8.4
Hardware: ppc64le
OS: Linux
unspecified
urgent
Target Milestone: beta
: 8.5
Assignee: Bootloader engineering team
QA Contact: Petr Janda
URL:
Whiteboard:
Depends On: 1862632
Blocks: 1916117 1961265
TreeView+ depends on / blocked
 
Reported: 2021-03-23 18:37 UTC by sgardner
Modified: 2021-11-10 09:43 UTC (History)
10 users (show)

Fixed In Version: grub2-2.02-103.el8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1961265 (view as bug list)
Environment:
Last Closed: 2021-11-09 19:53:59 UTC
Type: Bug
Target Upstream Version:


Attachments (Terms of Use)
Retry open and read on failure (3.33 KB, patch)
2021-04-07 13:01 UTC, IBM Bug Proxy
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
IBM Linux Technology Center 192172 0 None None None 2021-03-24 19:55:42 UTC
Red Hat Product Errata RHBA-2021:4466 0 None None None 2021-11-09 19:54:15 UTC

Description sgardner 2021-03-23 18:37:16 UTC
Description of problem:
IBM is providing a grub patch for PPC systems.  This patch will allow grub to retry IO and connection requests to fc devices in the event that the device returns a timeout.



Actual results:
Systems fail to boot with storage related errors.


Expected results:
Systems will no fail to boot.


Additional info:
Adding IBM team to this BZ.  As this is affecting some very large customers, there is a heavy push to have this backported into RHEL8.

Comment 1 IBM Bug Proxy 2021-04-07 13:00:57 UTC
------- Comment From diegodo.com 2021-04-07 09:00 EDT-------
Hello Redhat,

this is the patch that I'm willing to send upstream.

Please, let me know your thoughts.

This patch is on top of patch: Avoiding many unecessary open close

Comment 2 IBM Bug Proxy 2021-04-07 13:01:08 UTC
Created attachment 1769869 [details]
Retry open and read on failure

Comment 3 Hanns-Joachim Uhl 2021-04-07 13:54:02 UTC
(In reply to IBM Bug Proxy from comment #1)
> ------- Comment From diegodo.com 2021-04-07 09:00 EDT-------
...
> 
> This patch is on top of patch: Avoiding many unecessary open close
.
... which was integrated into RHEL8.4 through
LTC bug 187174 - RH1862632- RHEL8.3 Beta - ISST-LTE:PowerVM: Fleetwood:raylp83: LPAR installed on a namespace of Kona NVME card takes long time to boot  ...

Comment 4 IBM Bug Proxy 2021-05-11 13:51:38 UTC
------- Comment From fnovak.com 2021-05-11 09:48 EDT-------
Where does this stand?  No update in month+

Comment 5 Javier Martinez Canillas 2021-05-12 11:29:37 UTC
(In reply to IBM Bug Proxy from comment #4)
> ------- Comment From fnovak.com 2021-05-11 09:48 EDT-------
> Where does this stand?  No update in month+


Sorry, we were busy with other tasks. How far should this go for z-stream ?

Comment 6 sgardner 2021-05-12 12:38:57 UTC
We don't currently have any customers on RHEL8 hitting this race condition, but I don't want to wait until 8.5 to release this.  I think 8.3.z and 8.4.z is sufficient.

Comment 7 Javier Martinez Canillas 2021-05-12 12:47:09 UTC
(In reply to sgardner from comment #6)
> We don't currently have any customers on RHEL8 hitting this race condition,
> but I don't want to wait until 8.5 to release this.  I think 8.3.z and 8.4.z
> is sufficient.

Thanks for the info. Yes, I set to 8.5 because we first need to push the fix in
before backporting to z-stream. As far as I know, 8.3.z is not an EUS release
and was EOL after the 8.4 release, so then we only need to fix this in 8.4.z ?

Comment 8 Petr Janda 2021-05-17 08:25:57 UTC
I suppose IBM is able to test it when patch is merged. So providing qa_ack.

Comment 13 Brock Organ 2021-06-10 19:03:06 UTC
early access to packages (as requested):

http://people.redhat.com/~borgan/.8.5/grub2-ppc64le-2.02-103.el8.ppc64le/

Comment 15 Brock Organ 2021-07-07 14:46:45 UTC
added grub2-ppc64le-modules.noarch.rpm to the package list:

http://people.redhat.com/~borgan/.8.5/grub2-ppc64le-2.02-103.el8.ppc64le/

Comment 16 IBM Bug Proxy 2021-08-02 13:32:23 UTC
------- Comment From diegodo.com 2021-08-02 09:28 EDT-------
Hello Redhat,

the provided packages are working as expected.

Please make it available.

Let me know if something is missing from our side.

Thanks

Comment 24 errata-xmlrpc 2021-11-09 19:53:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (grub2 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:4466


Note You need to log in before you can comment on or make changes to this bug.