Red Hat Bugzilla – Bug 1496836
[RH 7.5 bug] Request for upstream commit 3664847d95e6 to be merged into RHEL 7.5/7.4
Last modified: 2018-04-10 18:16:51 EDT
Description of problem: Request for upstream commit 3664847d95e6 to be merged into RHEL 7.5/7.4. This patch was recently added to latest -rc by Linus under commit 12fcf66e74b1. md/raid5: fix a race condition in stripe batch https://github.com/torvalds/linux/commit/3664847d95e60a9a943858b7800f8484669740fc There have been reports of RHEL 7 systems crashing due to this bug in large environments when using Lustre. Testing with the mentioned patch indicates that it fixes the issue. Version-Release number of selected component (if applicable): kernel-3.10.0-693.2.2.el7 Actual results: System crash Expected results: No crash
A test kernel is available to verify the fix: http://people.redhat.com/ncroxon/rhel7/.rhel75_Oct6/kernel-3.10.0-726.el7.test.x86_64.rpm Please test as soon as possible and give feedback on your results. -Nigel
Hi Nigel, While others might be able to test, we had to put our new hardware into production very recently, so while I was able to test last week, we're not in a position of testing anymore. :/ We have servers connected to multiple JBODs and using mdraid. I applied the patch to 7.4 myself and it's working fine now. We're using the patch for more than a month now, maybe even 6 weeks. We had about 1 or 2 kernel panics per week without the patch (in 7.3 and 7.4). The tiny patch fixes a race condition that is well explained by the MD kernel maintainer. I hope you'll still be considering it for RH 7.4/7.5. Sorry of not behind more helpful right now but I prefer to be honest with you. :) Thanks for your help! Stephane
I will be including commit id 3664847d95e60a9a943858b7800f8484669740fc into RHEL7.5. Are you ok with closing this issue? -Nigel
Great news, thanks Nigel! Is there any plan to integrate this patch into a RHEL7.4 kernel update? Thanks, Stephane
Hello Stephane, Yes, we will back port this to RHEL7.4 kernel update. -Nigel
Thanks much. And yes, I'm ok with closing this issue now. Best, Stephane
This was fixed in RHEL7.5 with the following commit ID by me. commit 4f386c203da33195fa5ac0379aebec7ab1948e2e Author: Nigel Croxon <ncroxon@redhat.com> Date: Thu Nov 2 19:14:22 2017 -0400 -Nigel
Hello Did code review and verified patches applied correctly on kernel-3.10.0-820.el7. so will moved to "VERIFIED"
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:1062