Bug 1602151

Summary: General Protection Fault unlocking UDS callback mutex
Product: Red Hat Enterprise Linux 7 Reporter: Thomas Jaskiewicz <tjaskiew>
Component: kmod-kvdoAssignee: Thomas Jaskiewicz <tjaskiew>
Status: CLOSED ERRATA QA Contact: Jakub Krysl <jkrysl>
Severity: high Docs Contact:
Priority: unspecified    
Version: 7.6CC: awalsh, jkrysl, limershe, mgandhi, ryan.p.norwood
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 6.1.1.117 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-10-30 09:39:49 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Thomas Jaskiewicz 2018-07-17 21:36:30 UTC
Description of problem:

In our nightly testing, we have seen an occasional general protection fault related to the UDS callback code path.  Examining the source code, we have found a code path that was doing a bad thing and fixed it in bug 1533260.  After doing the fix, we still see
a similar error happening in the same code path.

We believe there is a race condition in some synchronization code that came from the user mode implementation of UDS.  Since this code is a reimplementation of the Linux kernel wait_for_condition mechanism, we have chosen to change the code to just use the existing kernel mechanism.


Version-Release number of selected component (if applicable):


How reproducible:

Very difficult.  We have a test that does 1024 start/stop cycles of a single VDO device.  Its purpose is to ensure that VDO instance numbers behave properly, but it was the first test to see the general protection fault (after 500 cycles).  We run this test 2 or 3 times a night, and have seen this failure 5 times in the last six months.

SanityOnly is the reasonable test plan.


Steps to Reproduce:
1.
2.
3.

Actual results:

A general protection fault in the kvdo%u:callbsckW kernel process.

Expected results:

no general protection fault.
Additional info:

Comment 3 Jakub Krysl 2018-08-30 15:00:35 UTC
Sanity testing on kmod-kvdo-6.1.1.120-2.el7 passed.

Comment 5 errata-xmlrpc 2018-10-30 09:39:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3094

Comment 6 Thomas Jaskiewicz 2019-08-20 19:35:24 UTC
*** Bug 1693695 has been marked as a duplicate of this bug. ***