Bug 1278920

Summary: deadlock when removing snapshot of root LV from lvremove failing to mlock() itself into memory
Product: Red Hat Enterprise Linux 6 Reporter: David Jeffery <djeffery>
Component: lvm2Assignee: LVM and device-mapper development team <lvm-team>
lvm2 sub component: Changing Logical Volumes (RHEL6) QA Contact: cluster-qe <cluster-qe>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: agk, cluster-qe, cmarthal, cww, dwysocha, fhirtz, heinzm, jbrassow, msnitzer, pm-eus, prajnoha, prockai, rbednar, zkabelac
Version: 6.7Keywords: Regression, ZStream
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: lvm2-2.02.140-3.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1279983 (view as bug list) Environment:
Last Closed: 2016-05-11 01:18:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1279983    

Description David Jeffery 2015-11-06 18:32:38 UTC
A customer periodically takes a snapshot of the logical volume containing the root filesystem.  Several RHEL 6.7 systems have hung and had all applications become unresponsive when using lvremove to remove the snapshot.

A vmcore was captured, and lvremove was found to be stuck waiting for a page fault.  At the time lvremove triggered a page fault, it had already suspended the DM devices for the root logical volume and snapshot.  lvremove was deadlocked as the pagefault needed data from the root filesystem, but the root filesystem couldn't be read until lvremove finished its operations and resumed the root logical volume.


The issue appears to be a regression starting with the 2.02.118 releases of lvm2 for RHEL 6.7.  lvremove did not have any of its memory mlocked into physical memory.  Under a test with strace running, lvremove was found to be passing a length of 0 to all calls to mlock():

...
mlock(0x7fc06c125000, 0)                = 0
mlock(0x7fc06c33b000, 0)                = 0
mlock(0x7fc06c33c000, 0)                = 0
...

Tested older versions, including lvm2-2.02.111-2.el6_6.6, did not show this behavior.  Instead, mlock was being passed proper lengths for the regions to lock into memory with RHEL6.6 and older versions of lvm2.

...
mlock(0x400000, 1044480)                = 0
mlock(0x6fe000, 49152)                  = 0
mlock(0x70a000, 98304)                  = 0
...

The bug appears to be from a change to _maps_line() in lib/mm/memlock.c related to valgrind defines, specifically the code shortly before mlock() is called:

#ifdef HAVE_VALGRIND
        /*
         * Valgrind is continually eating memory while executing code
         * so we need to deactivate check of locked memory size
         */
#ifndef VALGRIND_POOL
        if (RUNNING_ON_VALGRIND)
#endif
                sz -= sz; /* = 0, but avoids getting warning about dead assigment */

#endif

With HAVE_VALGRIND defined and VALGRIND_POOL now defined from an option passed to ./configure in lvm2.spec, the "sz -=sz;" line is always invoked and sets a 0 size.   This 0 size is then passed to mlock(), breaking the use of mlock().  With lvremove not locked into memory, it can page fault in the middle of its critical section and deadlock itself and hang anything else needing the root filesystem.


Version-Release number of selected component (if applicable):
lvm2-2.02.118-3.el6_7.2.x86_64

How reproducible:
Deadlock is very random from needing a page fault at a critical time.

Steps to Reproduce:
1.  Create a snapshot of an logical volume
2.  Run "lvremove" under strace to remove the snapshot.
3.  strace data will show mlock() calls with a length parameter of 0 when the bug occurs.

Actual results:
lvremove can deadlock when removing a snapshot for a logical volume containing the root filesystem.

Expected results:
lvremove should remove the snapshot without risk of a deadlock.

Comment 2 Zdenek Kabelac 2015-11-06 19:07:53 UTC
I'm quite confused what is this BZ about.

Running  'lvm2' code  within  'valgrind'   MUST not  mlock any memory.

Thus it eliminates locking size to 0 - this is 'expected' and 'wanted'.
Using 0 is not 'breaking' mlock - it disables mlock.

So passing  0  is not a problem -  it's the behaviour for lvm2 binary executed from valgrind.

Comment 3 Alasdair Kergon 2015-11-06 19:48:33 UTC
The spec file should not have set that option.

Comment 4 Zdenek Kabelac 2015-11-06 20:01:50 UTC
--enable-valgrind-pool  somehow slipped to the build.

This option shall not appear in final build as it's current implementation eats memory (even in critical section)  and it's not protected with runtime detection.

Comment 6 Alasdair Kergon 2015-11-10 03:22:06 UTC
To be clear, this is a straightforward rebuild with a corrected spec file.  No code change.  "Steps to reproduce" in the original description no longer showing zero will be sufficient to show the problem has gone away.

Comment 7 Alasdair Kergon 2015-11-10 03:24:35 UTC
A temporary workaround of setting lvm.conf configuration of activation/use_mlockall=1 has been provided, but this is not ideal as it uses more memory and can be slower, and should be reverted once the fixed package is available.

Comment 15 Roman Bednář 2015-11-24 07:50:48 UTC
Marking as verified.


lvremove strace output:


old version: lvm2-2.02.118-3.el6_7.3

...
mlock(0x7fc530929000, 0)                = 0
mlock(0x7fc53092a000, 0)                = 0
mlock(0x7fc530b41000, 0)                = 0
...



new version: lvm2-2.02.118-3.el6_7.4

...
mlock(0x7f0a7a024000, 4096)             = 0
mlock(0x7f0a7a025000, 94208)            = 0
mlock(0x7f0a7a23c000, 4096)             = 0
...

Comment 17 Roman Bednář 2016-02-04 14:43:55 UTC
Marking as verified.

Tested on:

2.6.32-610.el6.x86_64

lvm2-2.02.140-3.el6    BUILT: Thu Jan 21 12:40:10 CET 2016
lvm2-libs-2.02.140-3.el6    BUILT: Thu Jan 21 12:40:10 CET 2016
lvm2-cluster-2.02.140-3.el6    BUILT: Thu Jan 21 12:40:10 CET 2016
udev-147-2.69.el6    BUILT: Thu Jan 28 15:41:45 CET 2016
device-mapper-1.02.114-3.el6    BUILT: Thu Jan 21 12:40:10 CET 2016
device-mapper-libs-1.02.114-3.el6    BUILT: Thu Jan 21 12:40:10 CET 2016
device-mapper-event-1.02.114-3.el6    BUILT: Thu Jan 21 12:40:10 CET 2016
device-mapper-event-libs-1.02.114-3.el6    BUILT: Thu Jan 21 12:40:10 CET 2016
device-mapper-persistent-data-0.6.0-2.el6    BUILT: Thu Jan 21 09:40:25 CET 2016
cmirror-2.02.140-3.el6    BUILT: Thu Jan 21 12:40:10 CET 2016

==========================
Test result:

lvremove strace output:

...
mlock(0x7f81a38fb000, 4096)             = 0
mlock(0x7f81a38fc000, 536576)           = 0
mlock(0x7f81a3b7e000, 4096)             = 0
...

Comment 19 errata-xmlrpc 2016-05-11 01:18:55 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0964.html