Bug 1289486

Summary: RHEL7: lvresize on root volume hangs or does not complete, lvmetad blocked due to suspended LVM device
Product: Red Hat Enterprise Linux 7 Reporter: Simon Reber <sreber>
Component: lvm2Assignee: LVM and device-mapper development team <lvm-team>
lvm2 sub component: LVM Metadata / lvmetad QA Contact: cluster-qe <cluster-qe>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: agk, cmarthal, dwysocha, heinzm, jbrassow, lvm-team, mkarg, msnitzer, prajnoha, prockai, rbednar, teigland, zkabelac
Version: 7.1   
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: lvm2-2.02.160-1.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-04 04:13:22 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Simon Reber 2015-12-08 09:42:21 UTC
Description of problem:

Customer is running RHEL 7 with 3.10.0-229.el7.x86_64 (RHEL 7.1) on VMWare with XFS filesystem and is reporting system hang when increasing the root filesystem (which is XFS).
Once the system got stuck, no further interaction is possible and only a couple of hung_task messages are seen on the console (but not captured in /var/log/messages, as /var is on root filesystem).

Once the system is booted, re-size is working again without any issue. But obviously rebooting and the system hung should not happen.

So we enabled hung_task_panic, to crash the system when it happens again and we now received a core dump from the customer, where the same issue appeared again and was successfully crashed:


Version-Release number of selected component (if applicable):
lvm2-2.02.115-3.el7.x86_64
3.10.0-229.el7.x86_64

How reproducible:
Not always (internally not yet succeeded)

Steps to Reproduce:
1. N/A

Actual results:

System get's into hang state and can only be recovered with reboot. After that the reboot works fine

Expected results:

Resize of the root filesystem should just work without any issue

Additional info:

We have been looking at the vmcore but are not sure on whether it's something in VMWare or Storage layer causing the problem or even somewhere in XFS or LVM layer.

https://bugzilla.redhat.com/show_bug.cgi?id=1278920 was pointed out. But after careful checking, we found that in RHEL 7, LVM was correctly compiled and therefore it's not possible that it's the same issue.

Comment 2 David Teigland 2015-12-09 16:55:23 UTC
What is the sequence of commands that are run when there's a problem?
Is the lvextend/lvresize command deadlocking or xfs_growfs?
Is xfs_growfs being run after the lvextend/lvresize command is finished?
Can we tell if the dm device is suspended?

Comment 3 Simon Reber 2015-12-10 07:58:49 UTC
(In reply to David Teigland from comment #2)
> What is the sequence of commands that are run when there's a problem?
> Is the lvextend/lvresize command deadlocking or xfs_growfs?
> Is xfs_growfs being run after the lvextend/lvresize command is finished?
> Can we tell if the dm device is suspended?
They simply run `lvresize -L +6G -r /dev/mapper/path_to_lv` so I'm unable to tell at what command the problem starts.

Not sure on whether the dm device is suspended. Maybe we can find this information within the provided vmcore?!

Comment 9 David Teigland 2015-12-10 17:06:39 UTC
It appears that lvresize wrote the VG with the new size, suspended the dm device to apply the change to the kernel, then vanished somehow, leaving the dm device suspended.  lvresize didn't leave a core file did it?  Capturing the output of lvresize with -vvvv would probably help.

Comment 12 David Teigland 2015-12-10 22:26:17 UTC
I think we've identified the problem.  Until a fix is ready, disabling the use of lvmetad should avoid this problem.  Set "use_lvmetad=0" in lvm.conf.  lvmetad does not provide much or any benefit in environments like this with so few devices.

The problem seems to be that lvresize communicates with lvmetad while the dm device is suspended.  The communication and lvmetad involve allocating memory, which in this case triggers memory reclaim on the suspended device, which involves writing to the suspended device, which blocks because the device is suspended, causing a deadlock.

Comment 14 Simon Reber 2015-12-17 14:50:51 UTC
(In reply to David Teigland from comment #12)
> I think we've identified the problem.  Until a fix is ready, disabling the
> use of lvmetad should avoid this problem.  Set "use_lvmetad=0" in lvm.conf. 
> lvmetad does not provide much or any benefit in environments like this with
> so few devices.
> 
> The problem seems to be that lvresize communicates with lvmetad while the
> dm device is suspended.  The communication and lvmetad involve allocating
> memory, which in this case triggers memory reclaim on the suspended device,
> which involves writing to the suspended device, which blocks because the
> device is suspended, causing a deadlock.
Customer just confirmed today, that after applying the suggested work-around ("use_lvmetad=0") the problem did not re-occur (he did re-size the root filesystems on 10 systems today).

So I'm wondering how we could permanently solve this, without the need of the work-around. I assume that it will take time but was wondering if we already have an idea how to-do it.

Comment 15 David Teigland 2015-12-17 15:30:46 UTC
The solution is to send the message to lvmetad (lvmetad_vg_update()) after LVs have been resumed.  Unfortunately, the way LVs are resumed is not very straight forward -- it's  a hidden side effect of VG locking.  The update should happen after resume and before unlock.

Comment 21 Roman Bednář 2016-09-16 13:29:40 UTC
Marking verified with latest rpms. No errors or system hang occured during several lvextend runs on XFS rootfs, while lvmetad running. 

3.10.0-506.el7.x86_64
lvm2-2.02.165-2.el7.x86_64.rpm

# dmidecode | grep -i vmware
	Manufacturer: VMware, Inc.
	Product Name: VMware Virtual Platform
	Serial Number: VMware-42 36 bc 33 87 ca f8 12-54 25 53 c1 fc 49 c9 55
	Description: VMware SVGA II

# systemctl is-active lvm2-lvmetad.service 
active

# vgs
  VG   #PV #LV #SN Attr   VSize  VFree  
  rhel   1   2   0 wz--n- 15.00g 600.00m

# lvs
  LV   VG   Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  root rhel -wi-ao---- 13.39g                                                    
  swap rhel -wi-a-----  1.02g 
                                                   
# lvextend -L+100M -r rhel/root
  Size of logical volume rhel/root changed from 13.39 GiB (3429 extents) to 13.49 GiB (3454 extents).
  Logical volume rhel/root successfully resized.
meta-data=/dev/mapper/rhel-root  isize=512    agcount=4, agsize=877824 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=0 spinodes=0
data     =                       bsize=4096   blocks=3511296, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
log      =internal               bsize=4096   blocks=2560, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
data blocks changed from 3511296 to 3536896

# lvs
  LV   VG   Attr       LSize  ......
  root rhel -wi-ao---- 3.49g                                            
  swap rhel -wi-a----- 1.02g 
                                                   
# lvextend -L+100M -r rhel/root
  Size of logical volume rhel/root changed from 13.49 GiB (3454 extents) to 13.59 GiB (3479 extents).
  Logical volume rhel/root successfully resized.
meta-data=/dev/mapper/rhel-root  isize=512    agcount=5, agsize=877824 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=0 spinodes=0
data     =                       bsize=4096   blocks=3536896, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
log      =internal               bsize=4096   blocks=2560, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
data blocks changed from 3536896 to 3562496

# lvs rhel/root
  LV   VG   Attr       LSize  .....
  root rhel -wi-ao---- 3.59g  
                                                  
# lvextend -L+100M -r rhel/root
  Size of logical volume rhel/root changed from 13.59 GiB (3479 extents) to 13.69 GiB (3504 extents).
  Logical volume rhel/root successfully resized.
meta-data=/dev/mapper/rhel-root  isize=512    agcount=5, agsize=877824 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=0 spinodes=0
data     =                       bsize=4096   blocks=3562496, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
log      =internal               bsize=4096   blocks=2560, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
data blocks changed from 3562496 to 3588096

# lvs rhel/root
  LV   VG   Attr       LSize  ......
  root rhel -wi-ao---- 3.69g

Comment 23 errata-xmlrpc 2016-11-04 04:13:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-1445.html