Bug 1303571

Summary: Bricks used by glusterfs get unmounted from their respective nodes while attempting to stress.
Product: Red Hat Enterprise Linux 7 Reporter: Ambarish <asoman>
Component: lvm2Assignee: LVM and device-mapper development team <lvm-team>
lvm2 sub component: Other QA Contact: cluster-qe <cluster-qe>
Status: CLOSED NOTABUG Docs Contact:
Severity: urgent    
Priority: unspecified CC: agk, heinzm, jbrassow, msnitzer, prajnoha, prockai, zkabelac
Version: 7.2   
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-02-01 11:02:51 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ambarish 2016-02-01 10:44:07 UTC
Description of problem:

The volumes are created using RHGS(Red Hat Gluster Storage) which uses the underlying LVM PVs as bricks as the building block.
While running parallel I/O,the bricks get unmounted from the nodes (after ~20 mins of starting the workload).
This looks like an LVM issue(outside the scope of RHGS).
The VGs,PVs and LV are intact.

One of the problematic LVs is RHS_vg6/RHS_lv6 (from Server : 10.70.37.134)


*SNIPPET FROM LOGS* :


Jan 30 19:01:36 dhcp37-134 lvm[16942]: WARNING: Device for PV Kc8B3r-1Qg1-kfVy-VU6c-1WlR-rczA-0q0eRO not found or rejected by a filter.
Jan 30 19:01:36 dhcp37-134 lvm[16942]: Cannot change VG RHS_vg4 while PVs are missing.
Jan 30 19:01:36 dhcp37-134 lvm[16942]: Consider vgreduce --removemissing.
Jan 30 19:01:36 dhcp37-134 lvm[16942]: Failed to extend thin RHS_vg4-RHS_pool4-tpool.
Jan 30 19:01:36 dhcp37-134 lvm[16942]: Unmounting thin volume RHS_vg4-RHS_pool4-tpool from /rhs/brick4.

Jan 30 19:12:01 dhcp37-134 lvm[16942]: WARNING: Device for PV PuNir0-yPa4-qC9a-aO5a-GnwY-2iOl-7Y2vAg not found or rejected by a filter.
Jan 30 19:12:01 dhcp37-134 lvm[16942]: Cannot change VG RHS_vg5 while PVs are missing.
Jan 30 19:12:01 dhcp37-134 lvm[16942]: Consider vgreduce --removemissing.
Jan 30 19:12:01 dhcp37-134 lvm[16942]: Failed to extend thin RHS_vg5-RHS_pool5-tpool.
Jan 30 19:12:01 dhcp37-134 lvm[16942]: Unmounting thin volume RHS_vg5-RHS_pool5-tpool from /rhs/brick5.

Jan 30 19:37:20 dhcp37-134 kernel: XFS (dm-26): Unmounting Filesystem
Jan 30 19:37:20 dhcp37-134 kernel: XFS (dm-21): Unmounting Filesystem

Jan 31 12:20:48 dhcp37-134 lvm[16942]: WARNING: Device for PV g1FsKG-lxkE-cxe5-VKFx-YZnp-HCEQ-kJmQkK not found or rejected by a filter.
Jan 31 12:20:48 dhcp37-134 lvm[16942]: Cannot change VG RHS_vg6 while PVs are missing.
Jan 31 12:20:48 dhcp37-134 lvm[16942]: Consider vgreduce --removemissing.
Jan 31 12:20:48 dhcp37-134 lvm[16942]: Failed to extend thin RHS_vg6-RHS_pool6-tpool.
Jan 31 12:20:48 dhcp37-134 lvm[16942]: Unmounting thin volume RHS_vg6-RHS_pool6-tpool from /rhs/brick6.
Jan 31 12:20:49 dhcp37-134 lvm[16942]: WARNING: Device for PV ENWJcR-Q8Cj-2ld6-hTa3-lrC7-ugfw-vne6em not found or rejected by a filter.
Jan 31 12:20:49 dhcp37-134 lvm[16942]: Cannot change VG RHS_vg7 while PVs are missing.
Jan 31 12:20:49 dhcp37-134 lvm[16942]: Consider vgreduce --removemissing.
Jan 31 12:20:49 dhcp37-134 lvm[16942]: Failed to extend thin RHS_vg7-RHS_pool7-tpool.
Jan 31 12:20:49 dhcp37-134 lvm[16942]: Unmounting thin volume RHS_vg7-RHS_pool7-tpool from /rhs/brick7.



*SNIPPET FROM DMESG* :

[/dev/vdi was the partition used for brick6]

[root@dhcp37-103 core]# dmesg|grep vdi
[258262.255796]  vdi: unknown partition table
[259125.585924]  vdi: unknown partition table
[259125.633420]  vdi: unknown partition table
[259125.636292]  vdi: unknown partition table
[259125.668662]  vdi: unknown partition table
[259147.521809]  vdi: unknown partition table
[259178.239697]  vdi: unknown partition table


*VOLUME CONFIGURATION* :

[root@dhcp37-134 tmp]# gluster v info khal
 
Volume Name: khal
Type: Tier
Volume ID: b261b0d8-e9dc-4014-90f5-0e869755e146
Status: Started
Number of Bricks: 26
Transport-type: tcp
Hot Tier :
Hot Tier Type : Distributed-Replicate
Number of Bricks: 4 x 2 = 8
Brick1: 10.70.37.134:/rhs/brick7/A1
Brick2: 10.70.37.134:/rhs/brick6/A1
Brick3: 10.70.37.134:/rhs/brick5/A1
Brick4: 10.70.37.103:/rhs/brick7/A1
Brick5: 10.70.37.134:/rhs/brick4/A1
Brick6: 10.70.37.103:/rhs/brick6/A1
Brick7: 10.70.37.134:/rhs/brick3/A1
Brick8: 10.70.37.103:/rhs/brick5/A1
Cold Tier:
Cold Tier Type : Distributed-Replicate
Number of Bricks: 9 x 2 = 18
Brick9: 10.70.37.218:/rhs/brick1/A1
Brick10: 10.70.37.41:/rhs/brick1/A1
Brick11: 10.70.37.218:/rhs/brick2/A1
Brick12: 10.70.37.41:/rhs/brick2/A1
Brick13: 10.70.37.218:/rhs/brick3/A1
Brick14: 10.70.37.41:/rhs/brick3/A1
Brick15: 10.70.37.218:/rhs/brick4/A1
Brick16: 10.70.37.41:/rhs/brick4/A1
Brick17: 10.70.37.218:/rhs/brick5/A1
Brick18: 10.70.37.41:/rhs/brick5/A1
Brick19: 10.70.37.103:/rhs/brick1/A1
Brick20: 10.70.37.218:/rhs/brick6/A1
Brick21: 10.70.37.103:/rhs/brick2/A1
Brick22: 10.70.37.218:/rhs/brick7/A1
Brick23: 10.70.37.103:/rhs/brick3/A1
Brick24: 10.70.37.134:/rhs/brick2/A1
Brick25: 10.70.37.103:/rhs/brick4/A1
Brick26: 10.70.37.134:/rhs/brick1/A1
Options Reconfigured:
cluster.self-heal-daemon: on
features.quota-deem-statfs: off
features.inode-quota: on
features.quota: on
cluster.watermark-hi: 50
cluster.watermark-low: 20
cluster.read-freq-threshold: 1
cluster.write-freq-threshold: 1
performance.io-cache: off
performance.quick-read: off
features.record-counters: on
cluster.tier-mode: cache
features.ctr-enabled: on
performance.readdir-ahead: on
[root@dhcp37-134 tmp]# 


Version-Release number of selected component (if applicable):

[root@dhcp37-134 tmp]# cat /etc/redhat-storage-release 
Red Hat Gluster Storage Server 3.1 Update 2

[root@dhcp37-134 tmp]# cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 7.2 (Maipo)
[root@dhcp37-134 tmp]# 


How reproducible:
Tried Once

Steps to Reproduce:

Set a 9*2 volume.Add a 4*2 hot tier.Run parallel I/O from multiple clients.

Actual results:

4/8 bricks from hot tier get unmounted all of a sudden after 20 mins of the workload.

Expected results:

Bricks should not be unmounted,brick process should not have been killed and I/O must run successfully without hangs/crashes.

Additional info:


*NODES* [root/redhat]:

10.70.37.103
10.70.37.134


*CLIENTS* [root/redhat]:

10.70.37.199
10.70.37.87
10.70.37.96
10.70.37.61


*WORKLOAD DESCRIPTION*:

Following were run parallely from different threads.

Client 1 -> dd
Client 2 -> dd
Client 3 -> Linux untar
Client 4 -> Linux untar + Media copy

Comment 1 Ambarish 2016-02-01 10:53:29 UTC
Tier logs,brick logs and sosreports copied here:

http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1303571/

Comment 3 Ambarish 2016-02-01 10:57:01 UTC
dmesg copied to same location

Comment 4 Ambarish 2016-02-01 10:58:57 UTC
The environment is preserved for further debugging by the LVM team.

Comment 5 Zdenek Kabelac 2016-02-01 11:02:04 UTC
(In reply to Ambarish from comment #0)
> Description of problem:
> 
> The volumes are created using RHGS(Red Hat Gluster Storage) which uses the
> underlying LVM PVs as bricks as the building block.
> While running parallel I/O,the bricks get unmounted from the nodes (after
> ~20 mins of starting the workload).
> This looks like an LVM issue(outside the scope of RHGS).
> The VGs,PVs and LV are intact.
> 
> One of the problematic LVs is RHS_vg6/RHS_lv6 (from Server : 10.70.37.134)
> 
> 
> *SNIPPET FROM LOGS* :
> 
> 
> Jan 30 19:01:36 dhcp37-134 lvm[16942]: WARNING: Device for PV
> Kc8B3r-1Qg1-kfVy-VU6c-1WlR-rczA-0q0eRO not found or rejected by a filter.
> Jan 30 19:01:36 dhcp37-134 lvm[16942]: Cannot change VG RHS_vg4 while PVs
> are missing.
> Jan 30 19:01:36 dhcp37-134 lvm[16942]: Consider vgreduce --removemissing.
> Jan 30 19:01:36 dhcp37-134 lvm[16942]: Failed to extend thin
> RHS_vg4-RHS_pool4-tpool.
> Jan 30 19:01:36 dhcp37-134 lvm[16942]: Unmounting thin volume
> RHS_vg4-RHS_pool4-tpool from /rhs/brick4.
> 


So you get clear log output what happens.

It's intended lvm2 behavior - the failure of thin-pool extension is currently associated with unmount of every related thin-volume - to avoid bigger disaster to happen (pool overfilling).

If you want higher 'occupancy' of thin-pool - raise-up threshold, lvm2 currently tries to avoid overfilling of thin-pool by dropping thin-volumes from being used (and thus potentially generating further load on thin-pool).

As a fix - provide more space in VG so the thin-pool resize does not fail.
Use higher percentage (up to 95%) for thin-pool resize.
Use smaller 'resize' step (down to 1%) (thought more resize operations will appear and my slow down thin-pool usage a bit more)

lvm2 currently does not provide configurable options for dmeventd behavior
if you want other lvm2 behavior then described - please open  RFE.

We do plain to provide some more 'fine-grained' policy modes....

Comment 6 Zdenek Kabelac 2016-02-01 11:02:51 UTC
No need for further debugging.

It works as designed, thus closing this BZ.