1598179 – [GSS] [Tracker:RHEL] CNS: Number of devices per brick leads to long LVM Scan time on Server Reboot

Bug 1598179 - [GSS] [Tracker:RHEL] CNS: Number of devices per brick leads to long LVM Scan time on Server Reboot

Summary: [GSS] [Tracker:RHEL] CNS: Number of devices per brick leads to long LVM Scan ...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	rhgs-server-container
Sub Component:
Version:	cns-3.9
Hardware:	All
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Raghavendra Talur
QA Contact:	Prasanth
Docs Contact:
URL:
Whiteboard:
Depends On:	1613141
Blocks:	OCS-3.11.1-devel-triage-done 1642792
TreeView+	depends on / blocked

Reported:	2018-07-04 14:31 UTC by Matthew Robson
Modified:	2021-12-10 16:33 UTC (History)
CC List:	26 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2020-02-19 14:30:25 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Matthew Robson 2018-07-04 14:31:59 UTC

Description of problem:

As the number of Gluster Volumes grow, so does the number of underlying devices (4 per  brick). In this case, with ~220 volumes, there 880 devices per server.

On a server reboot, the lvm scan took over 20 minutes to complete.

As we support up to 1000 volumes, it's likely the current 20 minute scan time is going to grow to 90+ minutes on a fully loaded system.

After rebooting a server with two 4tb drives it took a long time for the lvm-pvscan services to all complete. Those two devices are used by Gluster in Container Native Storage.

[root@server ~]# systemctl list-units lvm2-pvscan*
UNIT                      LOAD   ACTIVE SUB    DESCRIPTION
lvm2-pvscan@253:3.service loaded active exited LVM2 PV scan on device 253:3
lvm2-pvscan@253:4.service loaded active exited LVM2 PV scan on device 253:4
lvm2-pvscan@253:5.service loaded active exited LVM2 PV scan on device 253:5
lvm2-pvscan@8:16.service  loaded active exited LVM2 PV scan on device 8:16
lvm2-pvscan@8:2.service   loaded active exited LVM2 PV scan on device 8:2
lvm2-pvscan@8:32.service  loaded active exited LVM2 PV scan on device 8:32

LOAD   = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB    = The low-level unit activation state, values depend on unit type.

6 loaded units listed. Pass --all to see loaded but inactive units, too.
To show all installed unit files use 'systemctl list-unit-files'.

[root@server ~]# systemctl status -l lvm2-pvscan@8:32.service
* lvm2-pvscan@8:32.service - LVM2 PV scan on device 8:32
   Loaded: loaded (/usr/lib/systemd/system/lvm2-pvscan@.service; static; vendor preset: disabled)
   Active: active (exited) since Thu 2018-05-17 10:48:11 PDT; 22h ago
     Docs: man:pvscan(8)
 Main PID: 1288 (code=exited, status=0/SUCCESS)
   Memory: 0B
   CGroup: /system.slice/system-lvm2\x2dpvscan.slice/lvm2-pvscan@8:32.service

May 17 10:36:27 server.dmz systemd[1]: Starting LVM2 PV scan on device 8:32...
May 17 10:36:27 server.dmz lvm[1288]: WARNING: lvmetad is being updated, retrying (setup) for 10 more seconds.
May 17 10:48:11 server.dmz lvm[1288]: Internal error: Reserved memory (21868544) not enough: used 45850624. Increase activation/reserved_memory?
May 17 10:48:11 server.dmz lvm[1288]: 202 logical volume(s) in volume group "vg_0b025213f4146415f5151250bf206d03" now active
May 17 10:48:11 server.dmz systemd[1]: Started LVM2 PV scan on device 8:32.

[root@server ~]# systemctl status -l lvm2-pvscan@8:16.service
* lvm2-pvscan@8:16.service - LVM2 PV scan on device 8:16
   Loaded: loaded (/usr/lib/systemd/system/lvm2-pvscan@.service; static; vendor preset: disabled)
   Active: active (exited) since Thu 2018-05-17 10:56:34 PDT; 22h ago
     Docs: man:pvscan(8)
 Main PID: 1304 (code=exited, status=0/SUCCESS)
   Memory: 0B
   CGroup: /system.slice/system-lvm2\x2dpvscan.slice/lvm2-pvscan@8:16.service

May 17 10:36:27 server.dmz systemd[1]: Starting LVM2 PV scan on device 8:16...
May 17 10:36:27 server.dmz lvm[1304]: WARNING: lvmetad is being updated, retrying (setup) for 10 more seconds.
May 17 10:56:34 server.dmz lvm[1304]: Internal error: Reserved memory (22237184) not enough: used 68055040. Increase activation/reserved_memory?
May 17 10:56:34 server.dmz lvm[1304]: 246 logical volume(s) in volume group "vg_fdef9e44dc1d58ce1759162de3d7b4d9" now active
May 17 10:56:34 server.dmz systemd[1]: Started LVM2 PV scan on device 8:16.

So between the reboot at 10:36 and 10:56 (20 min) other LVM operations would hang on the lock file.

[root@server ~]# pvs
^C  Interrupted...
  Giving up waiting for lock.
  /run/lock/lvm/V_vg_fdef9e44dc1d58ce1759162de3d7b4d9:aux: flock failed: Interrupted system call
  Can't get lock for vg_fdef9e44dc1d58ce1759162de3d7b4d9
  Cannot process volume group vg_fdef9e44dc1d58ce1759162de3d7b4d9
  Interrupted...
  Interrupted...
  PV                    VG                                  Fmt  Attr PSize   PFree
  /dev/mapper/mpatha_t2 docker-vg                           lvm2 a--  499.98g 299.49g
  /dev/mapper/mpathb_t2 vgvarlog01                          lvm2 a--   99.98g      0
  /dev/mapper/mpathc_t2 vgetcd                              lvm2 a--   49.98g      0
  /dev/sda2             rootvg                              lvm2 a--  278.36g 202.36g
  /dev/sdc              vg_0b025213f4146415f5151250bf206d03 lvm2 a--   <4.37t   3.84t

Version-Release number of selected component (if applicable):

lvm2-2.02.171-8.el7.x86_64

glusterfs-3.8.4-54.el7rhgs.x86_64

kernel-3.10.0-693.11.6.el7.x86_64


How reproducible:
Always


Steps to Reproduce:
1. Create a lot of CNS Volumes
2. Bring down Gluster pod cleanly 
3. Restart Server

Actual results:

Takes a long time for lvm scanning, preventing you from bringing back Gluster.


Expected results:

Need a way to preform restarts and maintenance is an efficient manner.

Additional info:

Comment 2 Yaniv Kaul 2018-07-04 14:45:01 UTC

Are all VGs active? (should they be?)
Are there happen to be LVs within LVs?
Is lvmetadata up and running (I don't think there's a need for it) ?

Comment 24 David Teigland 2018-07-23 18:33:44 UTC

Can you try just filtering out the brick LVs in lvm.conf
global_filter = [ "r|brick|" ]

Comment 26 David Teigland 2018-07-24 14:40:09 UTC

Filtering them out means that lvm will not waste time scanning those brick LVs for other PVs.  You do not appear to be "stacking" PVs on top of LVs, in which case there is no reason to scan the LVs.  I said the same up in comment 13.

If you filter them out, then the pvscan commands that are run to scan each brick LV will do nothing.  Nothing needs to be added to the lvmetad cache from these LVs.

Comment 32 Amar Tumballi 2018-08-13 05:32:28 UTC

Moving the component to CNS Ansible, mainly to point out that, we need to get this as a install step to setup LVM global setting while getting the node setup.

Comment 38 Jose A. Rivera 2018-09-05 16:50:36 UTC

Probably, though I'm not the right person to determine that nor make that change. I also don't know who would be.

Comment 39 Yaniv Kaul 2018-09-20 10:41:05 UTC

David, isn't this a deja-vu with VDSM (see https://bugzilla.redhat.com/show_bug.cgi?id=1374545 ) ? We need to disable LVs we do not need/usr, we need to disable lvmetad and we need a correct lvm.conf, no?
(I don't remember if we need to re-run dracut?)

Comment 40 David Teigland 2018-09-20 16:28:25 UTC

This bug may be the same recently fixed bug 1613141.  I'd suggest trying the fix from that bug.

(In reply to Yaniv Kaul from comment #39)
> David, isn't this a deja-vu with VDSM (see
> https://bugzilla.redhat.com/show_bug.cgi?id=1374545 ) ? We need to disable
> LVs we do not need/usr, we need to disable lvmetad and we need a correct
> lvm.conf, no?
> (I don't remember if we need to re-run dracut?)

In the vdsm case there is shared storage, but here I don't think there is, so lvmetad can still legitimately be used.  Also, I don't think there are any PVs layered on the LVs (from guests or otherwise), which means there should be no foreign LVs (e.g. from guests) that need to be excluded.  

Adding the LVs to the filter will not cause them to disappear, it will just prevent lvm from scanning them for layered (guest) PVs.  When there are hundreds or thousands of LVs, scanning them can waste a lot of time and cause contention with lvmetad.

Updating the initramfs to include the lvm.conf filter change would probably be best, although it's probably not necessary (there should be no pvscans or autoactivation happening in the initramfs).

Note You need to log in before you can comment on or make changes to this bug.

agk
akamra
bkunal
coughlan
ekuric
hchiramm
hongkliu
jarrpa
jmulligan
jstrunk
kramdoss
Lee.McClintock
madam
moagrawa
mpillai
mrobson
nberry
pprakash
prajnoha
rhs-bugs
rtalur
sarumuga
sheggodu
teigland
vbellur
zkabelac