Bug 575967

Summary:	lvm doesn't scale LV operations well
Product:	Red Hat Enterprise Linux 5	Reporter:	Corey Marthaler <cmarthal>
Component:	lvm2	Assignee:	Zdenek Kabelac <zkabelac>
Status:	CLOSED ERRATA	QA Contact:	Corey Marthaler <cmarthal>
Severity:	medium	Docs Contact:
Priority:	low
Version:	5.5	CC:	agk, coughlan, dwysocha, heinzm, iannis, jbrassow, mbroz, prockai
Target Milestone:	rc
Target Release:	---
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:	lvm2-2.02.88-1.el5	Doc Type:	Bug Fix
Doc Text:	Some of the lvm2 operations were optimized to speed up activation and deactivation time if running over a large group of logical volumes.	Story Points:	---
Clone Of:		Environment:
Last Closed:	2012-02-21 06:02:36 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Corey Marthaler 2010-03-22 20:23:17 UTC

Description of problem:
I think that this is already general knowledge in RHEL5.5, and may be fixed to a degree in RHEL6. I have a new test that spawns a bunch of LV creates, then keeps new create attempts running while others finish. This test appears to lead to operation time outs. 

This kind of testing becomes very difficult because clean up after creating this many volumes (2-4k) seems to take for ever (45min - 1hour+) if they don't end up hanging all together.

In the cases where the operations hang and require a reboot, that reboot then takes 45 min or more depending on how many lvs were created. It's then another 45 min or more to again attempt the clean up and once again start testing.
 
Version-Release number of selected component (if applicable):
2.6.18-190.el5

lvm2-2.02.56-8.el5    BUILT: Fri Feb 12 02:40:43 CST 2010
lvm2-cluster-2.02.56-7.el5    BUILT: Mon Feb  8 10:24:29 CST 2010
device-mapper-1.02.39-1.el5    BUILT: Wed Nov 11 12:31:44 CST 2009

How reproducible:
Often

Comment 1 Corey Marthaler 2010-03-23 17:30:23 UTC

The last vgchange that I ran in order to clean up and continue testing took, 3.5 hours.

[root@grant-01 ~]# time vgchange -an create_scale
  0 logical volume(s) in volume group "create_scale" now active

real    210m25.493s
user    201m38.415s
sys     8m39.714s

Comment 2 Milan Broz 2010-03-24 05:27:45 UTC

Cluster locking or it is also reproducible with local locking?

(The final vgchange - is it the only lvm command running or still others in-flight?)

Comment 3 Corey Marthaler 2010-03-24 14:21:33 UTC

This is with local locking only, and no other lvm commands still running.

Comment 4 Alasdair Kergon 2010-07-09 21:13:53 UTC

3.5 hours?!  Roughly how many LVs?

What's the equivalent time on RHEL6 now?

Comment 5 Alasdair Kergon 2010-07-09 21:17:33 UTC

Maybe post the first part of the -vvvv log (covering the deactivation of the first few volumes) of the RHEL6 version of this on here.

Comment 6 Alasdair Kergon 2010-10-14 14:42:04 UTC

Please retest to see what progress we've made in recent releases.

Comment 7 Corey Marthaler 2010-10-14 22:51:30 UTC

It does seem to be quite a bit faster with the current 5.6 rpms. However, what's with the "Internal error: Maps lock 16633856 < unlock 17121280" message?

[root@grant-01 ~]# lvs -a -o +devices | wc -l
4276

[root@grant-01 ~]# time vgchange -an create_scale
  Internal error: Maps lock 16633856 < unlock 17121280
  0 logical volume(s) in volume group "create_scale" now active

real    9m20.291s
user    7m33.234s
sys     1m42.607s

Comment 9 Corey Marthaler 2010-10-15 16:37:34 UTC

FWIW, the lvremove cmd took 53 minutes to remove the 4276 volumes listed in comment #7.

Comment 10 Corey Marthaler 2010-10-15 16:38:31 UTC

Milan, 

Do you know the 6.0 bug for the "Internal error:" issue so that I can clone it for 5.6?

Comment 11 Milan Broz 2010-10-15 17:53:25 UTC

(In reply to comment #10)
> Do you know the 6.0 bug for the "Internal error:" issue so that I can clone it
> for 5.6?

We discussed this with Zdenek already:
- the kernel bug for 5.6 is already here - bug #643426

I noticed that there is related fix in lvm2
"Read whole /proc/self/maps file before working with maps entries."
so maybe clone for this fix. I think both part (kernel+lvm2) shoud be fixed in 5.6

Comment 12 Milan Broz 2010-10-15 20:33:59 UTC

> I noticed that there is related fix in lvm2
> "Read whole /proc/self/maps file before working with maps entries."
> so maybe clone for this fix.

bug #643500 now

Comment 14 RHEL Program Management 2011-01-11 20:25:03 UTC

This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated in the
current release, Red Hat is unfortunately unable to address this
request at this time. Red Hat invites you to ask your support
representative to propose this request, if appropriate and relevant,
in the next release of Red Hat Enterprise Linux.

Comment 15 RHEL Program Management 2011-01-11 23:04:31 UTC

This request was erroneously denied for the current release of
Red Hat Enterprise Linux.  The error has been fixed and this
request has been re-proposed for the current release.

Comment 18 Zdenek Kabelac 2011-08-22 12:26:48 UTC

Most of speedup patches for vgchange operation with large  set of LVs are now part of upstream version 2.02.87  (bug 658639).

The bug will be resolved with next rebase.

Comment 19 Milan Broz 2011-10-17 21:36:14 UTC

Fixed in lvm2-2.02.88-1.el5.

Comment 22 Milan Broz 2011-12-06 23:07:15 UTC

    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Some of the lvm2 operations were optimized to speed up activation and deactivation time if running over a large group of logical volumes.

Comment 23 Corey Marthaler 2011-12-09 23:08:57 UTC

This still exists in the latest rpms.

2.6.18-274.el5

lvm2-2.02.88-5.el5    BUILT: Fri Dec  2 12:25:45 CST 2011
lvm2-cluster-2.02.88-5.el5    BUILT: Fri Dec  2 12:48:37 CST 2011
device-mapper-1.02.67-2.el5    BUILT: Mon Oct 17 08:31:56 CDT 2011
device-mapper-event-1.02.67-2.el5    BUILT: Mon Oct 17 08:31:56 CDT 2011
cmirror-1.1.39-14.el5    BUILT: Wed Nov  2 17:25:33 CDT 2011
kmod-cmirror-0.1.22-3.el5    BUILT: Tue Dec 22 13:39:47 CST 2009


SCENARIO - [many_snaps]
Create 500 snapshots of an origin volume
  Writing physical volume data to disk "/dev/sdb1"
  Writing physical volume data to disk "/dev/sdc1"
Making origin volume
Making 500 snapshots of origin volume
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 Although the snap create passed, errors were found in it's output
  Rounding up size to full physical extent 52.00 MB
  Logical volume "500_364" created
  Internal error: Maps lock 39600128 < unlock 39723008

Comment 24 Alasdair Kergon 2011-12-09 23:37:44 UTC

That internal error should not be appearing and needs to be fixed.

Comment 25 Corey Marthaler 2011-12-09 23:59:38 UTC

This still occurs with mirrored volumes as well, not just snapshots.

[root@grant-01 ~]# vgchange -an mirror_sanity
  0 logical volume(s) in volume group "mirror_sanity" now active
  Internal error: Maps lock 33738752 < unlock 35778560

Comment 26 Alasdair Kergon 2011-12-10 00:04:12 UTC

It was related to a kernel patch, so we need to check the kernel has the right code in it, and that the corresponding version of the userspace code is included.

Comment 27 Milan Broz 2011-12-19 10:12:39 UTC

IMHO this message should be removed completely or moved to debug level.

Corey, I think this should be a new bug "lock < unlock message", this bug was about time issues...

Comment 28 Alasdair Kergon 2011-12-19 13:05:54 UTC

Or flag 'internal error', which it is...

Comment 29 Corey Marthaler 2011-12-19 19:05:09 UTC

Filed bug 769053 for the error messages listed in the above comments. Please provide the proper acks for that bug to be included 5.8.

Moving this back into ON_QA...

Comment 30 Corey Marthaler 2012-01-13 00:02:08 UTC

It appears that the deactivation is now much faster, however the removes still take about an hour. Marking verified.

2.6.18-301.el5

lvm2-2.02.88-5.el5    BUILT: Fri Dec  2 12:25:45 CST 2011
lvm2-cluster-2.02.88-5.el5    BUILT: Fri Dec  2 12:48:37 CST 2011
device-mapper-1.02.67-2.el5    BUILT: Mon Oct 17 08:31:56 CDT 2011
device-mapper-event-1.02.67-2.el5    BUILT: Mon Oct 17 08:31:56 CDT 2011
cmirror-1.1.39-14.el5    BUILT: Wed Nov  2 17:25:33 CDT 2011
kmod-cmirror-0.1.22-3.el5    BUILT: Tue Dec 22 13:39:47 CST 2009



[root@grant-01 ~]# lvs | wc -l
4372
[root@grant-01 ~]# time vgchange -an create_scale
  0 logical volume(s) in volume group "create_scale" now active

real    0m12.639s
user    0m3.291s
sys     0m7.211s

[root@grant-01 ~]# vgs
  VG           #PV #LV  #SN Attr   VSize   VFree  
  VolGroup00     1    2   0 wz--n-  74.38G      0 
  create_scale   3 4372   0 wz--n- 373.17G 151.15G
[root@grant-01 ~]# time lvremove create_scale
  Logical volume "Sc4l3_3114" successfully removed
  Logical volume "Sc4l3_3181" successfully removed
  Logical volume "Sc4l3_3161" successfully removed
  [...]
  Logical volume "Sc4l3_23166" successfully removed
  Logical volume "Sc4l3_23057" successfully removed
  Logical volume "Sc4l3_23140" successfully removed

real    58m57.955s
user    45m32.209s
sys     6m16.708s

Comment 31 errata-xmlrpc 2012-02-21 06:02:36 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2012-0161.html