Bug 769053 - 'Internal error: Maps lock < unlock messages' when running hundreds of lvm ops
Summary: 'Internal error: Maps lock < unlock messages' when running hundreds of lvm ops
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: lvm2
Version: 5.8
Hardware: x86_64
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Alasdair Kergon
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks: 784372
TreeView+ depends on / blocked
 
Reported: 2011-12-19 19:03 UTC by Corey Marthaler
Modified: 2012-02-21 06:06 UTC (History)
10 users (show)

Fixed In Version: lvm2-2.02.88-6.el5
Doc Type: Bug Fix
Doc Text:
If preallocated memory is too low, lvm2 can issue an error message like "Internal error: Maps lock < unlock". The message was changed to "Reserved memory not enough. Increase activation/reserved_memory?" to provide better information about the source of problem to administrator. Preallocated memory can be changed in lvm.conf (reserved_memory option).
Clone Of:
Environment:
Last Closed: 2012-02-21 06:06:00 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2012:0161 0 normal SHIPPED_LIVE lvm2 bug fix and enhancement update 2012-02-20 15:07:59 UTC

Description Corey Marthaler 2011-12-19 19:03:44 UTC
Description of problem:
Opening this bug for the error messages listed in bug 575967.


2.6.18-274.el5

lvm2-2.02.88-5.el5    BUILT: Fri Dec  2 12:25:45 CST 2011
lvm2-cluster-2.02.88-5.el5    BUILT: Fri Dec  2 12:48:37 CST 2011
device-mapper-1.02.67-2.el5    BUILT: Mon Oct 17 08:31:56 CDT 2011
device-mapper-event-1.02.67-2.el5    BUILT: Mon Oct 17 08:31:56 CDT 2011
cmirror-1.1.39-14.el5    BUILT: Wed Nov  2 17:25:33 CDT 2011
kmod-cmirror-0.1.22-3.el5    BUILT: Tue Dec 22 13:39:47 CST 2009


SCENARIO - [many_snaps]
Create 500 snapshots of an origin volume
  Writing physical volume data to disk "/dev/sdb1"
  Writing physical volume data to disk "/dev/sdc1"
Making origin volume
Making 500 snapshots of origin volume
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55
56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81
82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105
106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125
126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145
146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165
166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185
186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205
206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225
226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245
246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265
266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285
286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305
306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325
326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345
346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364
Although the snap create passed, errors were found in it's output
  Rounding up size to full physical extent 52.00 MB
  Logical volume "500_364" created
  Internal error: Maps lock 39600128 < unlock 39723008


This still occurs with mirrored volumes as well, not just snapshots.

[root@grant-01 ~]# vgchange -an mirror_sanity
  0 logical volume(s) in volume group "mirror_sanity" now active
  Internal error: Maps lock 33738752 < unlock 35778560


Version-Release number of selected component (if applicable):
2.6.18-300.el5

lvm2-2.02.88-5.el5    BUILT: Fri Dec  2 12:25:45 CST 2011
lvm2-cluster-2.02.88-5.el5    BUILT: Fri Dec  2 12:48:37 CST 2011
device-mapper-1.02.67-2.el5    BUILT: Mon Oct 17 08:31:56 CDT 2011
device-mapper-event-1.02.67-2.el5    BUILT: Mon Oct 17 08:31:56 CDT 2011
cmirror-1.1.39-14.el5    BUILT: Wed Nov  2 17:25:33 CDT 2011
kmod-cmirror-0.1.22-3.el5    BUILT: Tue Dec 22 13:39:47 CST 2009


How reproducible:
Often when running hundreds of lvm operations.

Comment 1 Milan Broz 2011-12-20 13:37:59 UTC
This is just stupid error report in lvm which should not be in stable release.

I'll remove it in RHEL5.8 build (while upstream can discuss this another few years).

Comment 3 Alasdair Kergon 2011-12-20 14:12:39 UTC
There was a similar problem upstream last year and in a RHEL6 kernel patch backport (private bug 638525).  Current upstream does not have this problem as far as I'm aware, suggesting there could be a problem with a RHEL5 kernel patch backport which we should try to identify.

Comment 4 Peter Rajnoha 2011-12-20 14:33:14 UTC
(In reply to comment #3)
> There was a similar problem upstream last year and in a RHEL6 kernel patch
> backport (private bug 638525).  Current upstream does not have this problem as
> far as I'm aware, suggesting there could be a problem with a RHEL5 kernel patch
> backport which we should try to identify.

It seems it reappears again in some scenarios in RHEL6 as well - bug #740868.

Comment 5 Zdenek Kabelac 2011-12-20 16:18:17 UTC
Is it possible to repeat this test with increased size for  reserved_memory - e.g.
in lvm.conf

activation {
   reserved_memory = 32768
}


Is there any difference ?

I've checked RHEL5.8 kernel for mlock/munlock issues - and it seems to behave as good as upstream kernel - so it should not be cause of this problem here.

Comment 6 Milan Broz 2011-12-20 16:25:31 UTC
If it is problem with pre-allocated memory overflow, I would really suggest to detect this situation and display some more useful message to user.

"Internal error: Maps lock 33738752 < unlock 35778560" says absolutely nothing to administrator, it is message for developer.

It should be "Internal error: preallocated memory too low, please consider increasing reserved_memory in lvm.conf" or so.

Comment 8 Zdenek Kabelac 2011-12-20 17:24:48 UTC
(In reply to comment #6)
> If it is problem with pre-allocated memory overflow, I would really suggest to
> detect this situation and display some more useful message to user.
> 
> "Internal error: Maps lock 33738752 < unlock 35778560" says absolutely nothing
> to administrator, it is message for developer.
> 
> It should be "Internal error: preallocated memory too low, please consider
> increasing reserved_memory in lvm.conf" or so.

Well we have here 2 cases which cannot be easily distinguished - one is our internal problem when we allocate to much memory where we shouldn't, 
the other case is to small preallocated memory.

Issue with preallocated memory is probably somewhat similar to our preallocated stack issue where we are going to eliminate this variable.

However now I'm not sure if there is some easy formula to estimate maximum memory needed during activate when we know all metadata in front - but such surely must exist.

Anyway for now - users which are experimenting with large MDA sizes need to adapt reserved_memory to higher values to avoid Maps reporting.

Comment 9 Alasdair Kergon 2011-12-20 20:00:44 UTC
So, it's a genuine internal error - it didn't pre-allocate sufficient memory.

500 snapshots of the same origin is not a sensible case we need to support though.

Comment 10 Corey Marthaler 2012-01-10 23:26:15 UTC
This bug does not appear to happen when bumping up the reserved_memory to 32768. I'll let the tests run a few more iterations tonight however to make sure.


    # How much memory (in KB) to reserve for use while devices suspended
    #reserved_memory = 8192
    reserved_memory = 32768

Comment 11 Corey Marthaler 2012-01-11 13:04:16 UTC
Tests continued to run all night with the reserved_memory bumped up to 32768. I was then able to reproduce this issue right away when I set it back to the default of 8192.

Comment 12 Nenad Peric 2012-01-12 14:23:52 UTC
Ran tests with 500 mirrors as well. 

in lvm.conf:

reserved_memory = 32768 


The mirrors were created and deleted successfully without errors.

Comment 13 Alasdair Kergon 2012-01-12 18:32:33 UTC
In line with Milan's suggestion, I've changed the message to:

    Internal error: Reserved memory (33738752) not enough: used 35778560. Increase activation/reserved_memory?

upstream.

Comment 15 Milan Broz 2012-01-18 09:19:37 UTC
Error message changed in lvm2-2.02.88-6.el5 according to comment #13.

Comment 18 Milan Broz 2012-01-18 09:41:14 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
If preallocated memory is too low, lvm2 can issue an error message like "Internal error: Maps lock < unlock".
The message was changed to "Reserved memory not enough. Increase activation/reserved_memory?" to provide better information about the source of problem to administrator.
Preallocated memory can be changed in lvm.conf (reserved_memory option).

Comment 19 Corey Marthaler 2012-01-18 20:39:45 UTC
Fix verified in the latest rpms.


2.6.18-301.el5
lvm2-2.02.88-6.el5    BUILT: Wed Jan 18 03:34:29 CST 2012
lvm2-cluster-2.02.88-6.el5    BUILT: Wed Jan 18 03:33:26 CST 2012
device-mapper-1.02.67-2.el5    BUILT: Mon Oct 17 08:31:56 CDT 2011
device-mapper-event-1.02.67-2.el5    BUILT: Mon Oct 17 08:31:56 CDT 2011
cmirror-1.1.39-14.el5    BUILT: Wed Nov  2 17:25:33 CDT 2011
kmod-cmirror-0.1.22-3.el5    BUILT: Tue Dec 22 13:39:47 CST 2009



SCENARIO - [many_snaps]
Create 500 snapshots of an origin volume
Recreating VG and PVs to increase metadata size
  Writing physical volume data to disk "/dev/sdb1"
  Writing physical volume data to disk "/dev/sdc1"
Making origin volume
Making 500 snapshots of origin volume
1 2 3 4 5 6 7 8 9 10 11 12 13 Although the snap create passed, errors were found in it's output
  Rounding up size to full physical extent 52.00 MB
  Logical volume "500_13" created
  Internal error: Reserved memory (21938176) not enough: used 22061056. Increase activation/reserved_memory?

Comment 20 errata-xmlrpc 2012-02-21 06:06:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2012-0161.html


Note You need to log in before you can comment on or make changes to this bug.