Description of problem: Opening this bug for the error messages listed in bug 575967. 2.6.18-274.el5 lvm2-2.02.88-5.el5 BUILT: Fri Dec 2 12:25:45 CST 2011 lvm2-cluster-2.02.88-5.el5 BUILT: Fri Dec 2 12:48:37 CST 2011 device-mapper-1.02.67-2.el5 BUILT: Mon Oct 17 08:31:56 CDT 2011 device-mapper-event-1.02.67-2.el5 BUILT: Mon Oct 17 08:31:56 CDT 2011 cmirror-1.1.39-14.el5 BUILT: Wed Nov 2 17:25:33 CDT 2011 kmod-cmirror-0.1.22-3.el5 BUILT: Tue Dec 22 13:39:47 CST 2009 SCENARIO - [many_snaps] Create 500 snapshots of an origin volume Writing physical volume data to disk "/dev/sdb1" Writing physical volume data to disk "/dev/sdc1" Making origin volume Making 500 snapshots of origin volume 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 Although the snap create passed, errors were found in it's output Rounding up size to full physical extent 52.00 MB Logical volume "500_364" created Internal error: Maps lock 39600128 < unlock 39723008 This still occurs with mirrored volumes as well, not just snapshots. [root@grant-01 ~]# vgchange -an mirror_sanity 0 logical volume(s) in volume group "mirror_sanity" now active Internal error: Maps lock 33738752 < unlock 35778560 Version-Release number of selected component (if applicable): 2.6.18-300.el5 lvm2-2.02.88-5.el5 BUILT: Fri Dec 2 12:25:45 CST 2011 lvm2-cluster-2.02.88-5.el5 BUILT: Fri Dec 2 12:48:37 CST 2011 device-mapper-1.02.67-2.el5 BUILT: Mon Oct 17 08:31:56 CDT 2011 device-mapper-event-1.02.67-2.el5 BUILT: Mon Oct 17 08:31:56 CDT 2011 cmirror-1.1.39-14.el5 BUILT: Wed Nov 2 17:25:33 CDT 2011 kmod-cmirror-0.1.22-3.el5 BUILT: Tue Dec 22 13:39:47 CST 2009 How reproducible: Often when running hundreds of lvm operations.
This is just stupid error report in lvm which should not be in stable release. I'll remove it in RHEL5.8 build (while upstream can discuss this another few years).
There was a similar problem upstream last year and in a RHEL6 kernel patch backport (private bug 638525). Current upstream does not have this problem as far as I'm aware, suggesting there could be a problem with a RHEL5 kernel patch backport which we should try to identify.
(In reply to comment #3) > There was a similar problem upstream last year and in a RHEL6 kernel patch > backport (private bug 638525). Current upstream does not have this problem as > far as I'm aware, suggesting there could be a problem with a RHEL5 kernel patch > backport which we should try to identify. It seems it reappears again in some scenarios in RHEL6 as well - bug #740868.
Is it possible to repeat this test with increased size for reserved_memory - e.g. in lvm.conf activation { reserved_memory = 32768 } Is there any difference ? I've checked RHEL5.8 kernel for mlock/munlock issues - and it seems to behave as good as upstream kernel - so it should not be cause of this problem here.
If it is problem with pre-allocated memory overflow, I would really suggest to detect this situation and display some more useful message to user. "Internal error: Maps lock 33738752 < unlock 35778560" says absolutely nothing to administrator, it is message for developer. It should be "Internal error: preallocated memory too low, please consider increasing reserved_memory in lvm.conf" or so.
(In reply to comment #6) > If it is problem with pre-allocated memory overflow, I would really suggest to > detect this situation and display some more useful message to user. > > "Internal error: Maps lock 33738752 < unlock 35778560" says absolutely nothing > to administrator, it is message for developer. > > It should be "Internal error: preallocated memory too low, please consider > increasing reserved_memory in lvm.conf" or so. Well we have here 2 cases which cannot be easily distinguished - one is our internal problem when we allocate to much memory where we shouldn't, the other case is to small preallocated memory. Issue with preallocated memory is probably somewhat similar to our preallocated stack issue where we are going to eliminate this variable. However now I'm not sure if there is some easy formula to estimate maximum memory needed during activate when we know all metadata in front - but such surely must exist. Anyway for now - users which are experimenting with large MDA sizes need to adapt reserved_memory to higher values to avoid Maps reporting.
So, it's a genuine internal error - it didn't pre-allocate sufficient memory. 500 snapshots of the same origin is not a sensible case we need to support though.
This bug does not appear to happen when bumping up the reserved_memory to 32768. I'll let the tests run a few more iterations tonight however to make sure. # How much memory (in KB) to reserve for use while devices suspended #reserved_memory = 8192 reserved_memory = 32768
Tests continued to run all night with the reserved_memory bumped up to 32768. I was then able to reproduce this issue right away when I set it back to the default of 8192.
Ran tests with 500 mirrors as well. in lvm.conf: reserved_memory = 32768 The mirrors were created and deleted successfully without errors.
In line with Milan's suggestion, I've changed the message to: Internal error: Reserved memory (33738752) not enough: used 35778560. Increase activation/reserved_memory? upstream.
Error message changed in lvm2-2.02.88-6.el5 according to comment #13.
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: If preallocated memory is too low, lvm2 can issue an error message like "Internal error: Maps lock < unlock". The message was changed to "Reserved memory not enough. Increase activation/reserved_memory?" to provide better information about the source of problem to administrator. Preallocated memory can be changed in lvm.conf (reserved_memory option).
Fix verified in the latest rpms. 2.6.18-301.el5 lvm2-2.02.88-6.el5 BUILT: Wed Jan 18 03:34:29 CST 2012 lvm2-cluster-2.02.88-6.el5 BUILT: Wed Jan 18 03:33:26 CST 2012 device-mapper-1.02.67-2.el5 BUILT: Mon Oct 17 08:31:56 CDT 2011 device-mapper-event-1.02.67-2.el5 BUILT: Mon Oct 17 08:31:56 CDT 2011 cmirror-1.1.39-14.el5 BUILT: Wed Nov 2 17:25:33 CDT 2011 kmod-cmirror-0.1.22-3.el5 BUILT: Tue Dec 22 13:39:47 CST 2009 SCENARIO - [many_snaps] Create 500 snapshots of an origin volume Recreating VG and PVs to increase metadata size Writing physical volume data to disk "/dev/sdb1" Writing physical volume data to disk "/dev/sdc1" Making origin volume Making 500 snapshots of origin volume 1 2 3 4 5 6 7 8 9 10 11 12 13 Although the snap create passed, errors were found in it's output Rounding up size to full physical extent 52.00 MB Logical volume "500_13" created Internal error: Reserved memory (21938176) not enough: used 22061056. Increase activation/reserved_memory?
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2012-0161.html