Bug 198563
Summary: | clvmd panic in dm_mod:resize_pool while doing mirroring creation | ||
---|---|---|---|
Product: | [Retired] Red Hat Cluster Suite | Reporter: | Corey Marthaler <cmarthal> |
Component: | cmirror | Assignee: | Jonathan Earl Brassow <jbrassow> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Cluster QE <mspqa-list> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 4 | CC: | agk, bstevens, dwysocha, heinzm, mbroz |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2010-04-27 14:56:19 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Corey Marthaler
2006-07-11 21:59:59 UTC
Reproduced this on taft-01. Here's a more complete stack trace: kmem_cache_create: duplicate cache dm-io-bio ----------- [cut here ] --------- [please bite here ] --------- Kernel BUG at slab:1453 invalid operand: 0000 [1] SMP CPU 2 Modules linked in: dm_cmirror(U) dlm(U) cman(U) md5 ipv6 parport_pc lp parport autofs4 i2c_dev i2c_core sunrpc ds yenta_socket pcmcia_core buttond Pid: 6828, comm: clvmd Not tainted 2.6.9-39.1.ELsmp RIP: 0010:[<ffffffff80162e28>] <ffffffff80162e28>{kmem_cache_create+1330} RSP: 0018:00000102136e3be8 EFLAGS: 00010202 RAX: 000000000000002d RBX: 0000010037df06b0 RCX: ffffffff803e1be8 RDX: ffffffff803e1be8 RSI: 0000000000000246 RDI: ffffffff8048e140 RBP: 0000010217827380 R08: ffffffff803e1be8 R09: 0000010037df06b0 R10: 0000000100000000 R11: 0000000000000004 R12: ffffffff8048e140 R13: 0000000000002000 R14: 0000010000000000 R15: ffffffffa009f7c0 FS: 0000000041401960(005b) GS:ffffffff804e4e80(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000002a9556c000 CR3: 00000000dff9e000 CR4: 00000000000006e0 Process clvmd (pid: 6828, threadinfo 00000102136e2000, task 0000010216bf4030) Stack: 0000000000000202 00000102152f6600 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000010214673e00 000001021747b520 0000000000000040 0000000000000100 Call Trace:<ffffffffa0093ee7>{:dm_mod:resize_pool+219} <ffffffffa009527b>{:dm_mod:kcopyd_client_create+598} <ffffffffa00d8918>{:dm_mirror:mirror_ctr+1119} <ffffffffa00910bb>{:dm_mod:dm_table_add_target+328} <ffffffffa009329c>{:dm_mod:table_load+225} <ffffffffa00931bb>{:dm_mod:table_load+0} <ffffffffa0093c34>{:dm_mod:ctl_ioctl+600} <ffffffff80181c97>{sys_newstat+32} <ffffffff8018ad41>{sys_ioctl+853} <ffffffff8011024a>{system_call+126} Code: 0f 0b 6f 5a 32 80 ff ff ff ff ad 05 48 8b 1b eb 96 65 48 8b RIP <ffffffff80162e28>{kmem_cache_create+1330} RSP <00000102136e3be8> <0>Kernel panic - not syncing: Oops This is most likely a kernel bug. We'll need to determine the average user impact. If it's really unlikely, perhaps it can be pushed to RHEL4U5. It be interesting to know if this affects single machine mirrors as well... Any idea as to the number of mirrors that where present at the time? just one mirror was present. Just a note that I am still seeing this quite frequently on the taft cluster while doing clustered mirroring operations. I'm still trying to pinpoint what exact operation is causing this and whether or not the average user will see this panic. I'd like to be able to reproduce this on another cluster as well. These are the rpms that I'm running with: [root@taft-04 ~]# uname -ar Linux taft-04 2.6.9-42.ELsmp #1 SMP Wed Jul 12 23:32:02 EDT 2006 x86_64 x86_64 x86_64 GNU/Linux [root@taft-04 ~]# rpm -q lvm2 lvm2-2.02.06-6.0.RHEL4 [root@taft-04 ~]# rpm -q lvm2-cluster lvm2-cluster-2.02.06-6.0.RHEL4 [root@taft-04 ~]# rpm -q device-mapper device-mapper-1.02.07-4.0.RHEL4 [root@taft-04 ~]# rpm -q cmirror cmirror-1.0.1-0 [root@taft-04 ~]# rpm -q cmirror-kernel cmirror-kernel-2.6.9-10.2 I can reproduce this with just the following loop: while true; do lvcreate -L 1G -n mirror1 -m 1 group; sleep 5; lvremove -f /dev/group/mirror1; done Updated reproduction on kernel 2.6.9-42.BRsmp: Jul 18 09:54:26 taft-04 dmeventd[4406]: Monitoring mirror device, mirror_1_7505-mirror1 for events Jul 18 09:54:26 taft-04 kernel: device-mapper: 0 - 2048 Jul 18 09:54:26 taft-04 kernel: device-mapper: Total = 2048 Jul 18 09:54:27 taft-04 kernel: device-mapper: 2000 out-of-sync regions remaining for IhV5Xlh2. Jul 18 09:54:59 taft-04 kernel: device-mapper: Cluster mirror log server is shutting down. Jul 18 09:55:10 taft-04 kernel: device-mapper: I'm the cluster mirror log server for lYxvWdXF Jul 18 09:55:10 taft-04 kernel: device-mapper: Disk Resume:: Jul 18 09:55:10 taft-04 kernel: device-mapper: Live nodes :: 1 Jul 18 09:55:10 taft-04 kernel: device-mapper: In-Use Regions :: 0 Jul 18 09:55:10 taft-04 kernel: device-mapper: Good IUR's :: 0 Jul 18 09:55:10 taft-04 kernel: device-mapper: Bad IUR's :: 0 Jul 18 09:55:10 taft-04 kernel: device-mapper: Sync count :: 0 Jul 18 09:55:10 taft-04 kernel: device-mapper: Disk Region count :: 18446744073709551615 Jul 18 09:55:10 taft-04 kernel: device-mapper: Region count :: 2048 Jul 18 09:55:10 taft-04 kernel: device-mapper: NOTE: Mapping has changed. Jul 18 09:55:10 taft-04 kernel: device-mapper: Marked regions:: Jul 18 09:55:10 taft-04 kernel: device-mapper: 0 - 2048 Jul 18 09:55:10 taft-04 kernel: device-mapper: Total = 2048 Jul 18 09:55:10 taft-04 dmeventd[4406]: Monitoring mirror device, mirror_1_7505-mirror1 for events Jul 18 09:55:10 taft-04 kernel: device-mapper: Out-of-sync regions:: Jul 18 09:55:10 taft-04 kernel: device-mapper: 0 - 2048 Jul 18 09:55:10 taft-04 kernel: device-mapper: Total = 2048 Jul 18 09:55:15 taft-04 kernel: device-mapper: 2000 out-of-sync regions remaining for lYxvWdXF. slab error in kmem_cache_destroy(): cache `dm-io-bio': Can't free all objects Call Trace:<ffffffff80162443>{kmem_cache_destroy+202} <ffffffffa0093cba>{:dm_mod:bio_set_exit+34} <ffffffffa0093e70>{:dm_mod:resize_pool+100} <ffffffffa00928c9>{:dm_mod:dev_remove+0} <ffffffffa0095424>{:dm_mod:kcopyd_client_destroy+146} <ffffffff801ea645>{__up_write+20} <ffffffffa00d8a7a>{:dm_mirror:mirror_dtr+101} <ffffffffa009090d>{:dm_mod:table_destroy+72} <ffffffffa0090020>{:dm_mod:dm_put+149} <ffffffffa009297d>{:dm_mod:dev_remove+180} <ffffffffa0093c34>{:dm_mod:ctl_ioctl+600} <ffffffff8018ae05>{sys_ioctl+853} <ffffffff8011026a>{system_call+126} Jul 18 09:55:31 taft-04 kernel: slab error in kmem_cache_destroy(): cache `dm-io-bio': Can't free as Jul 18 09:55:31 taft-04 kernel: Jul 18 09:55:31 taft-04 kernel: Call Trace:<ffffffff80162443>{kmem_cache_destroy+202} <ffffffffa009 Jul 18 09:55:31 taft-04 kernel: <ffffffffa0093e70>{:dm_mod:resize_pool+100} <ffffffffa00928c Jul 18 09:55:31 taft-04 kernel: <ffffffffa0095424>{:dm_mod:kcopyd_client_destroy+146} Jul 18 09:55:31 taft-04 kernel: <ffffffff801ea645>{__up_write+20} <ffffffffa00d8a7a>{:dm_mir Jul 18 09:55:31 taft-04 kernel: <ffffffffa009090d>{:dm_mod:table_destroy+72} <ffffffffa00900 Jul 18 09:55:31 taft-04 kernel: <ffffffffa009297d>{:dm_mod:dev_remove+180} <ffffffffa0093c34 Jul 18 09:55:31 taft-04 kernel: <ffffffff8018ae05>{sys_ioctl+853} <ffffffff8011026a>{system_ Jul 18 09:55:31 taft-04 kernel: device-mapper: Log context can not be found for request device-mapper: Log context can not be found for request Jul 18 09:55:36 taft-04 kernel: device-mapper: A cluster mirror log member has left. Jul 18 09:55:36 taft-04 kernel: device-mapper: Log context can not be found for request Jul 18 09:55:36 taft-04 kernel: device-mapper: Log context can not be found for request Jul 18 09:55:41 taft-04 kernel: device-mapper: Cluster mirror log server is shutting down. kmem_cache_create: duplicate cache dm-io-bio ----------- [cut here ] --------- [please bite here ] --------- Kernel BUG at slab:1453 invalid operand: 0000 [1] SMP CPU 0 Modules linked in: dm_cmirror(U) dlm(U) cman(U) md5 ipv6 parport_pc lp parport autofs4 i2c_dev i2c_d Pid: 4336, comm: clvmd Not tainted 2.6.9-42.BRsmp RIP: 0010:[<ffffffff80162edc>] <ffffffff80162edc>{kmem_cache_create+1330} RSP: 0000:000001021236dbe8 EFLAGS: 00010202 RAX: 000000000000002d RBX: 000001020c4c48b0 RCX: ffffffff803e1ee8 RDX: ffffffff803e1ee8 RSI: 0000000000000246 RDI: ffffffff8048e440 RBP: 000001020c462e80 R08: ffffffff803e1ee8 R09: 000001020c4c48b0 R10: 0000000100000000 R11: 0000000000000004 R12: ffffffff8048e440 R13: 0000000000002000 R14: 0000010000000000 R15: ffffffffa009f8c0 FS: 0000000041401960(005b) GS:ffffffff804e5080(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000031fa38fae0 CR3: 0000000000101000 CR4: 00000000000006e0 Process clvmd (pid: 4336, threadinfo 000001021236c000, task 00000102127937f0) Stack: 0000000000000202 0000010213802380 0000000000000000 0000000000000000 0000000000000000 0000000000000000 00000102138a8600 00000102170472c0 0000000000000040 0000000000000100 Call Trace:<ffffffffa0093f23>{:dm_mod:resize_pool+279} <ffffffffa00952e7>{:dm_mod:kcopyd_client_cre <ffffffffa00d8981>{:dm_mirror:mirror_ctr+1119} <ffffffffa00910bb>{:dm_mod:dm_table_add_targe <ffffffffa009329c>{:dm_mod:table_load+225} <ffffffffa00931bb>{:dm_mod:table_load+0} <ffffffffa0093c34>{:dm_mod:ctl_ioctl+600} <ffffffff80181d5b>{sys_newstat+32} <ffffffff8018ae05>{sys_ioctl+853} <ffffffff8011026a>{system_call+126} Code: 0f 0b b2 5d 32 80 ff ff ff ff ad 05 48 8b 1b eb 96 65 48 8b RIP <ffffffff80162edc>{kmem_cache_create+1330} RSP <000001021236dbe8> <0>Kernel panic - not syncing: Oops have potential fix. will check in after testing. This issue appears to be fixed. Assuming this VERIFIED fix got released. Closing. Reopen if it's not yet resolved. |