Bug 448548 - cmirror panic in dm_mirror:dm_create_dirty_log during pvmove attempt running gulm
Summary: cmirror panic in dm_mirror:dm_create_dirty_log during pvmove attempt running ...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Cluster Suite
Classification: Retired
Component: cmirror-kernel
Version: 4
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Jonathan Earl Brassow
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-05-27 16:26 UTC by Corey Marthaler
Modified: 2010-05-20 16:01 UTC (History)
1 user (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2010-05-20 16:01:21 UTC
Embargoed:


Attachments (Terms of Use)

Description Corey Marthaler 2008-05-27 16:26:09 UTC
Description of problem:
I attempted to pvmove a clustered linear volume while running gulm, and saw this
panic on three of the four nodes in the cluster while running a 'vgscan'.

Here's the output from the test:

Adding /dev/sdc1 to the volume group on taft-01
Moving data from /dev/sdh1 to /dev/sdc1 on taft-01
    Wiping cache of LVM-capable devices
    Finding volume group "pv_shuffle"
    Executing: /sbin/modprobe dm-cmirror
    Archiving volume group "pv_shuffle" metadata (seqno 4).
    Creating logical volume pvmove0
    Moving 1280 extents of logical volume pv_shuffle/linear
    Updating volume group metadata
    Creating volume group backup "/etc/lvm/backup/pv_shuffle" (seqno 5).
  Error locking on node taft-03: device-mapper: reload ioctl failed: Invalid
argument
  Error locking on node taft-02: device-mapper: reload ioctl failed: Invalid
argument
  Error locking on node taft-04: device-mapper: reload ioctl failed: Invalid
argument
  Error locking on node taft-01: device-mapper: reload ioctl failed: Invalid
argument
  ABORTING: Temporary mirror activation failed.  Run pvmove --abort.


May 27 10:52:38 taft-02 qarshd[16076]: Running cmdline: vgscan
dm-cmirror: Couldn't register clustered_log service.  Reason: -107
dm-cmirror: Unable to connect to cluster infrastructure.
device-mapper: dm-mirror: Error creating mirror dirty log
device-mapper: error adding target to table
Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP:
<ffffffffa02eb78b>{:dm_cmirror:cluster_ctr+411}
PML4 2065c4067 PGD 2069b0067 PMD 0
Oops: 0000 [1] SMP
CPU 2
Modules linked in: dm_cmirror(U) gnbd(U) lock_nolock(U) gfs(U) cman(U)
lock_gulm(U) locd
Pid: 15670, comm: clvmd Not tainted 2.6.9-70.ELsmp
RIP: 0010:[<ffffffffa02eb78b>] <ffffffffa02eb78b>{:dm_cmirror:cluster_ctr+411}
RSP: 0018:00000102065adc28  EFLAGS: 00010286
RAX: fffffffffffffe18 RBX: fffffffffffffe18 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000010213eb6d61 RDI: 00000100dfdc3960
RBP: 0000010213eb6c00 R08: 0000000000000000 R09: 0000000000000000
R10: ffffff0000035000 R11: 0000000000000500 R12: ffffffffa02f3ec0
R13: 0000000000000002 R14: 0000010217967d60 R15: ffffff0000031080
FS:  0000000040a00960(005b) GS:ffffffff8050c580(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 00000000dffae000 CR4: 00000000000006e0
Process clvmd (pid: 15670, threadinfo 00000102065ac000, task 000001020c2dd7f0)
Stack: 0000010000013810 0000000000032000 0000000000000246 0000000000000246
       ffffffffa02f3fa0 ffffff000002a160 0000010217967d60 0000010037e4c610
       0000000000000002 ffffffffa0088271
Call Trace:<ffffffffa0088271>{:dm_mirror:dm_create_dirty_log+212}
       <ffffffffa008a7dc>{:dm_mirror:mirror_ctr+108}
<ffffffffa003f4fa>{:dm_mod:dm_spli
       <ffffffffa003f716>{:dm_mod:dm_table_add_target+328}
       <ffffffffa0041cac>{:dm_mod:table_load+523}
<ffffffffa0041aa1>{:dm_mod:table_load
       <ffffffffa00426a2>{:dm_mod:ctl_ioctl+602} <ffffffff80184796>{sys_newstat+32}
       <ffffffff8018db1d>{sys_ioctl+853} <ffffffff801102b6>{system_call+126}


Code: 48 8b 83 e8 01 00 00 0f 18 08 48 81 fa a0 3e 2f a0 74 68 48
RIP <ffffffffa02eb78b>{:dm_cmirror:cluster_ctr+411} RSP <00000102065adc28>
CR2: 0000000000000000
 <0>Kernel panic - not syncing: Oops



Version-Release number of selected component (if applicable):
2.6.9-70.ELsmp
cmirror-1.0.1-1 Build Date: Tue 30 Jan 2007 05:28:02 PM CST
cmirror-kernel-2.6.9-41.3 Build Date: Mon 19 May 2008 02:00:31 PM CDT

Comment 2 Corey Marthaler 2009-09-02 21:16:45 UTC
This is still reproducible.

Comment 4 Jonathan Earl Brassow 2010-05-20 16:01:21 UTC
Closing WONTFIX.  I think we are in very little danger of a customer hitting this bug.  We don't support (and never have) cmirror + gulm.  The code should not 'oops' if attempted, but my argument is that it won't be attempted.  Feel free to reopen, but only if you are willing to reproduce and allow me to use the cluster.


Note You need to log in before you can comment on or make changes to this bug.