Bug 239482 - kernel bug in dm_cmirror_server after killing secondary leg plus node in cluster
Summary: kernel bug in dm_cmirror_server after killing secondary leg plus node in cluster
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat Cluster Suite
Classification: Retired
Component: cmirror
Version: 4
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Jonathan Earl Brassow
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2007-05-08 19:20 UTC by Corey Marthaler
Modified: 2010-01-12 02:03 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-06-16 18:14:44 UTC
Embargoed:


Attachments (Terms of Use)

Description Corey Marthaler 2007-05-08 19:20:18 UTC
Description of problem:
I had GFS I/O going to 3 mirrors, all created from 3 different VGs, and I then
killed the secondary leg of all 3 cmirrors, along with link-07 (one of the four
nodes in the cluster) at the same time. 

Link-04, the node that paniced had a different view of the storage then the rest
of the nodes in the cluster. The disk being killed on link-0[278] was /dev/sda,
but on link-04 it was /dev/sdh.


LINK-02:
  cmirror1            vg1        Mwi-ao 10.00G                    cmirror1_mlog
100.00 cmirror1_mimage_0(0),cmirror1_mimage_1(0)
  [cmirror1_mimage_0] vg1        iwi-ao 10.00G                                 
       /dev/sdb1(0)                    
  [cmirror1_mimage_1] vg1        iwi-ao 10.00G                                 
       /dev/sda1(0)                    
  [cmirror1_mlog]     vg1        lwi-ao  4.00M                                 
       /dev/sdc1(0)                    
  cmirror2            vg2        Mwi-ao 10.00G                    cmirror2_mlog
100.00 cmirror2_mimage_0(0),cmirror2_mimage_1(0)
  [cmirror2_mimage_0] vg2        iwi-ao 10.00G                                 
       /dev/sdd1(0)                    
  [cmirror2_mimage_1] vg2        iwi-ao 10.00G                                 
       /dev/sda2(0)                    
  [cmirror2_mlog]     vg2        lwi-ao  4.00M                                 
       /dev/sde1(0)                    
  cmirror3            vg3        Mwi-ao 10.00G                    cmirror3_mlog
100.00 cmirror3_mimage_0(0),cmirror3_mimage_1(0)
  [cmirror3_mimage_0] vg3        iwi-ao 10.00G                                 
       /dev/sdg1(0)                    
  [cmirror3_mimage_1] vg3        iwi-ao 10.00G                                 
       /dev/sda3(0)                    
  [cmirror3_mlog]     vg3        lwi-ao  4.00M                                 
       /dev/sdf1(0)                    


LINK-04:
  cmirror1            vg1        wi-ao 10.00G                    cmirror1_mlog
100.00 cmirror1_mimage_0(0),cmirror1_mimage_1(0)
  [cmirror1_mimage_0] vg1        iwi-ao 10.00G                                 
       /dev/sda1(0)
  [cmirror1_mimage_1] vg1        iwi-ao 10.00G                                 
       /dev/sdh1(0)
  [cmirror1_mlog]     vg1        lwi-ao  4.00M                                 
       /dev/sdb1(0)
  cmirror2            vg2        wi-ao 10.00G                    cmirror2_mlog
100.00 cmirror2_mimage_0(0),cmirror2_mimage_1(0)
  [cmirror2_mimage_0] vg2        iwi-ao 10.00G                                 
       /dev/sdc1(0)
  [cmirror2_mimage_1] vg2        iwi-ao 10.00G                                 
       /dev/sdh2(0)
  [cmirror2_mlog]     vg2        lwi-ao  4.00M                                 
       /dev/sdd1(0)
  cmirror3            vg3        wi-ao 10.00G                    cmirror3_mlog
100.00 cmirror3_mimage_0(0),cmirror3_mimage_1(0)
  [cmirror3_mimage_0] vg3        iwi-ao 10.00G                                 
       /dev/sdf1(0)
  [cmirror3_mimage_1] vg3        iwi-ao 10.00G                                 
       /dev/sdh3(0)
  [cmirror3_mlog]     vg3        lwi-ao  4.00M                                 
       /dev/sde1(0)


LINK-07:
  cmirror1            vg1        Mwi-ao 10.00G                    cmirror1_mlog
100.00 cmirror1_mimage_0(0),cmirror1_mimage_1(0)
  [cmirror1_mimage_0] vg1        iwi-ao 10.00G
       /dev/sdb1(0)
  [cmirror1_mimage_1] vg1        iwi-ao 10.00G
       /dev/sda1(0)
  [cmirror1_mlog]     vg1        lwi-ao  4.00M
       /dev/sdc1(0)
  cmirror2            vg2        Mwi-ao 10.00G                    cmirror2_mlog
100.00 cmirror2_mimage_0(0),cmirror2_mimage_1(0)
  [cmirror2_mimage_0] vg2        iwi-ao 10.00G
       /dev/sdd1(0)
  [cmirror2_mimage_1] vg2        iwi-ao 10.00G
       /dev/sda2(0)
  [cmirror2_mlog]     vg2        lwi-ao  4.00M
       /dev/sde1(0)
  cmirror3            vg3        Mwi-ao 10.00G                    cmirror3_mlog
100.00 cmirror3_mimage_0(0),cmirror3_mimage_1(0)
  [cmirror3_mimage_0] vg3        iwi-ao 10.00G
       /dev/sdg1(0)
  [cmirror3_mimage_1] vg3        iwi-ao 10.00G
       /dev/sda3(0)
  [cmirror3_mlog]     vg3        lwi-ao  4.00M
       /dev/sdf1(0)


LINK-08:
cmirror1            vg1        Mwi-ao 10.00G                    cmirror1_mlog
100.00 cmirror1_mimage_0(0),cmirror1_mimage_1(0)
  [cmirror1_mimage_0] vg1        iwi-ao 10.00G                                 
       /dev/sdb1(0)
  [cmirror1_mimage_1] vg1        iwi-ao 10.00G                                 
       /dev/sda1(0)
  [cmirror1_mlog]     vg1        lwi-ao  4.00M                                 
       /dev/sdc1(0)
  cmirror2            vg2        Mwi-ao 10.00G                    cmirror2_mlog
100.00 cmirror2_mimage_0(0),cmirror2_mimage_1(0)
  [cmirror2_mimage_0] vg2        iwi-ao 10.00G                                 
       /dev/sdd1(0)
  [cmirror2_mimage_1] vg2        iwi-ao 10.00G                                 
       /dev/sda2(0)
  [cmirror2_mlog]     vg2        lwi-ao  4.00M                                 
       /dev/sde1(0)
  cmirror3            vg3        Mwi-ao 10.00G                    cmirror3_mlog
100.00 cmirror3_mimage_0(0),cmirror3_mimage_1(0)
  [cmirror3_mimage_0] vg3        iwi-ao 10.00G                                 
       /dev/sdg1(0)
  [cmirror3_mimage_1] vg3        iwi-ao 10.00G                                 
       /dev/sda3(0)
  [cmirror3_mlog]     vg3        lwi-ao  4.00M                                 
       /dev/sdf1(0)



end_request: I/O error, dev sdh, sector 855178879
SCSI error : <1 0 1 1> return code = 0x10000
end_request: I/O error, dev sdh, sector 855178887
device-mapper: Write error during recovery (error = 0x1)
device-mapper: recovery failed on region 10564
dm-cmirror: Unable to locate record of recovery
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at dm_cmirror_server:764
invalid operand: 0000 [1] SMP
CPU 0
Modules linked in: lock_dlm(U) gfs(U) lock_harness(U) dm_cmirror(U) dlm(U)
cman(U) mptfc qla2300 qla2xxx scsi_transport_fc md5 ipv6 parport_pc lp parport
autofs4 i2c_dev i2c_core sunrpc ds yenta_socket pcmcia_core button battery ac
ohci_hcd hw_random k8_edac edac_mc tg3 dm_snapshot dm_zero dm_mirror ext3 jbd
dm_mod mptscsih mptsas mptspi mptscsi mptbase sd_mod scsi_mod
Pid: 5678, comm: cluster_log_ser Not tainted 2.6.9-55.ELlargesmp
RIP: 0010:[<ffffffffa02a6465>]
<ffffffffa02a6465>{:dm_cmirror:cluster_log_serverd+4152}
RSP: 0000:0000010031ab9e38  EFLAGS: 00010216
RAX: 0000000000000033 RBX: 00000000fffffffa RCX: 0000000000000246
RDX: 0000000000d6692d RSI: 0000000000000246 RDI: ffffffff803e6580
RBP: 0000000000000000 R08: 00000000000927bf R09: 00000000fffffffa
R10: ffffffff80317aa0 R11: 0000ffff804015a0 R12: 0000010039bad400
R13: 0000000000000003 R14: 000001003880b9c0 R15: 000001003880b9e0
FS:  0000002a95563f00(0000) GS:ffffffff80500380(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000002a95610000 CR3: 0000000000101000 CR4: 00000000000006e0
Process cluster_log_ser (pid: 5678, threadinfo 0000010031ab8000, task
000001001d126030)
Stack: 0000010001663100 0000010000000073 0000000000000012 0000000080142ce7
       0000010001013a80 0000000000040001 000001001d126030 0000000000045d0f
       00003aacd760bf6c 00000100397c37f0
Call Trace:<ffffffff8013aa77>{do_exit+3151} <ffffffff80179335>{vfs_read+248}
       <ffffffff80110f47>{child_rip+8}
<ffffffffa02a542d>{:dm_cmirror:cluster_log_serverd+0}
       <ffffffff80110f3f>{child_rip+0}

Code: 0f 0b e6 8d 2a a0 ff ff ff ff fc 02 48 85 ed 75 34 49 8d bc
RIP <ffffffffa02a6465>{:dm_cmirror:cluster_log_serverd+4152} RSP <0000010031ab9e38>
 <0>Kernel panic - not syncing: Oops



Version-Release number of selected component (if applicable):
2.6.9-55.ELlargesmp
lvm2-cluster-2.02.21-7.el4
cmirror-kernel-2.6.9-32.0

Comment 1 Corey Marthaler 2008-06-16 18:14:44 UTC
I'm going to close this bug because it hasn't been seen in awhile and because I
believe it may be related to bz 450939.


Note You need to log in before you can comment on or make changes to this bug.