Bug 450939

Summary:

panic in cluster_log_ser during resync of two legged core log mirrors

Product:

[Retired] Red Hat Cluster Suite

Reporter:

Corey Marthaler <cmarthal>

Component:

cmirror-kernel

Assignee:

Jonathan Earl Brassow <jbrassow>

Status:

CLOSED ERRATA

QA Contact:

Cluster QE <mspqa-list>

Severity:

high

Docs Contact:

Priority:

high

Version:

CC:

edamato

Target Milestone:

---

Keywords:

Regression

Target Release:

---

Hardware:

All

OS:

Linux

Whiteboard:

Fixed In Version:

RHBA-2008-0803

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2008-07-25 19:28:02 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
log from taft-01	none
log from taft-02	none
log from taft-03	none
log from taft-04	none

Description Corey Marthaler 2008-06-11 20:13:14 UTC

Description of problem:
Saw this panic on taft-01 after the failed logs were added back into the mirrors
and while they were in the process of resyncing.

Senario: Kill secondary leg of synced core log 2 leg mirror(s)

****** Mirror hash info for this scenario ******
* name:      syncd_secondary_core_2legs
* sync:      1
* mirrors:   2
* disklog:   0
* failpv:    /dev/sdf1
* legs:      2
* pvs:       /dev/sdg1 /dev/sdf1
************************************************

Creating mirror(s) on taft-01...
taft-01: lvcreate --corelog -m 1 -n syncd_secondary_core_2legs_1 -L 800M
helter_skelter /dev/sdg1:0-1000 /dev/sdf1:0-1000
taft-01: lvcreate --corelog -m 1 -n syncd_secondary_core_2legs_2 -L 800M
helter_skelter /dev/sdg1:0-1000 /dev/sdf1:0-1000

Waiting until all mirrors become fully syncd...
        0/2 mirror(s) are fully synced: ( 1=11.50% 2=0.50% )
        0/2 mirror(s) are fully synced: ( 1=31.50% 2=20.50% )
        0/2 mirror(s) are fully synced: ( 1=52.50% 2=42.00% )
        0/2 mirror(s) are fully synced: ( 1=72.50% 2=62.00% )
        0/2 mirror(s) are fully synced: ( 1=93.00% 2=82.50% )
        2/2 mirror(s) are fully synced: ( 1=100.00% 2=100.00% )

Creating gfs on top of mirror(s) on taft-01...
Mounting mirrored gfs filesystems on taft-01...
Mounting mirrored gfs filesystems on taft-02...
Mounting mirrored gfs filesystems on taft-03...
Mounting mirrored gfs filesystems on taft-04...

Writing verification files (checkit) to mirror(s) on...

<start name="taft-01_1" pid="9519" time="Wed Jun 11 14:05:02 2008" type="cmd" />
<start name="taft-01_2" pid="9521" time="Wed Jun 11 14:05:02 2008" type="cmd" />
<start name="taft-02_1" pid="9523" time="Wed Jun 11 14:05:02 2008" type="cmd" />
<start name="taft-02_2" pid="9525" time="Wed Jun 11 14:05:02 2008" type="cmd" />
<start name="taft-03_1" pid="9527" time="Wed Jun 11 14:05:02 2008" type="cmd" />
<start name="taft-03_2" pid="9529" time="Wed Jun 11 14:05:02 2008" type="cmd" />
<start name="taft-04_1" pid="9531" time="Wed Jun 11 14:05:02 2008" type="cmd" />
<start name="taft-04_2" pid="9533" time="Wed Jun 11 14:05:02 2008" type="cmd" />
Sleeping 12 seconds to get some outsanding I/O locks before the failure

Disabling device sdf on taft-01
Disabling device sdf on taft-02
Disabling device sdf on taft-03
Disabling device sdf on taft-04

Attempting I/O to cause mirror down conversion(s) on taft-01
10+0 records in
10+0 records out
Verifying the down conversion of the failed mirror(s)
  /dev/sdf1: open failed: No such device or address
Verifying LEG device /dev/sdg1 *IS* in the volume(s)
  /dev/sdf1: open failed: No such device or address
Verifying FAILED device /dev/sdf1 is *NOT* in the volume(s)
  /dev/sdf1: open failed: No such device or address
Verify the dm devices associated with /dev/sdf1 are no longer present

Verifying files (checkit) on mirror(s) on...

Enabling device sdf on taft-01
Enabling device sdf on taft-02
Enabling device sdf on taft-03
Enabling device sdf on taft-04

Recreating PV /dev/sdf1
  WARNING: Volume group helter_skelter is not consistent
  WARNING: Volume Group helter_skelter is not consistent
  WARNING: Volume group helter_skelter is not consistent
Extending the recreated PV back into VG helter_skelter
Up converting linear(s) back to mirror(s) on taft-01...
taft-01: lvconvert --corelog -m 1 helter_skelter/syncd_secondary_core_2legs_1
/dev/sdg1:0-1000 /dev/sdf1:0-1000
taft-01: lvconvert --corelog -m 1 helter_skelter/syncd_secondary_core_2legs_2
/dev/sdg1:0-1000 /dev/sdf1:0-1000
Verifying the up conversion from linear(s) to mirror(s)
Verifying device /dev/sdg1 *IS* in the mirror(s)
Verifying device /dev/sdf1 *IS* in the mirror(s)

Waiting until all mirrors become fully syncd...
        1/2 mirror(s) are fully synced: ( 1=100.00% 2=2.00% )
        1/2 mirror(s) are fully synced: ( 1=100.00% 2=23.00% )
        1/2 mirror(s) are fully synced: ( 1=100.00% 2=39.00% )
        1/2 mirror(s) are fully synced: ( 1=100.00% 2=51.00% )
[PANIC]

Jun 11 14:10:18 taft-01 kernel: dm-cmirror: Someone is already recovering region
697/90GcsfRZ
Jun 11 14:10:18 taft-01 last message repeated 22 times
Jun 11 14:10:18 taft-01 kernel: dm-cmirror: Client finishing recovery: 697/90GcsfRZ
Jun 11 14:10:18 taft-01 kernel: dm-cmirror: Resync work completed by 2: 697/90GcsfRZ
Jun 11 14:10:18 taft-01 kernel: dm-cmirror: Assigning recovery work to 2:
698/90GcsfRZ
Jun 11 14:10:18 taft-01 kernel: dm-cmirror: Received recovery work from 2:
698/90GcsfRZ
Jun 11 14:10:18 taft-01 kernel: dm-cmirror: Someone is already recovering region
698/90GcsfRZ
Jun 11 14:10:18 taft-01 last message redm-cmirror: Recovery blocked by
outstanding write on region 831/90GcsfRZ
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at dm_cmirror_server:696
invalid operand: 0000 [1] SMP
CPU 2
Modules linked in: lock_dlm(U) gfs(U) lock_harness(U) dm_cmirror(U) dlm(U)
cman(U) md5 ipv6 parport_pc lp parport autofs4 i2c_dev i2c_core sunrpc ds
yenta_socket pcmcia_core cpufreq_powersave button battery ac uhci_hcd ehci_hcd
hw_random e1000 floppy sg dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod ata_piix
libata qla2300 qla2xxx scsi_transport_fc megaraid_mbox megaraid_mm sd_mod scsi_mod
Pid: 20086, comm: cluster_log_ser Not tainted 2.6.9-71.ELsmp
RIP: 0010:[<ffffffffa028e7f1>]
<ffffffffa028e7f1>{:dm_cmirror:cluster_log_serverd+3897}
RSP: 0000:0000010202817e08  EFLAGS: 00010216
RAX: 000000000000004c RBX: 00000000fffffffa RCX: ffffffff803f5aa8
RDX: ffffffff803f5aa8 RSI: 0000000000000246 RDI: ffffffff803f5aa0
RBP: 000001021467c4d0 R08: ffffffff803f5aa8 R09: 00000000fffffffa
R10: 0000000100000000 R11: 0000ffff80410d80 R12: 00000101fff87800
R13: 0000000000000004 R14: 000001021467c4c0 R15: 000001021467c4e0
FS:  0000000000000000(0000) GS:ffffffff8050c680(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000005a23e8 CR3: 00000000dffae000 CR4: 00000000000006e0
Process cluster_log_ser (pid: 20086, threadinfo 0000010202816000, task
0000010006af97f0)
Stack: 474a465148554247 474a465148554247 474a465148554247 474a465148554247
       474a465148554247 474a465148554247 474a465148554247 474a465148554247
       474a465148554247 474a465148554247
Call Trace:
        <ffffffff80101000>{init_level4_pgt+0}
        <ffffffff80110fd3>{child_rip+8}
        <ffffffffa028d8b8>{:dm_cmirror:cluster_log_serverd+0}
        <ffffffff80110fcb>{child_rip+0}

Code: 0f 0b 1a 13 29 a0 ff ff ff ff b8 02 e9 b8 00 00 00 31 c0 49
RIP <ffffffffa028e7f1>{:dm_cmirror:cluster_log_serverd+3897} RSP <0000010202817e08>
 ----------- [cut here ] --------- [please bite here ] ---------
Kernel panic - not syncing: Oops
 <1>Kernel BUG at spinlock:76
invalid operand: 0000 [2] SMP
CPU 1
Modules linked in: lock_dlm(U) gfs(U) lock_harness(U) dm_cmirror(U) dlm(U)
cman(U) md5 ipv6 parport_pc lp parport autofs4 i2c_dev i2c_core sunrpc ds
yenta_socket pcmcia_core cpufreq_powersave button battery ac uhci_hcd ehci_hcd
hw_random e1000 floppy sg dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod ata_piix
libata qla2300 qla2xxx scsi_transport_fc megaraid_mbox megaraid_mm sd_mod scsi_mod
Pid: 3645, comm: klogd Not tainted 2.6.9-71.ELsmp
RIP: 0010:[<ffffffff80318f69>] <ffffffff80318f69>{_spin_unlock_irq+27}
RSP: 0018:0000010215253e50  EFLAGS: 00010002
RAX: 0000000000000001 RBX: 000000000000006e RCX: 00000000008be539
RDX: ffffffff804861c0 RSI: 0000000000000e68 RDI: ffffffff803f5ac0
RBP: 0000000000000000 R08: 0000010215252000 R09: 0000000000000008
R10: 0000000000000042 R11: 0000000000000005 R12: 0000000000000fff
R13: 000000000000003d R14: 000000552abb093d R15: 0000000000000000
FS:  0000002a958a5b00(0000) GS:ffffffff8050c600(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000005a5b58 CR3: 00000001fffa8000 CR4: 00000000000006e0
Process klogd (pid: 3645, threadinfo 0000010215252000, task 0000010006b767f0)
Stack: ffffffff8013897d 0000000000000000 0000010006b767f0 ffffffff801360c4
       0000010215253ea0 0000010215253ea0 0000000000000000 0000000000000000
       0000010006b767f0 ffffffff801360c4
Call Trace:
        <ffffffff8013897d>{do_syslog+405}
        <ffffffff801360c4>{autoremove_wake_function+0}
        <ffffffff801360c4>{autoremove_wake_function+0}
        <ffffffff801360c4>{autoremove_wake_function+0}
        <ffffffff801b28e2>{kmsg_read+56}
        <ffffffff8017bac0>{vfs_read+207}
        <ffffffff8017bd1c>{sys_read+69}
        <ffffffff801102f6>{system_call+126}


Code: 0f 0b b1 24 33 80 ff ff ff ff 4c 00 c7 07 01 00 00 00 fb c3
RIP <ffffffff80318f69>{_spin_unlock_irq+27} RSP <0000010215253e50>
Badness in do_unblank_screen at drivers/char/vt.c:2878

Call Trace:
        <ffffffff8023b25c>{do_unblank_screen+61}
        <ffffffff80124388>{bust_spinlocks+28}
        <ffffffff80111b68>{oops_end+18}
        <ffffffff80111c95>{die+54}
        <ffffffff80112058>{do_invalid_op+145}
        <ffffffff80318f69>{_spin_unlock_irq+27}
        <ffffffff80317d24>{thread_return+0}
        <ffffffff80317d7c>{thread_return+88}
        <ffffffff80110e1d>{error_exit+0}
        <ffffffff80318f69>{_spin_unlock_irq+27}
        <ffffffff8013897d>{do_syslog+405}
        <ffffffff801360c4>{autoremove_wake_function+0}
        <ffffffff801360c4>{autoremove_wake_function+0}
        <ffffffff801360c4>{autoremove_wake_function+0}
        <ffffffff801b28e2>{kmsg_read+56}
        <ffffffff8017bac0>{vfs_read+207}
        <ffffffff8017bd1c>{sys_read+69}
        <ffffffff801102f6>{system_call+126}

Version-Release number of selected component (if applicable):
2.6.9-71.ELsmp

lvm2-2.02.37-2.el4    BUILT: Wed Jun 11 07:03:46 CDT 2008
lvm2-cluster-2.02.37-2.el4    BUILT: Wed Jun 11 08:56:04 CDT 2008
device-mapper-1.02.25-2.el4    BUILT: Mon Jun  9 09:28:41 CDT 2008
cmirror-1.0.1-1    BUILT: Tue Jan 30 17:28:02 CST 2007
cmirror-kernel-2.6.9-41.4    BUILT: Tue Jun  3 13:54:29 CDT 2008

Comment 1 Corey Marthaler 2008-06-11 20:19:11 UTC

Created attachment 308986 [details]
log from taft-01

Comment 2 Corey Marthaler 2008-06-11 20:21:06 UTC

Created attachment 308987 [details]
log from taft-02

Comment 3 Corey Marthaler 2008-06-11 20:27:01 UTC

Created attachment 308989 [details]
log from taft-03

Comment 4 Corey Marthaler 2008-06-11 20:28:16 UTC

Created attachment 308990 [details]
log from taft-04

Comment 5 Corey Marthaler 2008-06-13 13:40:06 UTC

Reproduced this again last night while running the same test case:
Kill the secendary legs of two synced core log mirrors.

lvm2-2.02.37-3.el4
lvm2-cluster-2.02.37-3.el4

Comment 6 Corey Marthaler 2008-06-13 15:58:21 UTC

This now seems fairly easy to reproduce, and since this testcase worked before,
it appears to be a regression.

Comment 7 Corey Marthaler 2008-06-13 20:31:10 UTC

Hit this during machine recovery testing as well. Shot hayes-01, and hayes-03
paniced.

GFS: fsid=HAYES:2.0: Done
dm-cmirror: Error listening for server(2) response for KbfWJO2B: -110
dm-cmirror: Error listening for server(2) response for KbfWJO2B: -110
GFS: fsid=HAYES:2.0: jid=1: Trying to acquire journal lock...
GFS: fsid=HAYES:1.2: jid=0: Trying to acquire journal lock...
GFS: fsid=HAYES:2.0: jid=1: Looking at journal...
GFS: fsid=HAYES:1.2: jid=0: Busy
dm-cmirror: Recovery blocked by outstanding write on region 6899/KbfWJO2B
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at dm_cmirror_server:696                                            
        
[...]

Comment 8 Corey Marthaler 2008-06-16 15:02:17 UTC

FWIW, this can be reproduced by failing the primary leg as well, doesn't have to
be the secondary one.

Comment 9 Jonathan Earl Brassow 2008-06-17 03:38:58 UTC

Judging from comment #1, you should be able to repo this bug just by doing
linear -> (corelog) mirror upconverts...  This should be true because there is
no mirror state carried along when a failed mirror goes to linear.

I'm not sure I see a way for this panic to be triggered during normal
operation...  I think I see a way during machine/disk failure; so it will be
important to know when exactly this panic is happening.

(BTW, Thanks for including the build dates of the rpms - that's helpful.)

Comment 10 Nate Straz 2008-06-17 13:16:02 UTC

I hit something like this while running a related cmirror failure case with the
following bits:

cmirror-1.0.1-1
cmirror-kernel-2.6.9-41.4
lvm2-2.02.37-3.el4
lvm2-cluster-2.02.37-3.el4

Senario: Kill secondary leg of non synced core log 2 leg mirror(s)

****** Mirror hash info for this scenario ******
* name:      nonsyncd_secondary_core_2legs
* sync:      0
* mirrors:   1
* disklog:   0
* failpv:    /dev/sdd1
* legs:      2
* pvs:       /dev/sde1 /dev/sdd1
************************************************

During a resync morph-02 panicked with the following messages:

dm-cmirror: Mark attempted to recovering region by 3: 679/fA32IibI
dm-cmirror:   lc->recovering_region = 679
dm-cmirror:   ru->ru_rw             = 2
dm-cmirror:   ru->ru_nodeid         = 4
dm-cmirror:   ru->ru_region         = 679
------------[ cut here ]------------
kernel BUG at
/builddir/build/BUILD/cmirror-kernel-2.6.9-41/hugemem/src/dm-cmirror-server.c:574!
invalid operand: 0000 [#1]
SMP 
Modules linked in: lock_dlm(U) gfs(U) lock_harness(U) dm_cmirror(U) dlm(U)
cman(U) md5 ipv6 parport_pc lp parport autofs4 i2c_dev i2c_core sunrpc
cpufreq_powersave button battery ac uhci_hcd hw_random e1000 floppy dm_snapshot
dm_zero dm_mirror ext3 jbd dm_mod qla2300 ata_piix libata qla2xxx
scsi_transport_fc sd_mod scsi_mod
CPU:    1
EIP:    0060:[<82c15820>]    Not tainted VLI
EFLAGS: 00010246   (2.6.9-71.ELhugemem) 
EIP is at server_mark_region+0x198/0x1bb [dm_cmirror]
eax: 0000002d   ebx: 79ad8200   ecx: 75578f34   edx: 82c18b67
esi: 7961ff20   edi: 39c6c375   ebp: 00000000   esp: 75578f30
ds: 007b   es: 007b   ss: 0068
Process cluster_log_ser (pid: 24176, threadinfo=75578000 task=773a3330)
Stack: 82c18b67 000002a7 00000000 82c18b3a 00000004 82c18b0d 00000002 00000003 
       81da5180 39c6c200 75578000 fffffffa ff000000 39c6c200 82c1664a 022d6202 
       00000043 75578000 7d5d85b0 773a3330 7e692d00 81da5220 00000000 c73e0002 
Call Trace:
 [<82c1664a>] process_log_request+0x28f/0x47b [dm_cmirror]
 [<022d6202>] schedule+0x8c2/0x8ee
 [<82c169e1>] cluster_log_serverd+0x1ab/0x21f [dm_cmirror]
 [<82c16836>] cluster_log_serverd+0x0/0x21f [dm_cmirror]
 [<021041f5>] kernel_thread_helper+0x5/0xb
Code: 20 ff 73 08 68 0d 8b c1 82 e8 66 d3 50 7f ff 73 0c 68 3a 8b c1 82 e8 59 d3
50 7f ff 73 14 ff 73 10 68 67 8b c1 82 e8 49 d3 50 7f <0f> 0b 3e 02 95 8b c1 82
83 c4 1c eb 0c 8b 00 89 70 04 89 06 89 
 <0>Fatal exception: panic in 5 seconds
dm-cmirror: Client finishing recovery: 679/fA32IibI
Kernel panic - not syncing: Fatal exception

Comment 11 Corey Marthaler 2008-06-19 15:33:37 UTC

It appears that the kernel mod that Jon gave me fixes this issue.

Comment 12 Jonathan Earl Brassow 2008-06-20 18:19:45 UTC

Fix checked into CVS:
commit a0e6a6d02a4a55b98078dc874204c5555dbf74a4
Author: Jonathan Brassow <jbrassow>
Date:   Fri Jun 20 13:17:44 2008 -0500

    dm-cmirror.ko:  Fix for bug 450939

    - must reset 'in_sync' var upon resume
    -- Clean-up of some compile warnings
    -- Additional debugging statements

Comment 13 Corey Marthaler 2008-06-23 16:23:25 UTC

Looks like this is still reproducable after all. Moving back to 'assigned'.

Comment 14 Jonathan Earl Brassow 2008-06-24 14:28:00 UTC

There should be no need for failure to reproduce this bug.  Let's also try:

1) Create corelog mirror - using '--nosync' is fine
2) get I/O going on all machines except the server
3) Deactivate the server (and only the server) node via 'lvchange -aln vg/lv'

This should trigger the bug.  The server will migrate, the outstanding I/O will
cause the mirror to appear 'not-in-sync', but the in_sync var will still be set.
 Once recovery collides with a write, the BUG will trigger.

The failure should happen within 10 minutes.  If it doesn't, you can try again.

Comment 15 Corey Marthaler 2008-06-25 16:57:22 UTC

My test is now hitting this on non corelog mirrors

Senario: Kill primary leg of synced 3 leg mirror(s)

[...] # well after the failure was dealt with

Enabling device sdg on taft-01
Enabling device sdg on taft-02
Enabling device sdg on taft-03
Enabling device sdg on taft-04

Recreating PV /dev/sdg1
  WARNING: Volume group helter_skelter is not consistent
  WARNING: Volume Group helter_skelter is not consistent
  WARNING: Volume group helter_skelter is not consistent
Extending the recreated PV back into VG helter_skelter
Since we can't yet up convert existing mirrors, down converting to linear(s)
on taft-04 before re-converting back to original mirror(s)
Up converting linear(s) back to mirror(s) on taft-04...
taft-04: lvconvert -m 2 helter_skelter/syncd_primary_3legs_1 /dev/sdg1:0-1000
/dev/sdf1:0-1000 /dev/sde1:0-1000 /dev/sdh1:0-150
taft-04: lvconvert -m 2 helter_skelter/syncd_primary_3legs_2 /dev/sdg1:0-1000
/dev/sdf1:0-1000 /dev/sde1:0-1000 /dev/sdh1:0-150
Verifying the up conversion from linear(s) to mirror(s)
Verifying device /dev/sdg1 *IS* in the mirror(s)
Could not connect to remote host
[PANIC]

Comment 16 Jonathan Earl Brassow 2008-06-27 14:37:55 UTC

should be fixed with the latest updates:


commit 3ab00427e3eaf45e99d5f40fed6f3b459faccb14
Author: Jonathan Brassow <jbrassow>
Date:   Fri Jun 27 09:36:00 2008 -0500

    dm-cmirror.ko:  Fix for bug 450939, and other minor cleanups

    - If a write-recovery conflict is detected, halt recovery
      rather than calling BUG() (the fix bug 450939)
    - Minor code style cleanups

Comment 20 errata-xmlrpc 2008-07-25 19:28:02 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2008-0803.html