Bug 228067
| Summary: | GFS I/O during mirror leg failure on three legged mirror paniced in dm_mirror:mirror_end_io | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | [Retired] Red Hat Cluster Suite | Reporter: | Corey Marthaler <cmarthal> | ||||||
| Component: | cmirror | Assignee: | Jonathan Earl Brassow <jbrassow> | ||||||
| Status: | CLOSED WORKSFORME | QA Contact: | Cluster QE <mspqa-list> | ||||||
| Severity: | medium | Docs Contact: | |||||||
| Priority: | high | ||||||||
| Version: | 4 | CC: | agk, dwysocha, jbrassow, jnomura, mbroz, prockai, rkenna | ||||||
| Target Milestone: | --- | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | All | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2007-08-24 18:56:33 UTC | Type: | --- | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
bz 228070 is the issue that happened on link-02 during this panic. Feb 9 08:31:37 link-02 kernel: device-mapper: Unable to read from primary mirror during recovery Feb 9 08:31:37 link-02 kernel: device-mapper: Primary mirror device has failed while mirror is out of sync. Feb 9 08:31:37 link-02 kernel: device-mapper: Unable to choose alternative primary device Feb 9 08:31:37 link-02 kernel: device-mapper: recovery failed on region 5576 Created attachment 147826 [details] Analysis of oops The call chain of the oops looks like: mirror_end_io()->__rh_lookup()->rh_dec() It looks like the call to __rh_lookup() inside rh_dec() fails to find a region pointer and the atomic_dec_and_test() instruction is what faults (full analysis in attachment). static void rh_dec(struct region_hash *rh, region_t region) { unsigned long flags; struct region *reg; int should_wake = 0; read_lock(&rh->hash_lock); reg = __rh_lookup(rh, region); read_unlock(&rh->hash_lock); spin_lock_irqsave(&rh->region_lock, flags); >>>OOPS>>> if (atomic_dec_and_test(®->pending)) { /* * There is no pending I/O for this region. * We can move the region to corresponding list for next action. Was able to reproduce this panic, Bumping priority. I propose setting a restriction that mirrors are limited to 2 sides for 4.5. This would diffuse this bug. Once we agree on that, I'll open a RFE for 4.6 and make this bug dependent on that. This bug look more like a problem with a reg struct being forgotten (if the analysis is correct) Once the machine (presumably the server) goes down with the panic, the server seems unable to migrate... Seems alot like 217895 after the panic. 228104 and 228056 have been fixed and are waiting for a package build. I believe this bug will still exist after that fix, but it would be nice to confirm. continuing on comment #3... rh_dec should always find the reg... the only way it can't is if it's not on the hash_list. The only way it gets removed from the hash_list is if it somehow got onto the clean_regions, recovered_regions, or failed_recovered_regions lists... Recovery can't start until all writes for the region have cleared; and new writes can't start on a region until a recovery is complete... That would seem to rule out latter two reasons for getting off the hash list. In order to get on the clean_regions list before the final write completes, the reg->pending count would have to be screwed up. I've reviewed the pending variable and I don't see a way that can get screwed up... That pretty much means that we looked up a region by the wrong number and there was no memory to allocate a new region. Very unlikely. Looking at it another way... forget about whether the reg is lost or not. We would have to fail a lookup and fail an allocation to get a NULL reg. corey: have you been able to reproduce this one? It is _very_ interesting to see the node oops in mirror_end_io... We might be able to add some debugging to the kernel, but we need to be able to easily reproduce it first. Waiting for reproduction. This has not been reproduced with the latest packages. Marking verified. Reproduced this last night on link-07 during cmirror failure, reopening...
Senario: Kill secondary leg of non synced 2 leg mirror
****** Mirror hash info for this scenario ******
* name: fail_secondary_non_synced_2_legs
* sync: 0
* disklog: 1
* failpv: /dev/sdb1
* legs: 2
* pvs: /dev/sda1 /dev/sdb1 /dev/sdg1
************************************************
Creating mirror on link-07...
qarsh root@link-07 lvcreate -m 1 -n fail_secondary_non_synced_2_legs -L 800M
helter_skelter /dev/sda1:0-500 /dev/sdb1:0-500 /dev/sdg1:0-50
mirror is only 6.00% synced right now
Creating gfs on top of mirror on link-07...
Creating mnt point /mnt/fail_secondary_non_synced_2_legs on link-02...
Mounting gfs on link-02...
Creating mnt point /mnt/fail_secondary_non_synced_2_legs on link-04...
Mounting gfs on link-04...
Creating mnt point /mnt/fail_secondary_non_synced_2_legs on link-07...
Mounting gfs on link-07...
Creating mnt point /mnt/fail_secondary_non_synced_2_legs on link-08...
Mounting gfs on link-08...
Writing files to gfs on...
link-02
checkit starting with:
CREATE
Num files: 100
Random Seed: 29592
Verify XIOR Stream: /tmp/checkit_fail_secondary_non_synced_2_legs
Working dir: /mnt/fail_secondary_non_synced_2_legs
link-04
checkit starting with:
CREATE
Num files: 100
Random Seed: 29684
Verify XIOR Stream: /tmp/checkit_fail_secondary_non_synced_2_legs
Working dir: /mnt/fail_secondary_non_synced_2_legs
link-07
checkit starting with:
CREATE
Num files: 100
Random Seed: 30312
Verify XIOR Stream: /tmp/checkit_fail_secondary_non_synced_2_legs
Working dir: /mnt/fail_secondary_non_synced_2_legs
link-08
checkit starting with:
CREATE
Num files: 100
Random Seed: 28879
Verify XIOR Stream: /tmp/checkit_fail_secondary_non_synced_2_legs
Working dir: /mnt/fail_secondary_non_synced_2_legs
Disabling device sdb on link-02
Disabling device sdb on link-04
Disabling device sdb on link-07
Disabling device sdb on link-08
Attempting I/O to cause mirror down conversion on link-07
10+0 records in
10+0 records out
Didn't receive heartbeat for 120 seconds
Verifying the down conversion from mirror to linear
connect() failed on local socket: Connection refused
Skipping clustered volume group helter_skelter
down conversion didn't appear to work as a simple lvs fails
run_iteration() method failed at
/home/msp/cmarthal/work/rhel4/sts-root/lib/FI_engine.pm line 21.
[PANIC]
Apr 25 19:09:43 link-07 lvm[5439]: No longer monitoring mirror device
helter_skelter-fail_secondary_non_synced_2_legs for events
device-mapper: recovery failed on region 959
scsi2 (0:2): rejecting I/O to offline device
Unable to handle
kernel NULL pointer dereference at 0000000000000038 RIP:
<ffffffffa00f8caa>{:dm_mirror:mirror_end_io+102}
PML4 1dc36067 PGD 1dc35067 PMD 0
Oops: 0002 [1] SMP
CPU 0
Modules linked in:
lock_dlm(U) gfs(U) lock_harness(U) dm_cmirror(U) dlm(U) cman(U) md5 ipv6
parport_pc lp parport autofs4 i2c_dev i2c_core sunrpc ds yenta_socket
pcmcia_core button battery ac ohci_hcd hw_random k8_edac edac_mc tg3 floppy
dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod qla2300 qla2xxx scsi_transport_fc
mptscsih mptsas mptspi mptscsi mptbase sd_mod scsi_mod
Pid: 0, comm: swapper Not
tainted 2.6.9-55.ELsmp
RIP: 0010:[<ffffffffa00f8caa>]
<ffffffffa00f8caa>{:dm_mirror:mirror_end_io+102}
RSP: 0018:ffffffff8045a898 EFLAGS: 00010046
RAX: 0000000000000216 RBX: 000001003f47a830 RCX:
0000000000000216 RDX:
00000000000003bf RSI: 0000000000000216 RDI: 000001003f47a850
RBP: 000001003f47a818 R08: ffffff0000059000 R09:
000000000000003f R10:
ffffffffa00ac2d2 R11: ffffffffa00ac2d2 R12: 0000000000000000
R13: 000001003f47a800 R14: 0000000000000000 R15:
0000000000000000 FS:
0000002a95562b00(0000) GS:ffffffff804ed700(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000038
CR3: 0000000000101000 CR4: 00000000000006e0
Process swapper (pid: 0, threadinfo ffffffff804f0000, task
ffffffff803d7280) Stack: 0000000100000000
0000010036794750 0000000000000000 00000100367f9980
0000000000000000 00000100367fc628 000001001f63f600
ffffffffa00ac338 0000000000000000
0000010036794750
Call Trace:<IRQ> <ffffffffa00ac338>{:dm_mod:clone_endio+102}
<ffffffffa00ac44f>{:dm_mod:clone_endio+381}
<ffffffff802512cf>{__end_that_request_first+238}
<ffffffffa000703c>{:scsi_mod:scsi_end_request+40}
<ffffffffa0007351>{:scsi_mod:scsi_io_completion+497}
<ffffffffa0002d45>{:scsi_mod:scsi_softirq+213}
<ffffffff8013cbec>{__do_softirq+88}
<ffffffff8013cc95>{do_softirq+49} <ffffffff80113247>{do_IRQ+328}
<ffffffff80110833>{ret_from_intr+0} <EOI>
<ffffffff8010e749>{default_idle+0}
<ffffffff8010e769>{default_idle+32} <ffffffff8010e7dc>{cpu_idle+26}
<ffffffff804f367b>{start_kernel+470}
<ffffffff804f31d5>{_sinittext+469}
Code: f0 41 ff 4c 24 38 0f 94 c0 84 c0 74 66 41 8b 44 24 10 83 f8
RIP
<ffffffffa00f8caa>{:dm_mirror:mirror_end_io+102} RSP <ffffffff8045a898>
CR2: 0000000000000038
<0>Kernel panic - not
syncing: Oops
Created attachment 153521 [details] panic trace Attaching panic as a text file since it didn't look good in comment #15. bug 238031 was fixed/built shortly after this bug was re-filed. That bug addressed issues of conflicting recovery/writes. Complain all you want about the performance of cluster mirrors during recovery, but you can't agrue with the results... no more GFS panics. This bug may not be addressed with the latest packages... the mirror_end_io business seems new to me. I'd like to see it reproduced though. Moving defect to 4.6 list to track for the next release if more information is uncovered. Moving to 4.7 consideration until more data is provided. Closing this bug due to the inability to reproduce. |
Description of problem: I had 4 (3 legged) cluster mirrors (created out of 2 seperate VGs) each with GFS on top and all running filesystem I/O. I then failed /dev/hda which was set up to be one of the three legs in all 4 mirrors. Just after the leg failed I hit the following panic on link-04. I also hit a different assertion on link-01 (bug to be filed next). Feb 9 08:58:46 link-04 kernel: SCSI error : <1 0 1 1> return code = 0x10000 Feb 9 08:58:46 link-04 kernel: end_request: I/O error, dev sdh, sector 1277144231 Feb 9 08:58:46 link-04 kernel: SCSI error : <1 0 1 1> return code = 0x10000 Feb 9 08:58:46 link-04 kernel: end_request: I/O error, dev sdh, sector 1277144239 Feb 9 08:58:46 link-04 kernel: device-mapper: Write error during recovery (error = 0x1) Feb 9 08:58:46 link-04 kernel: dm-cmirror: Error while listening for server response: -110 device-mapper: recovery failed on region 10346 SCSI error : <1 0 1 1> return code = 0x10000 end_request: I/O error, dev sdh, sector 1293230263 Unable to handle kernel NULL pointer dereference at 0000000000000038 RIP: <ffffffffa00a4caa>{:dm_mirror:mirror_end_io+102} PML4 32f45067 PGD 32f46067 PMD 0 Oops: 0002 [1] SMP CPU 0 Modules linked in: lock_dlm(U) gfs(U) lock_harness(U) mptfc dm_cmirror(U) dlm(U) cman(U) md5 ipv6 parport_pc lp parport autofs4 sunrpc ds yenta_socket pcmcia_core button battery ac ohci_hcd hw_random k8_edac edac_mc tg3 dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod mptscsih mptsas mptspi mptscsi mptbase sd_mod scsi_mod Pid: 5430, comm: xdoio Not tainted 2.6.9-46.ELsmp RIP: 0010:[<ffffffffa00a4caa>] <ffffffffa00a4caa>{:dm_mirror:mirror_end_io+102} RSP: 0018:ffffffff8045ad18 EFLAGS: 00010046 RAX: 0000000000000216 RBX: 000001003eef2c30 RCX: 0000000000000216 RDX: 000000000000286a RSI: 0000000000000216 RDI: 000001003eef2c50 RBP: 000001003eef2c18 R08: ffffff000017a000 R09: 00000000000000ff R10: ffffffffa00582d2 R11: ffffffffa00582d2 R12: 0000000000000000 R13: 000001003eef2c00 R14: 0000000000000000 R15: 0000000000000000 FS: 0000002a95563f00(0000) GS:ffffffff804ed500(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000038 CR3: 0000000000101000 CR4: 00000000000006e0 Process xdoio (pid: 5430, threadinfo 0000010032fac000, task 0000010032bf27f0) Stack: 0000000100000000 00000100326ced90 0000000000000000 0000010039b06080 0000000000000000 0000010031fce8e0 000001003efbc600 ffffffffa0058338 0000000000000000 00000100326ced90 Call Trace:<IRQ> <ffffffffa0058338>{:dm_mod:clone_endio+102} <ffffffffa005844f>{:dm_mod:clone_endio+381} <ffffffff80251b17>{__end_that_request_first+238} <ffffffffa000703c>{:scsi_mod:scsi_end_request+40} <ffffffffa0007351>{:scsi_mod:scsi_io_completion+497} <ffffffffa0002d45>{:scsi_mod:scsi_softirq+213} <ffffffff8013cfa8>{__do_softirq+88} <ffffffff8013d051>{do_softirq+49} <ffffffff80113247>{do_IRQ+328} <ffffffff80110833>{ret_from_intr+0} <EOI> <ffffffffa0227e80>{:gfs:get_block+0} <ffffffff801ed5a2>{copy_user_generic_c+8} <ffffffff8015c2bd>{generic_file_buffered_write+720} <ffffffffa005ac27>{:dm_mod:dm_table_any_congested+68} <ffffffff8013cf22>{current_fs_time+89} <ffffffff8015ca01>{__generic_file_aio_write_nolock+731} <ffffffff8015cc9f>{generic_file_aio_write_nolock+32} <ffffffff8015ce6d>{generic_file_write_nolock+158} <ffffffff80136020>{autoremove_wake_function+0} <ffffffffa022a5a2>{:gfs:do_write_buf+997} <ffffffffa02294a3>{:gfs:walk_vm+278} <ffffffffa022a1bd>{:gfs:do_write_buf+0} <ffffffff8016a3a9>{do_no_page+1023} <ffffffffa022a1bd>{:gfs:do_write_buf+0} <ffffffff8016a7f6>{handle_mm_fault+373} <ffffffffa022a88e>{:gfs:__gfs_write+201} <ffffffffa022a8b5>{:gfs:gfs_write+0} <ffffffff8017a585>{do_readv_writev+463} <ffffffff8030c4d1>{thread_return+0} <ffffffff8030c529>{thread_return+88} <ffffffff8016d797>{do_mmap_pgoff+1560} <ffffffff8017a72d>{sys_writev+69} <ffffffff8011026a>{system_call+126} Code: f0 41 ff 4c 24 38 0f 94 c0 84 c0 74 66 41 8b 44 24 10 83 f8 RIP <ffffffffa00a4caa>{:dm_mirror:mirror_end_io+102} RSP <ffffffff8045ad18> CR2: 0000000000000038 <0>Kernel panic - not syncing: Oops link-07: dm-cmirror: Failed to receive election results from server: -110 Feb 9 14:13:42 link-07 last message repeated 4 times dm-cmirror: Failed to receive election results from server: -110 dm-cmirror: Failed to receive election results from server: -110 dm-cmirror: Failed to receive election results from server: -110 dm-cmirror: Failed to receive election results from server: -110 Feb 9 14:15:02 link-07 last message repeated 4 times dm-cmirror: Failed to receive election results from server: -110 dm-cmirror: Failed to receive election results from server: -110 link-08: Feb 9 08:59:45 link-08 kernel: SCSI error : <2 0 1 1> return code = 0x10000 Feb 9 08:59:45 link-08 kernel: end_request: I/O error, dev sdh, sector 10318201 Feb 9 08:59:45 link-08 kernel: device-mapper: Write error during recovery (error = 0x1) Feb 9 08:59:45 link-08 kernel: dlm: clvmd: dlm_dir_rebuild_wait failed -1 Feb 9 08:59:45 link-08 kernel: device-mapper: recovery failed on region 10075 Feb 9 08:59:45 link-08 kernel: dlm: gfs1: dlm_dir_rebuild_wait failed -1 Feb 9 08:59:45 link-08 kernel: dlm: gfs3: dlm_dir_rebuild_wait failed -1 Feb 9 08:59:45 link-08 kernel: dlm: gfs2: dlm_dir_rebuild_wait failed -1 Feb 9 08:59:45 link-08 kernel: device-mapper: recovery failed on region 5575 Feb 9 08:59:45 link-08 kernel: SCSI error : <2 0 1 1> return code = 0x10000 Feb 9 08:59:45 link-08 kernel: end_request: I/O error, dev sdh, sector 20971905 Version-Release number of selected component (if applicable): 2.6.9-46.ELsmp lvm2-2.02.21-1.el4 lvm2-cluster-2.02.21-3.el4 cmirror-kernel-smp-2.6.9-19.1 cmirror-1.0.1-1