Bug 472094

Summary: Panic in kcopyd during block level I/O to origin with snapshots (dm_snapshot:pending_complete)
Product: Red Hat Enterprise Linux 5 Reporter: Corey Marthaler <cmarthal>
Component: lvm2Assignee: LVM and device-mapper development team <lvm-team>
Status: CLOSED DUPLICATE QA Contact: Cluster QE <mspqa-list>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 5.3CC: agk, dwysocha, edamato, heinzm, jbrassow, mbroz, prockai
Target Milestone: rcKeywords: Regression
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-11-18 19:43:07 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Corey Marthaler 2008-11-18 18:10:55 UTC
Description of problem:
I was running our block level io snapshot regression test and hit this panic.
    
 Making snapshot block_snap32 of origin volume
 Running block level I/O to the origin and verifying on snapshot 32
 b_iogen starting up with the following:

 Iterations:      500
 Seed:            26086
 Offset-mode:     random
 Single Pass:     off
 Overlap Flag:    on
 Mintrans:        512000
 Maxtrans:        5120000
 Syscalls:        write  writev
 Flags:          direct

 Test Devices:

 Path                                                      Size
                                                         (bytes)
 ---------------------------------------------------------------
 /dev/snapper/origin                                        4294967296
    Snap Devices:
            /dev/snapper/block_snap32
 Didn't receive heartbeat for 120 seconds
 block level IO failed with snapshot 32


EXT2-fs warning: mounting unchecked fs, running e2fsck is recommended
EXT2-fs warning: mounting unchecked fs, running e2fsck is recommended
EXT2-fs warning: mounting unchecked fs, running e2fsck is recommended
EXT2-fs warning: mounting unchecked fs, running e2fsck is recommended
EXT2-fs warning: mounting unchecked fs, running e2fsck is recommended
device-mapper: snapshots: Invalidating snapshot: Unable to allocate exception.
Buffer I/O error on device dm-3, logical block 7152
lost page write due to I/O error on dm-3
Buffer I/O error on device dm-3, logical block 7153
lost page write due to I/O error on dm-3
Buffer I/O error on device dm-3, logical block 7154
lost page write due to I/O error on dm-3
Buffer I/O error on device dm-3, logical block 7155
lost page write due to I/O error on dm-3
Buffer I/O error on device dm-3, logical block 7156
lost page write due to I/O error on dm-3
Buffer I/O error on device dm-3, logical block 7157
lost page write due to I/O error on dm-3
Buffer I/O error on device dm-3, logical block 7158
lost page write due to I/O error on dm-3
Buffer I/O error on device dm-3, logical block 7159
lost page write due to I/O error on dm-3
Buffer I/O error on device dm-3, logical block 7160
lost page write due to I/O error on dm-3
Buffer I/O error on device dm-3, logical block 7161
lost page write due to I/O error on dm-3
EXT2-fs warning: mounting unchecked fs, running e2fsck is recommended
EXT2-fs warning: mounting unchecked fs, running e2fsck is recommended
EXT2-fs warning: mounting unchecked fs, running e2fsck is recommended
EXT2-fs warning: mounting unchecked fs, running e2fsck is recommended
EXT2-fs warning: mounting unchecked fs, running e2fsck is recommended
EXT2-fs warning: mounting unchecked fs, running e2fsck is recommended
EXT2-fs warning: mounting unchecked fs, running e2fsck is recommended
EXT2-fs warning: mounting unchecked fs, running e2fsck is recommended
EXT2-fs warning: mounting unchecked fs, running e2fsck is recommended
EXT2-fs warning: mounting unchecked fs, running e2fsck is recommended
EXT2-fs warning: mounting unchecked fs, running e2fsck is recommended
Unable to handle kernel paging request at 0000000000200200 RIP: 
 [<ffffffff8014d0b6>] list_del+0x8/0x71
PGD 1ef5bf067 PUD 203231067 PMD 0 
Oops: 0000 [1] SMP 
last sysfs file: /devices/pci0000:00/0000:00:06.0/0000:08:00.2/0000:0b:03.0/0000:0c:06.1/irq
CPU 1 
Modules linked in: gfs(U) dlm configfs autofs4 hidp rfcomm l2cap bluetooth sunrpc ipv6 xfrm_nalgo cryd
Pid: 25952, comm: kcopyd Tainted: G      2.6.18-117.el5 #1
RIP: 0010:[<ffffffff8014d0b6>]  [<ffffffff8014d0b6>] list_del+0x8/0x71
RSP: 0018:ffff8101d419fd20  EFLAGS: 00010246
RAX: 0000000000200200 RBX: ffff8101dd9dbf28 RCX: 0000000000000001
RDX: ffff8101f26a8df0 RSI: ffff8101f26a8ef0 RDI: ffff8101dd9dbf28
RBP: 0000000000002ae7 R08: 0000000000002ae4 R09: ffff8101f26a8f10
R10: 0000000000002ae7 R11: ffff8101f26a8d90 R12: ffff8101dd9dbf28
R13: ffff81021fbf1400 R14: ffff8101fd1532e8 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff8101fff107c0(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000200200 CR3: 00000001fefc5000 CR4: 00000000000006e0
Process kcopyd (pid: 25952, threadinfo ffff8101d419e000, task ffff8101ffd41040)
Stack:  ffff8101f26a8d90 ffffffff881118e8 0000000000000040 0000000000000000
 ffff810218c5cdc0 0000000000000049 ffffffff881119dc ffff8101fd1532e8
 0000000000000000 ffffffff88112af9 ffff8101d4b43498 ffff81021876cdc0
Call Trace:
 [<ffffffff881118e8>] :dm_snapshot:pending_complete+0x114/0x1d1
 [<ffffffff881119dc>] :dm_snapshot:commit_callback+0x0/0x5
 [<ffffffff88112af9>] :dm_snapshot:persistent_commit+0xc1/0xdc
 [<ffffffff881119a5>] :dm_snapshot:copy_callback+0x0/0x37
 [<ffffffff880d8a52>] :dm_mod:run_complete_job+0x51/0x80
 [<ffffffff880d8764>] :dm_mod:process_jobs+0x2a/0xed
 [<ffffffff880d8a01>] :dm_mod:run_complete_job+0x0/0x80
 [<ffffffff880d8827>] :dm_mod:do_work+0x0/0x47
 [<ffffffff880d8841>] :dm_mod:do_work+0x1a/0x47
 [<ffffffff8004d267>] run_workqueue+0x94/0xe4
 [<ffffffff80049b20>] worker_thread+0x0/0x122
 [<ffffffff8009e7f0>] keventd_create_kthread+0x0/0xc4
 [<ffffffff80049c10>] worker_thread+0xf0/0x122
 [<ffffffff8008b29a>] default_wake_function+0x0/0xe
 [<ffffffff8009e7f0>] keventd_create_kthread+0x0/0xc4
 [<ffffffff8009e7f0>] keventd_create_kthread+0x0/0xc4
 [<ffffffff800324f3>] kthread+0xfe/0x132
 [<ffffffff8005dfb1>] child_rip+0xa/0x11
 [<ffffffff8009e7f0>] keventd_create_kthread+0x0/0xc4
 [<ffffffff800323f5>] kthread+0x0/0x132
 [<ffffffff8005dfa7>] child_rip+0x0/0x11


Code: 48 8b 10 48 39 fa 74 1b 48 89 fe 31 c0 48 c7 c7 6a b6 2a 80 
RIP  [<ffffffff8014d0b6>] list_del+0x8/0x71
 RSP <ffff8101d419fd20>
CR2: 0000000000200200
 <0>Kernel panic - not syncing: Fatal exception


Version-Release number of selected component (if applicable):
2.6.18-117.el5

lvm2-2.02.40-6.el5    BUILT: Fri Oct 24 07:37:33 CDT 2008
lvm2-cluster-2.02.40-6.el5    BUILT: Fri Oct 24 07:38:44 CDT 2008
device-mapper-1.02.28-2.el5    BUILT: Fri Sep 19 02:50:32 CDT 2008
cmirror-1.1.34-5.el5    BUILT: Thu Nov  6 15:10:44 CST 2008
kmod-cmirror-0.1.21-2.el5    BUILT: Thu Nov  6 14:12:07 CST 2008

Comment 1 Corey Marthaler 2008-11-18 19:30:44 UTC
This is easily reproducible. Marking as a Regression.

Comment 2 Corey Marthaler 2008-11-18 19:33:19 UTC
FWIW, none of the snaps are close to being full:

  block_snap16 snapper    swi-a-  3.50G origin  44.93                         /dev/sdc1(0)   
  block_snap32 snapper    swi-a-  3.50G origin  31.84                         /dev/sdd1(0)   
  block_snap64 snapper    swi-a-  3.50G origin   7.23                         /dev/sde1(0)   
  origin       snapper    owi-a-  4.00G                                       /dev/sdb1(0)

Comment 3 Corey Marthaler 2008-11-18 19:43:07 UTC

*** This bug has been marked as a duplicate of bug 465825 ***