Bug 214808
| Summary: | cmirror operations can cause clvmd to deadlock | ||
|---|---|---|---|
| Product: | [Retired] Red Hat Cluster Suite | Reporter: | Corey Marthaler <cmarthal> |
| Component: | cmirror | Assignee: | Jonathan Earl Brassow <jbrassow> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Cluster QE <mspqa-list> |
| Severity: | medium | Docs Contact: | |
| Priority: | high | ||
| Version: | 4 | CC: | agk, cfeist, dwysocha, mbroz, prockai |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2008-08-05 21:40:32 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 217626 | ||
| Bug Blocks: | |||
|
Description
Corey Marthaler
2006-11-09 16:32:59 UTC
I thought I saw this earlier, but before I got serious about reproducing it, I grabbed the latest RHEL4 branch. I haven't been able to reproduce it since. I'll let it run overnight, but I'm wondering if this is reproducible with the latest rpms (i.e. >= 11/13/2006) Ok, I see this all the time now. I've found a bug in CMAN that is allowing the cluster mirroring code to get the same CMAN id as clvmd, which causes much breakage. I've supplied dave with a patch. I appear to have hit this again last night running creations/deletions on a two node cluster (kool and salem). Here's what was running: [root@kool ~]# while true > do > lvcreate -m 1 -n kool -L 10G vg > sleep 5 > lvchange -an /dev/vg/kool > sleep 4 > lvremove -f /dev/vg/kool > sleep 2 > done [root@salem ~]# while true > do > lvcreate -m 1 -n salem -L 10G vg > sleep 10 > lvchange -an /dev/vg/salem > sleep 2 > lvremove -f /dev/vg/salem > sleep 5 > done I straced an lvs command and that is stuck in the same place as in the original report. Also the traces of the other hung cmds are also very similar to the trace in the original report. Jan 31 10:13:05 kool kernel: dlm_recoverd S 0000000000000000 0 4132 6 29456 4131 (L-TLB) Jan 31 10:13:05 kool kernel: 0000010078aadea8 0000000000000046 0000000000000004 000001007deaade8 Jan 31 10:13:05 kool kernel: 00000000fffffffb 0000000000000000 0000010001021a80 0000000080132fba Jan 31 10:13:06 kool kernel: 000001007d2577f0 0000000000003526 Jan 31 10:13:06 kool kernel: Call Trace:<ffffffffa023b8a7>{:dlm:wake_astd+27} <ffffffffa024b44f>{:dlm:dlm_recoverd+60} Jan 31 10:13:06 kool kernel: <ffffffffa024b413>{:dlm:dlm_recoverd+0} <ffffffff8014be28>{keventd_create_kthread+0} Jan 31 10:13:06 kool kernel: <ffffffff8014bdff>{kthread+200} <ffffffff80110f47>{child_rip+8} Jan 31 10:13:06 kool kernel: <ffffffff8014be28>{keventd_create_kthread+0} <ffffffff8014bd37>{kthread+0} Jan 31 10:13:06 kool kernel: <ffffffff80110f3f>{child_rip+0} Jan 31 10:13:06 kool kernel: dmeventd S ffffffff8030bb58 0 4212 1 29399 4156 (NOTLB) Jan 31 10:13:06 kool kernel: 00000100727d5d78 0000000000000006 0000010001698030 0000000000000076 Jan 31 10:13:06 kool kernel: 0000000000000000 000000000000db00 000000d001029a80 0000000100000246 Jan 31 10:13:06 kool kernel: 00000100774ce030 00000000000001f2 Jan 31 10:13:06 kool kernel: Call Trace:<ffffffff80140528>{__mod_timer+293} <ffffffff8030cf6f>{schedule_timeout+367} Jan 31 10:13:06 kool kernel: <ffffffff80140f52>{process_timeout+0} <ffffffff8018c5c3>{do_select+939} Jan 31 10:13:06 kool kernel: <ffffffff8018c15d>{__pollwait+0} <ffffffff8018c942>{sys_select+820} Jan 31 10:13:06 kool kernel: <ffffffff8011026a>{system_call+126} Jan 31 10:13:06 kool kernel: dmeventd S 0000010077e3f000 0 29458 1 29460 29423 (NOTLB) Jan 31 10:13:06 kool kernel: 0000010072325cf8 0000000000000006 000001007be5390c ffffffff00000073 Jan 31 10:13:06 kool kernel: 0000000000000000 0000000000000000 0000010001029a80 0000000100000000 Jan 31 10:13:07 kool kernel: 00000100715af030 000000000001653b Jan 31 10:13:07 kool kernel: Call Trace:<ffffffffa0055870>{:dm_mod:dev_wait+0} <ffffffffa0052c52>{:dm_mod:dm_wait_event+151} Jan 31 10:13:07 kool kernel: <ffffffff80136020>{autoremove_wake_function+0} <ffffffff80136020>{autoremove_wake_function+0} Jan 31 10:13:07 kool kernel: <ffffffff801eba99>{__up_read+16} <ffffffffa005598d>{:dm_mod:dev_wait+285} Jan 31 10:13:07 kool kernel: <ffffffffa00565e3>{:dm_mod:ctl_ioctl+602} <ffffffff80182a1a>{sys_newstat+32} Jan 31 10:13:07 kool kernel: <ffffffff8018bbc9>{sys_ioctl+853} <ffffffff8011026a>{system_call+126} Jan 31 10:13:07 kool kernel: Jan 31 10:13:07 kool kernel: lvcreate S 0000000000000012 0 29398 3934 (NOTLB) Jan 31 10:13:07 kool kernel: 000001007bdd9bd8 0000000000000006 000001007bdd9e58 0000000000000000 Jan 31 10:13:07 kool kernel: 000001007bdd9b98 ffffffff80132a01 000001007d0d85e8 0000000101685e00 Jan 31 10:13:07 kool kernel: 0000010070f1c7f0 0000000000000f10 Jan 31 10:13:07 kool kernel: Call Trace:<ffffffff80132a01>{recalc_task_prio+337} <ffffffff80132a8f>{activate_task+124} Jan 31 10:13:07 kool kernel: <ffffffff8030cee0>{schedule_timeout+224} <ffffffff80135f1c>{prepare_to_wait+21} Jan 31 10:13:07 kool kernel: <ffffffff8030804e>{unix_stream_recvmsg+592} <ffffffff80136020>{autoremove_wake_function+0} Jan 31 10:13:07 kool kernel: <ffffffff80136020>{autoremove_wake_function+0} <ffffffff802a93da>{sock_aio_read+297} Jan 31 10:13:07 kool kernel: <ffffffff802a9520>{sock_aio_write+306} <ffffffff80179eb5>{do_sync_read+178} Jan 31 10:13:07 kool kernel: <ffffffff80188047>{__user_walk+94} <ffffffff801826b4>{vfs_stat64+24} Jan 31 10:13:08 kool kernel: <ffffffff8030c44d>{thread_return+0} <ffffffff8030c4a5>{thread_return+88} Jan 31 10:13:08 kool kernel: <ffffffff80136020>{autoremove_wake_function+0} <ffffffff80179fc3>{vfs_read+226} Jan 31 10:13:08 kool kernel: <ffffffff8017a20c>{sys_read+69} <ffffffff8011026a>{system_call+126} Jan 31 10:13:08 kool kernel: Jan 31 10:13:08 kool kernel: cluster_log_s S ffffffff8030bb58 0 29423 1 29458 29399 (L-TLB) Jan 31 10:13:08 kool kernel: 000001007850f988 0000000000000046 0000010001698030 7fffffff00000086 Jan 31 10:13:08 kool kernel: 000001007850f988 0000000000000000 0000010001029a80 00000001804192c0 Jan 31 10:13:08 kool kernel: 00000100786f8030 0000000000000095 Jan 31 10:13:08 kool kernel: Call Trace:<ffffffff8030b8d7>{schedule+13} <ffffffff8030cee0>{schedule_timeout+224} Jan 31 10:12:38 salem kernel: cluster_log_s S ffffffff8030bb58 0 26081 1 26105 4180 (L-TLB) Jan 31 10:12:38 salem kernel: 0000010072379988 0000000000000046 0000010001697030 0000000000000078 Jan 31 10:12:38 salem kernel: 0000000000000000 0000000000000000 0000010001029a80 0000000100000000 Jan 31 10:12:38 salem kernel: 0000010072cb97f0 00000000000000d1 Jan 31 10:12:38 salem kernel: Call Trace:<ffffffffa01180af>{:tg3:tg3_start_xmit_dma_bug+1504} Jan 31 10:12:38 salem kernel: <ffffffff8030cee0>{schedule_timeout+224} <ffffffff80135f78>{prepare_to_wait_exclusive+21} Jan 31 10:12:38 salem kernel: <ffffffff802aecf5>{skb_recv_datagram+373} <ffffffff80136020>{autoremove_wake_function+0} Jan 31 10:12:38 salem kernel: <ffffffff802d03f1>{ip_push_pending_frames+855} <ffffffff80136020>{autoremove_wake_function +0} Jan 31 10:12:38 salem kernel: <ffffffff802eb748>{udp_recvmsg+118} <ffffffff802ac722>{sock_common_recvmsg+48} Jan 31 10:12:38 salem kernel: <ffffffff802a9234>{sock_recvmsg+284} <ffffffff80136020>{autoremove_wake_function+0} Jan 31 10:12:38 salem kernel: <ffffffff80140528>{__mod_timer+293} <ffffffffa0260698>{:dm_cmirror:my_recvmsg+265} Jan 31 10:12:39 salem kernel: <ffffffffa0260580>{:dm_cmirror:set_sigusr1+0} <ffffffffa0260580>{:dm_cmirror:set_sigusr1+0 } Jan 31 10:12:39 salem kernel: <ffffffffa025f9d9>{:dm_cmirror:cluster_log_serverd+752} Jan 31 10:12:39 salem kernel: <ffffffff80134660>{default_wake_function+0} <ffffffff80110f47>{child_rip+8} Jan 31 10:12:39 salem kernel: <ffffffffa025f6e9>{:dm_cmirror:cluster_log_serverd+0} Jan 31 10:12:39 salem kernel: <ffffffff80110f3f>{child_rip+0} Jan 31 10:12:39 salem kernel: kmirrord S ffffffffa009c644 0 26103 6 26104 4100 (L-TLB) Jan 31 10:12:39 salem kernel: 0000010072d31e68 0000000000000046 0000010072bd3030 0000000000000069 Jan 31 10:12:39 salem kernel: 000001007d2f7080 00000000013ff800 0000000000000400 0000000037c1a200 Jan 31 10:12:39 salem kernel: 00000100782b2030 0000000000000030 Jan 31 10:12:39 salem kernel: Call Trace:<ffffffffa009c644>{:dm_mirror:do_work+0} <ffffffff80148019>{worker_thread+226} Jan 31 10:12:39 salem kernel: <ffffffff80134660>{default_wake_function+0} <ffffffff801346b1>{__wake_up_common+67} Jan 31 10:12:39 salem kernel: <ffffffff80134660>{default_wake_function+0} <ffffffff8014be28>{keventd_create_kthread+0} Jan 31 10:12:39 salem kernel: <ffffffff80147f37>{worker_thread+0} <ffffffff8014be28>{keventd_create_kthread+0} Jan 31 10:12:39 salem kernel: <ffffffff8014bdff>{kthread+200} <ffffffff80110f47>{child_rip+8} Jan 31 10:12:39 salem kernel: <ffffffff8014be28>{keventd_create_kthread+0} <ffffffff8014bd37>{kthread+0} Jan 31 10:12:39 salem kernel: <ffffffff80110f3f>{child_rip+0} Jan 31 10:12:39 salem kernel: kcopyd S ffffffffa0057334 0 26104 6 26103 (L-TLB) Jan 31 10:12:39 salem kernel: 00000100722b3e68 0000000000000046 00000100722b3de8 ffffffff00000069 Jan 31 10:12:39 salem kernel: 0000000000000000 0000000072f65a18 0000010001021a80 00000000774cf5e0 Jan 31 10:12:39 salem kernel: 0000010072bd3030 0000000000000143 Jan 31 10:12:39 salem kernel: Call Trace:<ffffffffa0057334>{:dm_mod:do_work+0} <ffffffff80148019>{worker_thread+226} Jan 31 10:12:39 salem kernel: <ffffffff80134660>{default_wake_function+0} <ffffffff801346b1>{__wake_up_common+67} Jan 31 10:12:39 salem kernel: <ffffffff80134660>{default_wake_function+0} <ffffffff8014be28>{keventd_create_kthread+0} Jan 31 10:12:40 salem kernel: <ffffffff80147f37>{worker_thread+0} <ffffffff8014be28>{keventd_create_kthread+0} Jan 31 10:12:40 salem kernel: <ffffffff8014bdff>{kthread+200} <ffffffff80110f47>{child_rip+8} Jan 31 10:12:40 salem kernel: <ffffffff8014be28>{keventd_create_kthread+0} <ffffffff8014bd37>{kthread+0} Jan 31 10:12:40 salem kernel: <ffffffff80110f3f>{child_rip+0} Jan 31 10:12:40 salem kernel: lvchange S 0000000000000012 0 26107 3937 (NOTLB) Jan 31 10:12:40 salem kernel: 0000010072bb3bd8 0000000000000006 0000010072bb3e58 0000000000000000 Jan 31 10:12:40 salem kernel: 0000010072bb3b98 ffffffff80132a01 000001007d0dc408 0000000001684e00 Jan 31 10:12:40 salem kernel: 0000010078ea0030 0000000000000d70 Jan 31 10:12:40 salem kernel: Call Trace:<ffffffff80132a01>{recalc_task_prio+337} <ffffffff80132a8f>{activate_task+124} Jan 31 10:12:40 salem kernel: <ffffffff8030cee0>{schedule_timeout+224} <ffffffff80135f1c>{prepare_to_wait+21} Jan 31 10:12:40 salem kernel: <ffffffff8030804e>{unix_stream_recvmsg+592} <ffffffff80136020>{autoremove_wake_function+0} Jan 31 10:12:40 salem kernel: <ffffffff80136020>{autoremove_wake_function+0} <ffffffff802a93da>{sock_aio_read+297} Jan 31 10:12:40 salem kernel: <ffffffff802a9520>{sock_aio_write+306} <ffffffff80179eb5>{do_sync_read+178} Jan 31 10:12:40 salem kernel: <ffffffff80188047>{__user_walk+94} <ffffffff801826b4>{vfs_stat64+24} Jan 31 10:12:40 salem kernel: <ffffffff8030c44d>{thread_return+0} <ffffffff8030c4a5>{thread_return+88} Jan 31 10:12:40 salem kernel: <ffffffff80136020>{autoremove_wake_function+0} <ffffffff80179fc3>{vfs_read+226} Jan 31 10:12:40 salem kernel: <ffffffff8017a20c>{sys_read+69} <ffffffff8011026a>{system_call+126} Jan 31 10:12:40 salem kernel: Is this still a problem? sorry, didn't look at the dates of the last comment I also appear to have hit this while testing a failure scenario. Again two node
cluster, I had two cmirrors, each with gfs on top and I/O going from both nodes
to both mirrors. I then killed one of the legs in each mirror as well as also
killed one of the 2 nodes in the cluster. The I/O and access to each filesystem
continued but all lvm commands are now deadlocked.
Jan 31 17:46:27 salem kernel: cluster_log_s S 0000000000000000 0 4755
1 4768 4574 (L-TLB)
Jan 31 17:46:27 salem kernel: 00000100702e5988 0000000000000046 0000010037c23c10
ffffffff0000007d
Jan 31 17:46:27 salem kernel: ffffffff804d5ea0 00000000804184a0
0000010001029a80 00000001802b26c1
Jan 31 17:46:27 salem kernel: 000001007154d030 0000000000000073
Jan 31 17:46:27 salem kernel: Call Trace:<ffffffff802b2786>{process_backlog+136}
<ffffffff8030cee0>{schedule_timeout+224}
Jan 31 17:46:27 salem kernel: <ffffffff8013cfac>{__do_softirq+88}
<ffffffff80135f78>{prepare_to_wait_exclusive+21}
Jan 31 17:46:28 salem kernel: <ffffffff802aecf5>{skb_recv_datagram+373}
<ffffffff80136020>{autoremove_wake_function+0}
Jan 31 17:46:28 salem kernel:
<ffffffff802d03f1>{ip_push_pending_frames+855}
<ffffffff80136020>{autoremove_wake_function+0}
Jan 31 17:46:28 salem kernel: <ffffffff802eb748>{udp_recvmsg+118}
<ffffffff802ac722>{sock_common_recvmsg+48}
Jan 31 17:46:28 salem kernel: <ffffffff802a9234>{sock_recvmsg+284}
<ffffffff80132a01>{recalc_task_prio+337}
Jan 31 17:46:28 salem kernel:
<ffffffff80136020>{autoremove_wake_function+0} <ffffffff80140528>{__mod_timer+293}
Jan 31 17:46:28 salem kernel:
<ffffffffa0106698>{:dm_cmirror:my_recvmsg+265}
<ffffffffa0106580>{:dm_cmirror:set_sigusr1+0}
Jan 31 17:46:28 salem kernel:
<ffffffffa0106580>{:dm_cmirror:set_sigusr1+0}
<ffffffffa01059d9>{:dm_cmirror:cluster_log_serverd+7
52}
Jan 31 17:46:28 salem kernel: <ffffffff80134660>{default_wake_function+0}
<ffffffff80110f47>{child_rip+8}
Jan 31 17:46:28 salem kernel:
<ffffffffa01056e9>{:dm_cmirror:cluster_log_serverd+0}
Jan 31 17:46:28 salem kernel: <ffffffff80110f3f>{child_rip+0}
Jan 31 17:46:28 salem kernel: kmirrord S 00000001003c0b32 0 4757
7 4759 4346 (L-TLB)
Jan 31 17:46:28 salem kernel: 00000100718b3c48 0000000000000046 0000000000000000
0000000000000073
Jan 31 17:46:28 salem kernel: ffffffff00000000 0000000000000010
0000010001029a80 0000000100000000
Jan 31 17:46:28 salem kernel: 000001006fc5c7f0 0000000000000099
Jan 31 17:46:28 salem kernel: Call Trace:<ffffffff80140528>{__mod_timer+293}
<ffffffff8030cf6f>{schedule_timeout+367}
Jan 31 17:46:28 salem kernel: <ffffffff80140f52>{process_timeout+0}
<ffffffffa009d4b0>{:dm_mirror:do_work+3692}
Jan 31 17:46:28 salem kernel: <ffffffff8030c44d>{thread_return+0}
<ffffffff8030c4a5>{thread_return+88}
Jan 31 17:46:28 salem kernel: <ffffffffa009c644>{:dm_mirror:do_work+0}
<ffffffff801480da>{worker_thread+419}
Jan 31 17:46:28 salem kernel: <ffffffff80134660>{default_wake_function+0}
<ffffffff801346b1>{__wake_up_common+67}
Jan 31 17:46:28 salem kernel: <ffffffff80134660>{default_wake_function+0}
<ffffffff8014be28>{keventd_create_kthread+0}
Jan 31 17:46:28 salem kernel: <ffffffff80147f37>{worker_thread+0}
<ffffffff8014be28>{keventd_create_kthread+0}
Jan 31 17:46:28 salem kernel: <ffffffff8014bdff>{kthread+200}
<ffffffff80110f47>{child_rip+8}
Jan 31 17:46:29 salem kernel:
<ffffffff8014be28>{keventd_create_kthread+0} <ffffffff8014bd37>{kthread+0}
Jan 31 17:46:29 salem kernel: <ffffffff80110f3f>{child_rip+0}
Jan 31 17:46:29 salem kernel: kcopyd S ffffffffa00630e0 0 4759
7 4757 (L-TLB)
Jan 31 17:46:29 salem kernel: 0000010072461e68 0000000000000046 0000010072461de8
ffffffff00000064
Jan 31 17:46:29 salem kernel: 0000000000000001 000000007215b5e0
0000010001029a80 000000017215b1a8
Jan 31 17:46:29 salem kernel: 000001006fc5c030 000000000007dab6
Jan 31 17:46:29 salem kernel: Call Trace:<ffffffffa0057334>{:dm_mod:do_work+0}
<ffffffff80148019>{worker_thread+226}
Jan 31 17:46:29 salem kernel: <ffffffff80134660>{default_wake_function+0}
<ffffffff801346b1>{__wake_up_common+67}
Jan 31 17:46:29 salem kernel: <ffffffff80134660>{default_wake_function+0}
<ffffffff8014be28>{keventd_create_kthread+0}
Jan 31 17:46:29 salem kernel: <ffffffff80147f37>{worker_thread+0}
<ffffffff8014be28>{keventd_create_kthread+0}
Jan 31 17:46:29 salem kernel: <ffffffff8014bdff>{kthread+200}
<ffffffff80110f47>{child_rip+8}
Jan 31 17:46:29 salem kernel:
<ffffffff8014be28>{keventd_create_kthread+0} <ffffffff8014bd37>{kthread+0}
Jan 31 17:46:29 salem kernel: <ffffffff80110f3f>{child_rip+0}
Jan 31 17:46:29 salem kernel: lvs S 000001007c204c0c 0 4871
3822 (NOTLB)
Jan 31 17:46:29 salem kernel: 00000100729e3bd8 0000000000000006 00000100729e3b48
ffffffff0000007d
Jan 31 17:46:29 salem kernel: 00000100729e3b98 0000000080132a01
0000010001029a80 0000000100000001
Jan 31 17:46:29 salem kernel: 00000100706c1030 0000000000000855
Jan 31 17:46:29 salem kernel: Call Trace:<ffffffff80132a8f>{activate_task+124}
<ffffffff8030cee0>{schedule_timeout+224}
Jan 31 17:46:29 salem kernel: <ffffffff80135f1c>{prepare_to_wait+21}
<ffffffff8030804e>{unix_stream_recvmsg+592}
Jan 31 17:46:29 salem kernel:
<ffffffff80136020>{autoremove_wake_function+0}
<ffffffff80136020>{autoremove_wake_function+0}
Jan 31 17:46:29 salem kernel: <ffffffff802a93da>{sock_aio_read+297}
<ffffffff802a9520>{sock_aio_write+306}
Jan 31 17:46:30 salem kernel: <ffffffff80179eb5>{do_sync_read+178}
<ffffffff80188047>{__user_walk+94}
Jan 31 17:46:30 salem kernel: <ffffffff801826b4>{vfs_stat64+24}
<ffffffff8030c44d>{thread_return+0}
Jan 31 17:46:30 salem kernel: <ffffffff8030c4a5>{thread_return+88}
<ffffffff80136020>{autoremove_wake_function+0}
Jan 31 17:46:30 salem kernel: <ffffffff80179fc3>{vfs_read+226}
<ffffffff8017a20c>{sys_read+69}
Jan 31 17:46:30 salem kernel: <ffffffff8011026a>{system_call+126}
2.6.9-43.ELsmp
lvm2-2.02.20-1.el4
lvm2-cluster-2.02.20-1.el4
cmirror-1.0.1-1
cmirror-kernel-smp-2.6.9-18.7
changing the subject as this bug seems to appear with doing looping creation/deletions. (without) please help me reproduce with the latest cmirror-kernel package (>= 2/21/2007) Marking modified, as I believe this has been fixed in the process of fixing other bugs. I appear to have hit this issue while attempting to create a cmirror. The
lvcreate cmd deadlocked along with other lvm cmds such as lvs on the cluster.
The create was attempted on link-07.
Feb 28 11:09:49 link-07 kernel: lvcreate D 000001003f43dc88 0 28471
29723 (NOTLB)
Feb 28 11:09:49 link-07 kernel: 0000010036bd7bb8 0000000000000002
0000000000000000 ffffffff00000069
Feb 28 11:09:49 link-07 kernel: 000001003f6c6fc0 0000000000000000
0000010020013e00 000000018024e958
Feb 28 11:09:49 link-07 kernel: 000001003ba217f0 00000000000081f7
Feb 28 11:09:49 link-07 kernel: Call
Trace:<ffffffff8025002c>{__generic_unplug_device+19}
<ffffffff80250065>{generic_unplug_device+24}
Feb 28 11:09:49 link-07 kernel:
<ffffffffa00aec73>{:dm_mod:dm_table_unplug_all+49}
Feb 28 11:09:49 link-07 kernel: <ffffffff8030ce47>{io_schedule+38}
<ffffffff8019bc8f>{__blockdev_direct_IO+3023}
Feb 28 11:09:49 link-07 kernel: <ffffffff80180a04>{blkdev_direct_IO+48}
<ffffffff80180959>{blkdev_get_blocks+0}
Feb 28 11:09:49 link-07 kernel:
<ffffffff8015c612>{generic_file_direct_IO+78}
<ffffffff8015c69f>{generic_file_direct_write+103}
Feb 28 11:09:49 link-07 kernel:
<ffffffff8015c9bc>{__generic_file_aio_write_nolock+662}
Feb 28 11:09:49 link-07 kernel:
<ffffffff8015cc9f>{generic_file_aio_write_nolock+32}
Feb 28 11:09:49 link-07 kernel:
<ffffffff8015ce6d>{generic_file_write_nolock+158}
<ffffffff8015d1b0>{generic_file_read+187}
Feb 28 11:09:49 link-07 kernel:
<ffffffff80136020>{autoremove_wake_function+0} <ffffffff801941d0>{dnotify_parent+34}
Feb 28 11:09:49 link-07 kernel: <ffffffff801818a0>{blkdev_file_write+26}
<ffffffff8017a18e>{vfs_write+207}
Feb 28 11:09:49 link-07 kernel: <ffffffff8017a276>{sys_write+69}
<ffffffff8011026a>{system_call+126}
Feb 28 11:09:49 link-07 kernel: lvs S 0000000000000012 0 28574
4717 (NOTLB)
Feb 28 11:09:49 link-07 kernel: 000001001874dbd8 0000000000000002
000001001f889030 0000010001008c60
Feb 28 11:09:49 link-07 kernel: 000001001874db98 ffffffff80132a01
000000013f6fd570 0000000100000001
Feb 28 11:09:49 link-07 kernel: 000001003ba387f0 0000000000000876
Feb 28 11:09:49 link-07 kernel: Call
Trace:<ffffffff80132a01>{recalc_task_prio+337} <ffffffff80132a8f>{activate_task+124}
Feb 28 11:09:49 link-07 kernel: <ffffffff8030cf64>{schedule_timeout+224}
<ffffffff80135f1c>{prepare_to_wait+21}
Feb 28 11:09:49 link-07 kernel:
<ffffffff803080d2>{unix_stream_recvmsg+592}
<ffffffff80136020>{autoremove_wake_function+0}
Feb 28 11:09:49 link-07 kernel:
<ffffffff80136020>{autoremove_wake_function+0} <ffffffff802a945e>{sock_aio_read+297}
Feb 28 11:09:49 link-07 kernel: <ffffffff802a95a4>{sock_aio_write+306}
<ffffffff80179eb1>{do_sync_read+178}
Feb 28 11:09:49 link-07 kernel: <ffffffff80188043>{__user_walk+94}
<ffffffff801826b0>{vfs_stat64+24}
Feb 28 11:09:49 link-07 kernel:
<ffffffff80136020>{autoremove_wake_function+0} <ffffffff801941d0>{dnotify_parent+34}
Feb 28 11:09:49 link-07 kernel: <ffffffff80179fbf>{vfs_read+226}
<ffffffff8017a208>{sys_read+69}
Feb 28 11:09:49 link-07 kernel: <ffffffff8011026a>{system_call+126}
Feb 28 11:09:49 link-07 kernel: kmirrord S 0000000103f6dbff 0 28542
7 28428 (L-TLB)
Feb 28 11:09:49 link-07 kernel: 0000010036ef5c48 0000000000000046
0000000000000000 0000000000000073
Feb 28 11:09:49 link-07 kernel: 0000010000000000 0000000000000000
0000010020013e00 0000000100000000
Feb 28 11:09:49 link-07 kernel: 000001001a2877f0 000000000000005f
Feb 28 11:09:49 link-07 kernel: Call Trace:<ffffffff80140524>{__mod_timer+293}
<ffffffff8030cff3>{schedule_timeout+367}
Feb 28 11:09:49 link-07 kernel: <ffffffff80140f4e>{process_timeout+0}
<ffffffffa00f848f>{:dm_mirror:do_work+3692}
Feb 28 11:09:49 link-07 kernel: <ffffffff8030c4d1>{thread_return+0}
<ffffffff8030c529>{thread_return+88}
Feb 28 11:09:49 link-07 kernel: <ffffffffa00f7623>{:dm_mirror:do_work+0}
<ffffffff801480d6>{worker_thread+419}
Feb 28 11:09:49 link-07 kernel:
<ffffffff80134660>{default_wake_function+0} <ffffffff801346b1>{__wake_up_common+67}
Feb 28 11:09:49 link-07 kernel:
<ffffffff80134660>{default_wake_function+0}
<ffffffff8014be24>{keventd_create_kthread+0}
Feb 28 11:09:49 link-07 kernel: <ffffffff80147f33>{worker_thread+0}
<ffffffff8014be24>{keventd_create_kthread+0}
Feb 28 11:09:49 link-07 kernel: <ffffffff8014bdfb>{kthread+200}
<ffffffff80110f47>{child_rip+8}
Feb 28 11:09:49 link-07 kernel:
<ffffffff8014be24>{keventd_create_kthread+0} <ffffffff8014bd33>{kthread+0}
Feb 28 11:09:49 link-07 kernel: <ffffffff80110f3f>{child_rip+0}
Feb 28 11:09:48 link-07 kernel: dmeventd S ffffffffa00b0870 0 28567
1 28575 28472 (NOTLB)
Feb 28 11:09:48 link-07 kernel: 000001001836fcf8 0000000000000002
0000000000000000 000000000000006d
Feb 28 11:09:48 link-07 kernel: 0000000000000000 0000000000000000
0000010001009f80 0000000000017c2a
Feb 28 11:09:48 link-07 kernel: 000001003f6fc030 0000000000008831
Feb 28 11:09:48 link-07 kernel: Call
Trace:<ffffffffa00b0870>{:dm_mod:dev_wait+0}
<ffffffffa00adc52>{:dm_mod:dm_wait_event+151}
Feb 28 11:09:48 link-07 kernel:
<ffffffff80136020>{autoremove_wake_function+0}
<ffffffff80136020>{autoremove_wake_function+0}
Feb 28 11:09:48 link-07 kernel: <ffffffff801eba95>{__up_read+16}
<ffffffffa00b098d>{:dm_mod:dev_wait+285}
Feb 28 11:09:48 link-07 kernel: <ffffffffa00b15e3>{:dm_mod:ctl_ioctl+602}
<ffffffff8018bbc5>{sys_ioctl+853}
Feb 28 11:09:48 link-07 kernel: <ffffffff8011026a>{system_call+126}
Feb 28 11:09:48 link-07 kernel: kmirrord S 000001003f8ceb40 0 28359
6 28161 (L-TLB)
Feb 28 11:09:48 link-07 kernel: 00000100372f3e68 0000000000000046
ffffffff803d7a00 0000000000000073
Feb 28 11:09:48 link-07 kernel: 0000010019c1e080 0000000000063000
0000010001009f80 0000000000000000
Feb 28 11:09:48 link-07 kernel: 000001003a4c4030 0000000000000051
Feb 28 11:09:48 link-07 kernel: Call
Trace:<ffffffffa00f7623>{:dm_mirror:do_work+0} <ffffffff80148015>{worker_thread+226}
Feb 28 11:09:48 link-07 kernel:
<ffffffff80134660>{default_wake_function+0} <ffffffff801346b1>{__wake_up_common+67}
Feb 28 11:09:48 link-07 kernel:
<ffffffff80134660>{default_wake_function+0}
<ffffffff8014be24>{keventd_create_kthread+0}
Feb 28 11:09:48 link-07 kernel: <ffffffff80147f33>{worker_thread+0}
<ffffffff8014be24>{keventd_create_kthread+0}
Feb 28 11:09:49 link-07 kernel: <ffffffff8014bdfb>{kthread+200}
<ffffffff80110f47>{child_rip+8}
Feb 28 11:09:49 link-07 kernel:
<ffffffff8014be24>{keventd_create_kthread+0} <ffffffff8014bd33>{kthread+0}
Feb 28 11:09:49 link-07 kernel: <ffffffff80110f3f>{child_rip+0}
Feb 28 11:09:49 link-07 kernel: kmirrord S 000001003c272d40 0 28428
7 28542 28288 (L-TLB)
Feb 28 11:09:49 link-07 kernel: 000001001a023e68 0000000000000046
000001001a61b7f0 0000000000000073
Feb 28 11:09:49 link-07 kernel: 0000010036d08c40 0000000000063800
0000010020013e00 0000000100000000
Feb 28 11:09:49 link-07 kernel: 000001001f3127f0 0000000000000064
Feb 28 11:09:49 link-07 kernel: Call
Trace:<ffffffffa00f7623>{:dm_mirror:do_work+0} <ffffffff80148015>{worker_thread+226}
Feb 28 11:09:49 link-07 kernel:
<ffffffff80134660>{default_wake_function+0} <ffffffff801346b1>{__wake_up_common+67}
Feb 28 11:09:49 link-07 kernel:
<ffffffff80134660>{default_wake_function+0}
<ffffffff8014be24>{keventd_create_kthread+0}
Feb 28 11:09:49 link-07 kernel: <ffffffff80147f33>{worker_thread+0}
<ffffffff8014be24>{keventd_create_kthread+0}
Feb 28 11:09:49 link-07 kernel: <ffffffff8014bdfb>{kthread+200}
<ffffffff80110f47>{child_rip+8}
Feb 28 11:09:49 link-07 kernel:
<ffffffff8014be24>{keventd_create_kthread+0} <ffffffff8014bd33>{kthread+0}
Feb 28 11:09:49 link-07 kernel: <ffffffff80110f3f>{child_rip+0}
pretty sure these issues have been addressed. try cmirror-kernel >= 2.6.9-30.1 assigned -> modified I ran cmirror locking operations all night and wasn't able to trip this deadlock, marking verified in cmirror-kernel-2.6.9-32.0. Fixed in current release (4.7). |