Bug 214808
Summary: | cmirror operations can cause clvmd to deadlock | ||
---|---|---|---|
Product: | [Retired] Red Hat Cluster Suite | Reporter: | Corey Marthaler <cmarthal> |
Component: | cmirror | Assignee: | Jonathan Earl Brassow <jbrassow> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Cluster QE <mspqa-list> |
Severity: | medium | Docs Contact: | |
Priority: | high | ||
Version: | 4 | CC: | agk, cfeist, dwysocha, mbroz, prockai |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2008-08-05 21:40:32 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 217626 | ||
Bug Blocks: |
Description
Corey Marthaler
2006-11-09 16:32:59 UTC
I thought I saw this earlier, but before I got serious about reproducing it, I grabbed the latest RHEL4 branch. I haven't been able to reproduce it since. I'll let it run overnight, but I'm wondering if this is reproducible with the latest rpms (i.e. >= 11/13/2006) Ok, I see this all the time now. I've found a bug in CMAN that is allowing the cluster mirroring code to get the same CMAN id as clvmd, which causes much breakage. I've supplied dave with a patch. I appear to have hit this again last night running creations/deletions on a two node cluster (kool and salem). Here's what was running: [root@kool ~]# while true > do > lvcreate -m 1 -n kool -L 10G vg > sleep 5 > lvchange -an /dev/vg/kool > sleep 4 > lvremove -f /dev/vg/kool > sleep 2 > done [root@salem ~]# while true > do > lvcreate -m 1 -n salem -L 10G vg > sleep 10 > lvchange -an /dev/vg/salem > sleep 2 > lvremove -f /dev/vg/salem > sleep 5 > done I straced an lvs command and that is stuck in the same place as in the original report. Also the traces of the other hung cmds are also very similar to the trace in the original report. Jan 31 10:13:05 kool kernel: dlm_recoverd S 0000000000000000 0 4132 6 29456 4131 (L-TLB) Jan 31 10:13:05 kool kernel: 0000010078aadea8 0000000000000046 0000000000000004 000001007deaade8 Jan 31 10:13:05 kool kernel: 00000000fffffffb 0000000000000000 0000010001021a80 0000000080132fba Jan 31 10:13:06 kool kernel: 000001007d2577f0 0000000000003526 Jan 31 10:13:06 kool kernel: Call Trace:<ffffffffa023b8a7>{:dlm:wake_astd+27} <ffffffffa024b44f>{:dlm:dlm_recoverd+60} Jan 31 10:13:06 kool kernel: <ffffffffa024b413>{:dlm:dlm_recoverd+0} <ffffffff8014be28>{keventd_create_kthread+0} Jan 31 10:13:06 kool kernel: <ffffffff8014bdff>{kthread+200} <ffffffff80110f47>{child_rip+8} Jan 31 10:13:06 kool kernel: <ffffffff8014be28>{keventd_create_kthread+0} <ffffffff8014bd37>{kthread+0} Jan 31 10:13:06 kool kernel: <ffffffff80110f3f>{child_rip+0} Jan 31 10:13:06 kool kernel: dmeventd S ffffffff8030bb58 0 4212 1 29399 4156 (NOTLB) Jan 31 10:13:06 kool kernel: 00000100727d5d78 0000000000000006 0000010001698030 0000000000000076 Jan 31 10:13:06 kool kernel: 0000000000000000 000000000000db00 000000d001029a80 0000000100000246 Jan 31 10:13:06 kool kernel: 00000100774ce030 00000000000001f2 Jan 31 10:13:06 kool kernel: Call Trace:<ffffffff80140528>{__mod_timer+293} <ffffffff8030cf6f>{schedule_timeout+367} Jan 31 10:13:06 kool kernel: <ffffffff80140f52>{process_timeout+0} <ffffffff8018c5c3>{do_select+939} Jan 31 10:13:06 kool kernel: <ffffffff8018c15d>{__pollwait+0} <ffffffff8018c942>{sys_select+820} Jan 31 10:13:06 kool kernel: <ffffffff8011026a>{system_call+126} Jan 31 10:13:06 kool kernel: dmeventd S 0000010077e3f000 0 29458 1 29460 29423 (NOTLB) Jan 31 10:13:06 kool kernel: 0000010072325cf8 0000000000000006 000001007be5390c ffffffff00000073 Jan 31 10:13:06 kool kernel: 0000000000000000 0000000000000000 0000010001029a80 0000000100000000 Jan 31 10:13:07 kool kernel: 00000100715af030 000000000001653b Jan 31 10:13:07 kool kernel: Call Trace:<ffffffffa0055870>{:dm_mod:dev_wait+0} <ffffffffa0052c52>{:dm_mod:dm_wait_event+151} Jan 31 10:13:07 kool kernel: <ffffffff80136020>{autoremove_wake_function+0} <ffffffff80136020>{autoremove_wake_function+0} Jan 31 10:13:07 kool kernel: <ffffffff801eba99>{__up_read+16} <ffffffffa005598d>{:dm_mod:dev_wait+285} Jan 31 10:13:07 kool kernel: <ffffffffa00565e3>{:dm_mod:ctl_ioctl+602} <ffffffff80182a1a>{sys_newstat+32} Jan 31 10:13:07 kool kernel: <ffffffff8018bbc9>{sys_ioctl+853} <ffffffff8011026a>{system_call+126} Jan 31 10:13:07 kool kernel: Jan 31 10:13:07 kool kernel: lvcreate S 0000000000000012 0 29398 3934 (NOTLB) Jan 31 10:13:07 kool kernel: 000001007bdd9bd8 0000000000000006 000001007bdd9e58 0000000000000000 Jan 31 10:13:07 kool kernel: 000001007bdd9b98 ffffffff80132a01 000001007d0d85e8 0000000101685e00 Jan 31 10:13:07 kool kernel: 0000010070f1c7f0 0000000000000f10 Jan 31 10:13:07 kool kernel: Call Trace:<ffffffff80132a01>{recalc_task_prio+337} <ffffffff80132a8f>{activate_task+124} Jan 31 10:13:07 kool kernel: <ffffffff8030cee0>{schedule_timeout+224} <ffffffff80135f1c>{prepare_to_wait+21} Jan 31 10:13:07 kool kernel: <ffffffff8030804e>{unix_stream_recvmsg+592} <ffffffff80136020>{autoremove_wake_function+0} Jan 31 10:13:07 kool kernel: <ffffffff80136020>{autoremove_wake_function+0} <ffffffff802a93da>{sock_aio_read+297} Jan 31 10:13:07 kool kernel: <ffffffff802a9520>{sock_aio_write+306} <ffffffff80179eb5>{do_sync_read+178} Jan 31 10:13:07 kool kernel: <ffffffff80188047>{__user_walk+94} <ffffffff801826b4>{vfs_stat64+24} Jan 31 10:13:08 kool kernel: <ffffffff8030c44d>{thread_return+0} <ffffffff8030c4a5>{thread_return+88} Jan 31 10:13:08 kool kernel: <ffffffff80136020>{autoremove_wake_function+0} <ffffffff80179fc3>{vfs_read+226} Jan 31 10:13:08 kool kernel: <ffffffff8017a20c>{sys_read+69} <ffffffff8011026a>{system_call+126} Jan 31 10:13:08 kool kernel: Jan 31 10:13:08 kool kernel: cluster_log_s S ffffffff8030bb58 0 29423 1 29458 29399 (L-TLB) Jan 31 10:13:08 kool kernel: 000001007850f988 0000000000000046 0000010001698030 7fffffff00000086 Jan 31 10:13:08 kool kernel: 000001007850f988 0000000000000000 0000010001029a80 00000001804192c0 Jan 31 10:13:08 kool kernel: 00000100786f8030 0000000000000095 Jan 31 10:13:08 kool kernel: Call Trace:<ffffffff8030b8d7>{schedule+13} <ffffffff8030cee0>{schedule_timeout+224} Jan 31 10:12:38 salem kernel: cluster_log_s S ffffffff8030bb58 0 26081 1 26105 4180 (L-TLB) Jan 31 10:12:38 salem kernel: 0000010072379988 0000000000000046 0000010001697030 0000000000000078 Jan 31 10:12:38 salem kernel: 0000000000000000 0000000000000000 0000010001029a80 0000000100000000 Jan 31 10:12:38 salem kernel: 0000010072cb97f0 00000000000000d1 Jan 31 10:12:38 salem kernel: Call Trace:<ffffffffa01180af>{:tg3:tg3_start_xmit_dma_bug+1504} Jan 31 10:12:38 salem kernel: <ffffffff8030cee0>{schedule_timeout+224} <ffffffff80135f78>{prepare_to_wait_exclusive+21} Jan 31 10:12:38 salem kernel: <ffffffff802aecf5>{skb_recv_datagram+373} <ffffffff80136020>{autoremove_wake_function+0} Jan 31 10:12:38 salem kernel: <ffffffff802d03f1>{ip_push_pending_frames+855} <ffffffff80136020>{autoremove_wake_function +0} Jan 31 10:12:38 salem kernel: <ffffffff802eb748>{udp_recvmsg+118} <ffffffff802ac722>{sock_common_recvmsg+48} Jan 31 10:12:38 salem kernel: <ffffffff802a9234>{sock_recvmsg+284} <ffffffff80136020>{autoremove_wake_function+0} Jan 31 10:12:38 salem kernel: <ffffffff80140528>{__mod_timer+293} <ffffffffa0260698>{:dm_cmirror:my_recvmsg+265} Jan 31 10:12:39 salem kernel: <ffffffffa0260580>{:dm_cmirror:set_sigusr1+0} <ffffffffa0260580>{:dm_cmirror:set_sigusr1+0 } Jan 31 10:12:39 salem kernel: <ffffffffa025f9d9>{:dm_cmirror:cluster_log_serverd+752} Jan 31 10:12:39 salem kernel: <ffffffff80134660>{default_wake_function+0} <ffffffff80110f47>{child_rip+8} Jan 31 10:12:39 salem kernel: <ffffffffa025f6e9>{:dm_cmirror:cluster_log_serverd+0} Jan 31 10:12:39 salem kernel: <ffffffff80110f3f>{child_rip+0} Jan 31 10:12:39 salem kernel: kmirrord S ffffffffa009c644 0 26103 6 26104 4100 (L-TLB) Jan 31 10:12:39 salem kernel: 0000010072d31e68 0000000000000046 0000010072bd3030 0000000000000069 Jan 31 10:12:39 salem kernel: 000001007d2f7080 00000000013ff800 0000000000000400 0000000037c1a200 Jan 31 10:12:39 salem kernel: 00000100782b2030 0000000000000030 Jan 31 10:12:39 salem kernel: Call Trace:<ffffffffa009c644>{:dm_mirror:do_work+0} <ffffffff80148019>{worker_thread+226} Jan 31 10:12:39 salem kernel: <ffffffff80134660>{default_wake_function+0} <ffffffff801346b1>{__wake_up_common+67} Jan 31 10:12:39 salem kernel: <ffffffff80134660>{default_wake_function+0} <ffffffff8014be28>{keventd_create_kthread+0} Jan 31 10:12:39 salem kernel: <ffffffff80147f37>{worker_thread+0} <ffffffff8014be28>{keventd_create_kthread+0} Jan 31 10:12:39 salem kernel: <ffffffff8014bdff>{kthread+200} <ffffffff80110f47>{child_rip+8} Jan 31 10:12:39 salem kernel: <ffffffff8014be28>{keventd_create_kthread+0} <ffffffff8014bd37>{kthread+0} Jan 31 10:12:39 salem kernel: <ffffffff80110f3f>{child_rip+0} Jan 31 10:12:39 salem kernel: kcopyd S ffffffffa0057334 0 26104 6 26103 (L-TLB) Jan 31 10:12:39 salem kernel: 00000100722b3e68 0000000000000046 00000100722b3de8 ffffffff00000069 Jan 31 10:12:39 salem kernel: 0000000000000000 0000000072f65a18 0000010001021a80 00000000774cf5e0 Jan 31 10:12:39 salem kernel: 0000010072bd3030 0000000000000143 Jan 31 10:12:39 salem kernel: Call Trace:<ffffffffa0057334>{:dm_mod:do_work+0} <ffffffff80148019>{worker_thread+226} Jan 31 10:12:39 salem kernel: <ffffffff80134660>{default_wake_function+0} <ffffffff801346b1>{__wake_up_common+67} Jan 31 10:12:39 salem kernel: <ffffffff80134660>{default_wake_function+0} <ffffffff8014be28>{keventd_create_kthread+0} Jan 31 10:12:40 salem kernel: <ffffffff80147f37>{worker_thread+0} <ffffffff8014be28>{keventd_create_kthread+0} Jan 31 10:12:40 salem kernel: <ffffffff8014bdff>{kthread+200} <ffffffff80110f47>{child_rip+8} Jan 31 10:12:40 salem kernel: <ffffffff8014be28>{keventd_create_kthread+0} <ffffffff8014bd37>{kthread+0} Jan 31 10:12:40 salem kernel: <ffffffff80110f3f>{child_rip+0} Jan 31 10:12:40 salem kernel: lvchange S 0000000000000012 0 26107 3937 (NOTLB) Jan 31 10:12:40 salem kernel: 0000010072bb3bd8 0000000000000006 0000010072bb3e58 0000000000000000 Jan 31 10:12:40 salem kernel: 0000010072bb3b98 ffffffff80132a01 000001007d0dc408 0000000001684e00 Jan 31 10:12:40 salem kernel: 0000010078ea0030 0000000000000d70 Jan 31 10:12:40 salem kernel: Call Trace:<ffffffff80132a01>{recalc_task_prio+337} <ffffffff80132a8f>{activate_task+124} Jan 31 10:12:40 salem kernel: <ffffffff8030cee0>{schedule_timeout+224} <ffffffff80135f1c>{prepare_to_wait+21} Jan 31 10:12:40 salem kernel: <ffffffff8030804e>{unix_stream_recvmsg+592} <ffffffff80136020>{autoremove_wake_function+0} Jan 31 10:12:40 salem kernel: <ffffffff80136020>{autoremove_wake_function+0} <ffffffff802a93da>{sock_aio_read+297} Jan 31 10:12:40 salem kernel: <ffffffff802a9520>{sock_aio_write+306} <ffffffff80179eb5>{do_sync_read+178} Jan 31 10:12:40 salem kernel: <ffffffff80188047>{__user_walk+94} <ffffffff801826b4>{vfs_stat64+24} Jan 31 10:12:40 salem kernel: <ffffffff8030c44d>{thread_return+0} <ffffffff8030c4a5>{thread_return+88} Jan 31 10:12:40 salem kernel: <ffffffff80136020>{autoremove_wake_function+0} <ffffffff80179fc3>{vfs_read+226} Jan 31 10:12:40 salem kernel: <ffffffff8017a20c>{sys_read+69} <ffffffff8011026a>{system_call+126} Jan 31 10:12:40 salem kernel: Is this still a problem? sorry, didn't look at the dates of the last comment I also appear to have hit this while testing a failure scenario. Again two node cluster, I had two cmirrors, each with gfs on top and I/O going from both nodes to both mirrors. I then killed one of the legs in each mirror as well as also killed one of the 2 nodes in the cluster. The I/O and access to each filesystem continued but all lvm commands are now deadlocked. Jan 31 17:46:27 salem kernel: cluster_log_s S 0000000000000000 0 4755 1 4768 4574 (L-TLB) Jan 31 17:46:27 salem kernel: 00000100702e5988 0000000000000046 0000010037c23c10 ffffffff0000007d Jan 31 17:46:27 salem kernel: ffffffff804d5ea0 00000000804184a0 0000010001029a80 00000001802b26c1 Jan 31 17:46:27 salem kernel: 000001007154d030 0000000000000073 Jan 31 17:46:27 salem kernel: Call Trace:<ffffffff802b2786>{process_backlog+136} <ffffffff8030cee0>{schedule_timeout+224} Jan 31 17:46:27 salem kernel: <ffffffff8013cfac>{__do_softirq+88} <ffffffff80135f78>{prepare_to_wait_exclusive+21} Jan 31 17:46:28 salem kernel: <ffffffff802aecf5>{skb_recv_datagram+373} <ffffffff80136020>{autoremove_wake_function+0} Jan 31 17:46:28 salem kernel: <ffffffff802d03f1>{ip_push_pending_frames+855} <ffffffff80136020>{autoremove_wake_function+0} Jan 31 17:46:28 salem kernel: <ffffffff802eb748>{udp_recvmsg+118} <ffffffff802ac722>{sock_common_recvmsg+48} Jan 31 17:46:28 salem kernel: <ffffffff802a9234>{sock_recvmsg+284} <ffffffff80132a01>{recalc_task_prio+337} Jan 31 17:46:28 salem kernel: <ffffffff80136020>{autoremove_wake_function+0} <ffffffff80140528>{__mod_timer+293} Jan 31 17:46:28 salem kernel: <ffffffffa0106698>{:dm_cmirror:my_recvmsg+265} <ffffffffa0106580>{:dm_cmirror:set_sigusr1+0} Jan 31 17:46:28 salem kernel: <ffffffffa0106580>{:dm_cmirror:set_sigusr1+0} <ffffffffa01059d9>{:dm_cmirror:cluster_log_serverd+7 52} Jan 31 17:46:28 salem kernel: <ffffffff80134660>{default_wake_function+0} <ffffffff80110f47>{child_rip+8} Jan 31 17:46:28 salem kernel: <ffffffffa01056e9>{:dm_cmirror:cluster_log_serverd+0} Jan 31 17:46:28 salem kernel: <ffffffff80110f3f>{child_rip+0} Jan 31 17:46:28 salem kernel: kmirrord S 00000001003c0b32 0 4757 7 4759 4346 (L-TLB) Jan 31 17:46:28 salem kernel: 00000100718b3c48 0000000000000046 0000000000000000 0000000000000073 Jan 31 17:46:28 salem kernel: ffffffff00000000 0000000000000010 0000010001029a80 0000000100000000 Jan 31 17:46:28 salem kernel: 000001006fc5c7f0 0000000000000099 Jan 31 17:46:28 salem kernel: Call Trace:<ffffffff80140528>{__mod_timer+293} <ffffffff8030cf6f>{schedule_timeout+367} Jan 31 17:46:28 salem kernel: <ffffffff80140f52>{process_timeout+0} <ffffffffa009d4b0>{:dm_mirror:do_work+3692} Jan 31 17:46:28 salem kernel: <ffffffff8030c44d>{thread_return+0} <ffffffff8030c4a5>{thread_return+88} Jan 31 17:46:28 salem kernel: <ffffffffa009c644>{:dm_mirror:do_work+0} <ffffffff801480da>{worker_thread+419} Jan 31 17:46:28 salem kernel: <ffffffff80134660>{default_wake_function+0} <ffffffff801346b1>{__wake_up_common+67} Jan 31 17:46:28 salem kernel: <ffffffff80134660>{default_wake_function+0} <ffffffff8014be28>{keventd_create_kthread+0} Jan 31 17:46:28 salem kernel: <ffffffff80147f37>{worker_thread+0} <ffffffff8014be28>{keventd_create_kthread+0} Jan 31 17:46:28 salem kernel: <ffffffff8014bdff>{kthread+200} <ffffffff80110f47>{child_rip+8} Jan 31 17:46:29 salem kernel: <ffffffff8014be28>{keventd_create_kthread+0} <ffffffff8014bd37>{kthread+0} Jan 31 17:46:29 salem kernel: <ffffffff80110f3f>{child_rip+0} Jan 31 17:46:29 salem kernel: kcopyd S ffffffffa00630e0 0 4759 7 4757 (L-TLB) Jan 31 17:46:29 salem kernel: 0000010072461e68 0000000000000046 0000010072461de8 ffffffff00000064 Jan 31 17:46:29 salem kernel: 0000000000000001 000000007215b5e0 0000010001029a80 000000017215b1a8 Jan 31 17:46:29 salem kernel: 000001006fc5c030 000000000007dab6 Jan 31 17:46:29 salem kernel: Call Trace:<ffffffffa0057334>{:dm_mod:do_work+0} <ffffffff80148019>{worker_thread+226} Jan 31 17:46:29 salem kernel: <ffffffff80134660>{default_wake_function+0} <ffffffff801346b1>{__wake_up_common+67} Jan 31 17:46:29 salem kernel: <ffffffff80134660>{default_wake_function+0} <ffffffff8014be28>{keventd_create_kthread+0} Jan 31 17:46:29 salem kernel: <ffffffff80147f37>{worker_thread+0} <ffffffff8014be28>{keventd_create_kthread+0} Jan 31 17:46:29 salem kernel: <ffffffff8014bdff>{kthread+200} <ffffffff80110f47>{child_rip+8} Jan 31 17:46:29 salem kernel: <ffffffff8014be28>{keventd_create_kthread+0} <ffffffff8014bd37>{kthread+0} Jan 31 17:46:29 salem kernel: <ffffffff80110f3f>{child_rip+0} Jan 31 17:46:29 salem kernel: lvs S 000001007c204c0c 0 4871 3822 (NOTLB) Jan 31 17:46:29 salem kernel: 00000100729e3bd8 0000000000000006 00000100729e3b48 ffffffff0000007d Jan 31 17:46:29 salem kernel: 00000100729e3b98 0000000080132a01 0000010001029a80 0000000100000001 Jan 31 17:46:29 salem kernel: 00000100706c1030 0000000000000855 Jan 31 17:46:29 salem kernel: Call Trace:<ffffffff80132a8f>{activate_task+124} <ffffffff8030cee0>{schedule_timeout+224} Jan 31 17:46:29 salem kernel: <ffffffff80135f1c>{prepare_to_wait+21} <ffffffff8030804e>{unix_stream_recvmsg+592} Jan 31 17:46:29 salem kernel: <ffffffff80136020>{autoremove_wake_function+0} <ffffffff80136020>{autoremove_wake_function+0} Jan 31 17:46:29 salem kernel: <ffffffff802a93da>{sock_aio_read+297} <ffffffff802a9520>{sock_aio_write+306} Jan 31 17:46:30 salem kernel: <ffffffff80179eb5>{do_sync_read+178} <ffffffff80188047>{__user_walk+94} Jan 31 17:46:30 salem kernel: <ffffffff801826b4>{vfs_stat64+24} <ffffffff8030c44d>{thread_return+0} Jan 31 17:46:30 salem kernel: <ffffffff8030c4a5>{thread_return+88} <ffffffff80136020>{autoremove_wake_function+0} Jan 31 17:46:30 salem kernel: <ffffffff80179fc3>{vfs_read+226} <ffffffff8017a20c>{sys_read+69} Jan 31 17:46:30 salem kernel: <ffffffff8011026a>{system_call+126} 2.6.9-43.ELsmp lvm2-2.02.20-1.el4 lvm2-cluster-2.02.20-1.el4 cmirror-1.0.1-1 cmirror-kernel-smp-2.6.9-18.7 changing the subject as this bug seems to appear with doing looping creation/deletions. (without) please help me reproduce with the latest cmirror-kernel package (>= 2/21/2007) Marking modified, as I believe this has been fixed in the process of fixing other bugs. I appear to have hit this issue while attempting to create a cmirror. The lvcreate cmd deadlocked along with other lvm cmds such as lvs on the cluster. The create was attempted on link-07. Feb 28 11:09:49 link-07 kernel: lvcreate D 000001003f43dc88 0 28471 29723 (NOTLB) Feb 28 11:09:49 link-07 kernel: 0000010036bd7bb8 0000000000000002 0000000000000000 ffffffff00000069 Feb 28 11:09:49 link-07 kernel: 000001003f6c6fc0 0000000000000000 0000010020013e00 000000018024e958 Feb 28 11:09:49 link-07 kernel: 000001003ba217f0 00000000000081f7 Feb 28 11:09:49 link-07 kernel: Call Trace:<ffffffff8025002c>{__generic_unplug_device+19} <ffffffff80250065>{generic_unplug_device+24} Feb 28 11:09:49 link-07 kernel: <ffffffffa00aec73>{:dm_mod:dm_table_unplug_all+49} Feb 28 11:09:49 link-07 kernel: <ffffffff8030ce47>{io_schedule+38} <ffffffff8019bc8f>{__blockdev_direct_IO+3023} Feb 28 11:09:49 link-07 kernel: <ffffffff80180a04>{blkdev_direct_IO+48} <ffffffff80180959>{blkdev_get_blocks+0} Feb 28 11:09:49 link-07 kernel: <ffffffff8015c612>{generic_file_direct_IO+78} <ffffffff8015c69f>{generic_file_direct_write+103} Feb 28 11:09:49 link-07 kernel: <ffffffff8015c9bc>{__generic_file_aio_write_nolock+662} Feb 28 11:09:49 link-07 kernel: <ffffffff8015cc9f>{generic_file_aio_write_nolock+32} Feb 28 11:09:49 link-07 kernel: <ffffffff8015ce6d>{generic_file_write_nolock+158} <ffffffff8015d1b0>{generic_file_read+187} Feb 28 11:09:49 link-07 kernel: <ffffffff80136020>{autoremove_wake_function+0} <ffffffff801941d0>{dnotify_parent+34} Feb 28 11:09:49 link-07 kernel: <ffffffff801818a0>{blkdev_file_write+26} <ffffffff8017a18e>{vfs_write+207} Feb 28 11:09:49 link-07 kernel: <ffffffff8017a276>{sys_write+69} <ffffffff8011026a>{system_call+126} Feb 28 11:09:49 link-07 kernel: lvs S 0000000000000012 0 28574 4717 (NOTLB) Feb 28 11:09:49 link-07 kernel: 000001001874dbd8 0000000000000002 000001001f889030 0000010001008c60 Feb 28 11:09:49 link-07 kernel: 000001001874db98 ffffffff80132a01 000000013f6fd570 0000000100000001 Feb 28 11:09:49 link-07 kernel: 000001003ba387f0 0000000000000876 Feb 28 11:09:49 link-07 kernel: Call Trace:<ffffffff80132a01>{recalc_task_prio+337} <ffffffff80132a8f>{activate_task+124} Feb 28 11:09:49 link-07 kernel: <ffffffff8030cf64>{schedule_timeout+224} <ffffffff80135f1c>{prepare_to_wait+21} Feb 28 11:09:49 link-07 kernel: <ffffffff803080d2>{unix_stream_recvmsg+592} <ffffffff80136020>{autoremove_wake_function+0} Feb 28 11:09:49 link-07 kernel: <ffffffff80136020>{autoremove_wake_function+0} <ffffffff802a945e>{sock_aio_read+297} Feb 28 11:09:49 link-07 kernel: <ffffffff802a95a4>{sock_aio_write+306} <ffffffff80179eb1>{do_sync_read+178} Feb 28 11:09:49 link-07 kernel: <ffffffff80188043>{__user_walk+94} <ffffffff801826b0>{vfs_stat64+24} Feb 28 11:09:49 link-07 kernel: <ffffffff80136020>{autoremove_wake_function+0} <ffffffff801941d0>{dnotify_parent+34} Feb 28 11:09:49 link-07 kernel: <ffffffff80179fbf>{vfs_read+226} <ffffffff8017a208>{sys_read+69} Feb 28 11:09:49 link-07 kernel: <ffffffff8011026a>{system_call+126} Feb 28 11:09:49 link-07 kernel: kmirrord S 0000000103f6dbff 0 28542 7 28428 (L-TLB) Feb 28 11:09:49 link-07 kernel: 0000010036ef5c48 0000000000000046 0000000000000000 0000000000000073 Feb 28 11:09:49 link-07 kernel: 0000010000000000 0000000000000000 0000010020013e00 0000000100000000 Feb 28 11:09:49 link-07 kernel: 000001001a2877f0 000000000000005f Feb 28 11:09:49 link-07 kernel: Call Trace:<ffffffff80140524>{__mod_timer+293} <ffffffff8030cff3>{schedule_timeout+367} Feb 28 11:09:49 link-07 kernel: <ffffffff80140f4e>{process_timeout+0} <ffffffffa00f848f>{:dm_mirror:do_work+3692} Feb 28 11:09:49 link-07 kernel: <ffffffff8030c4d1>{thread_return+0} <ffffffff8030c529>{thread_return+88} Feb 28 11:09:49 link-07 kernel: <ffffffffa00f7623>{:dm_mirror:do_work+0} <ffffffff801480d6>{worker_thread+419} Feb 28 11:09:49 link-07 kernel: <ffffffff80134660>{default_wake_function+0} <ffffffff801346b1>{__wake_up_common+67} Feb 28 11:09:49 link-07 kernel: <ffffffff80134660>{default_wake_function+0} <ffffffff8014be24>{keventd_create_kthread+0} Feb 28 11:09:49 link-07 kernel: <ffffffff80147f33>{worker_thread+0} <ffffffff8014be24>{keventd_create_kthread+0} Feb 28 11:09:49 link-07 kernel: <ffffffff8014bdfb>{kthread+200} <ffffffff80110f47>{child_rip+8} Feb 28 11:09:49 link-07 kernel: <ffffffff8014be24>{keventd_create_kthread+0} <ffffffff8014bd33>{kthread+0} Feb 28 11:09:49 link-07 kernel: <ffffffff80110f3f>{child_rip+0} Feb 28 11:09:48 link-07 kernel: dmeventd S ffffffffa00b0870 0 28567 1 28575 28472 (NOTLB) Feb 28 11:09:48 link-07 kernel: 000001001836fcf8 0000000000000002 0000000000000000 000000000000006d Feb 28 11:09:48 link-07 kernel: 0000000000000000 0000000000000000 0000010001009f80 0000000000017c2a Feb 28 11:09:48 link-07 kernel: 000001003f6fc030 0000000000008831 Feb 28 11:09:48 link-07 kernel: Call Trace:<ffffffffa00b0870>{:dm_mod:dev_wait+0} <ffffffffa00adc52>{:dm_mod:dm_wait_event+151} Feb 28 11:09:48 link-07 kernel: <ffffffff80136020>{autoremove_wake_function+0} <ffffffff80136020>{autoremove_wake_function+0} Feb 28 11:09:48 link-07 kernel: <ffffffff801eba95>{__up_read+16} <ffffffffa00b098d>{:dm_mod:dev_wait+285} Feb 28 11:09:48 link-07 kernel: <ffffffffa00b15e3>{:dm_mod:ctl_ioctl+602} <ffffffff8018bbc5>{sys_ioctl+853} Feb 28 11:09:48 link-07 kernel: <ffffffff8011026a>{system_call+126} Feb 28 11:09:48 link-07 kernel: kmirrord S 000001003f8ceb40 0 28359 6 28161 (L-TLB) Feb 28 11:09:48 link-07 kernel: 00000100372f3e68 0000000000000046 ffffffff803d7a00 0000000000000073 Feb 28 11:09:48 link-07 kernel: 0000010019c1e080 0000000000063000 0000010001009f80 0000000000000000 Feb 28 11:09:48 link-07 kernel: 000001003a4c4030 0000000000000051 Feb 28 11:09:48 link-07 kernel: Call Trace:<ffffffffa00f7623>{:dm_mirror:do_work+0} <ffffffff80148015>{worker_thread+226} Feb 28 11:09:48 link-07 kernel: <ffffffff80134660>{default_wake_function+0} <ffffffff801346b1>{__wake_up_common+67} Feb 28 11:09:48 link-07 kernel: <ffffffff80134660>{default_wake_function+0} <ffffffff8014be24>{keventd_create_kthread+0} Feb 28 11:09:48 link-07 kernel: <ffffffff80147f33>{worker_thread+0} <ffffffff8014be24>{keventd_create_kthread+0} Feb 28 11:09:49 link-07 kernel: <ffffffff8014bdfb>{kthread+200} <ffffffff80110f47>{child_rip+8} Feb 28 11:09:49 link-07 kernel: <ffffffff8014be24>{keventd_create_kthread+0} <ffffffff8014bd33>{kthread+0} Feb 28 11:09:49 link-07 kernel: <ffffffff80110f3f>{child_rip+0} Feb 28 11:09:49 link-07 kernel: kmirrord S 000001003c272d40 0 28428 7 28542 28288 (L-TLB) Feb 28 11:09:49 link-07 kernel: 000001001a023e68 0000000000000046 000001001a61b7f0 0000000000000073 Feb 28 11:09:49 link-07 kernel: 0000010036d08c40 0000000000063800 0000010020013e00 0000000100000000 Feb 28 11:09:49 link-07 kernel: 000001001f3127f0 0000000000000064 Feb 28 11:09:49 link-07 kernel: Call Trace:<ffffffffa00f7623>{:dm_mirror:do_work+0} <ffffffff80148015>{worker_thread+226} Feb 28 11:09:49 link-07 kernel: <ffffffff80134660>{default_wake_function+0} <ffffffff801346b1>{__wake_up_common+67} Feb 28 11:09:49 link-07 kernel: <ffffffff80134660>{default_wake_function+0} <ffffffff8014be24>{keventd_create_kthread+0} Feb 28 11:09:49 link-07 kernel: <ffffffff80147f33>{worker_thread+0} <ffffffff8014be24>{keventd_create_kthread+0} Feb 28 11:09:49 link-07 kernel: <ffffffff8014bdfb>{kthread+200} <ffffffff80110f47>{child_rip+8} Feb 28 11:09:49 link-07 kernel: <ffffffff8014be24>{keventd_create_kthread+0} <ffffffff8014bd33>{kthread+0} Feb 28 11:09:49 link-07 kernel: <ffffffff80110f3f>{child_rip+0} pretty sure these issues have been addressed. try cmirror-kernel >= 2.6.9-30.1 assigned -> modified I ran cmirror locking operations all night and wasn't able to trip this deadlock, marking verified in cmirror-kernel-2.6.9-32.0. Fixed in current release (4.7). |