Created attachment 669418 [details] /etc/lvm/lvm.conf Description of problem: After a mirror/disk dissapears from a mirrored LVM on ARM, 'dmeventd' does not repair LVM and since the remaining disk is suspended, the system cannot run any new processes — they wait for I/O in vain. Version-Release number of selected component (if applicable): LVM 2.02.97 + patch on https://www.redhat.com/archives/lvm-devel/2012-November/msg00039.html (installed on Gentoo) How reproducible: Always. Steps to Reproduce: 1. Install LVM on an ARM system (DreamPlug with Kirkwood MV88F6281-A1 SoC and Feroceon 88FR131 rev 1 (v5l) a.k.a. Marvel Sheeva 1.2 GHz (ARMv5TE) CPU in my case) 2. Following the official RedHat docs create a root partition in a LVM Mirror (I used "-m 1 --mirrorlog mirrored --alloc anywhere" on a two disk set-up, so the mirror log is mirrored on both at the same time). 3. Make sure 'dmeventd' deamon is running. (starts as a runlevel script on boot in my case) 4. On a live system unplug one of the (two) disks to simulate a dead disk. Actual results: Immediate: Processes that already work, continue to work (in my case only 'htop'); any new processes get are marked as "D" in 'htop' and seem deadlocked. In 'htop' all of CPU is waiting for I/O and every new process just adds to the load and doesn’t run. 'dmeventd' process is sleeping. On reboot with both disks in: It boots and works normally. On reboot with only one disk in: LVM in initramfs cannot find the device and eventually kernel panicks. Expected results: On removal of the disk, 'dmeventd' should have fixed the volumes, or at least change them to an unmirrored mode. In any case, the system should have continued to run uninterupted in all above use-cases — whether with both or one disk and whether on reboot or a live system. Additional info: I already discussed this problem on irc://irc.freenode.net/#lvm and was directed here. An idea was that it may be a problem of 'dmeventd' not loading and locking in memory on ARM or with the abovementioned patch for ARM.
Created attachment 669419 [details] Kernel panic when booting with one disk only
Created attachment 669420 [details] output of `lvs -a --option +devices` Obviously this is the output before and after the bug occurs. When the bug occurs, I cannot get any output from `lvs`, as it indefinitely waits for I/O.
Created attachment 669421 [details] Kernel 3.6.11 config .config I use for Kernel 3.6.11 which is vanilla 3.6.11 kernel with some patches from Gentoo
Created attachment 669422 [details] Initramfs The 'init' script of the initramfs I use.
After your original step 4 when the machine is hung, please get diagnostics to show what it is waiting for. If you can't get 'ps' output, try sysrq as described here: http://kernel.org/doc/Documentation/sysrq.txt e.g. with 't' dumped to console.
An additional sysrq trace after device removal/failure: [ 111.766032] dmeventd S ffff88003fc13cc0 0 530 1 0x00000000 [ 111.766032] ffff88003c8318f8 0000000000000082 ffff88003a49dc40 ffff88003c831fd8 [ 111.766032] ffff88003c831fd8 ffff88003c831fd8 ffffffff81c13420 ffff88003a49dc40 [ 111.766032] 0000000000000000 ffff88003c831ac0 00000000000f423d 0000000000000000 [ 111.766032] Call Trace: [ 111.766032] [<ffffffff81622a19>] schedule+0x29/0x70 [ 111.766032] [<ffffffff81621bcc>] schedule_hrtimeout_range_clock+0x12c/0x170 [ 111.766032] [<ffffffff81084070>] ? update_rmtp+0x70/0x70 [ 111.766032] [<ffffffff81084af4>] ? hrtimer_start_range_ns+0x14/0x20 [ 111.766032] [<ffffffff81621c23>] schedule_hrtimeout_range+0x13/0x20 [ 111.766032] [<ffffffff811a3569>] poll_schedule_timeout+0x49/0x70 [ 111.766032] [<ffffffff811a40b8>] do_select+0x638/0x700 [ 111.766032] [<ffffffff811a3680>] ? __pollwait+0xf0/0xf0 [ 111.766032] [<ffffffff8109798f>] ? check_preempt_wakeup+0x1cf/0x270 [ 111.766032] [<ffffffff8108efb5>] ? check_preempt_curr+0x85/0xa0 [ 111.766032] [<ffffffff8108effc>] ? ttwu_do_wakeup+0x2c/0xf0 [ 111.766032] [<ffffffff8108f346>] ? ttwu_do_activate.constprop.81+0x66/0x70 [ 111.766032] [<ffffffff81092319>] ? try_to_wake_up+0x1d9/0x2c0 [ 111.766032] [<ffffffff81092412>] ? default_wake_function+0x12/0x20 [ 111.766032] [<ffffffff811a36e6>] ? pollwake+0x66/0x70 [ 111.766032] [<ffffffff81043b71>] ? pvclock_clocksource_read+0x61/0xf0 [ 111.766032] [<ffffffff811a9e51>] ? update_time+0x81/0xc0 [ 111.766032] [<ffffffff811ae4f0>] ? __mnt_want_write+0x40/0x60 [ 111.766032] [<ffffffff811a9f33>] ? file_update_time+0xa3/0xf0 [ 111.766032] [<ffffffff811a4375>] core_sys_select+0x1f5/0x350 [ 111.766032] [<ffffffff81043b71>] ? pvclock_clocksource_read+0x61/0xf0 [ 111.766032] [<ffffffff81042d49>] ? kvm_clock_read+0x19/0x20 [ 111.766032] [<ffffffff81042d59>] ? kvm_clock_get_cycles+0x9/0x10 [ 111.766032] [<ffffffff810ac44c>] ? ktime_get_ts+0x4c/0xf0 [ 111.766032] [<ffffffff811a3a62>] ? poll_select_set_timeout+0x72/0x90 [ 111.766032] [<ffffffff811a4589>] sys_select+0xb9/0x110 [ 111.766032] [<ffffffff8162bae9>] system_call_fastpath+0x16/0x1b [ 111.766032] dmeventd S ffff88003fc13cc0 0 531 1 0x00000000 [ 111.766032] ffff88003a967d48 0000000000000082 ffff880036654530 ffff88003a967fd8 [ 111.766032] ffff88003a967fd8 ffff88003a967fd8 ffff88003cb7c530 ffff880036654530 [ 111.766032] 0000000000000000 ffff880036ee7800 0000000000000001 ffff880036ee7938 [ 111.766032] Call Trace: [ 111.766032] [<ffffffff81622a19>] schedule+0x29/0x70 [ 111.766032] [<ffffffff814b5fa1>] dm_wait_event+0x81/0xd0 [ 111.766032] [<ffffffff810807f0>] ? wake_up_bit+0x40/0x40 [ 111.766032] [<ffffffff814bb267>] dev_wait+0x47/0xc0 [ 111.766032] [<ffffffff814bb220>] ? table_deps+0x170/0x170 [ 111.766032] [<ffffffff814bbd1b>] ctl_ioctl+0x18b/0x2c0 [ 111.766032] [<ffffffff814bbe63>] dm_ctl_ioctl+0x13/0x20 [ 111.766032] [<ffffffff811a2939>] do_vfs_ioctl+0x99/0x580 [ 111.766032] [<ffffffff8128380a>] ? inode_has_perm.isra.31.constprop.61+0x2a/0x30 [ 111.766032] [<ffffffff81284c37>] ? file_has_perm+0x97/0xb0 [ 111.766032] [<ffffffff811a2eb9>] sys_ioctl+0x99/0xa0 [ 111.766032] [<ffffffff810725c1>] ? sys_rt_sigprocmask+0xa1/0xd0 [ 111.766032] [<ffffffff8162bae9>] system_call_fastpath+0x16/0x1b [ 111.766032] dmeventd D ffff88003fc13cc0 0 532 1 0x00000000 [ 111.766032] ffff88003c84daf8 0000000000000082 ffff88003cb7c530 ffff88003c84dfd8 [ 111.766032] ffff88003c84dfd8 ffff88003c84dfd8 ffffffff81c13420 ffff88003cb7c530 [ 111.766032] ffff88003c84db18 ffff880036665000 ffff880036665068 ffff880036665024 [ 111.766032] Call Trace: [ 111.766032] [<ffffffff81622a19>] schedule+0x29/0x70 [ 111.766032] [<ffffffff81252bdf>] start_this_handle.isra.8+0x37f/0x480 [ 111.766032] [<ffffffff810807f0>] ? wake_up_bit+0x40/0x40 [ 111.766032] [<ffffffff81252ee8>] jbd2__journal_start+0xc8/0x110 [ 111.766032] [<ffffffff8121b61f>] ? ext4_create+0x6f/0x170 [ 111.766032] [<ffffffff81252f43>] jbd2_journal_start+0x13/0x20 [ 111.766032] [<ffffffff81231afb>] ext4_journal_start_sb+0x5b/0x130 [ 111.766032] [<ffffffff8121b61f>] ext4_create+0x6f/0x170 [ 111.766032] [<ffffffff8119f455>] vfs_create+0xb5/0x110 [ 111.766032] [<ffffffff8119fe12>] do_last+0x962/0xdf0 [ 111.766032] [<ffffffff8119c898>] ? inode_permission+0x18/0x50 [ 111.766032] [<ffffffff811a035a>] path_openat+0xba/0x4d0 [ 111.766032] [<ffffffff8117ad21>] ? kmem_cache_alloc+0x31/0x160 [ 111.766032] [<ffffffff811a09d1>] do_filp_open+0x41/0xa0 [ 111.766032] [<ffffffff811aca8d>] ? alloc_fd+0x4d/0x120 [ 111.766032] [<ffffffff8118fd36>] do_sys_open+0xf6/0x1e0 [ 111.766032] [<ffffffff8118fe41>] sys_open+0x21/0x30 [ 111.766032] [<ffffffff8162bae9>] system_call_fastpath+0x16/0x1b
Created attachment 704687 [details] ps aux As requested, here is the output of the `ps aux` just after I pull out one of the disks. I tried to get a dump via SysRq, but it didn’t work (maybe minicom doesn’t pass it through)
This bug appears to have been reported against 'rawhide' during the Fedora 19 development cycle. Changing version to '19'. (As we did not run this process for some time, it could affect also pre-Fedora 19 development cycle bugs. We are very sorry. It will help us with cleanup during Fedora 19 End Of Life. Thank you.) More information and reason for this action is here: https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Fedora19
Is there anything else I can help with pushing this forward?
Sounds like there may be no mirror plugin operating for dmeventd? Do we have logs showing that dmeventd has at least recognized the failure and attempted to respond to it? It should not be sleeping. The second issue is that the system boot-up process should be attempting partial activation, but is not: Refusing activation of partial LV layman. Use --partial to override. 0 logical volume(s) in volume group "mirrored" now active This has been resolved in recent releases. Finally, don't use "mirror" for your system disks, use the "raid1" segment type instead. It is a big improvement over "mirror".
If the reporter wishes to provide the system logs so we can determine how dmeventd responded to the failure, they are invited to reopen this bug. Otherwise, I am closing this issue WONTFIX. As discussed in comment 10, there are really two issues here: 1) handling the failure while the system is alive 2) dealing with the failure when the system boots #1 is fixed in the current release, I believe. There is no plan to fix #2 (instead, the user should use the "raid1" segment type for these logical volumes).
I am happy to share the system logs, if you let me know which files and under which conditions.
This bug appears to have been reported against 'rawhide' during the Fedora 23 development cycle. Changing version to '23'. (As we did not run this process for some time, it could affect also pre-Fedora 23 development cycle bugs. We are very sorry. It will help us with cleanup during Fedora 23 End Of Life. Thank you.) More information and reason for this action is here: https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Fedora23