Bug 890442 - On arm dmeventd fails to repair LVM if system disk is mirrored
On arm dmeventd fails to repair LVM if system disk is mirrored
Status: NEW
Product: Fedora
Classification: Fedora
Component: lvm2 (Show other bugs)
23
arm Linux
unspecified Severity unspecified
: ---
: ---
Assigned To: LVM and device-mapper development team
Fedora Extras Quality Assurance
: Reopened
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-12-26 19:06 EST by Matija Šuklje
Modified: 2015-07-15 10:54 EDT (History)
13 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-08-14 10:40:04 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
/etc/lvm/lvm.conf (34.79 KB, application/octet-stream)
2012-12-26 19:06 EST, Matija Šuklje
no flags Details
Kernel panic when booting with one disk only (3.05 KB, application/octet-stream)
2012-12-26 19:07 EST, Matija Šuklje
no flags Details
output of `lvs -a --option +devices` (5.38 KB, application/octet-stream)
2012-12-26 19:10 EST, Matija Šuklje
no flags Details
Kernel 3.6.11 config (59.93 KB, application/octet-stream)
2012-12-26 19:17 EST, Matija Šuklje
no flags Details
Initramfs (709 bytes, text/plain)
2012-12-26 19:25 EST, Matija Šuklje
no flags Details
ps aux (12.00 KB, text/plain)
2013-03-03 18:38 EST, Matija Šuklje
no flags Details

  None (edit)
Description Matija Šuklje 2012-12-26 19:06:01 EST
Created attachment 669418 [details]
/etc/lvm/lvm.conf

Description of problem:

After a mirror/disk dissapears from a mirrored LVM on ARM, 'dmeventd' does not repair LVM and since the remaining disk is suspended, the system cannot run any new processes — they wait for I/O in vain.

Version-Release number of selected component (if applicable):

LVM 2.02.97 + patch on https://www.redhat.com/archives/lvm-devel/2012-November/msg00039.html
(installed on Gentoo)

How reproducible:

Always.

Steps to Reproduce:

1. Install LVM on an ARM system
(DreamPlug with Kirkwood MV88F6281-A1 SoC and Feroceon 88FR131 rev 1 (v5l) a.k.a. Marvel Sheeva 1.2 GHz (ARMv5TE) CPU in my case)

2. Following the official RedHat docs create a root partition in a LVM Mirror
(I used "-m 1 --mirrorlog mirrored --alloc anywhere" on a two disk set-up, so the mirror log is mirrored on both at the same time).

3. Make sure 'dmeventd' deamon is running.
(starts as a runlevel script on boot in my case)

4. On a live system unplug one of the (two) disks to simulate a dead disk.
  
Actual results:

Immediate:

Processes that already work, continue to work (in my case only 'htop'); any new processes get are marked as "D" in 'htop' and seem deadlocked. In 'htop' all of CPU is waiting for I/O and every new process just adds to the load and doesn’t run. 'dmeventd' process is sleeping.

On reboot with both disks in:

It boots and works normally.

On reboot with only one disk in:

LVM in initramfs cannot find the device and eventually kernel panicks.

Expected results:

On removal of the disk, 'dmeventd' should have fixed the volumes, or at least change them to an unmirrored mode.

In any case, the system should have continued to run uninterupted in all above use-cases — whether with both or one disk and whether on reboot or a live system.


Additional info:

I already discussed this problem on irc://irc.freenode.net/#lvm and was directed here. An idea was that it may be a problem of 'dmeventd' not loading and locking in memory on ARM or with the abovementioned patch for ARM.
Comment 1 Matija Šuklje 2012-12-26 19:07:15 EST
Created attachment 669419 [details]
Kernel panic when booting with one disk only
Comment 2 Matija Šuklje 2012-12-26 19:10:02 EST
Created attachment 669420 [details]
output of `lvs -a --option +devices`

Obviously this is the output before and after the bug occurs.

When the bug occurs, I cannot get any output from `lvs`, as it indefinitely waits for I/O.
Comment 3 Matija Šuklje 2012-12-26 19:17:36 EST
Created attachment 669421 [details]
Kernel 3.6.11 config

.config I use for Kernel 3.6.11 which is vanilla 3.6.11 kernel with some patches from Gentoo
Comment 4 Matija Šuklje 2012-12-26 19:25:44 EST
Created attachment 669422 [details]
Initramfs

The 'init' script of the initramfs I use.
Comment 5 Alasdair Kergon 2013-01-02 14:03:26 EST
After your original step 4 when the machine is hung, please get diagnostics to show what it is waiting for.  If you can't get 'ps' output, try sysrq as described here:  http://kernel.org/doc/Documentation/sysrq.txt e.g. with 't' dumped to console.
Comment 6 Peter Rajnoha 2013-01-11 07:01:44 EST
An additional sysrq trace after device removal/failure:


[  111.766032] dmeventd        S ffff88003fc13cc0     0   530      1 0x00000000
[  111.766032]  ffff88003c8318f8 0000000000000082 ffff88003a49dc40 ffff88003c831fd8
[  111.766032]  ffff88003c831fd8 ffff88003c831fd8 ffffffff81c13420 ffff88003a49dc40
[  111.766032]  0000000000000000 ffff88003c831ac0 00000000000f423d 0000000000000000
[  111.766032] Call Trace:
[  111.766032]  [<ffffffff81622a19>] schedule+0x29/0x70
[  111.766032]  [<ffffffff81621bcc>] schedule_hrtimeout_range_clock+0x12c/0x170
[  111.766032]  [<ffffffff81084070>] ? update_rmtp+0x70/0x70
[  111.766032]  [<ffffffff81084af4>] ? hrtimer_start_range_ns+0x14/0x20
[  111.766032]  [<ffffffff81621c23>] schedule_hrtimeout_range+0x13/0x20
[  111.766032]  [<ffffffff811a3569>] poll_schedule_timeout+0x49/0x70
[  111.766032]  [<ffffffff811a40b8>] do_select+0x638/0x700
[  111.766032]  [<ffffffff811a3680>] ? __pollwait+0xf0/0xf0
[  111.766032]  [<ffffffff8109798f>] ? check_preempt_wakeup+0x1cf/0x270
[  111.766032]  [<ffffffff8108efb5>] ? check_preempt_curr+0x85/0xa0
[  111.766032]  [<ffffffff8108effc>] ? ttwu_do_wakeup+0x2c/0xf0
[  111.766032]  [<ffffffff8108f346>] ? ttwu_do_activate.constprop.81+0x66/0x70
[  111.766032]  [<ffffffff81092319>] ? try_to_wake_up+0x1d9/0x2c0
[  111.766032]  [<ffffffff81092412>] ? default_wake_function+0x12/0x20
[  111.766032]  [<ffffffff811a36e6>] ? pollwake+0x66/0x70
[  111.766032]  [<ffffffff81043b71>] ? pvclock_clocksource_read+0x61/0xf0
[  111.766032]  [<ffffffff811a9e51>] ? update_time+0x81/0xc0
[  111.766032]  [<ffffffff811ae4f0>] ? __mnt_want_write+0x40/0x60
[  111.766032]  [<ffffffff811a9f33>] ? file_update_time+0xa3/0xf0
[  111.766032]  [<ffffffff811a4375>] core_sys_select+0x1f5/0x350
[  111.766032]  [<ffffffff81043b71>] ? pvclock_clocksource_read+0x61/0xf0
[  111.766032]  [<ffffffff81042d49>] ? kvm_clock_read+0x19/0x20
[  111.766032]  [<ffffffff81042d59>] ? kvm_clock_get_cycles+0x9/0x10
[  111.766032]  [<ffffffff810ac44c>] ? ktime_get_ts+0x4c/0xf0
[  111.766032]  [<ffffffff811a3a62>] ? poll_select_set_timeout+0x72/0x90
[  111.766032]  [<ffffffff811a4589>] sys_select+0xb9/0x110
[  111.766032]  [<ffffffff8162bae9>] system_call_fastpath+0x16/0x1b
[  111.766032] dmeventd        S ffff88003fc13cc0     0   531      1 0x00000000
[  111.766032]  ffff88003a967d48 0000000000000082 ffff880036654530 ffff88003a967fd8
[  111.766032]  ffff88003a967fd8 ffff88003a967fd8 ffff88003cb7c530 ffff880036654530
[  111.766032]  0000000000000000 ffff880036ee7800 0000000000000001 ffff880036ee7938
[  111.766032] Call Trace:
[  111.766032]  [<ffffffff81622a19>] schedule+0x29/0x70
[  111.766032]  [<ffffffff814b5fa1>] dm_wait_event+0x81/0xd0
[  111.766032]  [<ffffffff810807f0>] ? wake_up_bit+0x40/0x40
[  111.766032]  [<ffffffff814bb267>] dev_wait+0x47/0xc0
[  111.766032]  [<ffffffff814bb220>] ? table_deps+0x170/0x170
[  111.766032]  [<ffffffff814bbd1b>] ctl_ioctl+0x18b/0x2c0
[  111.766032]  [<ffffffff814bbe63>] dm_ctl_ioctl+0x13/0x20
[  111.766032]  [<ffffffff811a2939>] do_vfs_ioctl+0x99/0x580
[  111.766032]  [<ffffffff8128380a>] ? inode_has_perm.isra.31.constprop.61+0x2a/0x30
[  111.766032]  [<ffffffff81284c37>] ? file_has_perm+0x97/0xb0
[  111.766032]  [<ffffffff811a2eb9>] sys_ioctl+0x99/0xa0
[  111.766032]  [<ffffffff810725c1>] ? sys_rt_sigprocmask+0xa1/0xd0
[  111.766032]  [<ffffffff8162bae9>] system_call_fastpath+0x16/0x1b
[  111.766032] dmeventd        D ffff88003fc13cc0     0   532      1 0x00000000
[  111.766032]  ffff88003c84daf8 0000000000000082 ffff88003cb7c530 ffff88003c84dfd8
[  111.766032]  ffff88003c84dfd8 ffff88003c84dfd8 ffffffff81c13420 ffff88003cb7c530
[  111.766032]  ffff88003c84db18 ffff880036665000 ffff880036665068 ffff880036665024
[  111.766032] Call Trace:
[  111.766032]  [<ffffffff81622a19>] schedule+0x29/0x70
[  111.766032]  [<ffffffff81252bdf>] start_this_handle.isra.8+0x37f/0x480
[  111.766032]  [<ffffffff810807f0>] ? wake_up_bit+0x40/0x40
[  111.766032]  [<ffffffff81252ee8>] jbd2__journal_start+0xc8/0x110
[  111.766032]  [<ffffffff8121b61f>] ? ext4_create+0x6f/0x170
[  111.766032]  [<ffffffff81252f43>] jbd2_journal_start+0x13/0x20
[  111.766032]  [<ffffffff81231afb>] ext4_journal_start_sb+0x5b/0x130
[  111.766032]  [<ffffffff8121b61f>] ext4_create+0x6f/0x170
[  111.766032]  [<ffffffff8119f455>] vfs_create+0xb5/0x110
[  111.766032]  [<ffffffff8119fe12>] do_last+0x962/0xdf0
[  111.766032]  [<ffffffff8119c898>] ? inode_permission+0x18/0x50
[  111.766032]  [<ffffffff811a035a>] path_openat+0xba/0x4d0
[  111.766032]  [<ffffffff8117ad21>] ? kmem_cache_alloc+0x31/0x160
[  111.766032]  [<ffffffff811a09d1>] do_filp_open+0x41/0xa0
[  111.766032]  [<ffffffff811aca8d>] ? alloc_fd+0x4d/0x120
[  111.766032]  [<ffffffff8118fd36>] do_sys_open+0xf6/0x1e0
[  111.766032]  [<ffffffff8118fe41>] sys_open+0x21/0x30
[  111.766032]  [<ffffffff8162bae9>] system_call_fastpath+0x16/0x1b
Comment 7 Matija Šuklje 2013-03-03 18:38:33 EST
Created attachment 704687 [details]
ps aux

As requested, here is the output of the `ps aux` just after I pull out one of the disks.

I tried to get a dump via SysRq, but it didn’t work (maybe minicom doesn’t pass  it through)
Comment 8 Fedora End Of Life 2013-04-03 12:01:15 EDT
This bug appears to have been reported against 'rawhide' during the Fedora 19 development cycle.
Changing version to '19'.

(As we did not run this process for some time, it could affect also pre-Fedora 19 development
cycle bugs. We are very sorry. It will help us with cleanup during Fedora 19 End Of Life. Thank you.)

More information and reason for this action is here:
https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Fedora19
Comment 9 Matija Šuklje 2013-06-25 07:06:23 EDT
Is there anything else I can help with pushing this forward?
Comment 10 Jonathan Earl Brassow 2014-08-14 10:27:02 EDT
Sounds like there may be no mirror plugin operating for dmeventd?  Do we have logs showing that dmeventd has at least recognized the failure and attempted to respond to it?  It should not be sleeping.

The second issue is that the system boot-up process should be attempting partial activation, but is not:
  Refusing activation of partial LV layman. Use --partial to override.
  0 logical volume(s) in volume group "mirrored" now active
This has been resolved in recent releases.

Finally, don't use "mirror" for your system disks, use the "raid1" segment type instead.  It is a big improvement over "mirror".
Comment 11 Jonathan Earl Brassow 2014-08-14 10:40:04 EDT
If the reporter wishes to provide the system logs so we can determine how dmeventd responded to the failure, they are invited to reopen this bug.  Otherwise, I am closing this issue WONTFIX.

As discussed in comment 10, there are really two issues here:
1) handling the failure while the system is alive
2) dealing with the failure when the system boots

#1 is fixed in the current release, I believe.  There is no plan to fix #2 (instead, the user should use the "raid1" segment type for these logical volumes).
Comment 12 Matija Šuklje 2014-08-14 11:13:11 EDT
I am happy to share the system logs, if you let me know which files and under which conditions.
Comment 13 Jan Kurik 2015-07-15 10:54:20 EDT
This bug appears to have been reported against 'rawhide' during the Fedora 23 development cycle.
Changing version to '23'.

(As we did not run this process for some time, it could affect also pre-Fedora 23 development
cycle bugs. We are very sorry. It will help us with cleanup during Fedora 23 End Of Life. Thank you.)

More information and reason for this action is here:
https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Fedora23

Note You need to log in before you can comment on or make changes to this bug.