Bug 985638 - Raid 1 array gets stuck in read only after reboot.
Summary: Raid 1 array gets stuck in read only after reboot.
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: lvm2
Version: 19
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Peter Rajnoha
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On: 1014067
Blocks: 1007406
TreeView+ depends on / blocked
 
Reported: 2013-07-18 00:30 UTC by Alan
Modified: 2014-01-15 14:31 UTC (History)
16 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1007406 (view as bug list)
Environment:
Last Closed: 2014-01-15 14:31:12 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
array and lv info. (1.76 KB, text/plain)
2013-07-18 00:31 UTC, Alan
no flags Details
lvmdump-AdRe (438.23 KB, application/x-compressed-tar)
2013-08-22 14:53 UTC, Adam Rempter
no flags Details
lsblk-AdRe (1.43 KB, text/plain)
2013-08-22 14:55 UTC, Adam Rempter
no flags Details
lvmdump-AdRe-beforeFix (433.67 KB, application/x-compressed-tar)
2013-08-22 16:05 UTC, Adam Rempter
no flags Details

Description Alan 2013-07-18 00:30:16 UTC
Description of problem:
Raid 1 array gets stuck in read only after reboot.

Version-Release number of selected component (if applicable):


How reproducible:
Happens at every reboot

Steps to Reproduce:
1.systemctl reboot
2.
3.

Actual results:
Results in not being able to mount filesystem on logical volume lv_opt.

Expected results:
array should be active after reboot.

Additional info:
To correct this issue I have to use fdisk to write partition table to disk.

Comment 1 Alan 2013-07-18 00:31:48 UTC
Created attachment 775029 [details]
array and lv info.

Comment 2 Alan 2013-07-18 00:35:54 UTC
Errors from journalctl -x

Jul 17 14:21:01 knucklehead.dbonenet.com kernel: device-mapper: table: 253:4: linear: dm-linear: Device lookup failed
Jul 17 14:21:01 knucklehead.dbonenet.com kernel: device-mapper: ioctl: error adding target to table

Comment 3 Alan 2013-07-18 13:40:43 UTC
Just for background this box has been running Fedora 18 with the same two
array set-ups fine. This issue started after a fedup upgrade to 19.

Comment 4 Alan 2013-07-19 18:42:35 UTC
Also noticed that devices are not created in /dev at boot for this array.
They are not being created until I write partition table to disk.

I rebooted this box at 7:35.

[root@knucklehead ~]# ls -ltr /dev
total 0
drwxr-xr-x   2 root root           0 Jul 19 03:35 pts
drwxrwxrwt   2 root root          40 Jul 19 03:35 mqueue
drwxr-xr-x   2 root root          60 Jul 19 03:35 raw
drwxr-xr-x   3 root root          60 Jul 19 03:35 bus
lrwxrwxrwx   1 root root          15 Jul 19 07:35 stdout -> /proc/self/fd/1
lrwxrwxrwx   1 root root          15 Jul 19 07:35 stdin -> /proc/self/fd/0
lrwxrwxrwx   1 root root          15 Jul 19 07:35 stderr -> /proc/self/fd/2
lrwxrwxrwx   1 root root          13 Jul 19 07:35 fd -> /proc/self/fd
lrwxrwxrwx   1 root root          11 Jul 19 07:35 core -> /proc/kcore
srw-rw-rw-   1 root root           0 Jul 19 07:35 log
drwxr-xr-x   2 root root          80 Jul 19 07:35 dri
prw-------   1 root root           0 Jul 19 07:35 initctl
crw-rw----   1 root lp        6,   3 Jul 19 07:35 lp3
crw-rw----   1 root lp        6,   2 Jul 19 07:35 lp2
crw-rw----   1 root lp        6,   1 Jul 19 07:35 lp1
crw-rw----   1 root lp        6,   0 Jul 19 07:35 lp0
crw-------   1 root root     10, 238 Jul 19 07:35 vhost-net
crw-------   1 root root    108,   0 Jul 19 07:35 ppp
drwxr-xr-x   2 root root          60 Jul 19 07:35 net
drwxr-xr-x  14 root root         300 Jul 19 07:35 cpu
crw-------   1 root root     10, 234 Jul 19 07:35 btrfs-control
drwxr-xr-x   2 root root         180 Jul 19 07:35 bsg
crw-------   1 root root    251,   5 Jul 19 07:35 usbmon5
crw-------   1 root root    251,   8 Jul 19 07:35 usbmon8
crw-------   1 root root    251,   7 Jul 19 07:35 usbmon7
crw-------   1 root root    251,   6 Jul 19 07:35 usbmon6
crw-------   1 root root    251,   4 Jul 19 07:35 usbmon4
crw-------   1 root root    251,   2 Jul 19 07:35 usbmon2
crw-------   1 root root    251,   1 Jul 19 07:35 usbmon1
crw-------   1 root root    251,   3 Jul 19 07:35 usbmon3
crw-rw----   1 root video    29,   0 Jul 19 07:35 fb0
crw-------   1 root root    249,   0 Jul 19 07:35 fw0
crw-rw----+  1 root cdrom    21,   1 Jul 19 07:35 sg1
crw-rw----   1 root disk     21,   0 Jul 19 07:35 sg0
crw-rw----   1 root disk     21,   2 Jul 19 07:35 sg2
crw-rw----   1 root disk     21,   3 Jul 19 07:35 sg3
crw-------   1 root root    250,   4 Jul 19 07:35 hidraw4
crw-rw----   1 root disk     21,   4 Jul 19 07:35 sg4
crw-------   1 root root    250,   2 Jul 19 07:35 hidraw2
crw-------   1 root root    250,   1 Jul 19 07:35 hidraw1
crw-rw----   1 root dialout   4,  66 Jul 19 07:35 ttyS2
crw-rw----   1 root dialout   4,  64 Jul 19 07:35 ttyS0
crw-rw----   1 root disk     21,   5 Jul 19 07:35 sg5
crw-rw----   1 root dialout   4,  67 Jul 19 07:35 ttyS3
crw-rw----   1 root dialout   4,  65 Jul 19 07:35 ttyS1
crw-------   1 root root    254,   0 Jul 19 07:35 rtc0
lrwxrwxrwx   1 root root           4 Jul 19 07:35 rtc -> rtc0
crw-------   1 root root    250,   3 Jul 19 07:35 hidraw3
crw-------   1 root root    250,   0 Jul 19 07:35 hidraw0
brw-rw----   1 root disk      7,   0 Jul 19 07:35 loop0
brw-rw----   1 root disk      7,   7 Jul 19 07:35 loop7
brw-rw----   1 root disk      7,   5 Jul 19 07:35 loop5
brw-rw----   1 root disk      7,   4 Jul 19 07:35 loop4
brw-rw----   1 root disk      7,   3 Jul 19 07:35 loop3
brw-rw----   1 root disk      7,   1 Jul 19 07:35 loop1
brw-rw----   1 root disk      7,   6 Jul 19 07:35 loop6
brw-rw----   1 root disk      7,   2 Jul 19 07:35 loop2
crw-rw-rw-   1 root root      1,   9 Jul 19 07:35 urandom
crw-rw-rw-   1 root root      1,   8 Jul 19 07:35 random
crw-r-----   1 root kmem      1,   4 Jul 19 07:35 port
crw-------   1 root root      1,  12 Jul 19 07:35 oldmem
crw-rw-rw-   1 root root      1,   3 Jul 19 07:35 null
crw-r-----   1 root kmem      1,   1 Jul 19 07:35 mem
crw-r--r--   1 root root      1,  11 Jul 19 07:35 kmsg
crw-rw-rw-   1 root root      1,   7 Jul 19 07:35 full
crw-rw-rw-   1 root root      1,   5 Jul 19 07:35 zero
crw-------   1 root root     10,  63 Jul 19 07:35 vga_arbiter
crw-------   1 root root     10, 231 Jul 19 07:35 snapshot
crw-------   1 root root     10, 144 Jul 19 07:35 nvram
crw-------   1 root root     10,  60 Jul 19 07:35 network_throughput
crw-------   1 root root     10,  61 Jul 19 07:35 network_latency
crw-------   1 root root     10, 227 Jul 19 07:35 mcelog
crw-------   1 root root     10, 237 Jul 19 07:35 loop-control
crw-------   1 root root     10, 228 Jul 19 07:35 hpet
crw-------   1 root root     10,  62 Jul 19 07:35 cpu_dma_latency
crw-------   1 root root     10, 235 Jul 19 07:35 autofs
crw-------   1 root root      5,   1 Jul 19 07:35 console
crw--w----   1 root tty       4,  18 Jul 19 07:35 tty18
crw--w----   1 root tty       4,  17 Jul 19 07:35 tty17
crw--w----   1 root tty       4,  16 Jul 19 07:35 tty16
crw--w----   1 root tty       4,  15 Jul 19 07:35 tty15
crw--w----   1 root tty       4,  13 Jul 19 07:35 tty13
crw--w----   1 root tty       4,  12 Jul 19 07:35 tty12
crw--w----   1 root tty       4,  11 Jul 19 07:35 tty11
crw--w----   1 root tty       4,  10 Jul 19 07:35 tty10
crw--w----   1 root tty       4,   1 Jul 19 07:35 tty1
crw--w----   1 root tty       4,   0 Jul 19 07:35 tty0
crw-rw-rw-   1 root tty       5,   0 Jul 19 07:35 tty
crw--w----   1 root tty       4,  26 Jul 19 07:35 tty26
crw--w----   1 root tty       4,  24 Jul 19 07:35 tty24
crw--w----   1 root tty       4,  23 Jul 19 07:35 tty23
crw--w----   1 root tty       4,  22 Jul 19 07:35 tty22
crw--w----   1 root tty       4,  21 Jul 19 07:35 tty21
crw--w----   1 root tty       4,  20 Jul 19 07:35 tty20
crw--w----   1 root tty       4,   2 Jul 19 07:35 tty2
crw--w----   1 root tty       4,  19 Jul 19 07:35 tty19
crw--w----   1 root tty       4,  28 Jul 19 07:35 tty28
crw--w----   1 root tty       4,  27 Jul 19 07:35 tty27
crw--w----   1 root tty       4,  25 Jul 19 07:35 tty25
crw--w----   1 root tty       4,  14 Jul 19 07:35 tty14
crw--w----   1 root tty       4,  31 Jul 19 07:35 tty31
crw--w----   1 root tty       4,  30 Jul 19 07:35 tty30
crw--w----   1 root tty       4,   3 Jul 19 07:35 tty3
crw--w----   1 root tty       4,  29 Jul 19 07:35 tty29
crw--w----   1 root tty       4,  37 Jul 19 07:35 tty37
crw--w----   1 root tty       4,  35 Jul 19 07:35 tty35
crw--w----   1 root tty       4,  34 Jul 19 07:35 tty34
crw--w----   1 root tty       4,  32 Jul 19 07:35 tty32
crw--w----   1 root tty       4,   4 Jul 19 07:35 tty4
crw--w----   1 root tty       4,  39 Jul 19 07:35 tty39
crw--w----   1 root tty       4,  38 Jul 19 07:35 tty38
crw--w----   1 root tty       4,  36 Jul 19 07:35 tty36
crw--w----   1 root tty       4,  33 Jul 19 07:35 tty33
crw-------   1 root root    250,   5 Jul 19 07:35 hidraw5
crw--w----   1 root tty       4,  49 Jul 19 07:35 tty49
crw--w----   1 root tty       4,  48 Jul 19 07:35 tty48
crw--w----   1 root tty       4,  47 Jul 19 07:35 tty47
crw--w----   1 root tty       4,  45 Jul 19 07:35 tty45
crw--w----   1 root tty       4,  44 Jul 19 07:35 tty44
crw--w----   1 root tty       4,  43 Jul 19 07:35 tty43
crw--w----   1 root tty       4,  42 Jul 19 07:35 tty42
crw--w----   1 root tty       4,  41 Jul 19 07:35 tty41
crw--w----   1 root tty       4,  40 Jul 19 07:35 tty40
crw--w----   1 root tty       4,  56 Jul 19 07:35 tty56
crw--w----   1 root tty       4,  55 Jul 19 07:35 tty55
crw--w----   1 root tty       4,  54 Jul 19 07:35 tty54
crw--w----   1 root tty       4,  53 Jul 19 07:35 tty53
crw--w----   1 root tty       4,  52 Jul 19 07:35 tty52
crw--w----   1 root tty       4,  51 Jul 19 07:35 tty51
crw--w----   1 root tty       4,  50 Jul 19 07:35 tty50
crw--w----   1 root tty       4,   5 Jul 19 07:35 tty5
crw--w----   1 root tty       4,  46 Jul 19 07:35 tty46
crw-------   1 root root    250,   6 Jul 19 07:35 hidraw6
crw-rw----   1 root tty       7, 128 Jul 19 07:35 vcsa
crw-rw----   1 root tty       7,   0 Jul 19 07:35 vcs
crw--w----   1 root tty       4,   9 Jul 19 07:35 tty9
crw--w----   1 root tty       4,   8 Jul 19 07:35 tty8
crw--w----   1 root tty       4,   7 Jul 19 07:35 tty7
crw--w----   1 root tty       4,  63 Jul 19 07:35 tty63
crw--w----   1 root tty       4,  62 Jul 19 07:35 tty62
crw--w----   1 root tty       4,  61 Jul 19 07:35 tty61
crw--w----   1 root tty       4,  60 Jul 19 07:35 tty60
crw--w----   1 root tty       4,   6 Jul 19 07:35 tty6
crw--w----   1 root tty       4,  59 Jul 19 07:35 tty59
crw--w----   1 root tty       4,  57 Jul 19 07:35 tty57
crw-rw----   1 root tty       7, 129 Jul 19 07:35 vcsa1
crw-------   1 root root    251,   0 Jul 19 07:35 usbmon0
brw-rw----   1 root disk      8,   0 Jul 19 07:35 sda
crw-------   1 root root     10, 223 Jul 19 07:35 uinput
crw--w----   1 root tty       4,  58 Jul 19 07:35 tty58
crw-rw----   1 root disk     21,   6 Jul 19 07:35 sg6
drwxr-xr-x   6 root root         120 Jul 19 07:35 disk
crw-rw----   1 root tty       7,   1 Jul 19 07:35 vcs1
crw-------   1 root root    253,   0 Jul 19 07:35 watchdog0
crw-------   1 root root     10, 130 Jul 19 07:35 watchdog
brw-rw----   1 root disk      8,  80 Jul 19 07:35 sdf
crw-rw-rw-+  1 root kvm      10, 232 Jul 19 07:35 kvm
brw-rw----   1 root disk      8,   2 Jul 19 07:35 sda2
brw-rw----   1 root disk      9, 127 Jul 19 07:35 md127
brw-rw----   1 root disk      8,   1 Jul 19 07:35 sda1
brw-rw----   1 root disk    253,   1 Jul 19 07:35 dm-1
brw-rw----   1 root disk    253,   0 Jul 19 07:35 dm-0
brw-rw----   1 root disk      9, 126 Jul 19 07:35 md126
brw-rw----+  1 root cdrom    11,   0 Jul 19 07:35 sr0
lrwxrwxrwx   1 root root           3 Jul 19 07:35 cdrom -> sr0
brw-rw----   1 root disk    259,   2 Jul 19 07:35 md126p3
brw-rw----   1 root disk    259,   0 Jul 19 07:35 md126p1
brw-rw----   1 root disk      8,  34 Jul 19 07:35 sdc2
drwxr-xr-x   2 root root          60 Jul 19 07:35 vg_knucklehead_02
brw-rw----   1 root disk      8,  35 Jul 19 07:35 sdc3
brw-rw----   1 root disk      8,  33 Jul 19 07:35 sdc1
brw-rw----   1 root disk      9, 125 Jul 19 07:35 md125
drwxr-xr-x   2 root root         100 Jul 19 07:35 fedora_knucklehead
brw-rw----   1 root disk    253,   2 Jul 19 07:35 dm-2
brw-rw----   1 root disk    253,   3 Jul 19 07:35 dm-3
brw-rw----   1 root disk    259,   1 Jul 19 07:35 md126p2
brw-rw----   1 root disk      8,  64 Jul 19 07:35 sde
brw-rw----   1 root disk      8,  48 Jul 19 07:35 sdd
brw-rw----   1 root disk      8,  19 Jul 19 07:35 sdb3
crw-rw----   1 root tty       7,   2 Jul 19 07:35 vcs2
brw-rw----   1 root disk      8,  17 Jul 19 07:35 sdb1
crw-rw----   1 root tty       7, 134 Jul 19 07:35 vcsa6
crw-rw----   1 root tty       7, 133 Jul 19 07:35 vcsa5
crw-rw----   1 root tty       7, 132 Jul 19 07:35 vcsa4
crw-rw----   1 root tty       7, 131 Jul 19 07:35 vcsa3
crw-rw----   1 root tty       7, 130 Jul 19 07:35 vcsa2
crw-rw----   1 root tty       7,   6 Jul 19 07:35 vcs6
crw-rw----   1 root tty       7,   5 Jul 19 07:35 vcs5
crw-rw----   1 root tty       7,   4 Jul 19 07:35 vcs4
crw-rw----   1 root tty       7,   3 Jul 19 07:35 vcs3
brw-rw----   1 root disk      8,  18 Jul 19 07:35 sdb2
drwxr-xr-x   4 root root         580 Jul 19 07:35 input
drwxr-xr-x   3 root root         420 Jul 19 07:35 snd
crw-rw-r--+  1 root root     10,  59 Jul 19 07:35 rfkill
drwxr-xr-x   2 root root          80 Jul 19 07:35 usb
crw-rw-rw-   1 root root     10, 229 Jul 19 07:35 fuse
drwxr-xr-x   2 root root        4640 Jul 19 07:35 char
drwxr-xr-x   3 root root           0 Jul 19 07:35 hugepages
drwxr-xr-x   2 root root          60 Jul 19 07:39 vg_knucklehead_01
drwxr-xr-x   2 root root         160 Jul 19 07:39 mapper
brw-rw----   1 root disk    253,   4 Jul 19 07:39 dm-4
brw-rw----   1 root disk    259,   4 Jul 19 07:39 md124p2
brw-rw----   1 root disk    259,   5 Jul 19 07:39 md124p3
brw-rw----   1 root disk    259,   3 Jul 19 07:39 md124p1
drwxr-xr-x   2 root root         800 Jul 19 07:39 block
drwxr-xr-x   2 root root         240 Jul 19 07:39 md
brw-rw----   1 root disk      9, 124 Jul 19 07:39 md124
drwxrwxrwt   2 root root         160 Jul 19 07:40 shm
brw-rw----   1 root disk      8,  16 Jul 19 09:06 sdb
brw-rw----   1 root disk      8,  32 Jul 19 09:06 sdc
crw-rw-rw-   1 root tty       5,   2 Jul 19 14:17 ptmx

Comment 5 Adam Rempter 2013-07-25 18:47:06 UTC
I have the same behaviour like described above. I use Intel raid.
It worked fine until update to F19

lspci:

00:1f.2 RAID bus controller: Intel Corporation 82801 SATA Controller [RAID mode] (rev 05)


$ sudo mdadm --detail /dev/md127 
/dev/md127:
        Version : imsm
     Raid Level : container
  Total Devices : 2

Working Devices : 2


           UUID : 4dde83ab:330b7152:84e99874:57043ebe
  Member Arrays : /dev/md/Volume1_0

    Number   Major   Minor   RaidDevice

       0       8       16        -        /dev/sdb
       1       8        0        -        /dev/sda


restarting with mdadm helps, but is not workaround really.

I got errors like:

Jul 15 17:05:02 dalek kernel: [  125.719344] device-mapper: table: 253:5: linear: dm-linear: Device lookup failed
Jul 15 17:05:02 dalek kernel: [  125.719400] device-mapper: ioctl: error adding target to table
Jul 15 17:05:02 dalek kernel: [  125.720430] device-mapper: table: 253:6: linear: dm-linear: Device lookup failed
Jul 15 17:05:02 dalek kernel: [  125.720488] device-mapper: ioctl: error adding target to table

and in messages:

Jul 25 20:35:38 dalek udisksd[1572]: Error creating watch for file /sys/devices/virtual/block/md127/md/degraded: No such file or directory (g-file-error-quark, 4)

nodes in /dev are not created when it occurs. I hit it every 2 or 3 reboots

Comment 6 Alan 2013-07-25 19:48:13 UTC
There is a bug report that was due to selinux I found listed in Common Fedora 19 Bugs. https://bugzilla.redhat.com/show_bug.cgi?id=975649

I do not have selinux enabled. But there is an entry that describes my problem
excatly. After a reboot the PV that sits on the raid 1 is made up of the wrong device. Instead of using the raid device /dev/md124p3 it it using one of the members of the raid device. Either /dev/sdb3 or /dev/sdc3.

[root@knucklehead ~]# cat /proc/mdstat
Personalities : [raid0] [raid1] 
md124 : active (auto-read-only) raid1 sdb[1] sdc[0]
      293032960 blocks super external:/md125/0 [2/2] [UU]
      
md125 : inactive sdb[1](S) sdc[0](S)
      6184 blocks super external:imsm
       
md126 : active raid0 sdd[1] sde[0]
      156296192 blocks super external:/md127/0 128k chunks
      
md127 : inactive sde[1](S) sdd[0](S)
      5032 blocks super external:imsm

[root@knucklehead ~]# pvs
  PV           VG                 Fmt  Attr PSize   PFree
  /dev/md126p3 fedora_knucklehead lvm2 a--   74.04g    0 
  /dev/sda2    vg_knucklehead_02  lvm2 a--  698.63g    0 
  /dev/sdb3    vg_knucklehead_01  lvm2 a--   39.45g    0 


I have been using his work around to get things in order after every reboot.

cat /proc/mdstat
pvs
vgchange -an vg_knucklehead_01
mdadm --stop /dev/md124
mdadm -As
vgchange -ay
mount /opt
cat /proc/mdstat

Alan

Comment 7 Adam Rempter 2013-07-26 06:06:05 UTC
(In reply to Alan from comment #6)
> There is a bug report that was due to selinux I found listed in Common
> Fedora 19 Bugs. https://bugzilla.redhat.com/show_bug.cgi?id=975649

I have seen this one, I think I tried the patch posted there, no luck,
then I completly disabled SELinux, but still problem occurs.
I have RAID-0


> I have been using his work around to get things in order after every reboot.
> 
> cat /proc/mdstat
> pvs
> vgchange -an vg_knucklehead_01
> mdadm --stop /dev/md124
> mdadm -As
> vgchange -ay
> mount /opt
> cat /proc/mdstat
> 
> Alan

Workaround works for me as well, but it is pain to run it every reboot...

Comment 8 Ling Li 2013-08-08 19:37:22 UTC
Same here: Upgraded to Fedora 19 and got readonly status after each reboot.  The raid was clean.  I now put the following lines in /etc/rc.d/rc.local:

mdadm --readwrite /dev/md127
pvscan --cache
vgchange -aa

Comment 9 Peter Rajnoha 2013-08-21 09:38:12 UTC
Please, can you confirm that you have devices/md_component_detection=1 set in your /etc/lvm/lvm.conf file. Also, can you reproduce if you set global/use_lvmetad=0 in the lvm.conf?

Comment 10 Alan 2013-08-21 10:18:33 UTC
md_component_detection=1 is set in /etc/lvm/lvm.conf

Cannot reproduce if using use_lvmetad=0 in lvm.conf

So far two reboots after setting use_lvmetad=0 allows to raid array to
come up correctly and the correct entries in /dev/mapper exist for the array.

Comment 11 Adam Rempter 2013-08-21 15:24:12 UTC
$ grep -i use_lvmetad /etc/lvm/lvm.conf 
    # If lvmetad has been running while use_lvmetad was 0, it MUST be stopped
    # before changing use_lvmetad to 1 and started again afterwards.
    use_lvmetad = 0

$ grep -i md_component_detection /etc/lvm/lvm.conf 
    md_component_detection = 1

rebooted and error, came again:
Aug 21 17:16:22 dalek udisksd[2860]: Error creating watch for file /sys/devices/virtual/block/md126/md/sync_action: No such file or directory (g-file-error-quark, 4)
Aug 21 17:16:22 dalek udisksd[2860]: Error creating watch for file /sys/devices/virtual/block/md126/md/degraded: No such file or directory (g-file-error-quark, 4)
Aug 21 17:16:22 dalek udisksd[2860]: Error creating watch for file /sys/devices/virtual/block/md127/md/sync_action: No such file or directory (g-file-error-quark, 4)
Aug 21 17:16:22 dalek udisksd[2860]: Error creating watch for file /sys/devices/virtual/block/md127/md/degraded: No such file or directory (g-file-error-quark, 4)

Comment 12 Peter Rajnoha 2013-08-22 08:49:12 UTC
(In reply to Adam Rempter from comment #11)
> rebooted and error, came again:
> Aug 21 17:16:22 dalek udisksd[2860]: Error creating watch for file
> /sys/devices/virtual/block/md126/md/sync_action: No such file or directory
> (g-file-error-quark, 4)
> Aug 21 17:16:22 dalek udisksd[2860]: Error creating watch for file
> /sys/devices/virtual/block/md126/md/degraded: No such file or directory
> (g-file-error-quark, 4)
> Aug 21 17:16:22 dalek udisksd[2860]: Error creating watch for file
> /sys/devices/virtual/block/md127/md/sync_action: No such file or directory
> (g-file-error-quark, 4)
> Aug 21 17:16:22 dalek udisksd[2860]: Error creating watch for file
> /sys/devices/virtual/block/md127/md/degraded: No such file or directory
> (g-file-error-quark, 4)

If that problem happens, please, try to run "lvmdump -m -u -l" (to collect lvm context and diagnostic info packed in tgz) and also the output of "lsblk -a" and attach it to this bz report. Thanks.

Comment 13 Peter Rajnoha 2013-08-22 10:30:55 UTC
OK, I've managed to reproduce the problem reported by Alan (with use_lvmetad=1), but I still need to investigate Adam's case.

The one I reproduced (with that use_lvmetad=1):


[0] raw/~ # mdadm --create --verbose /dev/md0 --level=mirror --raid-devices=2 /dev/sda /dev/sdb
mdadm: /dev/sda appears to be part of a raid array:
    level=raid1 devices=2 ctime=Thu Aug 22 12:15:05 2013
mdadm: Note: this array has metadata at the start and
    may not be suitable as a boot device.  If you plan to
    store '/boot' on this device please ensure that
    your boot-loader understands md/v1.x metadata, or use
    --metadata=0.90
mdadm: /dev/sdb appears to be part of a raid array:
    level=raid1 devices=2 ctime=Thu Aug 22 12:15:05 2013
mdadm: size set to 130944K
Continue creating array? y
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.

[0] raw/~ # pvcreate /dev/md0
  Physical volume "/dev/md0" successfully created

[0] raw/~ # vgcreate vg /dev/md0
  Volume group "vg" successfully created

[0] raw/~ # lvcreate -l1 vg
  Logical volume "lvol0" created

[0] raw/~ # dmsetup info -c -o name,attr
Name             Stat
fedora_raw-swap  L--w
fedora_raw-root  L--w

[0] raw/~ # vgchange -ay vg
  1 logical volume(s) in volume group "vg" now active

[0] raw/~ # dmsetup info -c -o name,attr
Name             Stat
fedora_raw-swap  L--w
vg-lvol0         L--w
fedora_raw-root  L--w

[0] raw/~ # vgchange -an vg
  0 logical volume(s) in volume group "vg" now active

[0] raw/~ # dmsetup info -c -o name,attr
Name             Stat
fedora_raw-swap  L--w
fedora_raw-root  L--w

[0] raw/~ # mdadm -S /dev/md0
mdadm: stopped /dev/md0

[0] raw/~ # dmsetup info -c -o name,attr
Name             Stat
fedora_raw-swap  L--w
vg-lvol0         ---w
fedora_raw-root  L--w


And assosciated log from dmesg after mdadm -S:

[ 5046.297812] md: md0 still in use.
[ 5046.333193] device-mapper: table: 253:2: linear: dm-linear: Device lookup failed
[ 5046.336647] device-mapper: ioctl: error adding target to table
[ 5046.500320] md0: detected capacity change from 134086656 to 0
[ 5046.503839] md: md0 stopped.
[ 5046.505986] md: unbind<sdb>
[ 5046.506862] md: export_rdev(sdb)
[ 5046.507715] md: unbind<sda>
[ 5046.510534] md: export_rdev(sda)

And udev log after mdadm -S:

[1] raw/~ # udevadm monitor --kernel    
monitor will print the received events for:
KERNEL - the kernel uevent

KERNEL[5046.296878] change   /devices/virtual/block/md0 (block)
KERNEL[5046.330424] add      /devices/virtual/bdi/253:2 (bdi)
KERNEL[5046.331208] add      /devices/virtual/block/dm-2 (block)
KERNEL[5046.511436] change   /devices/virtual/block/md0 (block)
KERNEL[5046.511777] change   /devices/virtual/block/md0 (block)
KERNEL[5046.512331] remove   /devices/virtual/bdi/9:0 (bdi)
KERNEL[5046.512496] remove   /devices/virtual/block/md0 (block)


So there's a CHANGE event followed by REMOVE event when stopping the MD array. The CHANGE event for md0 causes the autoactivation to trigger. At the time the LVM on MD is autoactivated, the MD itself is removed. That explains why we end up with the vg-lvol0 created, but not loaded - the MD underneath is already gone when trying to load the table:

[ 5046.333193] device-mapper: table: 253:2: linear: dm-linear: Device lookup failed

So what we need is to filter out that CHANGE even that's just before REMOVE... That's going to be a little bit harde since the CHANGE even before REMOVE is exactly the same as the CHANGE event that notifies about MD array activation (there's no variable we could make a comparison with). I'll have a think.

This happens only with use_lvmetad=1 (so the LVM autoactivation is used as well). I still need more info for Adam's problem (it might be something else which has similar symptoms... but I'll see more from his lvmdump I hope)

Comment 14 Peter Rajnoha 2013-08-22 10:33:57 UTC
(I've managed to reproduce only with the MD array deactivation - which I don't think is called during boot - the opposite is true - it's activating... so I'm wondering if anything during boot process calls that MD deactivation for some reason...)

Comment 15 Adam Rempter 2013-08-22 14:53:44 UTC
Created attachment 789240 [details]
lvmdump-AdRe

Comment 16 Adam Rempter 2013-08-22 14:55:46 UTC
Created attachment 789252 [details]
lsblk-AdRe

Comment 17 Adam Rempter 2013-08-22 14:56:52 UTC
(In reply to Peter Rajnoha from comment #12)

> If that problem happens, please, try to run "lvmdump -m -u -l" (to collect
> lvm context and diagnostic info packed in tgz) and also the output of "lsblk
> -a" and attach it to this bz report. Thanks.

I have attached requested files.Let me know if you need anything else

Comment 18 Adam Rempter 2013-08-22 16:05:38 UTC
Created attachment 789265 [details]
lvmdump-AdRe-beforeFix

This one is actually before running vgchange -yn, vgchange -ya. Maybe more of use

Comment 19 Peter Rajnoha 2013-09-11 11:42:32 UTC
This patch should fix the issues reported:

  https://git.fedorahosted.org/cgit/lvm2.git/commit/?id=8d1d83504dcf9c86ad42d34d3bd0b201d7bab8f6

I'll try to add that in next F19 update.

Comment 20 Peter Rajnoha 2013-10-01 11:29:24 UTC
(In reply to Peter Rajnoha from comment #19)
> This patch should fix the issues reported:
> 
>  
> https://git.fedorahosted.org/cgit/lvm2.git/commit/
> ?id=8d1d83504dcf9c86ad42d34d3bd0b201d7bab8f6
> 
> I'll try to add that in next F19 update.

We also need a fix for bug #1014067 (dracut).

Comment 21 Peter Rajnoha 2014-01-15 13:14:42 UTC
We hit several more issues with LVM on MD and there were more patches submitted for both LVM and MD in the meantime. Fedora 20 should have all the changes in. Would it be acceptable for you to upgrade to F20?

Comment 22 Alan 2014-01-15 14:09:15 UTC
Where I have not had any issues since implementing the workaround of setting
use_lvmetad=1 on F19. Thing have been ok.
I am planning to upgrade to F20 once the dust settles a bit. So if this issue
is resolved in F20 it is perfectly acceptable.

Comment 23 Peter Rajnoha 2014-01-15 14:31:12 UTC
OK, thanks!


Note You need to log in before you can comment on or make changes to this bug.