Bug 505683

Summary:	after F11 fresh install, pre-esisting non-boot data RAID1 disk cannot be mounted
Product:	[Fedora] Fedora	Reporter:	Keith A. Woodbury <keith.woodbury>
Component:	dmraid	Assignee:	Heinz Mauelshagen <heinzm>
Status:	CLOSED CURRENTRELEASE	QA Contact:	Fedora Extras Quality Assurance <extras-qa>
Severity:	medium	Docs Contact:
Priority:	low
Version:	11	CC:	agajania, agk, benhalicki, bmr, dwysocha, hdegoede, heinzm, lvm-team, mbroz, prockai
Target Milestone:	---
Target Release:	---
Hardware:	i386
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2010-05-18 20:18:05 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Keith A. Woodbury 2009-06-12 22:02:35 UTC

Description of problem:
After installing F11, cannot mount pre-existing non-boot RAID1 volume.  Neither dmraid nor mdadm can be used to create a volume that can be mounted.

[root@madmax kwoodbury]# dmraid -ay -vvvv -dddd
WARN: locking /var/lock/dmraid/.lock
NOTICE: /dev/sdc: asr     discovering
NOTICE: /dev/sdc: ddf1    discovering
NOTICE: /dev/sdc: hpt37x  discovering
NOTICE: /dev/sdc: hpt37x metadata discovered
NOTICE: /dev/sdc: hpt45x  discovering
NOTICE: /dev/sdc: isw     discovering
NOTICE: /dev/sdc: jmicron discovering
NOTICE: /dev/sdc: lsi     discovering
NOTICE: /dev/sdc: nvidia  discovering
NOTICE: /dev/sdc: pdc     discovering
NOTICE: /dev/sdc: sil     discovering
NOTICE: /dev/sdc: via     discovering
NOTICE: /dev/sdb: asr     discovering
NOTICE: /dev/sdb: ddf1    discovering
NOTICE: /dev/sdb: hpt37x  discovering
NOTICE: /dev/sdb: hpt37x metadata discovered
NOTICE: /dev/sdb: hpt45x  discovering
NOTICE: /dev/sdb: isw     discovering
NOTICE: /dev/sdb: jmicron discovering
NOTICE: /dev/sdb: lsi     discovering
NOTICE: /dev/sdb: nvidia  discovering
NOTICE: /dev/sdb: pdc     discovering
NOTICE: /dev/sdb: sil     discovering
NOTICE: /dev/sdb: via     discovering
NOTICE: /dev/sda: asr     discovering
NOTICE: /dev/sda: ddf1    discovering
NOTICE: /dev/sda: hpt37x  discovering
NOTICE: /dev/sda: hpt45x  discovering
NOTICE: /dev/sda: isw     discovering
NOTICE: /dev/sda: jmicron discovering
NOTICE: /dev/sda: lsi     discovering
NOTICE: /dev/sda: nvidia  discovering
NOTICE: /dev/sda: pdc     discovering
NOTICE: /dev/sda: sil     discovering
NOTICE: /dev/sda: via     discovering
DEBUG: _find_set: searching hpt37x_cabdfbfhhj
DEBUG: _find_set: not found hpt37x_cabdfbfhhj
DEBUG: _find_set: searching hpt37x_cabdfbfhhj
DEBUG: _find_set: not found hpt37x_cabdfbfhhj
NOTICE: added /dev/sdc to RAID set "hpt37x_cabdfbfhhj"
DEBUG: _find_set: searching hpt37x_cabdfbfhhj
DEBUG: _find_set: found hpt37x_cabdfbfhhj
DEBUG: _find_set: searching hpt37x_cabdfbfhhj
DEBUG: _find_set: found hpt37x_cabdfbfhhj
NOTICE: added /dev/sdb to RAID set "hpt37x_cabdfbfhhj"
DEBUG: checking hpt37x device "/dev/sdc"
DEBUG: checking hpt37x device "/dev/sdb"
DEBUG: set status of set "hpt37x_cabdfbfhhj" to 16
RAID set "hpt37x_cabdfbfhhj" was activated
INFO: Activating mirror raid set "hpt37x_cabdfbfhhj"
NOTICE: discovering partitions on "hpt37x_cabdfbfhhj"
NOTICE: /dev/mapper/hpt37x_cabdfbfhhj: dos     discovering
WARN: unlocking /var/lock/dmraid/.lock
DEBUG: freeing devices of RAID set "hpt37x_cabdfbfhhj"
DEBUG: freeing device "hpt37x_cabdfbfhhj", path "/dev/sdc"
DEBUG: freeing device "hpt37x_cabdfbfhhj", path "/dev/sdb"

now dmraid reports a nominal raid1:

[root@madmax kwoodbury]# dmraid -r
/dev/sdc: hpt37x, "hpt37x_cabdfbfhhj", mirror, ok, 156249999 sectors, data@ 0
/dev/sdb: hpt37x, "hpt37x_cabdfbfhhj", mirror, ok, 156249989 sectors, data@ 10
[root@madmax kwoodbury]# 

but fdisk says there is no partition table

[root@madmax kwoodbury]# fdisk -l

Disk /dev/sda: 120.0 GB, 120000000000 bytes
255 heads, 63 sectors/track, 14589 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x0002dbb2

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1          26      204800   83  Linux
Partition 1 does not end on cylinder boundary.
/dev/sda2              26       14589   116981311   8e  Linux LVM

Disk /dev/dm-0: 117.6 GB, 117671198720 bytes
255 heads, 63 sectors/track, 14306 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000000

Disk /dev/dm-0 doesn't contain a valid partition table

Disk /dev/dm-1: 2113 MB, 2113929216 bytes
255 heads, 63 sectors/track, 257 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000000

Disk /dev/dm-1 doesn't contain a valid partition table

Disk /dev/sdb: 80.0 GB, 80026361856 bytes
16 heads, 63 sectors/track, 155061 cylinders
Units = cylinders of 1008 * 512 = 516096 bytes
Disk identifier: 0x0002ba5f

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *           1      155009    78124504+  fd  Linux raid autodetect

Disk /dev/sdc: 80.0 GB, 80000000000 bytes
16 heads, 63 sectors/track, 155009 cylinders
Units = cylinders of 1008 * 512 = 516096 bytes
Disk identifier: 0x00029815

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1   *           1      155009    78124504+  fd  Linux raid autodetect

Disk /dev/dm-2: 79.9 GB, 79999994368 bytes
255 heads, 63 sectors/track, 9726 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000000

Disk /dev/dm-2 doesn't contain a valid partition table

and mount doesn't know what to do:

[root@madmax kwoodbury]# mount -t ext3 /dev/dm-2 /mnt
mount: wrong fs type, bad option, bad superblock on /dev/dm-2,
       missing codepage or helper program, or other error
       In some cases useful info is found in syslog - try
       dmesg | tail  or so


and dmesg says there is no file system

[root@madmax kwoodbury]# dmesg | tail
vortex: IRQ fifo error
fuse init (API version 7.11)
SELinux: initialized (dev fuse, type fuse), uses genfs_contexts
md: md127 stopped.
md: unbind<sdb1>
md: export_rdev(sdb1)
md: unbind<sdc1>
md: export_rdev(sdc1)
md127: detected capacity change from 79999401984 to 0
VFS: Can't find ext3 filesystem on dev dm-2.
[root@madmax kwoodbury]# 


Version-Release number of selected component (if applicable):


How reproducible:
I tried installation twice and got the same result.


Steps to Reproduce:
1.  Fresh install of F11 in system with existing dmraid (raid1) set
2.  try to mount the dmraid
3.
  
Actual results:
 as above


Expected results:
 mounted system with existing data accessible

Additional info:

Comment 1 Heinz Mauelshagen 2009-06-23 15:16:05 UTC

The partition table shows, that there's MD RAID autodetect types on /dev/sd[bc]1.
Did you ran those 2 partitions as an MD RAID set before ?

OTOH, dmraid detects HPT37X metadata signatures at the beginning of /dev/sd[bc], which means, that the partitions could as well have been mirrored using dmraid before.

Please clarify, how you accessed /dev/sd[bc] before WRT RAID.

What does "dmsetup ls" show after "dmraid -ay" ?

Comment 2 Keith A. Woodbury 2009-06-26 22:16:52 UTC

Following the F11 install, and after this partition would not mount as either mdadm or dmraid, I did use fdisk to change the partition type from Linux to Linux autodetect, hoping that this would help the situation.  it did not appear to make any difference.

******************************************************************
*** Please clarify, how you accessed /dev/sd[bc] before WRT RAID.

I'm fairly positive that I previously accessed the sd[bc] set using mdadm under F6 (the previous system on this box).  However, the drives also cannot be accessed using mdadm.  The behavior now does not reproduce what I saw after the initial install, because I let fsck touch the disk when the "invalid partition table" error occurred. after about two suggestions from fsck I aborted.  But with mdadm, i had to first stop the autodetected /dev/md127 (it would not mount), reassemble the /dev/md0 with existing /dev/sd[bc]1, and this /dev/md0 also complained of "invalid partition table"

Here is what I get now.  First with no dmraid and no mdadm here is fdisk report:

[root@madmax kwoodbury]# fdisk -l

Disk /dev/sda: 120.0 GB, 120000000000 bytes
255 heads, 63 sectors/track, 14589 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x0002dbb2

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1          26      204800   83  Linux
Partition 1 does not end on cylinder boundary.
/dev/sda2              26       14589   116981311   8e  Linux LVM

Disk /dev/dm-0: 117.6 GB, 117671198720 bytes
255 heads, 63 sectors/track, 14306 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000000

Disk /dev/dm-0 doesn't contain a valid partition table

Disk /dev/dm-1: 2113 MB, 2113929216 bytes
255 heads, 63 sectors/track, 257 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000000

Disk /dev/dm-1 doesn't contain a valid partition table

Disk /dev/sdb: 80.0 GB, 80026361856 bytes
16 heads, 63 sectors/track, 155061 cylinders
Units = cylinders of 1008 * 512 = 516096 bytes
Disk identifier: 0x0002ba5f

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *           1      155009    78124504+  fd  Linux raid autodetect

Disk /dev/sdc: 80.0 GB, 80000000000 bytes
16 heads, 63 sectors/track, 155009 cylinders
Units = cylinders of 1008 * 512 = 516096 bytes
Disk identifier: 0x00029815

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1   *           1      155009    78124504+  fd  Linux raid autodetect

Now try to assemble with mdadm - fails complaining about sdc1

[root@madmax kwoodbury]# mdadm --assemble /dev/md0 /dev/sdb1 /dev/sdc1
mdadm: cannot open device /dev/sdc1: Device or resource busy
mdadm: /dev/sdc1 has no superblock - assembly aborted

But oddly if assemble with only one drive /dev/sdb1 then it assembles ok:

[root@madmax kwoodbury]# mdadm --stop /dev/md0
mdadm: stopped /dev/md0
[root@madmax kwoodbury]# mdadm --assemble /dev/md0 /dev/sdb1
mdadm: /dev/md/0 has been started with 1 drive (out of 2).

But the drive still cannot be mounted:

[root@madmax kwoodbury]# mount /dev/md0 /raid1
mount: Stale NFS file handle

I'm pretty sure this bogus "stale NFS file handle" error is a result of the failed fsck operation.  Prior to that I got "invalid partition table" complaints.

*******************************************************************
*** What does "dmsetup ls" show after "dmraid -ay" ?  

[root@madmax kwoodbury]# dmraid -ay
RAID set "hpt37x_cabdfbfhhj" was activated
[root@madmax kwoodbury]# dmsetup ls
vg_madmax-lv_swap       (253, 1)
vg_madmax-lv_root       (253, 0)
hpt37x_cabdfbfhhj       (253, 2)

Comment 3 Ben Halicki 2009-11-15 13:05:37 UTC

Hi All,

Not sure if this is related, but I noticed a similar situation when I upgraded from FC10 (i386) to RHEL 5.4 (x86_64) using Intel controller (ISW).  I can replicate this problem as follows:
1.  Install FC10
2.  Configure Raid 10 (01) array using Intel BIOS.
3.  Activate RAID array (dmraid -ay)
4.  Partition RAID array (fdisk /dev/mapper/isw_bjbhhhjecc_DATA)
5.  Format RAID array (mke2fs /dev/mapper/isw_bjbhhhjecc_DATA)
6.  Mount RAID array (mount /dev/mapper/isw_bjbhhhjecc_DATA /mnt/raid)
7.  ... Upgrade to RHEL 5.4 ...
8.  Activate RAID array under RHEL 5.4 (dmraid -ay)
9.  Mount RAID array (mount /dev/mapper/isw_bjbhhhjecc_DATA /mnt/raid)

Produces this error when I try to mount:
[root@filesvr zone]# mount /dev/mapper/isw_cidcgffehg_DATA /mnt/data
mount: wrong fs type, bad option, bad superblock on /dev/mapper/isw_cidcgffehg_DATA,
       missing codepage or other error
       In some cases useful info is found in syslog - try
       dmesg | tail  or so

tail /var/log/messages:
Nov 16 00:01:55 filesvr kernel: EXT2-fs error (device dm-2): ext2_check_descriptors: Block bitmap for group 1920 not in group (block 0)!
Nov 16 00:01:55 filesvr kernel: EXT2-fs: group descriptors corrupted!

dmesg..
EXT2-fs error (device dm-2): ext2_check_descriptors: Block bitmap for group 1920 not in group (block 0)!
EXT2-fs: group descriptors corrupted!

If I use an FC10 boot disk, I can then remount the raid array under FC10.  If I boot back into RHEL, I get the mount issues.

Comment 4 Hans de Goede 2009-11-15 18:24:28 UTC

(In reply to comment #3)
> Hi All,
> 
> Not sure if this is related, but I noticed a similar situation when I upgraded
> from FC10 (i386) to RHEL 5.4 (x86_64) using Intel controller (ISW).  I can
> replicate this problem as follows:
> 1.  Install FC10
> 2.  Configure Raid 10 (01) array using Intel BIOS.

<snip>

There was a bug in dmraid a while ago (F-10 sounds about right), where it
did not get right which disk was part of which subset. When just using the raid set for data, and only from Linux things seem to work normally, but booting from it would not work because the BIOS assumed another drive order (accessing
it from windows would have issues too).

This means that an Intel RAID 10 set used with the wrong old order effectively
needs to be completely reformated for use with the fixed dmraid (and then never
should be used with the old buggy version).

I hope this explains what you're seeing,

Regards,

Hans

Comment 5 Bug Zapper 2010-04-27 14:50:39 UTC

This message is a reminder that Fedora 11 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 11.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '11'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 11's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 11 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping