Bug 533703 - libdmraid reports RAID 5 sets as valid sets, but the kernel does not support them.
Summary: libdmraid reports RAID 5 sets as valid sets, but the kernel does not support ...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: dmraid
Version: 12
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Heinz Mauelshagen
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: anaconda_trace_hash:f48bfa50619e4d47a...
: 218192 554602 603418 675583 744045 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-11-08 16:54 UTC by Henrik Nordström
Modified: 2011-10-11 13:30 UTC (History)
21 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-12-04 03:27:03 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Attached traceback automatically from anaconda. (23.93 KB, text/plain)
2009-11-08 16:54 UTC, Henrik Nordström
no flags Details
Attached traceback automatically from anaconda. (68.85 KB, text/plain)
2009-12-07 10:38 UTC, Matt Joiner
no flags Details
Attached traceback automatically from anaconda. (83.19 KB, text/plain)
2009-12-12 16:04 UTC, Terje Røsten
no flags Details
Attached traceback automatically from anaconda. (98.85 KB, text/plain)
2010-01-10 20:47 UTC, Xiaotian Sun
no flags Details
Attached traceback automatically from anaconda. (65.84 KB, text/plain)
2010-01-12 05:31 UTC, JOhn Westerdale
no flags Details
Attached traceback automatically from anaconda. (82.21 KB, text/plain)
2010-02-07 23:32 UTC, Miked
no flags Details

Description Henrik Nordström 2009-11-08 16:54:44 UTC
The following was filed automatically by anaconda:
anaconda 12.45 exception report
Traceback (most recent call first):
  File "/usr/lib64/python2.6/site-packages/block/__init__.py", line 35, in dm_log
    raise Exception, message
  File "/usr/lib64/python2.6/site-packages/block/device.py", line 719, in get_map
    self._RaidSet__map = _dm.map(name=self.name, table=self.rs.dmTable)
  File "/usr/lib64/python2.6/site-packages/block/device.py", line 822, in activate
    self.map.dev.mknod(self.prefix+self.name)
  File "/usr/lib/anaconda/storage/devicetree.py", line 1626, in handleUdevDMRaidMemberFormat
    rs.activate(mknod=True)
  File "/usr/lib/anaconda/storage/devicetree.py", line 1738, in handleUdevDeviceFormat
    self.handleUdevDMRaidMemberFormat(info, device)
  File "/usr/lib/anaconda/storage/devicetree.py", line 1282, in addUdevDevice
    self.handleUdevDeviceFormat(info, device)
  File "/usr/lib/anaconda/storage/devicetree.py", line 1974, in populate
    self.addUdevDevice(dev)
  File "/usr/lib/anaconda/storage/__init__.py", line 339, in reset
    self.devicetree.populate()
  File "/usr/lib/anaconda/storage/__init__.py", line 81, in storageInitialize
    storage.reset()
  File "/usr/lib/anaconda/dispatch.py", line 200, in moveStep
    rc = stepFunc(self.anaconda)
  File "/usr/lib/anaconda/dispatch.py", line 123, in gotoNext
    self.moveStep()
  File "/usr/lib/anaconda/gui.py", line 1195, in nextClicked
    self.anaconda.dispatch.gotoNext()
Exception: device-mapper: reload ioctl failed: Invalid argument

Comment 1 Henrik Nordström 2009-11-08 16:54:50 UTC
Created attachment 368049 [details]
Attached traceback automatically from anaconda.

Comment 2 Hans de Goede 2009-11-08 17:05:06 UTC
This one has my name written all over it I'm afraid, assigning it to me.

Comment 3 Henrik Nordström 2009-11-08 17:39:15 UTC
System state is

1. System originally set up with nvidia fakeraid RAID5.

2. RAID support then disabled in ssytem BIOS, but without first going into the RAID setup utility to delete the RAID array.

3. Install from liveusb image.

seems anaconda then thinks that the drives should be in RAID5 fakeraid setup and crashes.

Full story is as follows:

Had F10 installed on the system, with normal Linux raid setup (not fakeraid).

Tried upgrading to F12 and Anaconda crashed, with the same traceback as in this report I think (not sure, didn't save that one).

Wiped the drives clean (dd if=/dev/zero of=/dev/sd[abc] count=100) as I really wanted to test other things with F12 than fakeraid confusion issues..

Fiddled around in the bios enabling/disabling RAID etc just to be sure it's fully disabled.

Tried to install F12 again from the same liveusb image which worked.


As the other tests have been finished I came back to this, trying to figure out why it crashed. And got the theory that maybe fakeraid had once been enabled in the BIOS many years ago.. and seems it may have been. Also have some memory of a using a nodmraid optoin in earlier installs..

searching old bug reports I found Bug #474049 which seem to be very much related to this.

Comment 4 Henrik Nordström 2009-11-08 17:54:18 UTC
nodmraid works around the problem.

also don't seem to be able to reproduce the original problem where I had F10 installed & running fine and F12 installer crashing.

Installing F12 with nodmraid option asked me to reinitialize the sda drive, and starting the installer again after installing F12 finds the drives just fine even without nodmraid option.

Comment 5 Adam Williamson 2009-11-08 19:03:46 UTC
Hans, as per Bug #499733, the intended behaviour is to ignore disks in this kind of state entirely by default, and to show them as individual disks with 'nodmraid' parameter. Crashing the installer is never intended =)

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 6 Henrik Nordström 2009-11-08 20:29:35 UTC
Some new data.. after the successful nodmraid installation the installer only sees sda, hiding sdb & sdc..

one more datapoint is that for some reason when I ran the installer with nodmraid then sda had been cleaned no longer containint a partition table. Not sure why that happened. Need to investigate more.

Comment 7 Henrik Nordström 2009-11-08 23:38:50 UTC
The cleared partition table was nvidia raid manager job.

still a mystery why sda did not have a RAID signature but the other two drives did however. And don't think we will ever figure that one out.

Comment 8 Hans de Goede 2009-11-09 10:41:53 UTC
(In reply to comment #3)
> System state is
> 
> 1. System originally set up with nvidia fakeraid RAID5.
> 

Ah, RAID 5, we do not support BIOS RAID 5, the problem is dmraid does support this, but the kernel support is not yet upstream. Which explains the
Invalid argument error.

Changing component to dmraid for now, as the RAID 5 support really should go
upstream, or libdmraid should somehow tell us this aint going to work, rather
then giving us a device map which the kernel does not grok.

Comment 9 Bug Zapper 2009-11-16 15:19:16 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 12 development cycle.
Changing version to '12'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 10 Matt Joiner 2009-12-07 10:38:46 UTC
Created attachment 376646 [details]
Attached traceback automatically from anaconda.

Comment 11 Matt Joiner 2009-12-07 10:45:40 UTC
I suspect one of my disks is has some marker or flag indicating RAID/DM, as I previously attempted this under Gentoo. If anyone can indicate to me how to check and remove such flags, and/or disable dmraid during installation that would be great.

Comment 12 Matt Joiner 2009-12-07 10:48:22 UTC
[liveuser@localhost ~]$ liveinst --nodmraid
umount: /media/*: not found
21:46:56 Starting graphical installation.
Loading /lib/kbd/keymaps/i386/qwerty/us.map.gz
ERROR: isw: wrong number of devices in RAID set "isw_djchbgbjib_Volume0" [1/2] on /dev/sda

Comment 13 Hans de Goede 2009-12-07 10:55:59 UTC
Matt, see:
https://fedoraproject.org/wiki/Common_F11_bugs#dmraid-nodmraid

Comment 14 Matt Joiner 2009-12-07 13:35:21 UTC
Hans, this worked great. Thanks very much.

Comment 15 Dave Wysochanski 2009-12-08 18:27:56 UTC
Can this be closed or duped?

Comment 16 Hans de Goede 2009-12-09 09:14:06 UTC
(In reply to comment #15)
> Can this be closed or duped?  

No, put the summary desperately needs updating that, doing that now.

The issue is (quoting from comment #8):

"Changing component to dmraid for now, as the RAID 5 support really should go
upstream, or libdmraid should somehow tell us this aint going to work, rather
then giving us a device map which the kernel does not grok." 

Note when I say "RAID 5 support really should go upstream" I'm talking about the
devicemapper raid kernel code.

Comment 17 Heinz Mauelshagen 2009-12-09 17:02:48 UTC
The whole upstream raid consolidation politics is against including dm-raid45 I'm afraid. Nevertheless we're still trying to get it upstream.

I wonder when we'll start having only one filesystem...

Folks, please speak up and weigh in!

Comment 18 Hans de Goede 2009-12-09 18:42:05 UTC
Hmm,

Will the dmraid45 stuff be included in the RHEL-6 kernel ? Also how about adding it as a patch to the Fedora kernel (if Dave Jones agrees) ?

Comment 19 Heinz Mauelshagen 2009-12-10 12:04:57 UTC
Yes, it'll be part of the RHEL-6 kernel.
Sure, I'd like to see it added to Fedora.

Comment 20 Terje Røsten 2009-12-12 16:04:21 UTC
Created attachment 377885 [details]
Attached traceback automatically from anaconda.

Comment 21 Xiaotian Sun 2010-01-10 20:47:00 UTC
Created attachment 382874 [details]
Attached traceback automatically from anaconda.

Comment 22 JOhn Westerdale 2010-01-12 05:31:06 UTC
Created attachment 383178 [details]
Attached traceback automatically from anaconda.

Comment 23 Hans de Goede 2010-01-15 07:30:35 UTC
*** Bug 554602 has been marked as a duplicate of this bug. ***

Comment 24 Miked 2010-02-07 23:32:26 UTC
Created attachment 389441 [details]
Attached traceback automatically from anaconda.

Comment 25 JOhn Westerdale 2010-02-08 06:12:40 UTC
Should the OS install process consider any "hidden" configuration data on the back of drives, and present it to the Installing Admin?  Perhaps a simple alert that says "ignore the dmraid structures found on disk"? would side step this issue?

Comment 26 Hans de Goede 2010-02-08 08:33:02 UTC
(In reply to comment #25)
> Should the OS install process consider any "hidden" configuration data on the
> back of drives

Yes it should as people expect BIOS RAID to just work, and it does, except for this small RAID5 snarfu.

> and present it to the Installing Admin?  Perhaps a simple alert
> that says "ignore the dmraid structures found on disk"? would side step this
> issue?    

This assumes the person doing the install knows what he is doing. If there is a real RAID5 set there, with part being used by windows, accessing the drives directly will be disastrous.

For people who know what they are doing we have the nodmraid cmdline option to get the installer to just see the raw disks, note that this is not the preferred solution though, if you have still BIOS RAID meta data you should clear it, see:
https://fedoraproject.org/wiki/Common_F11_bugs#dmraid-nodmraid

Comment 27 Miked 2010-02-08 10:56:00 UTC
Additional information about my system.

Currently running Fedora 10.  This problem does not exist in this version.
Nvidia BIOS RAID is disabled and has been since upgrading from Windows Server when Fedora 9 was installed.
System has 6 SATA drives. 
4 500GB drives with several partitions with each partition set across the drives a Linux software RAID 5 array apart from mirrors for the boot and swap partitions.   The  RAID5 arrays are then managed under LVM2.
2 1TB drives with several partitions and each partition set across the drives is mirrored with Linux RAID software.  These are then managed under LVM2 apart from one partition for /usr.

As I am not using BIOS RAID why do I have this problem which also occurred while attempting to upgrade to Fedora 11?  The disks were all completely repartitioned when Fedora9 was installed.

Comment 28 Hans de Goede 2010-02-08 11:59:51 UTC
Miked,

Re-partitioning does not remove the BIOS RAID metadata which lives at the end of the disks. When you stopped using BIOS RAID you should have marked the disks as not being part of a BIOS RAID set in the RAID BIOS, before disabling the RAID BIOS, as you did not do that, they still contain the metadata identifying them
as a RAID SET.

In F-10 and before we used to ignore BIOS RAID metadata we did not understand, which is a very bad thing to do, as that can damage the RAID SET, causing data loss. So now we ignore drives which have BIOS RAID metadata we do not understand.

As explained in the link I gave you:
https://fedoraproject.org/wiki/Common_F11_bugs#dmraid-nodmraid

You can use dmraid -rE /dev/sd? To remove the BIOS RAID metadata from the disks.

Comment 29 Miked 2010-02-09 15:42:45 UTC
Thank you for the explanation, metadata removed fine.  Just to be sure what I was doing, I used dmraid -r /dev/sd? to start with and confirmed that the only data to be removed was that from the obsolete BIOS RAID before adding the E.   The install now finds the disks and can continue.

I seem to recall that on the original upgarde from Windows server I had hoped that I could have continued to use the nvidia RAID.  When it was obvious that this would not be possible I simply changed over to Linux software RAID without thinking that the old RAID would need to be broken up in the BIOS rather than just disabling it.

Comment 30 Hans de Goede 2010-07-01 07:57:51 UTC
*** Bug 603418 has been marked as a duplicate of this bug. ***

Comment 31 Hans de Goede 2010-07-01 07:59:09 UTC
*** Bug 218192 has been marked as a duplicate of this bug. ***

Comment 32 Bryan Mason 2010-07-14 20:56:03 UTC
any plans to include a fix in Fedora 14 or sooner?  I am crossing my fingers...

Comment 33 Heinz Mauelshagen 2010-07-15 05:43:25 UTC
We are working on a dm-target utilizing the MD RAID 4/5/6 personalities in order to get that upstream, which will solve the problem. Crossing my fingers that it'll make it upstream ASAP...

Comment 34 Heinz Mauelshagen 2010-10-22 09:36:03 UTC
Update:

The MD side patches are already upstream.

The missing DM side ones are being reviewed by LVM Team A right now aiming to upstream them ASAP.

Once they go via mainline into rawhide etc., dmraid will be able utilize the offered RAID5 kernel functionality out of the box.

Comment 35 Bug Zapper 2010-11-04 06:40:45 UTC
This message is a reminder that Fedora 12 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 12.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '12'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 12's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 12 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 36 Bug Zapper 2010-12-04 03:27:03 UTC
Fedora 12 changed to end-of-life (EOL) status on 2010-12-02. Fedora 12 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 37 Chris Lumens 2011-02-06 23:52:16 UTC
*** Bug 675583 has been marked as a duplicate of this bug. ***

Comment 38 Chris Lumens 2011-10-11 13:30:09 UTC
*** Bug 744045 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.