Bug 475024
Summary: | duplicate label message in install with hardware defined software RAID | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Ray Todd Stevens <raytodd> | ||||||||
Component: | anaconda | Assignee: | Hans de Goede <hdegoede> | ||||||||
Status: | CLOSED DUPLICATE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||
Severity: | high | Docs Contact: | |||||||||
Priority: | low | ||||||||||
Version: | 10 | CC: | bloch, bobgus, hdegoede, raina | ||||||||
Target Milestone: | --- | ||||||||||
Target Release: | --- | ||||||||||
Hardware: | All | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2009-02-13 08:10:08 UTC | Type: | --- | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Attachments: |
|
Description
Ray Todd Stevens
2008-12-06 19:53:26 UTC
By the way this is the same server that I also reported bug 446845. Yes, it sounds to me like we are seeing your two hard drives as separate drives with separate filesystems, instead of all together as one RAID device. Can you attach /tmp/aanconda.log and /tmp/syslog to this bug report? I would assume that you are quite correct, although I will say that this machine was generated under 8 and upgraded to 9 with no problems. So something somewhere has changed. Exactly where would I find these files. The system never mounted any volumes that I can find, so exactly what do I need to do here? OK figured it out here is my setup right now /dev/mapper/VolGroup00-LogVol00 470762416 357141132 89322176 80% / /dev/mapper/ddf1_4035305a8680c3272020202020202020eb4e47603a354a45p1 194442 175257 9146 96% /boot tmpfs 1036808 48 1036760 1% /dev/shm under fc9 I can get a logic partition table under sda and sdb, but it works. the raid control is symbios Created attachment 326400 [details]
anoconda log from run
There should be a simpler way of doing this.
Created attachment 326401 [details]
yum log
You didn't say you needed this, but figured it didn't hurt.
Created attachment 326402 [details]
the syslog from the failed run
Yes there should be a simpler way of doing this. That's one of the things I'm working on for F11. A couple of thoughts. I had to manually thought ifconfig enable the network and then do an scp. Just having a simple way to get the network up would be a big help. Also how about the simple ability to use a flash drive. Maybe if the flash drive is connected and has a text file called dumpit in the roon with a list of files then anaconda at its termination regardless of what that is simply dumps those files to the flash drive. Yep, those are all things I'm hoping to take care of with a little script you can run on tty2. The idea is to run a script that'll collect all the information for you, bring up the network if needed, and save to an existing bug, make a new bug, or save to a local disk. Of course, this plan could completely change during the course of working on it. But that's what I'm hoping to do. OK I think I can see the problem in the log. A sda and sdb are both coming up. Now this appears to also happen in my fc9 regular boot, but the boot then is somehow smart enough to only try and mount the raid volume. Wonder what is different? Interestinger and interestinger. Played with this a little more. Found three identical partition tables. sda sdb dm-0 the raid system seems to use a software based system called dmraid. Now as I said this worked with fc9, but not with fc10. OK I have three machines all with the same problem. All are using the dmraid stuff. This is almost definitely where the problem is at. In reading through the bugzilla bug reports it appears that I am by far not the only one experiencing this error. It appears that something has changed in the new fc10 os, where if a drive is part of a software raid system that is not directly the fedora software raid system, that it is now scanned and mounted along with the regular raid device. This did not used to occur. Could this be something unfixed by fixing one of my previous bugs? I was thinking back in fc8 and fc9 I had a problem that I filed a bug report on where I had some drives that had been part of a dmraid configuration, and they were now being used in a machine that didn't support dmraid. However they still apparently had some markings on them somewhere probably in the boot sector, and there was no way to install on them without putting "nodmraid" in the boot lines of the install. I did find out that I no longer need this parameter to do the install. Could this change be what is now causing this problem. OK interesting, from what I can tell one of the system that is experiencing this problem doesn't have volume labels. It appears that it might be something in a conflict between lvm and dmraid (or other hardware defined software raid systems) It seems to come back to the fact that the dmraid drive is found, but so are all of its components. All of them seem to be loaded and then the problem starts. Somehow being part of a hardware defined software raid system should exclude a drive from then being loaded as a regular drive. This is the way it appears to work in previous versions, but now it appears that these drives are loaded and treated as their own separate drives, which seems to be the problem. It appears that just ignoring this message will not work, as it also appears that in all probability an update would then wipe the system out by destroying the raid connection and only loading on the first drive in the array. PS this is if you are using raid 1 which is not the default. You default everything to raid 0 which seems to me to be a less than wise choice. The if one drive fails both are useless. About 4 times the failure risk. Well I have found some more out. First this may or may not be a dmraid driver problem. The dm-0 drive is loaded first before the sd drives. When I do an fdisk on it I get a corrupted partition table. In the logs it looks like it should have loaded from the log messages, but as I said the partition table is basically garbage. Then the sda and sdb drives are loaded. I wonder if dm-0 (none of the other dm drive exist dm-1 thru dm-9) had valid data if the sd drives would in fact load ???? Also as a side I did try to easily pull the loads and about anything else I could think if off the system and send them again, but I tried this with a flash drive. This flash drive loads fine on a fc9 machine in normal run mode, but instead of loading as /media/drive it loaded as /dec/sdc and nothing else, and also had a corrupted but differently corrupted partition table. More and more interesting. One more note from my past experiences. In looking through my system logs I found what may be the entire key here. I had a system that I needed to do several things to. I was going to upgrade it ti fc10 and then add some drives and redo the volume structure a little. However I ran into this same problem. I decided to do the whole thing as one try. Not good. Even as a scratch install with totally new drives the dmraid thing would not let me install. I just decided to forget the whole dmraid thing and go with linux software raid. This worked fine and so it is up, and I didn't file a report on it. But there seems to be something very broken with the dmraid stuff. From a quick look through the bugs here it also appears that other hardware defined software raid systems are also having problems with fc10. This may be where to look for the problem. I also have a set of drives with dmraid and fc9 ready to upgrade to fc10 that are set aside and not in use so I can test something if need be. I've just seen this on a Dell Precision 390. I had been unable to upgrade to F9 owing to a problem with HAL and the hard drives on there. There's a RAID1 array created using Intel ISW - Anaconda seems to be seeing both the array and one of the members, when claiming that there are multiple '/boot' labels. I just had an off line discussion with Adam about his situation, and I think it is relevant so I am going to post the important part of this discussion back here with additional comments.
On Mon, Jan 12, 2009 at 08:43:47AM -0500, Ray Todd Stevens wrote:
> I am courious. You say upgrading TO FC9. I am haivng this problem
> upgrading FC9 to FC10. Would you be upgrading to FC9 or FC10?
>
> Also I found one interesting thing was that if I booted to rescue mode
> ane looked at the partition tables of the drives both drives still
> existed, and had the same valid partition table, but that the raid drive
> existed, claimed to have a partition table, but had garbage for that
> partition table. What is your experience with this?
Hello
The box was originally running Fedora 7 and had been stuck there because
of a separate Anaconda problem, namely that it crashed when parsing the
hard disk names. That has finally been fixed in the F10 installer.
Anyway, I yum upgraded to F9 and then tried preupgrade to get to F10,
which is when I saw the message about duplicate /boot labels.
On this box, /dev/sda, /dev/sdb (the two component drives) and
/dev/mapper/isw_bedfhddfij_ARRAY all have valid partition tables.
Interesting, Yeah I had this same fc7 problem on serveral boxes, including one that is experiencing this problem. I have tried the preupgrade thing and the full upgrade from all three types of disks. All with the same problem.
You will also note that I had the same problem with a scratch install. This appears to be a problem related to the RAID system and is pretty well embedded.
Incidentally it will properly install as a RAID 0 system, and if you install FC 9 on RAID 0 and then upgrade this works fine. Now I do notice that for the full software raid system everything defaults to RAID 0. I am not sure why one would actually use RAID 0 without any additional protection, but that is the default here, and I wonder if RAID 0 is the only thing normally tested.
I also note a number of other reports of this same set of symptoms in bugzilla but attributed to other parts of the system, and or other sets of hardware. It might be good for someone to condense them into a single report and then find out which part of the system is causing the problem and assign it there.
Another possible clue. I have noticed that FC10 has some kind of a quirk where the system that identifies disk drives doesn't seem to either complete or communicate with anaconda properly. There are a number of bug reports out there on this one too. Check the info on Bug #474399 There seems to be some duplication. There are issues in F-10 with anaconda not seeing dmraid setups as a raid set but rather as 2 separate disks, that *might* be what is happening here. I've managed to reproduce and I believe fix this using a system with isw raid. I've provided updates.img files for this here; http://people.atrpms.net/~hdegoede/updates474399-i386.img http://people.atrpms.net/~hdegoede/updates474399-x86_64.img To use this with an i386 install using isw "hardware" raid type the following at the installer bootscreen (press <tab> to get to the cmdline editor): updates=http://people.atrpms.net/~hdegoede/updates474399-i386.img For an x86_64 install use: updates=http://people.atrpms.net/~hdegoede/updates474399-i386.img Please let me know if this resolves the issue for you. Hello Though my hardware is adaptec AAR-1220SA(dmraid ddf), I have installed F10 without problem by your updates.img file. Thanks (In reply to comment #23) > There are issues in F-10 with anaconda not seeing dmraid setups as a raid set > but rather as 2 separate disks, that *might* be what is happening here. I've > managed to reproduce and I believe fix this using a system with isw raid. > I've provided updates.img files for this here; > http://people.atrpms.net/~hdegoede/updates474399-i386.img > http://people.atrpms.net/~hdegoede/updates474399-x86_64.img > > To use this with an i386 install using isw "hardware" raid type the following > at the installer bootscreen (press <tab> to get to the cmdline editor): > updates=http://people.atrpms.net/~hdegoede/updates474399-i386.img > > For an x86_64 install use: > updates=http://people.atrpms.net/~hdegoede/updates474399-i386.img > > Please let me know if this resolves the issue for you. (In reply to comment #24) > Hello > > Though my hardware is adaptec AAR-1220SA(dmraid ddf), > I have installed F10 without problem by your updates.img file. > > Thanks The Adaptec AAR-1220SA looks like a nice board. I am working with the RAID on my Asus P5K-E motherboard. According to the User Manual, it is ¨Intel Matrix Storage Technology through the onboard Intel ICH9R RAID controller¨. I cuncur. I am the original reporter, and it seems to fix the dmraid problem too. It might be good to try and find out how many other of the raid bugs this fixes. With some 'hardware' Raid setups, see Bug #474399 and my system ( Comment #25 ), the problem still remains. The latest commentary on Bug #474399 seems to indicate that if the dmraid code is compiled into the kernel, some hardware raid (mine) does not work. If the dmraid code is compiled as loadable modules, then hardware raid does work. At the moment, the kernel used with FC10 apparently has the dmraid code compiled directly into the kernel. ---- I have upgraded from fc8 to fc9, but not yet fc10. Interesting I have dmraid installed and booting under fc10 under the patch above. I am running raid 1 on two 500 mb drives SATA. If that helps. (In reply to comment #28) > Interesting I have dmraid installed and booting under fc10 under the patch > above. I am running raid 1 on two 500 mb drives SATA. If that helps. 1) Are you running hardware or software RAID? (software works..) 2) If hardware, what kind of hardware? (some hardware does not work..) I am running dmraid, which is a form of raid that is defined in the settings in the bios, but is then fully executed in the drivers in the OS. That I can tell there is absolutely no hardware support on the board for the raid other than the ability to do these bios settings. Full software raid has been running find (the md0 stuff) From what I can tell many of the full hardware raid stuff is also working. It6 is this hybrid that seems to be a problem, and actually seems to be a very bad design idea. I have been moving away from it. Closing this per Comment #26. *** This bug has been marked as a duplicate of bug 474399 *** |