Bug 842776

Summary: Can't install from multi-pathed RAID
Product: Red Hat Enterprise Linux 6 Reporter: stephen.paavola
Component: anacondaAssignee: David Lehman <dlehman>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Release Test Team <release-test-team-automation>
Severity: medium Docs Contact:
Priority: low    
Version: 6.2CC: atodorov, sbueno, stephen.paavola
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-03-13 20:03:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description stephen.paavola 2012-07-24 15:04:20 UTC
Description of problem:

I've got a multi-pathed RAID with an EXT3 partition that I use for installing onto my system disk. On the CentOS 5 series, this worked if I used sdb1 (one of the multi-pathed devices) as the --partition argument to harddrive. The multi-pathing didn't get set up until the reboot to the installed system.

With 6.2 I've got multiple challenges:

I can use "sdb1" to get get the stage2 image loaded. However, the first thing Anaconda does when processing the kickstart file is configure and start the DM devices - sdb1 is no longer available.

I've tried a number of variations of linux boot parameters and kickstart options. I can usually get the stage2 file found, but the farthest I get has the following error reported:

"The installation source given by device ['sdb1'] could not be found."

Substitute "/dev/sdb1", "dm-1", "/dev/dm-1" for sdb1 and I've seen them all.

Looking at the log files, Anaconda correctly identifies the multi-paths, correctly identifies the file systems on my 2 partitions, and is able to mount the EXT3 partition. So far I've failed to get it to actually do the install unless I physically remove one SAS cable from the drive. This isn't an acceptable long-term solution. The system will be embedded, and the SAS cables and DVD won't be easily accessible. My preferred process is to modify the GRUB configuration on the system disk and reboot the system to do the install/reinstall.

Version-Release number of selected component (if applicable):


How reproducible: very


Steps to Reproduce:
1. 6.2 install directory on RAID, including .iso's and images directory
2. kickstart configuration file in install directory
3. add ks=hd:sdb1:path-to-cfg to grub's kernel parameters
  
Actual results:

Error message above

Expected results:

6.2 installation

Additional info:

Comment 2 stephen.paavola 2012-08-06 19:01:21 UTC
I've continued looking at this problem, trying to find a workaround. One thing I tried was setting the --partition in the harddrive statement to "UUID=". I noticed 2 things from the anaconda.log:

1) Anaconda is ignoring my stage2= argument to the kernel boot command in favor of the partition/path in the harddrive line. A surprise, but not a problem.

2) Anaconda finds the stage2 file using the UUID= argument. It will also find it as /dev/sdb1, and variations of /dev/disk/...

3) Anaconda isn't able to deal with UUID= to find the ISO's in my repository, or any other "partition" I've tried that is actually valid once multi path is enabled to the raid. It seems to only understand /dev/sd*, and that doesn't work.

Also note that this is Anaconda 13.21.117, not the 17. version with the current Fedora. I haven't figured out how to build/use the 17 version on CentOS 6.2.

Comment 3 stephen.paavola 2012-08-15 20:49:12 UTC
I've been able to get my install working by modifying Anaconda. The issues were:

When the installer first starts, the RAID has 2 block devices for the RAID partition, either of which will work. When Anaconda enables multi-path, there's a 3rd block device that gets created, but the original 2 can't be mounted.

The problems with the original code, and the resolution:

1) I'm using UUID= to specify the partition I want to install from. udev_resolve_devspec in storage/udev.py returns the first block device that matches the UUID, not necessarily one that can be mounted. I added code that looks first in /dev/disk/by-uuid for the UUID before scanning all possible block devices. /dev/disk/by-uuid appears to always link to a block device that works. The scan could/should probably be deleted.

2) DeviceTree.populate in storage/devicetree.py doesn't mark the install partition as protected. This results in Anaconda later complaining that it couldn't find the repository. My fix was to add a function much like udev_resolve_devspec that returns ALL devices that match the devspec, not just the first or current. I added code to DeviceTree.populate after it's enabled multi path that adds all of them to protectDevNames.

Here are my patches - I hope someone gets them into a release and completes testing. I've only tested with UUID=



diff -rc /mnt/ins/usr/lib/anaconda/storage/devicetree.py storage/devicetree.py
*** /mnt/ins/usr/lib/anaconda/storage/devicetree.py	2012-07-02 08:07:46.000000000 +0000
--- storage/devicetree.py	2012-08-09 18:46:42.000000000 +0000
***************
*** 2066,2071 ****
--- 2066,2072 ----
          self.populated = False
  
          # resolve the protected device specs to device names
+         log.debug("DeviceTree.populate: my protectedDevSpecs %s" % (self.protectedDevSpecs))
          for spec in self.protectedDevSpecs:
              name = udev_resolve_devspec(spec)
              if name:
***************
*** 2128,2133 ****
--- 2129,2140 ----
          with open("/etc/multipath.conf", "w+") as mpath_cfg:
              mpath_cfg.write(cfg)
  
+         # resolve the protected device specs to device names again
+ 	# multipath may have created new devices
+         for spec in self.protectedDevSpecs:
+             names = udev_resolve_all_devspec(spec)
+ 	    self.protectedDevNames.extend(names)
+ 
          # Now, loop and scan for devices that have appeared since the two above
          # blocks or since previous iterations.
          while True:
diff -rc /mnt/ins/usr/lib/anaconda/storage/udev.py storage/udev.py
*** /mnt/ins/usr/lib/anaconda/storage/udev.py	2012-07-02 08:07:46.000000000 +0000
--- storage/udev.py	2012-08-09 18:48:04.000000000 +0000
***************
*** 34,39 ****
--- 34,44 ----
      if not devspec:
          return None
  
+     if devspec.startswith("UUID="):
+         ret = os.path.realpath("/dev/disk/by-uuid/%s" % devspec[5:])
+         if ret:
+             return ret[5:]
+ 
      import devices as _devices
      ret = None
      for dev in udev_get_block_devices():
***************
*** 62,67 ****
--- 67,102 ----
      if ret:
          return udev_device_get_name(ret)
  
+ def udev_resolve_all_devspec(devspec):
+     ret = []
+     if not devspec:
+         return ret
+ 
+     import devices as _devices
+     for dev in udev_get_block_devices():
+         if devspec.startswith("LABEL="):
+             if udev_device_get_label(dev) == devspec[6:]:
+ 		ret.append(udev_device_get_name(dev))
+ 
+         elif devspec.startswith("UUID="):
+             if udev_device_get_uuid(dev) == devspec[5:]:
+ 		ret.append(udev_device_get_name(dev))
+ 
+         elif udev_device_get_name(dev) == _devices.devicePathToName(devspec):
+ 	    ret.append(udev_device_get_name(dev))
+ 
+         else:
+             spec = devspec
+             if not spec.startswith("/dev/"):
+                 spec = os.path.normpath("/dev/" + spec)
+ 
+             for link in dev["symlinks"]:
+                 if spec == link:
+ 		    ret.append(udev_device_get_name(dev))
+ 
+     del _devices
+     return ret
+ 
  def udev_resolve_glob(glob):
      import fnmatch
      ret = []

Comment 5 RHEL Program Management 2013-05-06 21:13:44 UTC
This request was evaluated by Red Hat Product Management for
inclusion in a Red Hat Enterprise Linux release.  Product
Management has requested further review of this request by
Red Hat Engineering, for potential inclusion in a Red Hat
Enterprise Linux release for currently deployed products.
This request is not yet committed for inclusion in a release.

Comment 6 David Lehman 2013-08-01 19:10:30 UTC
Are you using multipath or raid or both? If raid, what type and level?

All these questions and others could be answered if you provided /tmp/storage.log from the installation environment, taken after any failure has occurred.

Comment 7 Samantha N. Bueno 2014-01-13 21:18:59 UTC
(In reply to David Lehman from comment #6)
> Are you using multipath or raid or both? If raid, what type and level?
> 
> All these questions and others could be answered if you provided
> /tmp/storage.log from the installation environment, taken after any failure
> has occurred.

Echoing Dave's request for info; please provide us with logs so we can debug this further or confirm that this is not present in 6.5.

Comment 9 Chris Lumens 2014-03-13 20:03:01 UTC
Closing due to lack of details asked for in comment 6 and comment 7.  Feel free to reopen if you can provide the information requested.  Thanks.

Comment 10 stephen.paavola 2019-10-29 13:34:42 UTC
The system I was using to demonstrate this problem has been decommissioned, so this bug should remain closed.