Bug 626964 - snapshot11 boot from external sas array but root device is not in multipath listing
snapshot11 boot from external sas array but root device is not in multipath l...
Status: CLOSED DUPLICATE of bug 636668
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: device-mapper-multipath (Show other bugs)
6.0
All Linux
low Severity high
: rc
: 6.1
Assigned To: Ben Marzinski
Red Hat Kernel QE team
:
Depends On:
Blocks: 564512 563334 580566
  Show dependency treegraph
 
Reported: 2010-08-24 14:45 EDT by Anthony Cheung
Modified: 2015-09-16 19:46 EDT (History)
17 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-11-14 23:48:25 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
several system files and commands output (20.00 KB, application/x-tar)
2010-08-24 14:45 EDT, Anthony Cheung
no flags Details
host messages file (85.84 KB, text/plain)
2010-08-27 11:22 EDT, Anthony Cheung
no flags Details
lsscsi output (313 bytes, text/plain)
2010-08-27 11:22 EDT, Anthony Cheung
no flags Details
original created during OS installation (13.37 MB, application/octet-stream)
2010-08-27 17:33 EDT, Anthony Cheung
no flags Details
new one created afterward (13.21 MB, application/octet-stream)
2010-08-27 17:35 EDT, Anthony Cheung
no flags Details
new files from snapshot13 (20.00 KB, application/x-tar)
2010-09-21 11:12 EDT, Anthony Cheung
no flags Details
patch to make multipath try to assemble the maps with debugging on in the initramfs (430 bytes, patch)
2010-09-22 12:36 EDT, Ben Marzinski
no flags Details | Diff

  None (edit)
Description Anthony Cheung 2010-08-24 14:45:28 EDT
Created attachment 440728 [details]
several system files and commands output

Description of problem:
root device is on external sas array with multiple paths available.  OS was installed and boot up successfully.  But multipath -ll is not showing the root device being under multipathd manage.


Version-Release number of selected component (if applicable):
Snapshot 11

How reproducible:
Always

Steps to Reproduce:
1. Present one lun from external sas array. During install, select the lun from Multipath Device tab.
2. After installation completed and server boot up, multipath -ll will show nothing.
3.
  
Actual results:


Expected results:


Additional info:
/lib/udev/rules.d/40-multipath is still missing, OPTIONS+="link_priority=5"

However, modified 40-multipath still does not help the problem.
Comment 2 Tom Coughlan 2010-08-27 09:02:56 EDT
The diskby.txt seems to show that sda was configured, but sdb (the second path) was not configured. 

Please post /var/log/message showing the start-up messages. 

If there is no sdb, you could try something like:

echo "- - -" > /sys/class/scsi_host/host<h>/scan

to see if it gets configured on a second try, after the system is up. Post any associated messages for that as well. 

Ben, please take a look at the config files and see if they are correct.
Comment 3 Anthony Cheung 2010-08-27 11:21:35 EDT
sdb is configured on the host (see new attachment lsscsi output).  I think /dev/disk sub-directories only shows one instance of the device/partitions and is soft-link to just one path of the device.  But diskby.txt also shows nothing with dm- name.  In contrast, on a working host, one would see something like this
lrwxrwxrwx 1 root root 10 Aug 19 11:17 dm-name-mpatha -> ../../dm-0
lrwxrwxrwx 1 root root 10 Aug 19 11:17 dm-name-mpathap1 -> ../../dm-1
lrwxrwxrwx 1 root root 10 Aug 19 11:17 dm-name-mpathap2 -> ../../dm-2
lrwxrwxrwx 1 root root 10 Aug 19 11:17 dm-name-mpathap3 -> ../../dm-3
lrwxrwxrwx 1 root root 10 Aug 19 11:17 dm-uuid-mpath-36001438002a56fd60000800000180000 -> ../../dm-0
lrwxrwxrwx 1 root root 10 Aug 19 11:17 dm-uuid-part1-mpath-36001438002a56fd60000800000180000 -> ../../dm-1
lrwxrwxrwx 1 root root 10 Aug 19 11:17 dm-uuid-part2-mpath-36001438002a56fd60000800000180000 -> ../../dm-2
lrwxrwxrwx 1 root root 10 Aug 19 11:17 dm-uuid-part3-mpath-36001438002a56fd60000800000180000 -> ../../dm-3
lrwxrwxrwx 1 root root  9 Aug 19 11:17 scsi-36001438002a56fd60000800000180000 -> ../../sdb
lrwxrwxrwx 1 root root  9 Aug 19 11:17 wwn-0x6001438002a56fd60000800000180000 -> ../../sdb

I am also attaching the message file.
Comment 4 Anthony Cheung 2010-08-27 11:22:04 EDT
Created attachment 441542 [details]
host messages file
Comment 5 Anthony Cheung 2010-08-27 11:22:33 EDT
Created attachment 441543 [details]
lsscsi output
Comment 6 Ben Marzinski 2010-08-27 12:49:29 EDT
Looking at the attached multipath.conf, you are blacklisting every device except the ones with the wwid of 3600c0ff000f0043958fbf64901000000 (sda and sdb).  So those are the only devices that could be multipathed.  Looking at the multipath output, it appears that multipath does think that these devices are valid. however it can't create the multipath device because sda is already in use (as the root device).  Whatever problem you are seeing, it's happening in the initramfs, so what is in /etc/multipath.conf is irrelevant.  multipath needs to create the multipath device before the root filesystem is mounted. Could you attach your initramfs-2.6.32-63.el6.x86_64.img?
Comment 7 Anthony Cheung 2010-08-27 17:33:36 EDT
Created attachment 441616 [details]
original created during OS installation

Argh, attached these earlier but to the wrong bug!!  Here they're
Comment 8 Anthony Cheung 2010-08-27 17:35:29 EDT
Created attachment 441617 [details]
new one created afterward

For experiment, this new one was created after multipath.conf multipath{} part was commented out.
Comment 9 Anthony Cheung 2010-08-27 19:58:46 EDT
Well, I played with another server with similar setup and that's working fine.  Installed, boot-up, and root device under multipath manage as expected.

Let me know if you find anything in the initramfs obvious wrong.  Otherwise, I can try install again on the original server.
Comment 10 Anthony Cheung 2010-09-10 13:46:08 EDT
Hmm somehow there has been no response to needinfo??

I went ahead and tried snapshot13.  I have 2 servers and I am installing them to same SAS external array.  Each server has just 1 lun presented to it with 2 available paths.  I am pretty sure I went the install same way for both servers.  At the end 1 server (snowy) came up as expected, multipathd running, boot lun is showing in multipath -ll listing, root directory mounted from /dev/mapper/mpathbp1.  But the other server (jersey) is still not coming up right.  As before boot lun does not have multipath mapping.  The only difference I can see is that jersey is using 3Gb LSI sas adapter (driver mptsas) and snowy is using 6Gb LSI sas adapter (driver mpt2sas).

Let me know what info I can provide.
Comment 11 Paul Hinchman 2010-09-17 17:09:16 EDT
Please respond, product launch depends on resolution to this.
Comment 12 Ben Marzinski 2010-09-20 13:43:49 EDT
The best guess I have is that multipath might not be building the device because something else has the file open already, either temporarily or permanently.

The only other possible issue I see is that multipath.conf sets the alias to be mpathb.  The mpathX names are reserved for user_friendly_names.  If that alias is already taken by a device with a default user_friendly_name by the time that multipath tries to build the device with the hardcoded alias, it will fail.
Comment 13 Anthony Cheung 2010-09-21 11:12:46 EDT
Created attachment 448721 [details]
new files from snapshot13

Ben, attachment you looked at were from snapshot11.  Here are the new ones and multipath.conf is the original one (no change at all), right after snapshot13 install.  There is only 1 device from the array presented to the system, the boot lun.  So no device should be using mpathb name.
Comment 14 Ben Marzinski 2010-09-22 12:29:45 EDT
I see the same thing here. multipath is all set up to make the device. However it can't because the device is apparenly in use.  If you give me the output of /var/log/messages, from when you run this command, I verify that. But since the root device is obviously mounted, it's pretty clear that the device is already in use.  Looking at the initramfs stuff you sent me, multipath should be getting called before the code to mount the root filesystem.  If there aren't other devices being mounted, then my best guess is that something is temporarily keeping the device busy.

My best idea for debugging this would be to edit the initramfs to print out debug messages.
Comment 15 Ben Marzinski 2010-09-22 12:36:10 EDT
Created attachment 448981 [details]
patch to make multipath try to assemble the maps with debugging on in the initramfs

This patches pre-trigger/02multipathd.sh in the initramfs to make multipath run with the verbosity turned up before it starts multipathd.  If you manually remake your initramfs with this patch added, you should get multipath to print out what's going on to the console during boot.

Can you please reboot with the patch and send me the console messages from during bootup. Hopefully, this will tell me why multipath isn't building the path in the initramfs.
Comment 16 Ben Marzinski 2010-09-22 13:00:41 EDT
If you have problems remaking the initrd, let me know, and I can either give you some instructions, or if you post the initrd you want patched, I can do it for you.
Comment 17 Anthony Cheung 2010-09-22 13:55:03 EDT
Ben, I should use the patch to change /usr/share/dracut/modules.d/90multipath/multipathd.sh then make a new initramfs, correct?
Comment 18 Ben Marzinski 2010-09-22 15:50:16 EDT
Yeah, That should work.
Comment 19 Beth Zeranski 2010-09-22 16:08:35 EDT
Anthony,
Can you try this in RC?

thanks,
 Beth
Comment 20 Ben Marzinski 2010-11-14 23:04:53 EST
Is there a reason that this bug can't just be closed as a duplicate of 636668? Was it supposed to be used for something else?
Comment 21 Beth Zeranski 2010-11-14 23:48:25 EST
Dup'ing works for me. As Ben recommended, dup'ing to 636668.

thanks,
 Beth

*** This bug has been marked as a duplicate of bug 636668 ***
Comment 22 Anthony Cheung 2010-11-15 11:07:28 EST
This bug was where 636668 started.  Later on Beth dupli to 636668 with limited group/viewing.

Note You need to log in before you can comment on or make changes to this bug.