Bug 676329 - mpathconf should be able to help setup multipathing in early boot.
mpathconf should be able to help setup multipathing in early boot.
Status: CLOSED WONTFIX
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: device-mapper-multipath (Show other bugs)
6.1
Unspecified Unspecified
unspecified Severity unspecified
: rc
: ---
Assigned To: Ben Marzinski
yanfu,wang
: FutureFeature
Depends On:
Blocks: 756082
  Show dependency treegraph
 
Reported: 2011-02-09 09:51 EST by Steve Dickson
Modified: 2015-06-08 19:20 EDT (History)
15 users (show)

See Also:
Fixed In Version:
Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-06-08 19:20:00 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
A picture of the error (1.84 MB, image/jpeg)
2011-02-09 09:52 EST, Steve Dickson
no flags Details
The change log diff between -86 and -90 (17.99 KB, application/octet-stream)
2011-02-09 13:29 EST, Steve Dickson
no flags Details

  None (edit)
Description Steve Dickson 2011-02-09 09:51:23 EST
Description of problem:
RHEL6.1 kernel later than 2.6.32-86.el6 fail to boot due to the
following failure

fsck.ext3: Device or resource busy while trying to open /dev/sda1
Filesystem mounted or opened exclusively by another program?

The error does not happen on RHEL5, F14 or RHEL6.0 kernels.

Version-Release number of selected component (if applicable):


How reproducible:
every time

Steps to Reproduce:
1.reboot with a kernel later than 2.6.32-86.el6
Comment 1 Steve Dickson 2011-02-09 09:52:22 EST
Created attachment 477834 [details]
A picture of the error
Comment 7 Ben Marzinski 2011-02-09 12:49:36 EST
Unless there is a multipath device which includes /dev/sda that you can see from the debug shell, it seems unlikely that this is multipath.  It's failing in rc.sysinit, which is after multipathd gets shut down in the initramfs, and before the multipathd service is started.  Multipath has already been run at this point, but once the process is done all the file discriptors it had open got freed.  So the only way multipath could have the device open should be it it created a multipath device on top of it.

Running

# dmsetup table

from the debug shell should tell you if there is a dm device that is using sda. Also, is sysfs mounted or can you mount it?

If so, does

# ls /sys/block/sda/holders

say anything?
Comment 8 Ben Marzinski 2011-02-09 13:00:11 EST
can you run the fsck from the debug shell? That would let us know if whatever was holding sda1 is still doing it.
Comment 9 Steve Dickson 2011-02-09 13:27:44 EST
The regression happen between 2.6.32-86.el6 and 2.6.32-90.el6. I will attach the changelog diff...
Comment 10 Steve Dickson 2011-02-09 13:29:36 EST
Created attachment 477880 [details]
The change log diff between -86 and -90
Comment 11 Steve Dickson 2011-02-09 13:33:27 EST
(In reply to comment #7)
> Unless there is a multipath device which includes /dev/sda that you can see
> from the debug shell, it seems unlikely that this is multipath.  It's failing
> in rc.sysinit, which is after multipathd gets shut down in the initramfs, and
> before the multipathd service is started.  Multipath has already been run at
> this point, but once the process is done all the file discriptors it had open
> got freed.  So the only way multipath could have the device open should be it
> it created a multipath device on top of it.
> 
> Running
> 
> # dmsetup table
> 
> from the debug shell should tell you if there is a dm device that is using sda.
> Also, is sysfs mounted or can you mount it?
> 
> If so, does
> 
> # ls /sys/block/sda/holders
> 
> say anything?

Unfortunately I can not because the debug shell will not let me in... every
time I type something a new line is echoed along with the 
 (Type the password or Clnt-D) prompt...
Comment 12 Steve Dickson 2011-02-09 14:03:22 EST
(In reply to comment #11)
> > Running
> > 
> > # dmsetup table
> > 
> > from the debug shell should tell you if there is a dm device that is using sda.
> > Also, is sysfs mounted or can you mount it?
> > 
> > If so, does
> > 
> > # ls /sys/block/sda/holders
> > 
> > say anything?
> 
> Unfortunately I can not because the debug shell will not let me in... every
> time I type something a new line is echoed along with the 
>  (Type the password or Clnt-D) prompt...
Let me correct myself... The system is just hung at this point... 
Typing the root password does nothing... but typing CNT-D does reboot
the box...
Comment 13 Steve Dickson 2011-02-09 14:40:49 EST
System_OS-rhel6_var: 0 6291456 linear 253:8 4194688
1ATA     GB0750C8047                             5QKZ024P            : 0 1465149168 multipath 0 0 1 1 round-robin 0 1 1 8:48 1000 
1ATA     GB0750C8047                             5QK084QE            : 0 1465149168 multipath 0 0 1 1 round-robin 0 1 1 8:64 1000 
System_OS-rhel6_root: 0 4194304 linear 253:8 384
I hacked up a sysinit file... Here is your information


dmsetup table

Home-Dirs: 0 41943040 linear 253:9 384
36001438005deb47100005000004d0000: 0 41943040 multipath 1 queue_if_no_path 0 2 1 round-robin 0 2 1 8:80 100 8:16 100 round-robin 0 2 1 8:96 100 8:112 100 
1ATA     GB0750C8047                             5QK080VV            : 0 1465149168 multipath 0 0 1 1 round-robin 0 1 1 8:32 1000 
36001438005deb47100005000004d0000p1: 0 41929587 linear 253:1 63
1ATA     GB0750C8047                             5QK085MY            p2: 0 1463046850 linear 253:3 2097215
36000393000014fbf01000000d8d7adba: 0 3907043328 multipath 0 0 1 1 round-robin 0 1 1 8:128 1000 
1ATA     GB0750C8047                             5QK085MY            p1: 0 2097152 linear 253:3 63
System_OS-f13_usr: 0 16580608 linear 253:8 31261056
1ATA     GB0750C8047                             5QK085MY            : 0 1465149168 multipath 0 0 1 1 round-robin 0 1 1 8:0 1000 
System_OS-f13_var: 0 6291456 linear 253:8 52035968
System_OS-f13_root: 0 4194304 linear 253:8 47841664
1ATA     GB0750C8047                             5QK080VV            p1: 0 1465144002 linear 253:6 63
System_OS-sys_swap: 0 4194304 linear 253:8 10486144
System_OS-rhel6_usr: 0 16580608 linear 253:8 14680448

ls -a /sys/block/sda/holders
.
..
dm-3


Should dm-3 exists?
Comment 14 Mike Snitzer 2011-02-09 15:09:19 EST
(In reply to comment #13)
> ls -a /sys/block/sda/holders
> .
> ..
> dm-3
> 
> 
> Should dm-3 exists?

No.

I'm willing to bet this doesn't have anything to do with the kernel version and that it is an issue with a newer dracut's multipath support.

When the newer kernel(s) were installed it generated a new initramfs via dracut.  dracut could've easily been updated since the earlier kernel that worked (but earlier kernel version still has older dracut generated initramfs).

To test this theory, simple enough to just rebuild your -86.el6's initramfs with the systems new dracut (first saving the old working -86.el6 initramfs).
Comment 15 Steve Dickson 2011-02-09 15:55:37 EST
Building a new  initramfs with the dracut command from dracut-004 caused
the working kernel (-86) to fail like the new kernels are. So it appears
to be a dracut problem....
Comment 16 Ben Marzinski 2011-02-09 17:01:56 EST
Could you try rebuilding the initramfs with an older version of dracut. I'm not convinced that this is necessarily a dracut problem either.  It seems possible that someone changed their multipath.conf file, and didn't blacklist /dev/sda.

Could you please attach /etc/multipath.conf to the bugzilla.  If the problem is that /etc/multipath.conf is incorrect, then fixing that should solve the problem on all versions of dracut.

However, this does bring up an issue.  Since in this setup no devices need to be set up in early boot, dracut should not even be using the multipath module. This could probably be considered a feature rather than a bug.  I am willing to write some code that will let mpathconf determine if multipath needs to be setup in early boot and also let it spit out an edited multipath.conf file that only sets up multipathing on the early boot devices.  Then if the multipath module was specifically added or omitted, dracut could call mpathconf to see if mutipathing was necessary, and it could also use the editted multipath.conf file, to keep the initramfs from starting multipathing on unnecessary devices.  How does that sound as a 6.2 idea?
Comment 17 Steve Dickson 2011-02-10 08:52:06 EST
(In reply to comment #16)
> Could you try rebuilding the initramfs with an older version of dracut. I'm not
> convinced that this is necessarily a dracut problem either.  It seems possible
> that someone changed their multipath.conf file, and didn't blacklist /dev/sda.
> 
> Could you please attach /etc/multipath.conf to the bugzilla.  If the problem is
> that /etc/multipath.conf is incorrect, then fixing that should solve the
> problem on all versions of dracut.
The /etc/multipath.conf is empty... What should it contain and how should I populate it?
Comment 18 Steve Dickson 2011-02-10 09:59:15 EST
To answer my own question, I found a rhel5 example similar to this:

$ cat /etc/multipath.conf 
blacklist {
	devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st|sda)[0-9]*"
	devnode "^(hd|xvd)[a-z]*"
	device {
		vendor  "*"
		product "*"
	}
}

then I reinstalled a -113 kernel and rebooted... The hang still occurred
so black listing /dev/sda1 did not work.
Comment 19 Ben Marzinski 2011-02-10 10:16:15 EST
That's odd.  Did a multipath device using /dev/sda actually get created, like in comment #13?

If you don't want multipath to create any device, you could try:

blacklist {
    devnode ".*"
}

That's the typical way to blacklist everything.. but yours should have worked as well.  It certainly blacklists everything for me, when I just tried it. If you want me to take a look at the machine, just give me the login info, and I'd be glad to.  Otherwise, can you attach the initramfs that was created when you reinstalled the -113 kernel, so I can take a look at that?
Comment 20 Steve Dickson 2011-02-10 11:00:38 EST
(In reply to comment #19)
> That's odd.  Did a multipath device using /dev/sda actually get created, like
> in comment #13?
> 
> If you don't want multipath to create any device, you could try:
> 
> blacklist {
>     devnode ".*"
> }
> 
This worked.... 

> That's the typical way to blacklist everything.. but yours should have worked
> as well.  It certainly blacklists everything for me, when I just tried it. If
> you want me to take a look at the machine, just give me the login info, and I'd
> be glad to.  Otherwise, can you attach the initramfs that was created when you
> reinstalled the -113 kernel, so I can take a look at that?
If you like I'll send you the info...
Comment 21 Harald Hoyer 2011-02-14 05:08:57 EST
so, this is not a dracut problem... more like a multipath one..
Comment 23 Ben Marzinski 2011-04-28 12:38:36 EDT
I'm keeping this bug open as a feature request to make mpathconf able to determine if multipathing needs to be setup in early boot, and to output a multipath.conf file that blacklists all unnecessary devices, so only the essential ones are setup in early boot.
Comment 24 Sayan Saha 2011-05-27 14:09:39 EDT
This won't make it in RHEL 6.2. Moving this to RHEL 6.3.
Comment 28 Suzanne Yeghiayan 2012-05-18 16:45:58 EDT
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.
Comment 29 Ben Marzinski 2015-06-08 19:20:00 EDT
RHEL 7 solves the related issues differently, and the solutions don't need to be backported to RHEL6, since it currently works the way it is.

Note You need to log in before you can comment on or make changes to this bug.