Bug 1124450

Summary: Wrong default multipath configuration for EL6
Product: [Retired] oVirt Reporter: Patrick Hurrelmann <redhat>
Component: vdsmAssignee: Nir Soffer <nsoffer>
Status: CLOSED CURRENTRELEASE QA Contact: Gil Klein <gklein>
Severity: urgent Docs Contact:
Priority: high    
Version: 3.4CC: amureini, bazulay, bugs, ecohen, gklein, iheim, mgoldboi, nsoffer, rbalakri, redhat, s.kieske, yeylon
Target Milestone: ---Keywords: EasyFix
Target Release: 3.5.1Flags: redhat: needinfo-
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: storage
Fixed In Version: ovirt-3.5.1_rc1 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-01-21 16:05:27 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1108711    
Attachments:
Description Flags
Proposed patch none

Description Patrick Hurrelmann 2014-07-29 14:08:09 UTC
Created attachment 922160 [details]
Proposed patch

Description of problem:

Systems configured to boot from SAN via iSCSI fail to boot after vdsm changed the multipath.conf originally created by anaconda. An updated kernel finally embeds the faulty multipath.conf into the initrd and the boot fails. Multipath is not able to enumerate the multipath disks.

Version-Release number of selected component (if applicable):

Linux test-host.eample.com 2.6.32-431.20.5.el6.x86_64 #1 SMP Fri Jul 25 08:34:44 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

Installed Packages
vdsm.x86_64    4.14.9-0.el6

How reproducible:
Always

Steps to Reproduce:
1. Install system with root on SAN (iSCSI multipath)
2. Install vdsm (vdsm overwrites multipath.conf created by anaconda)
3. Update kernel (faulty multipath.conf is embedded into initrd)

Actual results:
Boot fails, multipath cannot enumerate the disks

Expected results:
Boot succeeds, multipath disks can be enumerated

Additional info:

It all boils down to the default value set for option getuid_callout
in multipath.conf. Vdsm sets it to the following:

getuid_callout  "/sbin/scsi_id --whitelisted --replace-whitespace
--device=/dev/%n"

According to the man-page and other docu the default value on EL6 is
"/lib/udev/scsi_id --whitelisted --device=/dev/%n". Although the binary
/sbin/scsi_id is a valid link to the target /lib/udev/scsi_id, the link
itself (/sbin/scsi_id) is _not_ included in the generated initrd. The
binary /lib/udev/scsi_id is indeed included and changing the default
config to use /lib/udev/scsi_id instead does make it all work again.
iscsi boot (after regenerating the initrd, as the multipath.conf is
embedded) is back to good and the previously logged device-mapper errors
are gone, too.

Comment 1 Allon Mureinik 2014-08-19 14:16:46 UTC
Patrick - thanks for the patch!

Nir, Patrick - who's submitting this to gerrit?

Comment 2 Nir Soffer 2014-08-19 15:16:50 UTC
I will post Patrick patch to gerrit,

Comment 3 Nir Soffer 2014-11-02 20:32:39 UTC
We need more than this patch - we need to upgrade existing multipath conf file created by vdsm versions without this fix.

So the final fix will have to be:
1. Fix the path to scsi_id
2. Bump multipath configuration version, so multipath.conf will be
   upgraded with the correct path when upgrading vdsm.

Comment 4 Nir Soffer 2014-11-03 11:58:00 UTC
Related to bug 1108711

Comment 5 Nir Soffer 2014-11-05 17:08:58 UTC
Patrick, please check your patch in gerrit:
http://gerrit.ovirt.org/34839

Comment 6 Patrick Hurrelmann 2014-11-24 12:27:01 UTC
Sorry for the delay. The patch looks fine and it is already merged.
Thanks Nir for committing

Comment 7 Nir Soffer 2014-12-21 22:58:20 UTC
Allon, any reason that this should not go into 3.5.1?

Comment 8 Allon Mureinik 2014-12-22 08:01:02 UTC
It's already been delayed THREE times (https://www.ovirt.org/OVirt_3.5.z_Release_Management) - at this point we want to decrease uncertainty, not increase it.

Is there a super-pressing need for this fix in 3.5.1?

Comment 9 Nir Soffer 2014-12-22 13:14:11 UTC
(In reply to Allon Mureinik from comment #8)
> It's already been delayed THREE times
> (https://www.ovirt.org/OVirt_3.5.z_Release_Management) - at this point we
> want to decrease uncertainty, not increase it.
> 
> Is there a super-pressing need for this fix in 3.5.1?

I don't see any urgency.

Comment 10 Sandro Bonazzola 2015-01-15 14:25:34 UTC
This is an automated message: 
This bug should be fixed in oVirt 3.5.1 RC1, moving to QA

Comment 11 Sandro Bonazzola 2015-01-21 16:05:27 UTC
oVirt 3.5.1 has been released. If problems still persist, please make note of it in this bug report.