Bug 1053505

Summary: Kernel panic when reinstall to device which has multiple path
Product: Red Hat Enterprise Virtualization Manager Reporter: Guohua Ouyang <gouyang>
Component: ovirt-nodeAssignee: Ryan Barry <rbarry>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 2.1.0CC: aberezin, aburden, bsarathy, cshao, dfediuck, fdeutsch, fdinitto, gklein, gouyang, hadong, huiwa, iheim, jboggs, juwu, leiwang, ovirt-maint, rbalakri, rbarry, virt-bugs, yaniwang, ycui
Target Milestone: ---Keywords: Regression
Target Release: 3.5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: node
Fixed In Version: Doc Type: Known Issue
Doc Text:
Previously, reinstalling the Hypervisor to a device with multipath devices resulted in a kernel panic on some hardware devices. Now, disks selected for wiping are checked to see if they are listed more than once and the Hypervisor is reinstalled as expected.
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-02-11 20:50:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Node RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 885987    

Description Guohua Ouyang 2014-01-15 10:48:12 UTC
Created attachment 850437 [details]
kernel panic

Description of problem:
Reinstall rhevh on ibm-x3650m3 RAID-5 run into kernel panic.
On screen it reports:
"Found duplicate PV ceMGFGzzzAZS..HhW: using /dev/sdc4 not /dev/sdb4"
"dracut: FATAL: LV '/dev/sdc4' is a member of VG "HostVG' and must be included in $storage_init"
"dracut: FATAL: Not all member PVs of HostVG are given in the storage_init parameter, exiting"

The workaround is add "storage_init=/dev/sdb,/dev/sdc" into kernel command line.

[root@ibm-x3650m3-01 admin]# multipath -ll
3600605b0042656f017e364b30fb88f64 dm-0 IBM,ServeRAID M5015
size=408G features='0' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  `- 0:2:0:0 sda 8:0  active ready running
36006016002213200006e15b43d38e211 dm-6 DGC,RAID 5
size=100G features='1 queue_if_no_path' hwhandler='1 emc' wp=rw
|-+- policy='round-robin 0' prio=1 status=active
| `- 7:0:0:0 sdb 8:16 active ready running
`-+- policy='round-robin 0' prio=0 status=enabled
  `- 7:0:1:0 sdc 8:32 active ready running

[root@ibm-x3650m3-01 admin]# lspci | grep -i raid
01:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] (rev 05)

However, reinstall rhevh-6.4 over rhevh-6.4 is okay and reinstall rhevh6.4 over rhevh 6.5 is okay.

Version-Release number of selected component (if applicable):
rhevh-6.5-20140110.1.iso

How reproducible:
always

Steps to Reproduce:
1. install rhevh-6.5-20140110.1
2. try reinstall it
3.

Actual results:


Expected results:


Additional info:

Comment 19 Fabian Deutsch 2014-02-27 16:29:44 UTC
Could it be that he IO errors appear because the multipath device is not used?

Comment 22 Fabian Deutsch 2014-06-12 11:15:41 UTC
Any update here?

Otherwise I consider to close this as not a bug because of missing data.

Comment 23 Ryan Barry 2014-06-12 14:25:51 UTC
Fabian -

I'm not able to reproduce this since I don't have the correct environment, but I have a tentative fix for this that I need to resubmit to get an image built for testing...

The problem occurs in our dracut cleanup scripts when multipathed devices show up as multiple disparate devices with the same volume name.

I'll get a build out for testing.

Comment 37 Guohua Ouyang 2014-12-15 05:58:54 UTC
verified on 6.6-20141119.0 build.

Steps:
1. install an old build 6.5-20140430.0 build
2. reinstall 6.5-20140430.0 build, could see the kernel panic as the bug described.
3. reboot and reinstall 6.6-20141119.0 build, no kernel panic happen and it boot into TUI installer page.

Comment 39 errata-xmlrpc 2015-02-11 20:50:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2015-0160.html