Bug 1178941

Summary: Sometimes a storage device can not be read, if two storage devices point to the same backing file (simulating multipath with SATA)
Product: [Fedora] Fedora Reporter: Fabian Deutsch <fdeutsch>
Component: qemuAssignee: Fedora Virtualization Maintainers <virt-maint>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 21CC: agk, amit.shah, berrange, bmarzins, cfergeau, crobinso, cshao, dfediuck, dwmw2, ecohen, fdeutsch, gklein, heinzm, huiwa, iheim, itamar, jbrassow, leiwang, lilu, lsurette, msnitzer, pbonzini, prajnoha, rjones, scottt.tw, virt-maint, yaniwang, yanwang, ycui
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-11-05 05:08:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
virt-manager definition for vm with mpathed disk
none
Parts of dracut log from a failed boot
none
complete log of a failed attempt none

Description Fabian Deutsch 2015-01-05 17:15:52 UTC
Description of problem:
After installation, RHEV-H can not boot from a mpathed disk who's wwid is given on the kernel cmdline.

Version-Release number of selected component (if applicable):
 rhev-hypervisor7-7.0-20150105.0.1

How reproducible:
Always

Steps to Reproduce:
1. Install RHEV-H in a VM with a multipathed SATA disk (two SATA disks pointing to the same image file, no caching, same serial)
2. Reboot
3. 

Actual results:
Dracut fails to boot

Expected results:
RHEV-H boots

Additional info:

Comment 2 Fabian Deutsch 2015-01-05 20:10:54 UTC
Exact steps to reproduce:

1. Create a VM with multipathed SATA bus or use a real mpath host
2. Install RHEV-H from comment 1 onto the mpathed disk
3. Reboot

Comment 3 Fabian Deutsch 2015-01-05 20:12:12 UTC
This bug is necessary for RHEV 3.5 RC

Comment 4 Fabian Deutsch 2015-01-05 20:25:46 UTC
Created attachment 976576 [details]
virt-manager definition for vm with mpathed disk

The xml defintiion can be used to create the vm which can be used to reproduce this bug.

Comment 5 Fabian Deutsch 2015-01-06 14:55:33 UTC
Created attachment 976832 [details]
Parts of dracut log from a failed boot

This bug seems to be hard to reproduce when rd.debug is set, which undermines that this might be racy (different call time deltas, because of the debug output …).

Anyhow, the attachement should illustrate what is happening, dracut is caught in a loop which tries to find the fs labeled Root, but this never appears, because the partitions (which contain the fs) are not discovered (not seen in the logs).

Comment 6 Fabian Deutsch 2015-01-06 15:18:08 UTC
Created attachment 976837 [details]
complete log of a failed attempt

This logfile is a complete log from a failed attempt

Comment 7 Fabian Deutsch 2015-01-06 17:06:17 UTC
Okay, the issue seems to be in qemu:

Sometimes, in rare cases the partition table can not be read from one of the two devices pointing to the same backing file.
And booting fails because multipath chosse the device which can not be used for reading as the active device, then booting fails.

This happens with the SATA bus, switching to IDE seems to solve it (after the first few tries).

Comment 8 Fabian Deutsch 2015-01-06 19:02:36 UTC
Switching to IDE did not solve the issuem, but switching to raw instead of qcow2 looks promising, for now.

Comment 9 Fedora End Of Life 2015-11-04 11:12:28 UTC
This message is a reminder that Fedora 21 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 21. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '21'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 21 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 10 Cole Robinson 2015-11-04 23:37:39 UTC
Fabian can you still reproduce with fedora 23 qemu?

Comment 11 Fabian Deutsch 2015-11-05 05:08:37 UTC
I can actually not even reproduce it on Fedora 22 anymore.