Bug 445476

Summary: Unable to create a PV guest with > 3 disks
Product: Red Hat Enterprise Linux 5 Reporter: Daniel Berrangé <berrange>
Component: xenAssignee: Daniel Berrangé <berrange>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: urgent    
Version: 5.2CC: gozen, mmatsuya, vsd-xen-devel, xen-maint
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-01-20 21:11:44 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 445789    

Description Daniel Berrangé 2008-05-07 02:15:24 UTC
Description of problem:
I have a guest with 4 disks configured

disk = [ "file:/var/lib/xen/images/rhel5pv.img,xvda,w",
"tap:aio:/var/lib/xen/images/demo-disk-b.img,xvdb,w",
"tap:aio:/var/lib/xen/images/demo-disk-c.img,xvdc,w",
"tap:aio:/var/lib/xen/images/demo-disk-c.img,xvdd,w"]

And it often fail to start up giving a hotplug error

# virsh start rhel5pv
libvir: Xen Daemon error : POST operation failed: (xend.err 'Device 51760 (tap)
could not be connected. xenstore-read backend/tap/3/51904/params failed.')
error: Failed to start domain rhel5pv

If I drop it down to only 3 disks, it'll start fairly reliably. If I increase it
to have 8 disks, it'll never start. 


Version-Release number of selected component (if applicable):
xen-3.0.3-61.el5

How reproducible:
Always

Steps to Reproduce:
1. Configure a guest with 10 disks
2. Start it
3.
  
Actual results:
Fails to boot

Expected results:
Boots

Additional info:

Removing the following code from /etc/xen/scripts/blktap

if [ "$mode" != '!' ]
then
    result=$(check_blktap_sharing "$file" "$mode")
    [ "$result" = 'ok' ] || ebusy "$file already in use by other domain"
fi


fixes the problem allowing it to start with as many as 16 disks.

This code was added in RHEL-5.2 in the patch xen-blktap-sharing.patch for bug 223259

IMHO, the patch needs to be reverted - it causes a serious regression from
RHEL-5.1, for minimal gain.

Comment 4 Daniel Berrangé 2008-05-08 13:41:50 UTC
Fix built for 5.3 in

$ brew latest-pkg dist-5E-qu-candidate xen
Build                                     Tag                   Built by
----------------------------------------  --------------------  ----------------
xen-3.0.3-65.el5                          dist-5E-qu-candidate  berrange

* Thu May  8 2008 Daniel P. Berrange <berrange> - 3.0.3-65.el5
- Remove blktap sharing patch which prevents guests with large
  numbers of disks booting (rhbz #445476)



Please clone this bug for 5.2.z

Comment 7 Fujitsu Xen Team 2008-05-16 10:25:08 UTC
Hi Berrange-san,

We tested start domain with more than 4 disks applying blktap patch.
The result is no problem.

We think the patch is not concerning with your test result.
We would like to check the race condition of other domain's disk and test 
again.

Thank you.                                      Fujitsu) Nishi



Comment 8 Daniel Berrangé 2008-05-16 15:43:07 UTC
The problem is a race condition that depends on many factors. On some machines >
3 disks causes a problem, on other machines it only impacts > 8 disks. When it
hits the errors are clearly coming from the blktap sharing patch. The problem is
that it is trying to read info about other disks out of xenstore, while XenD is
still writing the entries into xenstore.

Comment 13 errata-xmlrpc 2009-01-20 21:11:44 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-0118.html