Bug 672224

Summary: Domain reboot failed when two disks attached
Product: Red Hat Enterprise Linux 5 Reporter: Qixiang Wan <qwan>
Component: xenAssignee: Xen Maintainance List <xen-maint>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 5.7CC: leiwang, mrezanin, mshao, pbonzini, xen-maint, yuzhang, yuzhou
Target Milestone: rcKeywords: Regression
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: xen-3.0.3-123.el5 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-07-21 09:14:04 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
xend log none

Description Qixiang Wan 2011-01-24 13:30:02 UTC
Created attachment 474959 [details]
xend log

Description of problem:
while there are two disks attached to the domain (both HVM and PV), it will failed to recreate the domain while rebooting.

Version-Release number of selected component (if applicable):
xen-3.0.3-122

How reproducible:
100%

Steps to Reproduce:
1. create domain with 2 disks attached:
disk = ['tap:aio:/data/images/xen/xen-autotest/client/tests/xen/images/RHEL-Server-6.0-32-pv.raw,xvda,w', 'tap:aio:/data/images/xen/xen-autotest/client/tests/xen/images/2nd_disk.raw,xvdb,w']

2. reboot the domain
3. domain will failed to reboot
  
Actual results:
Domain recreate failed

Expected results:
Domain should reboot successfully

Additional info:
This is a regression which is probably introduced by: 
b64805ad92e53384c8070cc96bca68cd2a3b19aa
Remove overhead when checking duplicate devices

xend log is attached.

Comment 1 Qixiang Wan 2011-01-25 05:53:56 UTC
this also cause xm restore failed:

[1] fail to restore guest which has just been saved
$ xm save vm1 /tmp/vm1.save
$ xm restore /tmp/vm1.save
Error: Restore failed
Usage: xm restore <CheckpointFile>

Restore a domain from a saved state.

[2] success when try to restore guest after issue a xm list command before restore
$ xm save vm1 /tmp/vm1.save
$ xm list
$ xm restore /tmp/vm1.save (restore success)

error log in xend for scenario [1]:

Traceback (most recent call last):
  File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomain.py", line 281, in domain_restore_fd
    return XendCheckpoint.restore(self, fd, relocating=relocating)
  File "/usr/lib64/python2.4/site-packages/xen/xend/XendCheckpoint.py", line 283, in restore
    dominfo = xd.restore_(vmconfig, relocating)
  File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomain.py", line 306, in restore_
    dominfo = XendDomainInfo.restore(config, relocating)
  File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 328, in restore
    vm.createDevices()
  File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 2410, in createDevices
    self.createDevice(n, c)
  File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 1503, in createDevice
    self.device_duplicate_check(deviceClass, devconfig)
  File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 697, in device_duplicate_check
    raise VmError('The uname "%s" is already used by another domain' %
VmError: The uname "/data/images/xen/xen-autotest/client/tests/xen/images/RHEL-Server-6.0-32-pv.raw" is already used by another domain
[2011-01-25 14:12:35 xend 26993] DEBUG (XendCheckpoint:101) Available memory: 5758 MiB, guest requires: 2048 MiB

Comment 7 Qixiang Wan 2011-02-10 08:29:44 UTC
the problem hasn't been fixed completely in 123 build:
[1] restore works now
[2] attach multiple nics to DomU with NULL mac address works now
[3] create DomU with multiple disks still fail

so the test failed due to [3].
can reproduce by creating DomU with 2 disks attached:

$ cat guest.cfg
...
disk = [ "file:/data/images/rhel-server-5.5-32-hvm.img,hda,w", "file:/data/images/2nd_disk.img,hda,w" ]
...

$ xm create guest.cfg
Using config file "guest.cfg".
Error: The device "hda" is already defined

$ cat xend.log
...
[2011-02-10 16:47:42 xend.XendDomainInfo 8906] DEBUG (XendDomainInfo:647) Checking for duplicate for uname: /data/images/rhel-server-5.5-32-hvm.img [file:/data/images/rhel-server-5.5-32-hvm.img], dev: hda, mode: w
[2011-02-10 16:47:42 xend 8906] DEBUG (blkif:27) exception looking up device number for hda: [Errno 2] No such file or directory: '/dev/hda'
[2011-02-10 16:47:42 xend 8906] DEBUG (blkif:27) exception looking up device number for hda: [Errno 2] No such file or directory: '/dev/hda'
[2011-02-10 16:47:42 xend.XendDomainInfo 8906] ERROR (XendDomainInfo:243) Domain construction failed
Traceback (most recent call last):
  File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 236, in create
    vm.initDomain()
  File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 2184, in initDomain
    self.createDevices()
  File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 2440, in createDevices
    self.createDevice(n, c)
  File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 1516, in createDevice
    self.device_duplicate_check(deviceClass, devconfig)
  File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 674, in device_duplicate_check
    raise VmError('The device "%s" is already defined' %
VmError: The device "hda" is already defined
...

Comment 9 Miroslav Rezanina 2011-02-11 05:32:52 UTC
Just a note...cat guest.cfg shows two hda devices set..So failure is correct behavior - you try to use duplicate device. Is this typo or do you really use two hda devices??

Comment 10 Qixiang Wan 2011-02-11 05:42:23 UTC
(In reply to comment #9)
> Just a note...cat guest.cfg shows two hda devices set..So failure is correct
> behavior - you try to use duplicate device. Is this typo or do you really use
> two hda devices??

you're right, sorry for the config fault. moving to ON_QA.

Comment 11 Paolo Bonzini 2011-02-11 12:37:37 UTC
How could the domain boot at all the first time?  Maybe that's another bug?

Comment 14 errata-xmlrpc 2011-07-21 09:14:04 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-1070.html

Comment 15 errata-xmlrpc 2011-07-21 12:07:09 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-1070.html