Bug 544950

Summary: Trying to save a PV guest with a pci device assigned makes the PV guest hang there
Product: Red Hat Enterprise Linux 5 Reporter: Yufang Zhang <yuzhang>
Component: xenAssignee: Linqing Lu <lilu>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: low    
Version: 5.4CC: clalance, ddutile, drjones, gshipley, leiwang, minovotn, mshao, xen-maint
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: xen-3.0.3-115.el5 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-01-13 22:19:47 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 514500    
Attachments:
Description Flags
xend.log
none
xend.log for above comment
none
xend.log for above comment
none
The whole xend.log
none
full xend.log of above comment
none
full xend.log for comment #14
none
Bash script to unbind Intel 82541PI and bind to PCI back
none
Patch to implement correct resume handling on failed save
none
log info for the "xm save" operation none

Description Yufang Zhang 2009-12-07 05:56:21 UTC
Created attachment 376587 [details]
xend.log

Description of problem:
When trying to save a PV guest with a pci device assigned to it, the PV guest hang there without any response. 

Version-Release number of selected component (if applicable):
xen-3.0.3-94.el5

How reproducible:
Always

Steps to Reproduce:

In Domain0:

# xm pci-list-assignable-devices
0000:03:00.0

# xm cr /etc/xen/test_pv pci="0000:03:00.0"
Using config file "/etc/xen/test_pv".
file /root/pv.img
Started domain PvDomain

# xm save PvDomain PvDomain.save
Error: Migration not permitted with assigned PCI device.
Usage: xm save <Domain> <CheckpointFile>

Save a domain state to restore later.

# xm li 
Name                                      ID Mem(MiB) VCPUs State   Time(s)
Domain-0                                   0     3409     4 r-----   1646.1
migrating-PvDomain                         6      511     4 -b----     17.0


Also we got error output from within PV guest after we try to save the PV guest:

pcifront pci-0: pciback not responding!!!
get no response from backend for disable MSI
pcifront pci-0: pciback not responding!!!
pcifront pci-0: pciback not responding!!!
pcifront pci-0: pciback not responding!!!
pcifront pci-0: 22 freeing event channel 15

  
Actual results:
Xen prevent you from saving PV guest, but the PV guest hang there with no response after you trying to save it. Also the name of PV guest changed from 'PvDomain' to 'migrating-PvDomain'.

Expected results:
Xen prevent you from saving PV guest and the guest works fine even xm save failed.

Additional info:

Comment 2 Michal Novotny 2010-07-20 13:01:10 UTC
(In reply to comment #0)
> Created an attachment (id=376587) [details]
> xend.log
> 
> Description of problem:
> When trying to save a PV guest with a pci device assigned to it, the PV guest
> hang there without any response. 
> 
> Version-Release number of selected component (if applicable):
> xen-3.0.3-94.el5
> 
> How reproducible:
> Always
> 
> Steps to Reproduce:
> 
> In Domain0:
> 
> # xm pci-list-assignable-devices
> 0000:03:00.0
> 
> # xm cr /etc/xen/test_pv pci="0000:03:00.0"
> Using config file "/etc/xen/test_pv".
> file /root/pv.img
> Started domain PvDomain
> 
> # xm save PvDomain PvDomain.save
> Error: Migration not permitted with assigned PCI device.
> Usage: xm save <Domain> <CheckpointFile>
> 
> Save a domain state to restore later.
> 
> # xm li 
> Name                                      ID Mem(MiB) VCPUs State   Time(s)
> Domain-0                                   0     3409     4 r-----   1646.1
> migrating-PvDomain                         6      511     4 -b----     17.0
> 
> 
> Also we got error output from within PV guest after we try to save the PV
> guest:
> 
> pcifront pci-0: pciback not responding!!!
> get no response from backend for disable MSI
> pcifront pci-0: pciback not responding!!!
> pcifront pci-0: pciback not responding!!!
> pcifront pci-0: pciback not responding!!!
> pcifront pci-0: 22 freeing event channel 15
> 
> 
> Actual results:
> Xen prevent you from saving PV guest, but the PV guest hang there with no
> response after you trying to save it. Also the name of PV guest changed from
> 'PvDomain' to 'migrating-PvDomain'.
> 
> Expected results:
> Xen prevent you from saving PV guest and the guest works fine even xm save
> failed.
> 
> Additional info:    

Well, could you please try with the latest version of xen package Yufang? There was some code to reconnect the backend devices and obviously the PCI backend device was not reconnected correctly. You can try latest virttest version of xen package on http://people.redhat.com/mrezanin/xen .

Please let us know when testing is done, unfortunately I'm not having hardware to do PCI passthrough with.

Thanks,
Michal

Comment 3 Yufang Zhang 2010-07-21 06:36:54 UTC
(In reply to comment #2)
> (In reply to comment #0)
> > Created an attachment (id=376587) [details] [details]
> > xend.log
> > 
> > Description of problem:
> > When trying to save a PV guest with a pci device assigned to it, the PV guest
> > hang there without any response. 
> > 
> > Version-Release number of selected component (if applicable):
> > xen-3.0.3-94.el5
> > 
> > How reproducible:
> > Always
> > 
> > Steps to Reproduce:
> > 
> > In Domain0:
> > 
> > # xm pci-list-assignable-devices
> > 0000:03:00.0
> > 
> > # xm cr /etc/xen/test_pv pci="0000:03:00.0"
> > Using config file "/etc/xen/test_pv".
> > file /root/pv.img
> > Started domain PvDomain
> > 
> > # xm save PvDomain PvDomain.save
> > Error: Migration not permitted with assigned PCI device.
> > Usage: xm save <Domain> <CheckpointFile>
> > 
> > Save a domain state to restore later.
> > 
> > # xm li 
> > Name                                      ID Mem(MiB) VCPUs State   Time(s)
> > Domain-0                                   0     3409     4 r-----   1646.1
> > migrating-PvDomain                         6      511     4 -b----     17.0
> > 
> > 
> > Also we got error output from within PV guest after we try to save the PV
> > guest:
> > 
> > pcifront pci-0: pciback not responding!!!
> > get no response from backend for disable MSI
> > pcifront pci-0: pciback not responding!!!
> > pcifront pci-0: pciback not responding!!!
> > pcifront pci-0: pciback not responding!!!
> > pcifront pci-0: 22 freeing event channel 15
> > 
> > 
> > Actual results:
> > Xen prevent you from saving PV guest, but the PV guest hang there with no
> > response after you trying to save it. Also the name of PV guest changed from
> > 'PvDomain' to 'migrating-PvDomain'.
> > 
> > Expected results:
> > Xen prevent you from saving PV guest and the guest works fine even xm save
> > failed.
> > 
> > Additional info:    
> 
> Well, could you please try with the latest version of xen package Yufang? There
> was some code to reconnect the backend devices and obviously the PCI backend
> device was not reconnected correctly. You can try latest virttest version of
> xen package on http://people.redhat.com/mrezanin/xen .
> 
> Please let us know when testing is done, unfortunately I'm not having hardware
> to do PCI passthrough with.
> 
> Thanks,
> Michal    

Hi Michal,
I test this bug with the latest xen package(xen-3.0.3-114.el5), problem still exists but with different behaviour:
(1) Create a RHEL5.4 PV guest with PCI device assigned 
(2) Try to save the PV guest 

Xend does give error output which says trying to save a PV guest with PCI device assigned is not permitted. But the PV guest disappeared after that. And logs for xend told us that the VM was destroyed. You could find detailed information from xend.log in next comment

Comment 4 Yufang Zhang 2010-07-21 06:38:15 UTC
Created attachment 433307 [details]
xend.log for above comment

Comment 5 Michal Novotny 2010-07-21 07:01:07 UTC
(In reply to comment #4)
> Created an attachment (id=433307) [details]
> xend.log for above comment    

Well, it looks like it's disabled according to following lines:
...
[2010-07-21 14:01:26 xend 3684] INFO (pciquirk:91) NO quirks found for PCI device [8086:10b9:8086:1093]
[2010-07-21 14:01:26 xend 3684] DEBUG (pciquirk:131) Permissive mode NOT enabled for PCI device [8086:10b9:8086:1093]
...

This is most probably the configuration issue since according to this this is not permitted for PCI device with id 8086:10b9:8086:1093.

According to the code the configuration should be in /etc/xen/xend-pci-permissive.sxp file so I *think* there should be definition like:

(unconstrained_dev_ids
     ('8086:10b9:8086:1093')
)

According to the /usr/lib64/python2.4/site-packages/xen/xend/server/pciif.py file there's a call to xc.domain_ioport_permission() function but I guess this is not right and there may be a bug that it goes there even if it's not permitted which may be the reason why the guest disappear after that.

When looking at the code it should also write to /sys/bus/pci/drivers/pciback/permissive node however it doesn't have effect here. It seems there's 3 (No such process) error in the xc.domain_ioport_permission() function which makes it fail. xc_domain_ioport_permission() is a hypercall to XEN_DOMCTL_ioport_permission but I'm not wise from that. Could you please attach the full xend.log ?

Thanks,
Michal

Comment 6 Yufang Zhang 2010-07-21 07:05:27 UTC
Things go worse when testing with latest virttest version of xen package: Trying to start a PV guest with PCI device assigned just failed.

# rpm -qa | grep xen
kernel-xen-devel-2.6.18-206.el5
xen-3.0.3-113.el5virttest30
xen-debuginfo-3.0.3-113.el5virttest30
kernel-xen-2.6.18-206.el5
xen-devel-3.0.3-113.el5virttest30
xen-libs-3.0.3-113.el5virttest30

# xm pci-list-assignable-device
0000:03:00.0

# xm cr /tmp/rhel5.4-64-pv.cfg 
Using config file "/tmp/rhel5.4-64-pv.cfg".
Using <class 'grub.GrubConf.GrubConfigFile'> to parse /grub/menu.lst
Error: (22, 'Invalid argument')

Some interesting information would be found in xend.log at next comment.

Comment 7 Yufang Zhang 2010-07-21 07:08:12 UTC
Created attachment 433310 [details]
xend.log for above comment

Comment 8 Yufang Zhang 2010-07-21 07:09:55 UTC
Created attachment 433311 [details]
The whole xend.log

Comment 9 Yufang Zhang 2010-07-21 07:10:46 UTC
(In reply to comment #5)
> (In reply to comment #4)
> > Created an attachment (id=433307) [details] [details]
> > xend.log for above comment    
> 
> Well, it looks like it's disabled according to following lines:
> ...
> [2010-07-21 14:01:26 xend 3684] INFO (pciquirk:91) NO quirks found for PCI
> device [8086:10b9:8086:1093]
> [2010-07-21 14:01:26 xend 3684] DEBUG (pciquirk:131) Permissive mode NOT
> enabled for PCI device [8086:10b9:8086:1093]
> ...
> 
> This is most probably the configuration issue since according to this this is
> not permitted for PCI device with id 8086:10b9:8086:1093.
> 
> According to the code the configuration should be in
> /etc/xen/xend-pci-permissive.sxp file so I *think* there should be definition
> like:
> 
> (unconstrained_dev_ids
>      ('8086:10b9:8086:1093')
> )
> 
> According to the /usr/lib64/python2.4/site-packages/xen/xend/server/pciif.py
> file there's a call to xc.domain_ioport_permission() function but I guess this
> is not right and there may be a bug that it goes there even if it's not
> permitted which may be the reason why the guest disappear after that.
> 
> When looking at the code it should also write to
> /sys/bus/pci/drivers/pciback/permissive node however it doesn't have effect
> here. It seems there's 3 (No such process) error in the
> xc.domain_ioport_permission() function which makes it fail.
> xc_domain_ioport_permission() is a hypercall to XEN_DOMCTL_ioport_permission
> but I'm not wise from that. Could you please attach the full xend.log ?
> 
> Thanks,
> Michal    

Michal, the full xend.log is in comment #8.

Comment 10 Michal Novotny 2010-07-21 07:18:44 UTC
Well, I don't know what caused the invalid argument issue but I'm building the new version of xen package with some debugging message added for testing purposes since I can't test this one myself. I doubt the invalid argument message is PCI-related but I *think* that for PCI device assignment you should have the device ID in the pci-list-assignable-devices output.

Michal

Comment 11 Michal Novotny 2010-07-21 07:25:40 UTC
(In reply to comment #10)
> Well, I don't know what caused the invalid argument issue but I'm building the
> new version of xen package with some debugging message added for testing
> purposes since I can't test this one myself. I doubt the invalid argument
> message is PCI-related but I *think* that for PCI device assignment you should
> have the device ID in the pci-list-assignable-devices output.
> 
> Michal    

Could you please try using the http://people.redhat.com/minovotn/xen version of xen package and provide me the xend.log from testing?

Thanks,
Michal

Comment 12 Yufang Zhang 2010-07-21 07:37:30 UTC
(In reply to comment #10)
> Well, I don't know what caused the invalid argument issue but I'm building the
> new version of xen package with some debugging message added for testing
> purposes since I can't test this one myself. I doubt the invalid argument
> message is PCI-related but I *think* that for PCI device assignment you should
> have the device ID in the pci-list-assignable-devices output.
> 
> Michal    

The problem is due to I restart xend before create the PV guest. I can reproduce this scenario even with xen-3.0.3-114:

(1) Boot the host
(2) Unbind pci device from original device driver and bind it to pciback
(3) # xm pci-list-assignable-device
      0000:03:00.0
(4) Restart xend
(5) Try to create the machine with the PCI device assinged

At step(5), you could see the error output:
# xm cr /tmp/rhel5.4-64-pv.cfg 
Using config file "/tmp/rhel5.4-64-pv.cfg".
Using <class 'grub.GrubConf.GrubConfigFile'> to parse /grub/menu.lst
Error: (22, 'Invalid argument')

Without restarting xend, I could create the PV guest with PCI device assigned successfully.

Comment 13 Yufang Zhang 2010-07-21 07:39:25 UTC
Created attachment 433322 [details]
full xend.log of above comment

Comment 14 Yufang Zhang 2010-07-21 09:33:22 UTC
(In reply to comment #11)
> (In reply to comment #10)
> > Well, I don't know what caused the invalid argument issue but I'm building the
> > new version of xen package with some debugging message added for testing
> > purposes since I can't test this one myself. I doubt the invalid argument
> > message is PCI-related but I *think* that for PCI device assignment you should
> > have the device ID in the pci-list-assignable-devices output.
> > 
> > Michal    
> 
> Could you please try using the http://people.redhat.com/minovotn/xen version of
> xen package and provide me the xend.log from testing?
> 
> Thanks,
> Michal    

Reproduce this bug with your latest xen package:

# xm pci-list-assignable-device 
0000:03:00.0

# xm cr /tmp/rhel5.4-64-pv.cfg 
Using config file "/tmp/rhel5.4-64-pv.cfg".
Using <class 'grub.GrubConf.GrubConfigFile'> to parse /grub/menu.lst
Started domain rhel5-pv-x84_64

# xm li
Name                                      ID Mem(MiB) VCPUs State   Time(s)
Domain-0                                   0     3409     4 r-----     90.4
rhel5-pv-x84_64                            1      512     1 r-----      9.2

# xm pci-list 1
domain   bus   slot   func
0    3     0      0      

# xm save 1 1.save
Error: Migration not permitted with assigned PCI device.
Usage: xm save <Domain> <CheckpointFile>

Save a domain state to restore later.

# xm li
Name                                      ID Mem(MiB) VCPUs State   Time(s)
Domain-0                                   0     3409     4 r-----     94.0
rhel5-pv-x84_64                            2      512     1 --p---      0.0

The vm disappear after a while:
# xm li
Name                                      ID Mem(MiB) VCPUs State   Time(s)
Domain-0                                   0     3409     4 r-----     95.2

Comment 15 Yufang Zhang 2010-07-21 09:34:49 UTC
Created attachment 433354 [details]
full xend.log for comment #14

Comment 16 Michal Novotny 2010-07-21 09:48:33 UTC
(In reply to comment #15)
> Created an attachment (id=433354) [details]
> full xend.log for comment #14    

This is strange, this shouldn't be happening since it's been already fixed by something. Obviously not for all the cases:

[2010-07-21 17:45:47 xend.XendDomainInfo 4277] ERROR (XendDomainInfo:2811) Failed to restart domain 1.
Traceback (most recent call last):
  File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 2797, in restart
    new_dom.waitForDevices()
  File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 2489, in waitForDevices
    self.waitForDevices_(c)
  File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 1484, in waitForDevices_
    return self.getDeviceController(deviceClass).waitForDevices()
  File "/usr/lib64/python2.4/site-packages/xen/xend/server/DevController.py", line 162, in waitForDevices
    return map(self.waitForDevice, self.deviceIDs())
  File "/usr/lib64/python2.4/site-packages/xen/xend/server/DevController.py", line 172, in waitForDevice
    raise VmError("Device %s (%s) could not be connected. "
VmError: Device 0 (vkbd) could not be connected. Hotplug scripts not working.

I need to have a closer look. So, is it working fine to create the guest ? Honestly I don't know whether saving the PV guest with PCI device shouldn't be treated the same as migration, i.e. that it shouldn't be possible at all.

Michal

Comment 17 Michal Novotny 2010-07-21 10:47:19 UTC
Now I can see the issues are in this code:

            rc = xc.physdev_map_pirq(domid = fe_domid,
                                   index = dev.irq,
                                   pirq  = dev.irq)

where fe_domid is the ID of domain where it's being attached (8 in my case), and dev_irq equals to 255 but I don't know whether this is OK.

The xc_physdev_map_pirq() is the libxc function which is calling the hypervisor with the PHYSDEVOP_map_pirq operation and the return code 22 (-EINVAL) is the code coming from the xc_physdev_map_pirq() function for case the pirq is unset. The pirq seems to be set to 255 so it shouldn't be returning -EINVAL from there but there's a call to xc_physdev_op() function. According to the definition in  xen/arch/x86/x86_64/compat.c there's a define for do_physdev_op to be substituted by compat_physdev_op which resides in the hypervisor code too. Isn't it possible there's something not enabled on the hypervisor command line or something like that? I don't understand PCI passthrough stuff so it's just my guess.

Michal

Comment 18 Lei Wang 2010-07-23 07:10:15 UTC
Hi, Michal

1. We assume that Xen prevent us from saving PV guest with PCI device assigned to is right, as said in bug Description/Expected results.

2. And the problem now is that after giving the Error message below:
>Error: Migration not permitted with assigned PCI device.
>Usage: xm save <Domain> <CheckpointFile>

>Save a domain state to restore later.

The VM should still on and work properly but not be destroyed/disappeared.

3. Seems there's no other special configuration that we should enable to support PCI passthrough.

Comment 19 Michal Novotny 2010-07-23 07:42:46 UTC
(In reply to comment #18)
> Hi, Michal
> 
> 1. We assume that Xen prevent us from saving PV guest with PCI device assigned
> to is right, as said in bug Description/Expected results.
> 
> 2. And the problem now is that after giving the Error message below:
> >Error: Migration not permitted with assigned PCI device.
> >Usage: xm save <Domain> <CheckpointFile>
> 
> >Save a domain state to restore later.
> 
> The VM should still on and work properly but not be destroyed/disappeared.
> 
> 3. Seems there's no other special configuration that we should enable to
> support PCI passthrough.    

Hi Lei,
I see what you mean. If the expected behaviour is to make it fail but resume the guest, then this is the problem I've been coping with.

Nevertheless, could you please give me access to some machine with PCI passthrough device connected to the guest for further testing? I was using some remote machine but I don't remember it and I can't find it in the history now :(

Michal

Comment 20 Yufang Zhang 2010-07-23 09:29:53 UTC
(In reply to comment #17)
> Now I can see the issues are in this code:
> 
>             rc = xc.physdev_map_pirq(domid = fe_domid,
>                                    index = dev.irq,
>                                    pirq  = dev.irq)
> 
> where fe_domid is the ID of domain where it's being attached (8 in my case),
> and dev_irq equals to 255 but I don't know whether this is OK.
> 
> The xc_physdev_map_pirq() is the libxc function which is calling the hypervisor
> with the PHYSDEVOP_map_pirq operation and the return code 22 (-EINVAL) is the
> code coming from the xc_physdev_map_pirq() function for case the pirq is unset.
> The pirq seems to be set to 255 so it shouldn't be returning -EINVAL from there
> but there's a call to xc_physdev_op() function. According to the definition in 
> xen/arch/x86/x86_64/compat.c there's a define for do_physdev_op to be
> substituted by compat_physdev_op which resides in the hypervisor code too.
> Isn't it possible there's something not enabled on the hypervisor command line
> or something like that? I don't understand PCI passthrough stuff so it's just
> my guess.
> 
> Michal    

Yeah. It seems that we forgot to add "iommu=1" in kernel command line to enable IOMMU. But I thought we don't have to do that if we only want to do PCI pass-through with PV guest. Don't know whether this has some impacts for the scenario with this bug.

Comment 21 Michal Novotny 2010-07-23 10:10:34 UTC
(In reply to comment #19)
> (In reply to comment #18)
> > Hi, Michal
> > 
> > 1. We assume that Xen prevent us from saving PV guest with PCI device assigned
> > to is right, as said in bug Description/Expected results.
> > 
> > 2. And the problem now is that after giving the Error message below:
> > >Error: Migration not permitted with assigned PCI device.
> > >Usage: xm save <Domain> <CheckpointFile>
> > 
> > >Save a domain state to restore later.
> > 
> > The VM should still on and work properly but not be destroyed/disappeared.
> > 
> > 3. Seems there's no other special configuration that we should enable to
> > support PCI passthrough.    
> 
> Hi Lei,
> I see what you mean. If the expected behaviour is to make it fail but resume
> the guest, then this is the problem I've been coping with.
> 
> Nevertheless, could you please give me access to some machine with PCI
> passthrough device connected to the guest for further testing? I was using some
> remote machine but I don't remember it and I can't find it in the history now
> :(
> 
> Michal    

Well, you have to have 'iommu=1' (IOMMU enabled) on hypervisor command-line and on the testing machine you don't have it enabled and since it's Dell 760 it's a hardware bug there as described in bug 541788 on Dell 760. This machine used for testing does have 2 major issues - first, iommu is disabled (i.e. no 'iommu=1' on HV command-line) and second, you can't enable it because of hardware bug. Some better machine for testing to be able to enable iommu is necessary for testing this bug.

Michal

Comment 22 Michal Novotny 2010-07-23 10:12:42 UTC
(In reply to comment #20)
> (In reply to comment #17)
> > Now I can see the issues are in this code:
> > 
> >             rc = xc.physdev_map_pirq(domid = fe_domid,
> >                                    index = dev.irq,
> >                                    pirq  = dev.irq)
> > 
> > where fe_domid is the ID of domain where it's being attached (8 in my case),
> > and dev_irq equals to 255 but I don't know whether this is OK.
> > 
> > The xc_physdev_map_pirq() is the libxc function which is calling the hypervisor
> > with the PHYSDEVOP_map_pirq operation and the return code 22 (-EINVAL) is the
> > code coming from the xc_physdev_map_pirq() function for case the pirq is unset.
> > The pirq seems to be set to 255 so it shouldn't be returning -EINVAL from there
> > but there's a call to xc_physdev_op() function. According to the definition in 
> > xen/arch/x86/x86_64/compat.c there's a define for do_physdev_op to be
> > substituted by compat_physdev_op which resides in the hypervisor code too.
> > Isn't it possible there's something not enabled on the hypervisor command line
> > or something like that? I don't understand PCI passthrough stuff so it's just
> > my guess.
> > 
> > Michal    
> 
> Yeah. It seems that we forgot to add "iommu=1" in kernel command line to enable
> IOMMU. But I thought we don't have to do that if we only want to do PCI
> pass-through with PV guest. Don't know whether this has some impacts for the
> scenario with this bug.    

Yufang,
I *think* this is necessary for PV guests too. And also, the test case leiwang gave me link to is having IOMMU in it's summary ([IOMMU]Try to save a PV guest when a pci device assigned to it (PV)) so I guess it's really necessary to have it enabled for both HVM and PV guests to be able to do PCI passthru.

Michal

Comment 23 Andrew Jones 2010-07-23 10:47:58 UTC
(In reply to comment #22)
> I *think* this is necessary for PV guests too. 

I don't think this is correct. PCI passthrough has been supported for PV guests since RHEL 5.0, but that parameter was introduced for RHEL 5.4 when support for passthrough to HVM guests was implemented. You would need it as well as another parameter (but on dom0's command line) if you wanted to passthrough a VF using VTd, but I don't think that currently works for PV guests anyway. Rather than making assumptions about what's needed and what isn't, I suggest you do some research and/or an experiment or two to figure the current requirements and what's supported.

Comment 24 Bill Burns 2010-07-23 11:21:51 UTC
Adding Don to this as he should be able to help clear things up.

Comment 25 Michal Novotny 2010-07-23 12:04:03 UTC
(In reply to comment #23)
> (In reply to comment #22)
> > I *think* this is necessary for PV guests too. 
> 
> I don't think this is correct. PCI passthrough has been supported for PV guests
> since RHEL 5.0, but that parameter was introduced for RHEL 5.4 when support for
> passthrough to HVM guests was implemented. You would need it as well as another
> parameter (but on dom0's command line) if you wanted to passthrough a VF using
> VTd, but I don't think that currently works for PV guests anyway. Rather than
> making assumptions about what's needed and what isn't, I suggest you do some
> research and/or an experiment or two to figure the current requirements and
> what's supported.    

Well, the problem on the test machine (Dell 760 one) was that I was getting -EINVAL error all the time I tried to boot up the PV guest with PCI device attached so I guess this may be either iommu specific or the BIOS bug related thing on this machine. Currently I'm working on "Intel(R) Xeon(R) CPU           X5550  @ 2.67GHz" with no such issues and it boots the guest and the guest is able to see the PCI device finally. On the Dell machine with iommu disabled I was unable to even start up the guest because of -EINVAL coming from do_phydev_op() call in libxc which is accessing the HV AFAIK. I can do the experiment but after I finish work on this save issue since I'm finally able to reproduce behaviour described in this bugzilla since it's finally working for me.

Michal

Comment 26 Michal Novotny 2010-07-23 12:53:11 UTC
Well, it seems the problem there is the problem with crash according to the log file. Therefore the guest is dying since it crashed and adding the option to start the new instance of the guest/reboot is not the right one since the root cause is most likely related to the crash there. There are some repetitions of "Dev %s still active, looping ..." message called from XendDomainInfo.py:testDeviceComplete() function and this is where the crash occurs.

...
  File "/usr/lib64/python2.4/site-packages/xen/xend/server/pciif.py", line 419, in migrate
    raise XendError('Migration not permitted with assigned PCI device.')
XendError: Migration not permitted with assigned PCI device.
[2010-07-23 19:50:56 xend.XendDomainInfo 10121] DEBUG (XendDomainInfo:2330) XendDomainInfo.resumeDomain(6)
[2010-07-23 19:50:56 xend.XendDomainInfo 10121] INFO (XendDomainInfo:2454) Dev 51712 still active, looping...
[2010-07-23 19:50:56 xend.XendDomainInfo 10121] INFO (XendDomainInfo:2454) Dev 51712 still active, looping...
[2010-07-23 19:50:56 xend.XendDomainInfo 10121] INFO (XendDomainInfo:2454) Dev 51712 still active, looping...
[2010-07-23 19:50:56 xend.XendDomainInfo 10121] WARNING (XendDomainInfo:1222) Domain has crashed: name=migrating-rhel5-pv-x84_64 id=6.
[2010-07-23 19:50:56 xend.XendDomainInfo 10121] INFO (XendDomainInfo:1229) Starting automatic crash dump
[2010-07-23 19:51:01 xend.XendDomainInfo 10121] DEBUG (XendDomainInfo:2341) XendDomainInfo.resumeDomain: devices released
...

The issue may be caused by backend reconnection when it's not disconnected yet which could result into guest crash since we disconnect a disk from currently running guest. Obviously PV guests don't wait and kernel panic or something immediately resulting into the guest crash.

Michal

Comment 27 Chris Lalancette 2010-07-23 13:33:04 UTC
Andrew is correct here; for doing PV passthrough, iommu=1 is not needed (and indeed is ignored; PV passthrough doesn't go through the IOMMU, which is why it is unsafe to use in general).  So the bug has to be elsewhere.

Chris Lalancette

Comment 28 Michal Novotny 2010-07-23 14:02:32 UTC
Created attachment 433961 [details]
Bash script to unbind Intel 82541PI and bind to PCI back

Well, I found out that the issue was with the resume being called before it went to save. This way it was unable to work fine since it was unable to resume the guest which memory has been still allocated so the resume failed on mapping start_info which resulted into the PV guest crash. I'm already having the patch but I need to test it further for HVM guests now.

Michal

Comment 29 Michal Novotny 2010-07-23 14:19:50 UTC
Created attachment 433968 [details]
Patch to implement correct resume handling on failed save

Hi,
this is the patch to call domain_resume only when appropriate, i.e. not before the domain is being saved and memory deallocated. The patch has been tested on Intel(R) Xeon(R) CPU X5550 and RHEL-5 PV guest with both PCI device not assigned and assigned. When saving the PV guest without PCI device assigned on a place with enough space everything went fine to both save and restore, when I tried to save the guest to place with insufficient disk space
it failed to save but it resume fine. Finally for PV guest with PCI device assigned it failed with "Migration not permitted with assigned PCI device" reason but it was still working fine. When testing with HVM guests I saw no regressive behaviour, i.e. it's been working like before the patch applied.

Before my patch applied, it failed to resume the guest because of memory were unable to be allocated returning "Couldn't map start_info" from domain_resume function of libxc.

The device attached to the guest was "37:04.0" device that was dependant on "37:09.0". The script I used to add the devices to PCIback driver was added in the previous comment.

Michal

Comment 30 Michal Novotny 2010-07-23 14:26:46 UTC
Oh, just one more update: I was wrong. For PV guests, only modprobing pciback is required and no iommu=1 and pci_pt_e820_access=on the hypervisor command line is required.

Michal

Comment 34 Linqing Lu 2010-08-24 10:04:01 UTC
Created attachment 440612 [details]
log info for the "xm save" operation

Host: xen-3.0.3-115.el5, kernel-xen-2.6.18-212.el5
Guest: RHEL-Server-5.4-64-pv

Test with a pci device (intel 82576 VF) attached.

When did "xm save" to the guest domain, the operation failed with following output:
>> root@dhcp-65-129 ~]# xm save 1 save_file
>> Error: Migration not permitted with assigned PCI device.
>> Usage: xm save <Domain> <CheckpointFile>
>> 
>> Save a domain state to restore later.

The target guest ran well after save operation failed, so did the pci device.
Bug verified.

Comment 39 errata-xmlrpc 2011-01-13 22:19:47 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0031.html