Bug 613892

Summary: [SR-IOV]VF device can not start on 32bit Windows2008 SP2
Product: Red Hat Enterprise Linux 6 Reporter: Hao, Xudong <xudong.hao>
Component: qemu-kvmAssignee: john cooper <john.cooper>
Status: CLOSED CURRENTRELEASE QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: low    
Version: 6.0CC: alex.williamson, ashish.n.shah, ddutile, hui.xiao, jane.lv, jiajun.xu, john.cooper, jvillalo, jwilleford, kcao, llim, luyu, mkenneth, nobody, tburke, virt-maint, xin.li, yang.z.zhang, yongkang.you
Target Milestone: rc   
Target Release: 6.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: qemu-kvm-0.12.1.2-2.107.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 654208 (view as bug list) Environment:
Last Closed: 2010-11-10 21:26:31 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 654208, 656751    
Attachments:
Description Flags
vf device can not start on guest
none
cannot find enough free resources none

Description Hao, Xudong 2010-07-13 06:27:14 UTC
Description of problem:
With RHEL6 Beta2, Kawela VF can be assigned to 32bit Windows 2008 SP2 with
qemu-kvm cmdline, device manager show "yellow bang" and it's said "This device can not start. (Code 10)". VF device can not work on 32bit Windows 2k8 SP2.

Version-Release number of selected component (if applicable):
rhel6-beta2 2.6.32-37.el6.x86_64 


How reproducible:
Always

Steps to Reproduce:
1. Install 32bit Windows 2008 SP2 and assign Kawela VF to it
2. Install driver from
http://downloadcenter.intel.com/T8Clearance.aspx?sType=&agr=Y&ProductID=&DwnldID=18720&url=/18720/a08/PROWin32.exe&PrdMap=&strOSs=&OSFullName=&lang=eng
3. Reboot guest and check network connection


Actual results:
VF can not get IP

Expected results:
VF can work on Windows SP2.


Additional info:

Comment 1 Hao, Xudong 2010-07-13 06:29:30 UTC
Created attachment 431354 [details]
vf device can not start on guest

Comment 3 Don Dutile (Red Hat) 2010-07-13 14:25:20 UTC
Please provide:

-- host kernel version

-- guest startup cmdline (if use qemu-kvm directly) or xml spec (if you virsh)

Comment 4 Alex Williamson 2010-07-13 19:18:13 UTC
Created attachment 431573 [details]
cannot find enough free resources

I get a different error message, see the attached image.  This same error happens both with current rhel6 bits and upstream kvm.  Looking at the log files doesn't seem to suggest a PCI BAR resource issue.  Can we get some help to understand what the driver is looking for that it can't find resources for?

Comment 5 Ashish Shah 2010-07-13 21:07:18 UTC
When we debugged here there is no error in the driver that we can see. It seems like the OS is having some issues with getting resources required for the device (such as MSI-X.) The same driver is working correct on Xen Server so we are not certain what the difference is between the two systems that would cause this.

Comment 6 Hao, Xudong 2010-07-14 01:08:05 UTC
(In reply to comment #3)
> -- host kernel version
> 
kernel 2.6.32-37.el6.x86_64.
qemu-kvm-0.12.1.2

> -- guest startup cmdline (if use qemu-kvm directly) or xml spec (if you virsh)    
qemu-kvm command line: 
/usr/libexec/qemu-kvm -m 1024 -smp 2 -net none -hda /var/lib/libvirt/images/win2k8.img  -pcidevice host=01:10.0

Comment 7 Alex Williamson 2010-07-14 20:59:34 UTC
Can you provide both an 'lspci -vvv' and an 'ls -l
/sys/bus/pci/devices/0000:00:xx.y/' from a linux guest with the device assigned
to it running on xenserver?  Maybe we can spot something different in the
features or config space they're exposing.

Comment 8 Alex Williamson 2010-07-25 14:12:57 UTC
I've noticed that 32bit Windows on kvm typically does not use MSI interrupts.  However, if I boot the guest with '-cpu host' MSI will be used and the 82576 VF works.  Is Windows looking for specific processor flags to enable MSI interrupt support?

Comment 9 Alex Williamson 2010-07-25 15:57:25 UTC
Looks like 32bit Windows doesn't enable MSI support until family 6, model 13 processor revisions, so if you boot using -cpu qemu64,model=13 the VF works as expected.

Comment 10 yang 2010-07-26 01:50:47 UTC
(In reply to comment #9)
> Looks like 32bit Windows doesn't enable MSI support until family 6, model 13
> processor revisions, so if you boot using -cpu qemu64,model=13 the VF works as
> expected.    
yeah, with the arg "-cpu qemu64,model=13", the VF can work well.

Comment 11 Dor Laor 2010-07-26 12:15:19 UTC
Can you retest with one of the following option (most suitable to the host):
-cpu Penryn or Nehalem or Conroe? They should have family=6, model =15

Comment 12 Dor Laor 2010-07-26 12:16:47 UTC
(In reply to comment #11)
> Can you retest with one of the following option (most suitable to the host):
> -cpu Penryn or Nehalem or Conroe? They should have family=6, model =15    

Oops, the AMD models have model=15 while the ones above have model 6.
We should change that.

Comment 13 john cooper 2010-07-26 19:43:01 UTC
These fields inspire headaches.  Currently for the new
models we use the Intel and AMD reviewed values of:

AMD Opteron_G1/G2/G3:

   family = "15"
   model = "6"

Intel Conroe/Penryn/Nehalem + qemu64:

   family = "6"
   model = "2"

So the Intel cpu model CPUID provided "model" fields
and those of qemu64 require the prospective change.
Awaiting input from Intel on this.

Comment 14 yang 2010-07-27 02:18:07 UTC
(In reply to comment #11)
> Can you retest with one of the following option (most suitable to the host):
> -cpu Penryn or Nehalem or Conroe? They should have family=6, model =15    

I have retest with those args. Unfortunately, VF can not work with it.

Comment 15 john cooper 2010-07-28 02:02:10 UTC
Feedback form Intel (mail attached for reference).
The recommendation summary for cpuid "model" is:

    Conroe: 15
    Penryn: 23
    Nehalem: 26

Concerning qemu64, empirically the model value needs
to be at least 13, with a rhel5-equivalent legacy
model added for compatibility.

Comment 16 john cooper 2010-07-28 03:32:31 UTC
From: "Dugger, Donald D" <donald.d.dugger>
To: john cooper <john.cooper>
CC: Bill Burns <bburns>, "Nakajima, Jun" <jun.nakajima>,
        "Yu, Wilfred" <wilfred.yu>
Date: Mon, 26 Jul 2010 18:43:33 -0700
Subject: RE: Need Intel input on CPUID model..

John-

Yeah, this whole issue of virtualizing the family/model (which we have to
do) and how it exposes unexpected issues is pretty icky (note, I would
contend that the Windows code is wrong, there's no relationship between
MSI support and the CPU model but the reality is we have to get the Windows
guest to work).

We talked this issue over in our team meeting today and the bottom line
is that using Family 6, Model 13 when identifying a virtual CPU with either
Conroe or Penryn or Nehalem capabilities should work just fine.  Those 3
CPUs all have model numbers greater than 13 (Conroe = 15, Penryn = 23,
Nehalem = 26) so 13 will certainly work as a least common denominator for
them.


--
Don Dugger
"Censeo Toto nos in Kansa esse decisse." - D. Gale
Ph: 303/443-3786

-----Original Message-----
From: john cooper [john.cooper]
Sent: Monday, July 26, 2010 1:10 PM
To: Dugger, Donald D
Cc: john cooper; Bill Burns
Subject: Need Intel input on CPUID model..

Don,
    We have a bug with SR-IOV where the CPUID
family:model must be at least 6:13 for a 32bit
windows guest to enable MSI.  So we're considering
to make such a change for the new Conroe/Penryn/Nehalem
models we've discussed with you folks.

However currently we're using 6:2 for family:model
as advised by Intel for a least-common-denominator
in the respective classes.  As such we're a bit
hesitant to make a change without feedback either
way from you.

Comment 17 You, Yongkang 2010-07-29 08:19:14 UTC
Remove the NeedInfo request for xudong.

Comment 22 yang 2010-08-17 07:11:51 UTC
Verified this bug with rhel6 snap10, and PASSED.

libvirt-0.8.1-21.el6.x86_64
qemu-kvm-tools-0.12.1.2-2.108.el6.x86_64
qemu-kvm-0.12.1.2-2.108.el6.x86_64
kernel-2.6.32-59.el6.x86_64

Comment 23 Cao, Chen 2010-08-17 07:25:44 UTC
(In reply to comment #22)
> Verified this bug with rhel6 snap10, and PASSED.
> 
> libvirt-0.8.1-21.el6.x86_64
> qemu-kvm-tools-0.12.1.2-2.108.el6.x86_64
> qemu-kvm-0.12.1.2-2.108.el6.x86_64
> kernel-2.6.32-59.el6.x86_64

Comment 24 releng-rhel@redhat.com 2010-11-10 21:26:31 UTC
Red Hat Enterprise Linux 6.0 is now available and should resolve
the problem described in this bug report. This report is therefore being closed
with a resolution of CURRENTRELEASE. You may reopen this bug report if the
solution does not work for you.