Bug 1019666 - windows 2012r2 BSOD while installing intel 82599 driver
windows 2012r2 BSOD while installing intel 82599 driver
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: qemu-kvm (Show other bugs)
6.5
Unspecified Unspecified
high Severity high
: rc
: ---
Assigned To: Alex Williamson
Virtualization Bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-10-16 04:33 EDT by Chao Yang
Modified: 2014-05-13 17:29 EDT (History)
19 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-05-13 17:29:06 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Chao Yang 2013-10-16 04:33:51 EDT
Description of problem:
When testing device assignment, assigned Intel dual port 82599 PFs to windows 2012r2 guest, guest BSOD once I was trying to install its driver from Intel main page.

Version-Release number of selected component (if applicable):
2.6.32-423.el6.x86_64
qemu-kvm-0.12.1.2-2.412.el6.x86_64
virtio-win-prewhql-0.1-72

How reproducible:
100%

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:
CLI:
/usr/libexec/qemu-kvm -name win8_64_amd-1 -M rhel6.5.0 -cpu host -enable-kvm -m 4096 -realtime mlock=off -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive file=/home/en_windows_server_2012_r2_datacenter_preview_x64_dvd_2358570.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw,serial= -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive file=/home/win2012r2.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none,werror=stop,rerror=stop,aio=native -device virtio-blk-pci,scsi=off,bus=pci.0,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,id=hostnet0,vhost=on -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1a:4a:42:48:cd,bus=pci.0 -spice port=5900,disable-ticketing -k en-us -vga qxl -global qxl-vga.ram_size=67108864 -global qxl-vga.vram_size=67108864 -device virtio-balloon-pci,id=balloon0,bus=pci.0 -monitor stdio -boot menu=on -device pci-assign,host=05:00.0,id=PF-1 -device pci-assign,host=05:00.1,id=PF-2



Microsoft (R) Windows Debugger Version 6.2.9200.20512 AMD64
Copyright (c) Microsoft Corporation. All rights reserved.


Loading Dump File [C:\Windows\MEMORY.DMP]
Kernel Bitmap Dump File: Full address space is available

Symbol search path is: srv*c:\mss*http://msdl.microsoft.com/download/symbols
Executable search path is: 
Windows 8 Kernel Version 9431 UP Free x64
Product: Server, suite: TerminalServer DataCenter SingleUserTS
Built by: 9431.0.amd64fre.winmain_bluemp.130615-1214
Machine Name:
Kernel base = 0xfffff801`cfc77000 PsLoadedModuleList = 0xfffff801`cff41990
Debug session time: Wed Oct 16 03:34:11.205 2013 (UTC - 7:00)
System Uptime: 0 days 0:08:10.963
Loading Kernel Symbols
...............................................................
................................................................
.
Loading User Symbols
.............................................
Loading unloaded module list
......
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck 1A, {1233, 54445, 1, 0}

*** ERROR: Module load completed but symbols could not be loaded for iqvw64e.sys
*** ERROR: Module load completed but symbols could not be loaded for NcsColib.dll
Probably caused by : iqvw64e.sys ( iqvw64e+29fa )

Followup: MachineOwner
---------

kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

MEMORY_MANAGEMENT (1a)
    # Any other values for parameter 1 must be individually examined.
Arguments:
Arg1: 0000000000001233, The subtype of the bugcheck.
Arg2: 0000000000054445
Arg3: 0000000000000001
Arg4: 0000000000000000

Debugging Details:
------------------


BUGCHECK_STR:  0x1a_1233

DEFAULT_BUCKET_ID:  WIN8_DRIVER_FAULT

PROCESS_NAME:  ncs2prov.exe

CURRENT_IRQL:  0

LAST_CONTROL_TRANSFER:  from fffff801cfddbf5e to fffff801cfdc7da0

STACK_TEXT:  
ffffd000`21a597e8 fffff801`cfddbf5e : 00000000`0000001a 00000000`00001233 00000000`00054445 00000000`00000001 : nt!KeBugCheckEx
ffffd000`21a597f0 fffff801`cfc9bc6c : 00000000`00000000 00000000`00000000 e0000127`76908957 00000000`c86a8004 : nt! ?? ::FNODOBFM::`string'+0x391e
ffffd000`21a598f0 fffff800`02a529fa : 00000000`00000000 00000000`80862007 00000000`00cfcd10 00000000`00000001 : nt!MmMapIoSpace+0xc
ffffd000`21a59920 fffff800`02a518cb : 00000000`00000001 ffffd000`21a59cc0 ffffe000`00eeb240 00000000`00cfa1f0 : iqvw64e+0x29fa
ffffd000`21a59950 fffff800`02a511a7 : 00000000`00000003 ffffe000`012776c0 00000000`00000001 ffffe000`00eeb240 : iqvw64e+0x18cb
ffffd000`21a59990 fffff801`d0050bb3 : 00000000`00000001 ffffd000`21a59cc0 00000000`00000001 00000000`00000001 : iqvw64e+0x11a7
ffffd000`21a599c0 fffff801`d0051daa : 00000000`00000001 00000000`00000000 00000000`00000000 00000000`00000000 : nt!IopXxxControlFile+0x8c3
ffffd000`21a59b60 fffff801`cfdd36b3 : 00000000`00000001 00000000`00cfa708 fffff801`cfc5b900 ffffd000`21a59cc0 : nt!NtDeviceIoControlFile+0x56
ffffd000`21a59bd0 00007ffc`34a4b12a : 00007ffc`32192f83 00000000`80862007 00800103`00000000 00000000`9d000895 : nt!KiSystemServiceCopyEnd+0x13
00000000`00cfa0d8 00007ffc`32192f83 : 00000000`80862007 00800103`00000000 00000000`9d000895 00000000`00000895 : ntdll!NtDeviceIoControlFile+0xa
00000000`00cfa0e0 00007ffc`33fe14f0 : 00000000`80862007 00000000`9d000895 00000000`00cfa3b8 00000000`00000000 : KERNELBASE!DeviceIoControl+0x73
00000000`00cfa150 00000000`012620fa : 00000000`00000108 00000000`00000008 00000000`00cfc9e0 00000000`00cfa878 : KERNEL32!DeviceIoControlImplementation+0x74
00000000`00cfa1a0 00000000`00000108 : 00000000`00000008 00000000`00cfc9e0 00000000`00cfa878 00000000`00000000 : NcsColib+0xa20fa
00000000`00cfa1a8 00000000`00000008 : 00000000`00cfc9e0 00000000`00cfa878 00000000`00000000 00000000`00000000 : 0x108
00000000`00cfa1b0 00000000`00cfc9e0 : 00000000`00cfa878 00000000`00000000 00000000`00000000 00000000`00cfa428 : 0x8
00000000`00cfa1b8 00000000`00cfa878 : 00000000`00000000 00000000`00000000 00000000`00cfa428 00000000`00000000 : 0xcfc9e0
00000000`00cfa1c0 00000000`00000000 : 00000000`00000000 00000000`00cfa428 00000000`00000000 00000000`00000028 : 0xcfa878


STACK_COMMAND:  kb

FOLLOWUP_IP: 
iqvw64e+29fa
fffff800`02a529fa 488d0d8f260000  lea     rcx,[iqvw64e+0x5090 (fffff800`02a55090)]

SYMBOL_STACK_INDEX:  3

SYMBOL_NAME:  iqvw64e+29fa

FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: iqvw64e

IMAGE_NAME:  iqvw64e.sys

DEBUG_FLR_IMAGE_TIMESTAMP:  508659cf

FAILURE_BUCKET_ID:  0x1a_1233_iqvw64e+29fa

BUCKET_ID:  0x1a_1233_iqvw64e+29fa

Followup: MachineOwner
---------
Comment 1 Chao Yang 2013-10-16 04:41:58 EDT
There is a similar bug - Bug 947791 got fixed on virtio-win-prewhql-0.1-68. But this issue is also reproducible with -68. 

With VFs of Intel dual port 82576 and virtio-win-prewhql-72, this issue is not reproducible.
Comment 9 langfang 2013-10-25 07:13:24 EDT
Test this bug on latest on GA version ,hit same problem(BSOD)

Senarios 1)RHEL6.5

Host: 
kernel-2.6.32-425.el6.x86_64.rpm 
qemu-kvm-0.12.1.2-2.414.el6.x86_64.rpm 

Guest:win2012r2

Senario 2) RHEL6.4-GA

Host:
2.6.32-358.el6.x86_64
qemu-kvm-0.12.1.2-2.355.el6.x86_64

Guest:win2012r2


So this bug is not regression.
Comment 10 Ronen Hod 2013-10-25 19:49:31 EDT
Since it is about device assignment, and it is not really a regression, and we are out of time, we will have to defer it.
I also removed the blocker.
Yan, do you see anything in the dump that can give Alex a hint?
Comment 11 Yan Vugenfirer 2013-10-27 11:56:14 EDT
I cannot download the dump file. Can you please compress it and re-upload?

Thanks.
Comment 12 Chao Yang 2013-10-27 21:23:41 EDT
(In reply to Yan Vugenfirer from comment #11)
> I cannot download the dump file. Can you please compress it and re-upload?
> 
Please retry.

> Thanks.
Comment 15 Yan Vugenfirer 2013-10-29 12:50:14 EDT
The crash is when the driver is trying to map IO range with MmMapIoSpace

http://msdn.microsoft.com/en-us/library/windows/hardware/ff554618(v=vs.85).aspx

According to the parameters of the BSOD:
MEMORY_MANAGEMENT (1a) ( http://msdn.microsoft.com/en-us/library/windows/hardware/ff557391(v=vs.85).aspx )

Arg1: 0000000000001233: A driver tried to map a physical memory page that was not locked. This is illegal because the contents or attributes of the page can change at any time. This is a bug in the code that made the mapping call. Parameter 2 is the page frame number of the physical page that the driver attempted to map.

Arg2: 0000000000054445 - pfn


Looking at the pfn:

kd> !pfn 0000000000054445
    PFN 00054445 at address FFFFFA8000FCCCF0
    flink       00000000  blink / share count 00000000  pteaddress 00000000
    reference count 0000    used entry count  0000      Cached    color 0   Priority 0
    restore pte 00000000  containing page        FFFFFFFFE  Free               



If it helps below windbg trace in order to look at the resource allocations for VFs:

kd> !pcitree
Bus 0x0 (FDO Ext ffffe00000535ae0)
  (d=0,  f=0) 80861237 devext 0xffffe0000052a9d0 devstack 0xffffe0000052a880 0600 Bridge/HOST to PCI
  (d=1,  f=0) 80867000 devext 0xffffe000005291b0 devstack 0xffffe00000529060 0601 Bridge/PCI to ISA
  (d=1,  f=1) 80867010 devext 0xffffe000005299d0 devstack 0xffffe00000529880 0101 Mass Storage Controller/IDE
  (d=1,  f=2) 80867020 devext 0xffffe000005181b0 devstack 0xffffe00000518060 0c03 Serial Bus Controller/USB
  (d=2,  f=0) 1b360100 devext 0xffffe0000056d1b0 devstack 0xffffe0000056d060 0300 Display Controller/VGA
>> Red Hat virtio devices:
  (d=3,  f=0) 1af41001 devext 0xffffe0000056d9d0 devstack 0xffffe0000056d880 0100 Mass Storage Controller/SCSI
  (d=4,  f=0) 1af41000 devext 0xffffe0000056c1b0 devstack 0xffffe0000056c060 0200 Network Controller/Ethernet
  (d=5,  f=0) 1af41003 devext 0xffffe0000056c9d0 devstack 0xffffe0000056c880 0780 Simple Serial Communications Controller/'Other'
  (d=6,  f=0) 1af41002 devext 0xffffe0000056b1b0 devstack 0xffffe0000056b060 0500 Memory Controller/RAM
>> Intel VFs
  (d=7,  f=0) 808610fb devext 0xffffe0000056b9d0 devstack 0xffffe0000056b880 0200 Network Controller/Ethernet
  (d=8,  f=0) 808610fb devext 0xffffe0000056a1b0 devstack 0xffffe0000056a060 0200 Network Controller/Ethernet


Two Intel VFs:

kd> !devext 0xffffe0000056b9d0 pci
PDO Extension, Bus 0x0, Device 7, Function 0.
  DevObj 0xffffe0000056b880  Parent FDO DevExt 0xffffe00000535ae0
  Device State = PciStarted
  Vendor ID 8086 (INTEL)  Device ID 10FB
  Subsystem Vendor ID 8086 (INTEL)  Subsystem ID 7A11
  Header Type 0, Class Base/Sub 02/00  (Network Controller/Ethernet)
  Programming Interface: 00, Revision: 01, IntPin: 01, RawLine 00
  Possible Decodes ((cmd & 7) = 7): BMI
  Capabilities: Ptr=e0, power msi msix express 
  Express capabilities: (BIOS controlled) 
  Logical Device Power State: D0
  Device Wake Level:          Unspecified
  WaitWakeIrp:                <none>
  Requirements:     Alignment Length    Minimum          Maximum
    BAR0    Mem:    00080000  00080000  0000000000000000 00000000ffffffff
    BAR2     Io:    00000020  00000020  0000000000000000 00000000ffffffff
    BAR4    Mem:    00004000  00004000  0000000000000000 00000000ffffffff
      ROM BAR:      00080000  00080000  0000000000000000 00000000ffffffff
    VF BAR0 Mem:    00080000  00080000  0000000000000000 00000000ffffffff
  Resources:        Start            Length
    BAR0    Mem:    00000000f4080000 00080000
    BAR4    Mem:    00000000f4100000 00004000
  Interrupt Requirement:
    Line Based - Min Vector = 0x0, Max Vector = 0xffffffff
    Message Based: Type - Msi-X, 0x40 messages requested
  Interrupt Resource:    Type - MSI-X, 0x13 Messages Granted


kd> !devext 0xffffe0000056a1b0  pci
PDO Extension, Bus 0x0, Device 8, Function 0.
  DevObj 0xffffe0000056a060  Parent FDO DevExt 0xffffe00000535ae0
  Device State = PciStarted
  Vendor ID 8086 (INTEL)  Device ID 10FB
  Subsystem Vendor ID 8086 (INTEL)  Subsystem ID 7A11
  Header Type 0, Class Base/Sub 02/00  (Network Controller/Ethernet)
  Programming Interface: 00, Revision: 01, IntPin: 02, RawLine 00
  Possible Decodes ((cmd & 7) = 7): BMI
  Capabilities: Ptr=e0, power msi msix express 
  Express capabilities: (BIOS controlled) 
  Logical Device Power State: D0
  Device Wake Level:          Unspecified
  WaitWakeIrp:                <none>
  Requirements:     Alignment Length    Minimum          Maximum
    BAR0    Mem:    00080000  00080000  0000000000000000 00000000ffffffff
    BAR2     Io:    00000020  00000020  0000000000000000 00000000ffffffff
    BAR4    Mem:    00004000  00004000  0000000000000000 00000000ffffffff
      ROM BAR:      00080000  00080000  0000000000000000 00000000ffffffff
    VF BAR0 Mem:    00080000  00080000  0000000000000000 00000000ffffffff
  Resources:        Start            Length
    BAR0    Mem:    00000000f4200000 00080000
    BAR4    Mem:    00000000f4280000 00004000
  Interrupt Requirement:
    Line Based - Min Vector = 0x0, Max Vector = 0xffffffff
    Message Based: Type - Msi-X, 0x40 messages requested
  Interrupt Resource:    Type - MSI-X, 0x13 Messages Granted
Comment 17 Alex Williamson 2014-05-12 17:39:45 EDT
(In reply to Yan Vugenfirer from comment #15)
> The crash is when the driver is trying to map IO range with MmMapIoSpace
> 
> http://msdn.microsoft.com/en-us/library/windows/hardware/ff554618(v=vs.85).
> aspx
> 
> According to the parameters of the BSOD:
> MEMORY_MANAGEMENT (1a) (
> http://msdn.microsoft.com/en-us/library/windows/hardware/ff557391(v=vs.85).
> aspx )
> 
> Arg1: 0000000000001233: A driver tried to map a physical memory page that
> was not locked. This is illegal because the contents or attributes of the
> page can change at any time. This is a bug in the code that made the mapping
> call. Parameter 2 is the page frame number of the physical page that the
> driver attempted to map.

This sounds like a driver bug.  Intel has told us in the past that they don't support assignment of PFs that support SR-IOV.  Does the BSOD go away if only function 0 of the PF is assigned or if both functions are assigned with guest function number matching host function?  ex.

-device pci-assign,host=05:00.0,multifunction=on,addr=6.0,id=PF-1 \
-device pci-assign,host=05:00.1,addr=6.1,id=PF-2
Comment 18 Chao Yang 2014-05-13 05:59:15 EDT
I am not able to reproduce this bug on Intel system with Intel Corporation 82599ES 10-Gigabit with latest qemu-kvm, kernel and windows driver for 82599. Except that in guest Device Manager, it displays "Intel(R) Ethernet Server Adapter X520-2" and "Intel(R) Ethernet Server Adapter X520-2 #2"

Packages tested:
qemu-kvm-0.12.1.2-2.425.el6.x86_64
2.6.32-464.el6.x86_64

Driver version:
Operating Systems: Windows Server 2012 R2*
Date: 2014/04/10
Version: 19.1 

CLI:
# /usr/libexec/qemu-kvm -M rhel6.5.0 -cpu host -enable-kvm -m 4096 -realtime mlock=off -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -nodefaults -drive file=en_windows_server_2012_r2_x64_dvd_2707946.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw,serial= -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive file=win2012r2.qcow2,if=none,id=drive-ide-disk0,format=qcow2,cache=none,werror=stop,rerror=stop,aio=native -device ide-drive,bus=ide.0,unit=0,drive=drive-ide-disk0,id=ide-disk0,bootindex=1 -netdev tap,id=hostnet0 -device e1000,netdev=hostnet0,id=net0,mac=00:1a:4a:42:48:cd,bus=pci.0 -k en-us -vga cirrus -vnc :1 -monitor stdio -boot menu=on -device pci-assign,host=05:00.0,id=pf-1 -device pci-assign,host=05:00.1,id=pf-2
Comment 19 Alex Williamson 2014-05-13 17:29:06 EDT
Marking this closed then, the fix might have come from the Intel driver.  X520 is the new Intel marketing name for the 82599.

Note You need to log in before you can comment on or make changes to this bug.