Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1083873

Summary: virtio(netkvm.sys) BSOD when try to Hibernate on winxp guest
Product: Red Hat Enterprise Linux 7 Reporter: Lulin Fan <f_ella>
Component: virtio-winAssignee: Yvugenfi <yvugenfi>
Status: CLOSED CURRENTRELEASE QA Contact: Virtualization Bugs <virt-bugs>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 7.0CC: acathrow, bcao, bsarathy, f_ella, juzhang, lijin, mdeng, michen, rhod, virt-maint, vrozenfe, yvugenfi
Target Milestone: rc   
Target Release: 7.0   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: virtio-win-prewwhql-0.1-79 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-07-25 07:37:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
memory dump
none
memory dump2
none
BSOD when resume from hibernate none

Description Lulin Fan 2014-04-03 06:51:09 UTC
Description of problem:

xp guest with netkvm installed will BSOD when try to Hibernate.
Win7 works good.

Version-Release number of selected component (if applicable):

kernel: 2.6.32-431.5.1.el6.x86_64
qemu-kvm: 1.4.2
virtio-driver: spice-guest-tools-0.74.exe from spice website, driver version is 62.65.104.7400(2013-1-20)

How reproducible:
100%

Steps to Reproduce:
1. install xp guest with virtio network driver
2. try to Hibernate
3. BSOD

Actual results:
BSOD...

Expected results:
works fine.

Additional info:
0: kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

SYSTEM_THREAD_EXCEPTION_NOT_HANDLED (7e)
This is a very common bugcheck.  Usually the exception address pinpoints
the driver/function that caused the problem.  Always note this address
as well as the link date of the driver/image that contains this address.
Arguments:
Arg1: c0000005, The exception code that was not handled
Arg2: ba9df971, The address that the exception occurred at
Arg3: bace7c84, Exception Record Address
Arg4: bace7980, Context Record Address

Debugging Details:
------------------

***** Kernel symbols are WRONG. Please fix symbols to do analysis.

*************************************************************************
***                                                                   ***
***                                                                   ***
***    Your debugger is not using the correct symbols                 ***
***                                                                   ***
***    In order for this command to work properly, your symbol path   ***
***    must point to .pdb files that have full type information.      ***
***                                                                   ***
***    Certain .pdb files (such as the public OS symbols) do not      ***
***    contain the required information.  Contact the group that      ***
***    provided you with these symbols if you need this command to    ***
***    work.                                                          ***
***                                                                   ***
***    Type referenced: nt!_KPRCB                                     ***
***                                                                   ***
*************************************************************************
*************************************************************************
***                                                                   ***
***                                                                   ***
***    Your debugger is not using the correct symbols                 ***
***                                                                   ***
***    In order for this command to work properly, your symbol path   ***
***    must point to .pdb files that have full type information.      ***
***                                                                   ***
***    Certain .pdb files (such as the public OS symbols) do not      ***
***    contain the required information.  Contact the group that      ***
***    provided you with these symbols if you need this command to    ***
***    work.                                                          ***
***                                                                   ***
***    Type referenced: nt!_KPRCB                                     ***
***                                                                   ***
*************************************************************************

ADDITIONAL_DEBUG_TEXT:  
Use '!findthebuild' command to search for the target build information.
If the build information is available, run '!findthebuild -s ; .reload' to set symbol path and load symbols.

MODULE_NAME: netkvm

FAULTING_MODULE: 804d8000 nt

DEBUG_FLR_IMAGE_TIMESTAMP:  528c7566

EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - "0x%08lx"

FAULTING_IP: 
netkvm+7971
ba9df971 8b4804          mov     ecx,dword ptr [eax+4]

EXCEPTION_RECORD:  bace7c84 -- (.exr 0xffffffffbace7c84)
ExceptionAddress: ba9df971 (netkvm+0x00007971)
   ExceptionCode: c0000005 (Access violation)
  ExceptionFlags: 00000000
NumberParameters: 2
   Parameter[0]: 00000000
   Parameter[1]: 00000004
Attempt to read from address 00000004

CONTEXT:  bace7980 -- (.cxr 0xffffffffbace7980)
eax=00000000 ebx=806e7900 ecx=00000000 edx=00000000 esi=896b1cd4 edi=896b1a20
eip=ba9df971 esp=bace7d4c ebp=bace7d58 iopl=0         nv up ei pl zr na pe nc
cs=0008  ss=0010  ds=0023  es=0023  fs=0030  gs=0000             efl=00010246
netkvm+0x7971:
ba9df971 8b4804          mov     ecx,dword ptr [eax+4] ds:0023:00000004=????????
Resetting default scope

DEFAULT_BUCKET_ID:  DRIVER_FAULT

BUGCHECK_STR:  0x7E

LAST_CONTROL_TRANSFER:  from ba9da91e to ba9df971

STACK_TEXT:  
WARNING: Stack unwind information not available. Following frames may be wrong.
bace7d58 ba9da91e 896b1a20 89635fd8 80565820 netkvm+0x7971
bace7d6c ba5fdbfe 89635fd0 89635fd0 bace7dac netkvm+0x291e
bace7d7c 8053976d 89635fd0 00000000 89a314a8 NDIS!NdisFreeToBlockPool+0x1658
bace7dac 805d0f64 89635fd0 00000000 00000000 nt!ExQueueWorkItem+0x1a3
bace7ddc 805470de 8053967e 00000000 00000000 nt!PsRemoveCreateThreadNotifyRoutine+0x214
00000000 00000000 00000000 00000000 00000000 nt!KiDispatchInterrupt+0x72e


FOLLOWUP_IP: 
netkvm+7971
ba9df971 8b4804          mov     ecx,dword ptr [eax+4]

SYMBOL_STACK_INDEX:  0

SYMBOL_NAME:  netkvm+7971

FOLLOWUP_NAME:  MachineOwner

IMAGE_NAME:  netkvm.sys

STACK_COMMAND:  .cxr 0xffffffffbace7980 ; kb

BUCKET_ID:  WRONG_SYMBOLS

Followup: MachineOwner
---------

Comment 2 Mike Cao 2014-04-03 06:58:54 UTC
The official netkvm driver support for winxp is virtio-win-prewhql-49 ,Pls retest it w/ virtio-win-prewhql-49 or latest virtio-win package.

Mike

Comment 3 Lulin Fan 2014-04-03 07:26:10 UTC
From http://alt.fedoraproject.org/pub/alt/virtio-win/latest/images/ I can only get 0.1-74, which also have this issue.

and one mistake in my post that the driver version is 11/20/2013,51.65.104.7400

(In reply to Mike Cao from comment #2)
> The official netkvm driver support for winxp is virtio-win-prewhql-49 ,Pls
> retest it w/ virtio-win-prewhql-49 or latest virtio-win package.
> 
> Mike

Comment 4 Yvugenfi@redhat.com 2014-04-03 08:00:50 UTC
Hi,

Can you upload to somewhere kernel memory dump (please archive it before uploading).

Thanks,
Yan.

Comment 5 Lulin Fan 2014-04-03 12:33:12 UTC
Created attachment 882249 [details]
memory dump

minidump of windows xp

Comment 6 Yvugenfi@redhat.com 2014-04-03 14:45:01 UTC
(In reply to fanlulin from comment #5)
> Created attachment 882249 [details]
> memory dump
> 
> minidump of windows xp

Looks like the file was not uploaded correctly.

Comment 7 Lulin Fan 2014-04-04 01:15:59 UTC
Created attachment 882491 [details]
memory dump2

xp dump again.
------------------
i didn't notice the first one was broken.

Comment 8 Yvugenfi@redhat.com 2014-04-07 16:08:45 UTC
The crash is in function ParaNdis_PowerOff(PARANDIS_ADAPTER *pContext) when we try to shutdown the control queue. It looks like the control queue was not initialised to begin with.

Comment 10 Yvugenfi@redhat.com 2014-04-07 16:35:19 UTC
Hi,

Can you try this engineering build and see if solves the problem?

https://www.dropbox.com/s/wkfamji1tt6u6wv/XP.zip

Thanks,
Yan.

Comment 11 Lulin Fan 2014-04-08 02:10:21 UTC
Hi
I have tried this build, and confirm this issue is fixed..
But I tried 4 times, only once I got a BSOD when  resuming from hibernation, so I am not sure this issue is from the new driver.

And I noticed that is speed of NIC is 1.4Gbps after install the new driver.

0: kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

KERNEL_STACK_INPAGE_ERROR (77)
The requested page of kernel data could not be read in.  Caused by
bad block in paging file or disk controller error.
In the case when the first arguments is 0 or 1, the stack signature
in the kernel stack was not found.  Again, bad hardware.
An I/O status of c000009c (STATUS_DEVICE_DATA_ERROR) or
C000016AL (STATUS_DISK_OPERATION_FAILED)  normally indicates
the data could not be read from the disk due to a bad
block.  Upon reboot autocheck will run and attempt to map out the bad
sector.  If the status is C0000185 (STATUS_IO_DEVICE_ERROR) and the paging
file is on a SCSI disk device, then the cabling and termination should be
checked.  See the knowledge base article on SCSI termination.
Arguments:
Arg1: c000000e, status code
Arg2: c000000e, i/o status code
Arg3: 00000000, page file number
Arg4: 007c2000, offset into page file

Debugging Details:
------------------


ERROR_CODE: (NTSTATUS) 0xc000000e - <Unable to get error code text>

DISK_HARDWARE_ERROR: There was error with disk hardware

BUGCHECK_STR:  0x77_c000000e

DEFAULT_BUCKET_ID:  DRIVER_FAULT

PROCESS_NAME:  System

LAST_CONTROL_TRANSFER:  from 80512f79 to 804faf33

STACK_TEXT:  
bad23cdc 80512f79 00000077 c000000e c000000e nt!KeBugCheckEx+0x1b
bad23d50 80513deb c0588610 000169a1 00000001 nt!MiMakeOutswappedPageResident+0x4f5
bad23d8c 80540d76 00789a90 00000000 89a28b30 nt!MmInPageKernelStack+0x149
bad23da4 80541246 896bbe08 805d0f64 00000000 nt!KiInSwapKernelStacks+0x16
bad23dac 805d0f64 00000000 00000000 00000000 nt!KeSwapProcessOrStack+0x7c
bad23ddc 805470de 805411ca 00000000 00000000 nt!PspSystemThreadStartup+0x34
00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x16


STACK_COMMAND:  kb

FOLLOWUP_IP: 
nt!MiMakeOutswappedPageResident+4f5
80512f79 cc              int     3

SYMBOL_STACK_INDEX:  1

SYMBOL_NAME:  nt!MiMakeOutswappedPageResident+4f5

FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: nt

DEBUG_FLR_IMAGE_TIMESTAMP:  4802516a

IMAGE_NAME:  memory_corruption

FAILURE_BUCKET_ID:  0x77_c000000e_nt!MiMakeOutswappedPageResident+4f5

BUCKET_ID:  0x77_c000000e_nt!MiMakeOutswappedPageResident+4f5

Followup: MachineOwner
---------



(In reply to Yan Vugenfirer from comment #10)
> Hi,
> 
> Can you try this engineering build and see if solves the problem?
> 
> https://www.dropbox.com/s/wkfamji1tt6u6wv/XP.zip
> 
> Thanks,
> Yan.

Comment 12 Lulin Fan 2014-04-08 02:13:22 UTC
Created attachment 883819 [details]
BSOD when resume from hibernate

memory dump of BSOD when resume from hibernate

Comment 13 Yvugenfi@redhat.com 2014-04-08 06:55:29 UTC
Related bug: BZ #957507

Comment 14 Mike Cao 2014-04-08 06:57:11 UTC
(In reply to Yan Vugenfirer from comment #13)
> Related bug: BZ #957507

Hi,Yan
Do you plan to add the patch to prewhql build ?

Thanks,
Mike

Comment 15 Yvugenfi@redhat.com 2014-04-08 06:58:29 UTC
1. The new crash dump is not related to network driver. It is worth to open a new bug (could be some storage controller issue).


2. You are saying the speed now is 1.4Gbps. What was it before? Are you talking about actual NIC throughput or number you see in the GUI?

Thanks,
Yan.

Comment 17 Lulin Fan 2014-04-08 09:06:51 UTC
1. I will do more test to confirm it.

2. only Windows GUI shows the speed is 1.4Gbps, before is 1Gbps... I didn't test the actual NIC throughput.

Thanks
Fan

(In reply to Yan Vugenfirer from comment #15)
> 1. The new crash dump is not related to network driver. It is worth to open
> a new bug (could be some storage controller issue).
> 
> 
> 2. You are saying the speed now is 1.4Gbps. What was it before? Are you
> talking about actual NIC throughput or number you see in the GUI?
> 
> Thanks,
> Yan.

Comment 26 Mike Cao 2014-07-22 01:32:10 UTC
lijin, pls help to verify this bug

Comment 29 lijin 2014-07-25 07:31:33 UTC
Reproduced this issue on virtio-win-prewhl-74
Verified this issue on virtio-win-prewhl-87

kernel-2.6.32-491.el6.x86_64
qemu-kvm: 1.4.2

Steps:
1.boot winxp guest with:
qemu-system-x86_64 -drive file=winxp.qcow2,if=none,cache=unsafe,media=disk,format=qcow2,id=drive-ide0-0-1 -device ide-drive,id=ide1,drive=drive-ide0-0-1,bus=ide.1 -monitor stdio -usb -device usb-tablet -boot menu=on -chardev file,path=/root/console.log,id=serial1 -device isa-serial,chardev=serial1,id=s1 -cpu Nehalem,hv_relaxed -smp 2 -m 2G -enable-kvm -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -netdev tap,script=/etc/qemu-ifup,id=tap1 -device virtio-net-pci,netdev=tap1,id=net1 -cdrom virtio-win-0.1-74.iso -vga cirrus -vnc :2
2.do s4 in guest

Actual Results:
on virtio-win-prewhl-74, guest bsod with "7e" code
on  virtio-win-prewhl-87,guest can hibernate and resume correctly.

Based on above ,this issue has been fixed already .

Comment 30 Mike Cao 2014-07-25 07:37:11 UTC
Need to highlight that we can *not* reproduce this issue on RHEL6 qemu-kvm .
We use upstream v1.4.2 for bug verification

Since RHEL users will not affected ,closing this bug

Comment 31 lijin 2014-11-06 03:33:49 UTC
winxp can s4 correctly on rhel7.1 host.

package info:
qemu-kvm-rhev-2.1.2-5.el7.x86_64
kernel-3.10.0-196.el7.x86_64
virtio-win-1.7.2-1.el7
seabios-1.7.5-5.el7.x86_64