RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1432567 - [virtio-win][vioscsi] Crash dump not generated with num_queues=4
Summary: [virtio-win][vioscsi] Crash dump not generated with num_queues=4
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: virtio-win
Version: 7.3
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Ladi Prosek
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-03-15 17:03 UTC by Ladi Prosek
Modified: 2019-02-19 10:23 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: two many queues are allocated in in dump mode, Consequence:this results in attempts to allocate excessive amount of memory and VioScsiFindAdapter failing. Fix: Allocate only one virtqueue in dump mode Result: memory dump can be generated
Clone Of:
Environment:
Last Closed: 2017-08-01 12:58:08 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:2341 0 normal SHIPPED_LIVE virtio-win bug fix and enhancement update 2017-08-01 16:52:38 UTC

Description Ladi Prosek 2017-03-15 17:03:16 UTC
Description of problem:
Vioscsi.sys allocates memory for all available queues even when running in crash dump mode where memory is extremely scarce.

Version-Release number of selected component (if applicable):
virtio-win pre-whql 134

How reproducible:
100%

Steps to Reproduce:
1. Launch a Win7 64-bit VM with
-smp 4,sockets=1,cores=4,threads=1
and
-device virtio-scsi-pci,num_queues=4
and the system disk attached to it.

2. Cause a BSOD, for example with the NotMyFault tool.

Actual results:
Crash dump is not generated.

Expected results:
Crash dump is generated.

Comment 2 Vadim Rozenfeld 2017-03-16 03:00:18 UTC
Looks like regression introduced by
commit 4bf912c46cf3519534ed83a7f22d8c6a4adb211f
Author: Julius Rus <iuliur>
Date:   Wed Feb 1 14:22:40 2017 -0800
    Fix bluescreen when num_queues > num_cpus.

Can you please confirm that build 131 has no such problem?

Thanks,
Vadim.

Comment 4 Ladi Prosek 2017-03-16 08:01:56 UTC
(In reply to Vadim Rozenfeld from comment #2)
> Looks like regression introduced by
> commit 4bf912c46cf3519534ed83a7f22d8c6a4adb211f
> Author: Julius Rus <iuliur>
> Date:   Wed Feb 1 14:22:40 2017 -0800
>     Fix bluescreen when num_queues > num_cpus.
> 
> Can you please confirm that build 131 has no such problem?

131 has the same problem. This was introduced in

  commit 7cfc971d62a361796d497adbb5c7c3bb7cac635a
  Author: Vadim Rozenfeld <vrozenfe>
  Date:   Tue Jun 28 16:31:55 2016 +1000

      [vioscsi] fix regression in Hot-Add Device HCK test


So it is a 7.2 -> 7.3 regression.

Comment 6 Peixiu Hou 2017-03-20 03:14:37 UTC
Hi Ladi,

Could you please help check my reproduced steps? I cannot reproduce this issue with virtio-win-prewhql-134 & 131. The Crash dump is generated.

Steps:
1. Launch a Win7 64-bit VM with qemu cli:

/usr/libexec/qemu-kvm -name win7-64 -enable-kvm -m 3G -smp 4,sockets=1,cores=4,threads=1 -cpu SandyBridge -uuid ea78071a-f6e4-4347-8077-9cb9f7953a84 -nodefconfig --nodefaults -boot order=cd,menu=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=isa_serial0 -device usb-tablet,id=input0 -device virtio-scsi-pci,id=scsi0,num_queues=4 -drive file=133QSRWIN764BWA,if=none,id=drive-scsi-disk0,format=raw,serial=22,cache=none -device scsi-hd,bus=scsi0.0,drive=drive-scsi-disk0,id=scsi-disk0 -drive file=en_windows_server_2008_datacenter_enterprise_standard_sp2_x64_dvd_342336.iso,media=cdrom,id=cdrom,if=none -device ide-drive,drive=cdrom,bootindex=1 -netdev tap,id=hostnet0,vhost=on,vhostforce=off -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:83:66:77:88:66,bus=pci.0,addr=0x3,status=on -vnc 0.0.0.0:1 -vga std -monitor stdio -qmp tcp:0:4446,server,nowait 

2. Cause a BSOD with the NotMyFault tool.

Used version:
kernel-3.10.0-606.el7.x86_64
qemu-kvm-rhev-2.8.0-6.el7.x86_64
seabios-1.10.2-1.el7.x86_64
virtio-win-prewhql-134 & 131


Best Regards~
Peixiu

Comment 7 Ladi Prosek 2017-03-20 12:39:47 UTC
Hi Peixiu Hou,

(In reply to Peixiu Hou from comment #6)
> Hi Ladi,
> 
> Could you please help check my reproduced steps? I cannot reproduce this
> issue with virtio-win-prewhql-134 & 131. The Crash dump is generated.

Apologies for incomplete/incorrect repro steps. Turns out my test VM was running with disable-modern=true and modern virtio needs much less physically contiguous memory.

Can you please try it with:

-device virtio-scsi-pci,id=scsi0,num_queues=12
(and 12 vcpus)

or

-device virtio-scsi-pci,id=scsi0,num_queues=4,disable-modern=true

Thanks!
Ladi

Comment 8 Peixiu Hou 2017-03-21 07:39:35 UTC
Hi Ladi,

Tried with "-device virtio-scsi-pci,id=scsi0,num_queues=12 (and 12 vcpus)", reproduced this issue.
Tried with "-device virtio-scsi-pci,id=scsi0,num_queues=4,disable-modern=true", reproduced this issue.

Thanks a lot~
Peixiu Hou

Comment 9 peliu@redhat.com 2017-04-01 08:50:36 UTC
Reproduced this issue on virtio-win-prewhql-131&134 version
Verified this issue on virtio-win-prewhql-135 version
   
Steps same as comment#7
Actual Results:
on virtio-win-prewhql-134&131 (un-fixed version), Crash dump is not generated.
on virtio-win-prewhql-135 (fix version), Crash dump is generated (expected results).
   
So this issue has been fixed,thanks.

Version-Release number of selected component 
kernel-3.10.0-634.el7.x86_6
qemu-kvm-rhev-2.8.0-6.el7.x86_64
seabios-1.10.2-1.el7.x86_64

Comment 10 lijin 2017-04-01 08:56:36 UTC
change status to verified according to comment#9

Comment 13 errata-xmlrpc 2017-08-01 12:58:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2341

Comment 14 Peixiu Hou 2019-02-18 06:51:18 UTC
Hi Vadim,

We're reviewing the test case, I have some question for this issue. 
We have a test case for this issue, case steps as follows:

1. Start VM with virtio-scsi-pci (system disk)
-object iothread,id=iothread0 \
-device virtio-scsi-pci,id=scsi0,iothread=iothread0 \
-drive file=OS.raw,if=none,id=drive-scsi-disk0,format=raw,serial=22,cache=none \
-device scsi-hd,bus=scsi0.0,drive=drive-scsi-disk0,id=scsi-disk0,share-rw=on \

2. Check whether vioscsi.sys verifier enabled in guest:
    #verifier /querysettings (run as administrator)
    If No ,enabled vioscsi.sys verifier:
    #verifier.exe /standard /driver vioscsi.sys  (run as administrator)
    then reboot the guest and recheck
    #verifier /querysettings

3. Cause a BSOD with the NotMyFault tool or NMI.
1). For NotMyFault: open the tool application, click any crash type to trigger BSOD.
2). For NMI: create a NMICrashDump DWORD in registry folder

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\CrashContro
Set NMICrashDump = 1, then reboot the vm

Open QMP monitor 
# telnet $HostIP $Port
{"execute":"qmp_capabilities"}
{"execute":"inject-nmi"}

4. Quit the VM and reboot it up, check Memory.dump in C:\Windows directory
5. Shutdown the VM, restart the VM with virtio-scsi-pci (system disk) and num_queues=12
CLI:
-smp 12,sockets=1,cores=12,threads=1 \
-device virtio-scsi-pci,id=scsi0,num_queues=12 -drive file=OS.raw,if=none,id=drive-scsi-disk0,format=raw,serial=22,cache=none -device scsi-hd,bus=scsi0.0,drive=drive-scsi-disk0,id=scsi-disk0
6. Repeat step3-4.

The num_queues=12 on step5 is come from comment#7, here I confused why choose num_queues=12 to reproduce this issue? what's special meaning have? and if the case step need to be adjusted? if yes, could you help to give some advise for it?

Thanks a lot~
Peixiu

Comment 15 Vadim Rozenfeld 2019-02-19 03:08:05 UTC
(In reply to Peixiu Hou from comment #14)
> Hi Vadim,
> 
> We're reviewing the test case, I have some question for this issue. 
> We have a test case for this issue, case steps as follows:
> 
> 1. Start VM with virtio-scsi-pci (system disk)
> -object iothread,id=iothread0 \
> -device virtio-scsi-pci,id=scsi0,iothread=iothread0 \
> -drive
> file=OS.raw,if=none,id=drive-scsi-disk0,format=raw,serial=22,cache=none \
> -device scsi-hd,bus=scsi0.0,drive=drive-scsi-disk0,id=scsi-disk0,share-rw=on
> \
> 
> 2. Check whether vioscsi.sys verifier enabled in guest:
>     #verifier /querysettings (run as administrator)
>     If No ,enabled vioscsi.sys verifier:
>     #verifier.exe /standard /driver vioscsi.sys  (run as administrator)
>     then reboot the guest and recheck
>     #verifier /querysettings
> 
> 3. Cause a BSOD with the NotMyFault tool or NMI.
> 1). For NotMyFault: open the tool application, click any crash type to
> trigger BSOD.
> 2). For NMI: create a NMICrashDump DWORD in registry folder
> 
> HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\CrashContro
> Set NMICrashDump = 1, then reboot the vm
> 
> Open QMP monitor 
> # telnet $HostIP $Port
> {"execute":"qmp_capabilities"}
> {"execute":"inject-nmi"}
> 
> 4. Quit the VM and reboot it up, check Memory.dump in C:\Windows directory
> 5. Shutdown the VM, restart the VM with virtio-scsi-pci (system disk) and
> num_queues=12
> CLI:
> -smp 12,sockets=1,cores=12,threads=1 \
> -device virtio-scsi-pci,id=scsi0,num_queues=12 -drive
> file=OS.raw,if=none,id=drive-scsi-disk0,format=raw,serial=22,cache=none
> -device scsi-hd,bus=scsi0.0,drive=drive-scsi-disk0,id=scsi-disk0
> 6. Repeat step3-4.
> 
> The num_queues=12 on step5 is come from comment#7, here I confused why
> choose num_queues=12 to reproduce this issue? what's special meaning have?
> and if the case step need to be adjusted? if yes, could you help to give
> some advise for it?
> 
> Thanks a lot~
> Peixiu

Windows needs to pre-allocate some amount of physically continuous memory to build
virtio queues on it. The problem is that Windows needs to do it not only during boot-up
time but also when crash happens. The number of pages that Windows can allocate as
continuous region is quite limited resource. in build 134 vioscsi driver was trying to 
allocate memory and create as many virtio queues as specified in num_queues even when
running in dump stack. 12 queues need quite significant physically continuous memory region
which is not always available. When virtio-scsi drivers fail to allocate enough memory it crashes
by itself, which makes impossible creating a valid dump file. IIRC the problem described in
this bug was a temporary regression, and should be fixed now.

Best regards,
Vadim.

Comment 16 Peixiu Hou 2019-02-19 10:23:18 UTC
(In reply to Vadim Rozenfeld from comment #15)
> (In reply to Peixiu Hou from comment #14)
> > Hi Vadim,
> > 
> > We're reviewing the test case, I have some question for this issue. 
> > We have a test case for this issue, case steps as follows:
> > 
> > 1. Start VM with virtio-scsi-pci (system disk)
> > -object iothread,id=iothread0 \
> > -device virtio-scsi-pci,id=scsi0,iothread=iothread0 \
> > -drive
> > file=OS.raw,if=none,id=drive-scsi-disk0,format=raw,serial=22,cache=none \
> > -device scsi-hd,bus=scsi0.0,drive=drive-scsi-disk0,id=scsi-disk0,share-rw=on
> > \
> > 
> > 2. Check whether vioscsi.sys verifier enabled in guest:
> >     #verifier /querysettings (run as administrator)
> >     If No ,enabled vioscsi.sys verifier:
> >     #verifier.exe /standard /driver vioscsi.sys  (run as administrator)
> >     then reboot the guest and recheck
> >     #verifier /querysettings
> > 
> > 3. Cause a BSOD with the NotMyFault tool or NMI.
> > 1). For NotMyFault: open the tool application, click any crash type to
> > trigger BSOD.
> > 2). For NMI: create a NMICrashDump DWORD in registry folder
> > 
> > HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\CrashContro
> > Set NMICrashDump = 1, then reboot the vm
> > 
> > Open QMP monitor 
> > # telnet $HostIP $Port
> > {"execute":"qmp_capabilities"}
> > {"execute":"inject-nmi"}
> > 
> > 4. Quit the VM and reboot it up, check Memory.dump in C:\Windows directory
> > 5. Shutdown the VM, restart the VM with virtio-scsi-pci (system disk) and
> > num_queues=12
> > CLI:
> > -smp 12,sockets=1,cores=12,threads=1 \
> > -device virtio-scsi-pci,id=scsi0,num_queues=12 -drive
> > file=OS.raw,if=none,id=drive-scsi-disk0,format=raw,serial=22,cache=none
> > -device scsi-hd,bus=scsi0.0,drive=drive-scsi-disk0,id=scsi-disk0
> > 6. Repeat step3-4.
> > 
> > The num_queues=12 on step5 is come from comment#7, here I confused why
> > choose num_queues=12 to reproduce this issue? what's special meaning have?
> > and if the case step need to be adjusted? if yes, could you help to give
> > some advise for it?
> > 
> > Thanks a lot~
> > Peixiu
> 
> Windows needs to pre-allocate some amount of physically continuous memory to
> build
> virtio queues on it. The problem is that Windows needs to do it not only
> during boot-up
> time but also when crash happens. The number of pages that Windows can
> allocate as
> continuous region is quite limited resource. in build 134 vioscsi driver was
> trying to 
> allocate memory and create as many virtio queues as specified in num_queues
> even when
> running in dump stack. 12 queues need quite significant physically
> continuous memory region
> which is not always available. When virtio-scsi drivers fail to allocate
> enough memory it crashes
> by itself, which makes impossible creating a valid dump file. IIRC the
> problem described in
> this bug was a temporary regression, and should be fixed now.
> 
> Best regards,
> Vadim.

Got it, thanks a lot~


Note You need to log in before you can comment on or make changes to this bug.