Bug 1083860

Summary: kernel panic when virtscsi_init fails
Product: Red Hat Enterprise Linux 7 Reporter: FuXiangChun <xfu>
Component: kernelAssignee: Fam Zheng <famz>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.1CC: areis, bsarathy, famz, hhuang, juzhang, knoel, mazhang, michen, mkenneth, pbonzini, qzhang, rbalakri, sluo, virt-bugs, virt-maint, vrozenfe, xfu
Target Milestone: pre-dev-freeze   
Target Release: 7.1   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: kernel-3.10.0-152.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-03-05 11:48:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description FuXiangChun 2014-04-03 06:09:20 UTC
Description of problem:
Boot RHEL7 guest with and virtio-scsi-pci controller and num_queues=2 option. guest will kernel panic. If num_queues=1 then guest works well.

Version-Release number of selected component (if applicable):
host:
2.6.32-452.el6.x86_64
qemu-kvm-0.12.1.2-2.415.el6_5.7.x86_64

How reproducible:
100%

Steps to Reproduce:
1.Boot guest with num_queues=2

/usr/libexec/qemu-kvm -smp 2 -drive file=/dev/sdb,if=none,id=drive-virtio-disk,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop,media=disk,snapshot=off -device virtio-blk-pci,scsi=off,drive=drive-virtio-disk,id=virtio-disk,bus=pci.0,addr=0x7,bootindex=1,physical_block_size=512,logical_block_size=512,multifunction=on -vnc :2 -drive file=/home/ide-disk,if=none,id=drive-data-disk,format=raw,cache=none,aio=native,werror=stop,rerror=stop,copy-on-read=off,serial=fux-ide,media=disk -device ide-drive,drive=drive-data-disk,id=system-disk,wwn=0x5000c50015ea71ad,logical_block_size=512,physical_block_size=512,min_io_size=32,opt_io_size=64,discard_granularity=512,ver=fuxc-ver,bus=ide.0,unit=0 -monitor stdio -drive file=/home/virtio-scsi-disk,if=none,id=drive-scsi-disk,format=qcow2,cache=none,werror=stop,rerror=stop -device virtio-scsi-pci,id=scsi0,addr=0x13,vectors=512,indirect_desc=on,event_idx=off,hotplug=on,param_change=off,multifunction=on,rombar=64,num_queues=2 -device scsi-hd,drive=drive-scsi-disk,bus=scsi0.0,scsi-id=0,lun=0,id=data-disk2
2.
3.

Actual results:

[  OK  ] Reached target System Initialization.
         Starting Show Plymouth Boot Screen...
[    1.362811] Floppy drive(s): fd0 is 1.44M
[    1.374408] ACPI: bus type ATA registered
[    1.376299] FDC 0 is a S82078B
[    1.392182] ACPI: PCI Interrupt Link [LNKC] enabled at IRQ 11
[    1.394437] 8139cp: 8139cp: 10/100 PCI Ethernet driver v1.3 (Mar 22, 2004)
[    1.396708] [drm] Initialized drm 1.1.0 20060810
[    1.404291] 8139cp 0000:00:03.0 eth0: RTL-8139C+ at 0xffffc90000032000, 52:54:00:12:34:56, IRQ 11
[    1.422995] scsi0 : ata_piix
[    1.424998] scsi1 : ata_piix
[    1.426439] ata1: PATA max MWDMA2 cmd 0x1f0 ctl 0x3f6 bmdma 0xc000 irq 14
[    1.428509] ata2: PATA max MWDMA2 cmd 0x170 ctl 0x376 bmdma 0xc008 irq 15
[    1.450258] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
[    1.453550] IP: [<ffffffffa011ddb0>] __virtscsi_set_affinity+0x60/0x140 [virtio_scsi]
[    1.453550] PGD 0 
[    1.453550] Oops: 0000 [#1] SMP 
[    1.453550] Modules linked in: sysfillrect(+) virtio_scsi(+) sysimgblt virtio_blk(+) drm_kms_helper ttm ata_piix 8139cp virtio_pci virtio_ring mii virtio drm i2c_core libata floppy dm_mirror dm_region_hash dm_log dm_mod
[    1.453550] CPU: 1 PID: 218 Comm: systemd-udevd Not tainted 3.10.0-118.el7.x86_64 #1
[    1.453550] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
[    1.453550] task: ffff8800039adb00 ti: ffff880003a14000 task.ti: ffff880003a14000
[    1.453550] RIP: 0010:[<ffffffffa011ddb0>]  [<ffffffffa011ddb0>] __virtscsi_set_affinity+0x60/0x140 [virtio_scsi]
[    1.453550] RSP: 0018:ffff880003a15b38  EFLAGS: 00010216
[    1.453550] RAX: 0000000000000200 RBX: 0000000000000000 RCX: 0000000000000002
[    1.453550] RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000000000
[    1.453550] RBP: ffff880003a15b58 R08: 0000000000000002 R09: 0000000000000000
[    1.453550] R10: ffffffff819f40a0 R11: ffffffffa011e50c R12: ffff8800001ab730
[    1.453550] R13: ffffffff819f40a0 R14: 0000000000000000 R15: ffff8800074f0ec0
[    1.453550] FS:  00007f9b142ff880(0000) GS:ffff880007d00000(0000) knlGS:0000000000000000
[    1.453550] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    1.453550] CR2: 0000000000000020 CR3: 0000000003adc000 CR4: 00000000000006e0
[    1.453550] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    1.453550] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[    1.453550] Stack:
[    1.453550]  ffff8800039be400 ffff8800001ab730 0000000000000004 ffff8800074f0ee0
[    1.453550]  ffff880003a15b78 ffffffffa011debc ffff8800001ab730 ffff8800039be400
[    1.453550]  ffff880003a15bd8 ffffffffa011e514 ffff8800074f0f00 00000000fffffffe
[    1.453550] Call Trace:
[    1.453550]  [<ffffffffa011debc>] virtscsi_remove_vqs+0x2c/0x50 [virtio_scsi]
[    1.453550]  [<ffffffffa011e514>] virtscsi_init+0x134/0x2a0 [virtio_scsi]
[    1.453550]  [<ffffffffa011e7ef>] virtscsi_probe+0xef/0x27c [virtio_scsi]
[    1.453550]  [<ffffffffa01177c0>] ? vp_reset+0x90/0x90 [virtio_pci]
[    1.453550]  [<ffffffffa00221d2>] virtio_dev_probe+0xe2/0x150 [virtio]
[    1.453550]  [<ffffffff813b6997>] driver_probe_device+0x87/0x390
[    1.453550]  [<ffffffff813b6d73>] __driver_attach+0x93/0xa0
[    1.453550]  [<ffffffff813b6ce0>] ? __device_attach+0x40/0x40
[    1.453550]  [<ffffffff813b4723>] bus_for_each_dev+0x73/0xc0
[    1.453550]  [<ffffffff813b63ee>] driver_attach+0x1e/0x20
[    1.453550]  [<ffffffff813b5f40>] bus_add_driver+0x200/0x2d0
[    1.453550]  [<ffffffffa0123000>] ? 0xffffffffa0122fff
[    1.453550]  [<ffffffff813b73f4>] driver_register+0x64/0xf0
[    1.453550]  [<ffffffffa0123000>] ? 0xffffffffa0122fff
[    1.453550]  [<ffffffffa0022540>] register_virtio_driver+0x20/0x30 [virtio]
[    1.453550]  [<ffffffffa0123085>] init+0x85/0x1000 [virtio_scsi]
[    1.453550]  [<ffffffff810020e2>] do_one_initcall+0xe2/0x190
[    1.453550]  [<ffffffff810ca73b>] load_module+0x129b/0x1a90
[    1.453550]  [<ffffffff812da460>] ? ddebug_proc_write+0xf0/0xf0
[    1.453550]  [<ffffffff810c7073>] ? copy_module_from_fd.isra.43+0x53/0x150
[    1.453550]  [<ffffffff810cb0e6>] SyS_finit_module+0xa6/0xd0
[    1.453550]  [<ffffffff815fc899>] system_call_fastpath+0x16/0x1b
[    1.453550] Code: e1 39 c3 74 7e 45 84 f6 75 61 41 8b 84 24 c8 01 00 00 31 db 85 c0 74 3b 0f 1f 00 48 63 c3 48 83 c0 20 48 c1 e0 04 49 8b 7c 04 10 <48> 8b 47 20 48 8b 80 a8 02 00 00 48 8b 40 50 48 85 c0 74 07 be 
[    1.453550] RIP  [<ffffffffa011ddb0>] __virtscsi_set_affinity+0x60/0x140 [virtio_scsi]
[    1.453550]  RSP <ffff880003a15b38>
[    1.453550] CR2: 0000000000000020
[    1.580940] ---[ end trace d28de2f963f9e7de ]---
[    1.582716] Kernel panic - not syncing: Fatal exception
[    1.584417] ata1.01: NODEV after polling detection
[    1.584872] ata1.00: ATA-7: QEMU HARDDISK, fuxc-ver, max UDMA/100
[    1.584874] ata1.00: 4194304 sectors, multi 16: LBA48 
[    1.585589] ata1.00: configured for MWDMA2
[    1.585720] scsi 0:0:0:0: Direct-Access     ATA      QEMU HARDDISK    fuxc PQ: 0 ANSI: 5


Expected results:
don't affect guest, or show warning message if rhel6 don't support this option.

Additional info:
QE know RHEL6 don't support this function, but it shouldn't cause guest kernel panic.

Comment 1 juzhang 2014-04-03 06:17:25 UTC
> Additional info:
> QE know RHEL6 don't support this function, but it shouldn't cause guest
> kernel panic.

QE knows we do not support virtio-scsi multi-queue in rhel6.x and we might need to give a friendly behaviour instead of causing guest panic directly.

Comment 8 Fam Zheng 2014-04-11 07:33:06 UTC
Posted a fix to upstream:

https://www.mail-archive.com/kvm@vger.kernel.org/msg101086.html

Comment 9 Fam Zheng 2014-05-07 07:58:45 UTC
I've built a kernel package with the upstreamed fix included:

http://brewweb.devel.redhat.com/brew/taskinfo?taskID=7424793

Xiangchun, Could you give a test using the above kernel in guest?

Thanks a lot,
Fam

Comment 12 Sibiao Luo 2014-05-14 07:24:45 UTC
(In reply to juzhang from comment #11)
> Hi Sluo,
I can reproduce it using rhel7 guest on rhel6 host with the same steps as comment #0.
host info:
# uname -r && rpm -q qemu-kvm-rhev
2.6.32-448.el6.x86_64
qemu-kvm-rhev-0.12.1.2-2.424.el6.x86_64
guest info:
# uname -r
2.6.32-448.el6.x86_64

My qemu-kvm command line:
# /usr/libexec/qemu-kvm -smp 2 -drive file=/home/test,if=none,id=drive-virtio-disk,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop,media=disk,snapshot=off -device virtio-blk-pci,scsi=off,drive=drive-virtio-disk,id=virtio-disk,bus=pci.0,addr=0x7,bootindex=1,physical_block_size=512,logical_block_size=512,multifunction=on -netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=00:01:02:B6:40:11,bus=pci.0,addr=0x5 -vnc :2 -monitor stdio -drive file=/home/my-data-disk1.qcow2,if=none,id=drive-scsi-disk,format=qcow2,cache=none,werror=stop,rerror=stop -device virtio-scsi-pci,id=scsi0,addr=0x13,vectors=512,indirect_desc=on,event_idx=off,hotplug=on,param_change=off,multifunction=on,rombar=64,num_queues=2 -device scsi-hd,drive=drive-scsi-disk,bus=scsi0.0,scsi-id=0,lun=0,id=data-disk1 -serial unix:/tmp/ttyS0,server,nowait

> Since xiangchun is on pto, would you please give a help doing the following
> testing?
> 
> 1. Test this bz by using rhel7.0 guest(about kernel, please use fam's build)
> according to comment0 on rhel6.6 host?
Rhel7 guest with fam's private build on rhel6 host boot up successfully without any Call Trace.
host info:
# uname -r && rpm -q qemu-kvm-rhev
2.6.32-448.el6.x86_64
qemu-kvm-rhev-0.12.1.2-2.424.el6.x86_64
guest info:
3.10.0-123.el7.test.x86_64

> 2. Test this bz by using rhel7.0 guest(about kernel, please use fam's build)
> according to comment0 on rhel7.0 host?
Rhel7 guest with fam's private build on rhel7 host also boot up successfully without any Call Trace.
host info:
# uname -r && rpm -q qemu-kvm
3.10.0-121.el7.x86_64
qemu-kvm-1.5.3-60.el7.x86_64
guest info:
guest info:
3.10.0-123.el7.test.x86_64

> Plus
> 
> Had better have a try rhel6.6 guest on according to comment0 on rhel6.6 host
> as well?
> 
test rhel6.6 guest on rhel6 host according to comment #0 which boot up successfully without any Call Trace.
host info:
# uname -r && rpm -q qemu-kvm-rhev
2.6.32-448.el6.x86_64
qemu-kvm-rhev-0.12.1.2-2.424.el6.x86_64
guest info:
# uname -r
2.6.32-448.el6.x86_64

Best Regards,
sluo

Comment 13 Jarod Wilson 2014-09-05 17:13:42 UTC
Patch(es) available on kernel-3.10.0-152.el7

Comment 16 mazhang 2014-11-05 09:56:04 UTC
Reproduce this bug on rhel6 host.

Host:
qemu-kvm-rhev-0.12.1.2-2.448.el6.x86_64
qemu-kvm-rhev-debuginfo-0.12.1.2-2.448.el6.x86_64
gpxe-roms-qemu-0.9.7-6.12.el6.noarch
qemu-img-rhev-0.12.1.2-2.448.el6.x86_64
qemu-kvm-rhev-tools-0.12.1.2-2.448.el6.x86_64
kernel-2.6.32-497.el6.x86_64

Guest:
kernel-3.10.0-123.el7.x86_64

Cli:
/usr/libexec/qemu-kvm \
-M pc \
-cpu SandyBridge \
-m 2G \
-smp 2 \
-enable-kvm \
-name rhel7 \
-uuid 990ea161-6b67-47b2-b803-19fb01d30d12 \
-smbios type=1,manufacturer='Red Hat',product='RHEV Hypervisor',version=el6,serial=koTUXQrb,uuid=feebc8fd-f8b0-4e75-abc3-e63fcdb67170 \
-k en-us \
-rtc base=localtime,clock=host,driftfix=slew \
-nodefaults \
-monitor stdio \
-qmp tcp:0:5555,server,nowait \
-boot menu=on \
-bios /usr/share/seabios/bios.bin \
-monitor unix:/tmp/monitor2,server,nowait \
-vga std \
-vnc :0 \
-usb \
-device usb-tablet,id=tablet0 \
-netdev tap,id=hostnet0 \
-device virtio-net-pci,netdev=hostnet0,id=net0,mac=54:52:00:B6:40:21 \
-drive file=/home/rhel7-64.qcow2,if=none,id=drive-virtio-disk,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop,media=disk,snapshot=off \
-device virtio-blk-pci,scsi=off,drive=drive-virtio-disk,id=virtio-disk,bus=pci.0,addr=0x7,bootindex=1,physical_block_size=512,logical_block_size=512,multifunction=on \
-drive file=/home/storage0.qcow2,if=none,id=drive-scsi-disk,format=qcow2,cache=none,werror=stop,rerror=stop \
-device virtio-scsi-pci,id=scsi0,addr=0x13,vectors=512,indirect_desc=on,event_idx=off,hotplug=on,param_change=off,multifunction=on,rombar=64,num_queues=2 \
-device scsi-hd,drive=drive-scsi-disk,bus=scsi0.0,scsi-id=0,lun=0,id=data-disk2 \ 

Result:
Guest kernel panic.


Update guest kernel to kernel-3.10.0-197.el7.x86_64 re-test this bug, guest works well.

So this bug has been fixed.

Comment 17 juzhang 2014-11-11 03:57:23 UTC
According to comment16, set this issue as verified.

Comment 19 errata-xmlrpc 2015-03-05 11:48:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-0290.html