Bug 1664619
Summary: | virtio-blk: guest kernel panic when boot vm with disk over 8k sector size | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Han Han <hhan> | ||||||
Component: | kernel | Assignee: | Maxim Levitsky <mlevitsk> | ||||||
kernel sub component: | KVM | QA Contact: | qing.wang <qinwang> | ||||||
Status: | CLOSED WONTFIX | Docs Contact: | |||||||
Severity: | medium | ||||||||
Priority: | low | CC: | chayang, coli, dyuan, juzhang, knoel, minlei, mlevitsk, ngu, qzhang, stefanha, virt-maint, xuwei, xuzhang | ||||||
Version: | 8.0 | Keywords: | Triaged | ||||||
Target Milestone: | rc | ||||||||
Target Release: | --- | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2021-02-01 07:31:48 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Han Han
2019-01-09 10:19:07 UTC
This looks like a guest driver bug. Note that this is not about 4k sector size, but about 8k, which makes it rather low priority in my opinion. This issue could be reproduced on both RHEL7 and RHEL8 guests. Host: kernel-4.18.0-56.el8.x86_64 qemu-kvm-core-3.1.0-3.module+el8+2638+e43dad09.x86_64 Guest: RHEL7: kernel-3.10.0-957.el7.x86_64 RHEL8: kernel-4.18.0-58.el8.x86_64 Steps: Boot the guest with below command line: /usr/libexec/qemu-kvm \ -S \ -name 'test' \ -sandbox off \ -machine pc \ -nodefaults \ -device qxl-vga \ -object iothread,id=iothread0 \ -object iothread,id=iothread1 \ -object iothread,id=iothread2 \ -blockdev driver=file,cache.direct=off,cache.no-flush=on,node-name=file_win1,filename=/root/rhel80-64-virtio-scsi.qcow2 \ -blockdev driver=qcow2,node-name=drive_win1,file=file_win1 \ -device virtio-blk-pci,id=image2,drive=drive_win1,iothread=iothread0 \ -blockdev driver=file,cache.direct=off,cache.no-flush=on,node-name=file_stg1,filename=/home/chai/disk1.qcow2 \ -blockdev driver=qcow2,node-name=drive_stg1,file=file_stg1 \ -device virtio-blk-pci,id=image3,drive=drive_stg1,iothread=iothread1,logical_block_size=8192,physical_block_size=8192 \ -chardev file,path=/home/chai/serial.log,id=serial_id_serial0 \ -device isa-serial,chardev=serial_id_serial0 \ -device virtio-net-pci,mac=6c:ae:8b:20:80:70,id=iddd,vectors=4,netdev=idttt \ -netdev tap,id=idttt,vhost=on \ -m 4G \ -smp 12,maxcpus=12,cores=6,threads=1,sockets=2 \ -cpu 'SandyBridge' \ -rtc base=utc,clock=host,driftfix=slew \ -enable-kvm \ -monitor stdio \ -device qemu-xhci,id=usb1 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -qmp tcp:0:4441,server,nowait \ -vnc :1 P.S. The detailed logs (serial-rhel7.log and serial-rhel8.log) are attached. Created attachment 1521419 [details]
serial-rhel8.log
Created attachment 1521420 [details]
serial-rhel7.log
To be honest, Linux kernel doesn't support block devices with sector size > PAGE_SIZE. This is very hard limitation, based on assumption that most of the IO is going through page cache, so you should be allowed to read/write a single page. In theory a block driver can emulate a 4K block size by RMW, and there is even a 'depricated' pktcdvd driver which emulates an 4K block device on top of CD/DVD-RW drive which has 64K/32K block size. I researched this very, like very long ago. What _is_ strange here is that we get an oops instead of a clear error message, and I can look at fixing this. PS: Actually this brings memories from my childhood, when I just switched to Linux, and was trying to evaluate the options of supporting packet writing on CD/DVD media, a feature that I missed so much from windows. (we haven't had flash drives yet, so readin/writing files 'normally' instead of using burner program on a CD-RW was a big deal back then) This was kind of my first encounter with the Linux kernel, I actually had to understand both filesystem and memory subsystems very well, to reach this conclusion. Reproduced upstream with roughly the same backtrace. [ 1.859335] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 1.859629] [drm] number of cap sets: 2 [ 1.860424] #PF: supervisor write access in kernel mode [ 1.860424] #PF: error_code(0x0002) - not-present page [ 1.860425] PGD 0 P4D 0 [ 1.860427] Oops: 0002 [#1] SMP [ 1.860428] CPU: 21 PID: 639 Comm: systemd-udevd Not tainted 5.8.0-rc4.stable #27 [ 1.860429] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 [ 1.860432] RIP: 0010:create_empty_buffers+0x21/0x110 [ 1.860434] Code: 0b 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 55 49 89 d5 ba 01 00 00 00 41 54 53 48 89 fb e8 e2 f5 ff ff 49 89 c4 <4c> 09 28 48 89 c2 48 8b 40 08 48 85 c0 75 f1 4c 89 62 08 48 8b 43 [ 1.861070] [drm] cap set 0: id 1, max-version 1, max-size 308 [ 1.861581] RSP: 0018:ffffc90000abb5f8 EFLAGS: 00010286 [ 1.861582] RAX: 0000000000000000 RBX: ffffea00213f9740 RCX: 000000000000000d [ 1.861583] RDX: 00006077a0004558 RSI: ffff888854715000 RDI: ffffea00213f9740 [ 1.861583] RBP: ffffc90000abb610 R08: ffff888854715000 R09: 000000000074e1ad [ 1.861584] R10: 0000000000000001 R11: ffff888850388758 R12: 0000000000000000 [ 1.861584] R13: 0000000000000000 R14: ffffea00213f9740 R15: 0000000000000000 [ 1.861586] FS: 00007f34edb4bb80(0000) GS:ffff88885f340000(0000) knlGS:0000000000000000 [ 1.861587] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1.861587] CR2: 0000000000000000 CR3: 000000084fdea000 CR4: 0000000000340ee0 [ 1.861588] Call Trace: [ 1.861592] create_page_buffers+0x55/0x60 [ 1.862255] [drm] cap set 1: id 2, max-version 2, max-size 688 [ 1.862544] block_read_full_page+0x4f/0x3e0 [ 1.863064] [drm] Initialized virtio_gpu 0.1.0 0 for virtio3 on minor 0 [ 1.863824] ? bdev_evict_inode+0xe0/0xe0 [ 1.863826] ? __add_to_page_cache_locked+0x11c/0x330 [ 1.863828] ? scan_shadow_nodes+0x30/0x30 [ 1.863829] blkdev_readpage+0x18/0x20 [ 1.863830] do_read_cache_page+0x2a6/0x390 [ 1.863831] read_cache_page+0x12/0x20 [ 1.863833] read_part_sector+0x37/0xc8 [ 1.863833] read_lba+0x11a/0x1e0 [ 1.863835] ? kmem_cache_alloc_trace+0x153/0x220 [ 1.863836] efi_partition+0x1d9/0x81d [ 1.863837] ? vsnprintf+0x2d4/0x470 [ 1.863838] ? snprintf+0x49/0x60 [ 1.863839] blk_add_partitions+0x145/0x390 [ 1.863841] ? blk_drop_partitions+0x9c/0xd0 [ 1.884870] bdev_disk_changed+0x73/0xe0 [ 1.884872] __blkdev_get+0x3cb/0x540 [ 1.884873] blkdev_get+0x3d/0x160 [ 1.884874] __device_add_disk+0x336/0x4a0 [ 1.884875] device_add_disk+0x13/0x20 [ 1.884878] virtblk_probe+0x4d3/0x7d4 [virtio_blk] [ 1.884880] virtio_dev_probe+0x14d/0x1e0 [virtio] [ 1.884884] really_probe+0x171/0x420 [ 1.888786] driver_probe_device+0xe9/0x160 [ 1.888788] device_driver_attach+0xab/0xb0 [ 1.888789] __driver_attach+0x8c/0x150 [ 1.888790] ? device_driver_attach+0xb0/0xb0 [ 1.888790] bus_for_each_dev+0x7c/0xc0 [ 1.888791] driver_attach+0x1e/0x20 [ 1.888792] bus_add_driver+0x135/0x1f0 [ 1.888793] fbcon: Deferring console take-over [ 1.888795] driver_register+0x91/0xf0 [ 1.888797] register_virtio_driver+0x20/0x30 [virtio] [ 1.889317] virtio_gpu virtio3: fb0: virtio_gpudrmfb frame buffer device [ 1.889788] init+0x54/0x1000 [virtio_blk] [ 1.894847] ? 0xffffffffa0188000 [ 1.894849] do_one_initcall+0x48/0x1f0 [ 1.894850] ? _cond_resched+0x1a/0x50 [ 1.894851] ? kmem_cache_alloc_trace+0x153/0x220 [ 1.894853] ? do_init_module+0x28/0x270 [ 1.894855] do_init_module+0x62/0x270 [ 1.897652] load_module+0x2a3e/0x2c80 [ 1.897654] __do_sys_finit_module+0xbe/0x120 [ 1.897656] __x64_sys_finit_module+0x1a/0x20 [ 1.897657] do_syscall_64+0x46/0xc0 [ 1.897658] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 1.897659] RIP: 0033:0x7f34eec9743d [ 1.897660] Code: Bad RIP value. [ 1.900892] RSP: 002b:00007ffd3bb159b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 1.900893] RAX: ffffffffffffffda RBX: 00005645ca6b1710 RCX: 00007f34eec9743d [ 1.900894] RDX: 0000000000000000 RSI: 00007f34ee8f795d RDI: 0000000000000006 [ 1.900894] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000007 [ 1.900894] R10: 0000000000000006 R11: 0000000000000246 R12: 0000000000000000 [ 1.900895] R13: 00007f34ee8f795d R14: 00005645ca68a360 R15: 00005645ca689520 [ 1.900896] Modules linked in: virtio_gpu(+) drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec drm virtio_net net_failover i2c_core failover virtio_scsi virtio_blk(+) crc32c_intel xhci_pci virtio_pci virtio_ring xhci_hcd virtio dm_mirror dm_region_hash dm_log fuse ipv6 autofs4 [ 1.908874] CR2: 0000000000000000 [ 1.909270] ---[ end trace 33dbeb0325d38ff9 ]--- [ 1.909800] RIP: 0010:create_empty_buffers+0x21/0x110 [ 1.910382] Code: 0b 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 55 49 89 d5 ba 01 00 00 00 41 54 53 48 89 fb e8 e2 f5 ff ff 49 89 c4 <4c> 09 28 48 89 c2 48 8b 40 08 48 85 c0 75 f1 4c 89 62 08 48 8b 43 [ 1.912502] RSP: 0018:ffffc90000abb5f8 EFLAGS: 00010286 [ 1.913099] RAX: 0000000000000000 RBX: ffffea00213f9740 RCX: 000000000000000d [ 1.913913] RDX: 00006077a0004558 RSI: ffff888854715000 RDI: ffffea00213f9740 [ 1.914729] RBP: ffffc90000abb610 R08: ffff888854715000 R09: 000000000074e1ad [ 1.915551] R10: 0000000000000001 R11: ffff888850388758 R12: 0000000000000000 [ 1.916414] R13: 0000000000000000 R14: ffffea00213f9740 R15: 0000000000000000 [ 1.917324] FS: 00007f34edb4bb80(0000) GS:ffff88885f340000(0000) knlGS:0000000000000000 [ 1.917324] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1.917325] CR2: 0000000000000000 CR3: 000000084fdea000 CR4: 0000000000340ee0 [ 1.917326] Kernel panic - not syncing: Fatal exception [ 1.918370] Kernel Offset: disabled [ 1.920849] Rebooting in 10 seconds.. [ 11.921756] ACPI MEMORY or I/O RESET_REG. virtio-scsi kernel driver on the other hand is immune to this issue: [ 1.963928] sd 1:0:0:0: Power-on or device reset occurred [ 1.963934] sd 0:0:0:0: Power-on or device reset occurred [ 1.964753] sd 1:0:0:0: [sdb] 314572800 512-byte logical blocks: (161 GB/150 GiB) [ 1.965337] sd 0:0:0:0: [sda] Unsupported sector size 8192. [ 1.966210] sd 1:0:0:0: [sdb] Write Protect is off [ 1.966972] sd 0:0:0:0: [sda] 0 512-byte logical blocks: (0 B/0 B) [ 1.967499] sd 1:0:0:0: [sdb] Mode Sense: 63 00 00 08 [ 1.968214] sd 0:0:0:0: [sda] 8192-byte physical blocks [ 1.968240] sd 0:0:0:0: [sda] Write Protect is off [ 1.968241] sd 0:0:0:0: [sda] Mode Sense: 63 00 00 08 [ 1.968282] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 1.968974] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 1.970112] sd 0:0:0:0: [sda] Unsupported sector size 8192. [ 1.972661] sd 0:0:0:0: [sda] Attached SCSI disk Patch posted upstream: https://lkml.org/lkml/2020/7/15/421 Can we make this bug public since it doesn't contain IMHO anything really private. I didn't notice that it is private and added link to it on the patch. Reproduced this issue on guest kernel-4.18.0-214.el8.x86_64 /usr/libexec/qemu-kvm \ -name 'test-vm1' \ -machine q35 \ -nodefaults \ -device VGA,bus=pcie.0,addr=0x1 \ -device pvpanic,ioport=0x505,id=idZcGD6F \ -device pcie-root-port,id=pcie.0-root-port-2,slot=2,chassis=2,addr=0x2,bus=pcie.0 \ -device qemu-xhci,id=usb1,bus=pcie.0-root-port-2,addr=0x0 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -object iothread,id=iothread0 \ -device pcie-root-port,id=pcie.0-root-port-3,slot=3,chassis=3,addr=0x3,bus=pcie.0 \ -device pcie-root-port,id=pcie.0-root-port-4,slot=4,chassis=4,addr=0x4,bus=pcie.0 \ -device pcie-root-port,id=pcie.0-root-port-6,slot=6,chassis=6,addr=0x6,bus=pcie.0 \ -device pcie-root-port,id=pcie.0-root-port-7,slot=7,chassis=7,addr=0x7,bus=pcie.0 \ -device pcie-root-port,id=pcie.0-root-port-8,slot=8,chassis=8,addr=0x8,bus=pcie.0 \ -device virtio-scsi-pci,id=scsi0,bus=pcie.0-root-port-3,addr=0x0,iothread=iothread0 \ -device virtio-scsi-pci,id=scsi2,bus=pcie.0-root-port-4,addr=0x0 \ \ -blockdev driver=file,cache.direct=off,cache.no-flush=on,filename=/home/kvm_autotest_root/images/rhel830-64-virtio.qcow2,node-name=os_img \ -blockdev driver=qcow2,node-name=os_drive,file=os_img \ -device virtio-blk-pci,drive=os_drive,id=os_disk,bus=pcie.0-root-port-6 \ \ -blockdev driver=file,cache.direct=on,cache.no-flush=off,node-name=file_stg1,filename=/home/kvm_autotest_root/images/stg1.qcow2 \ -blockdev driver=qcow2,node-name=drive_stg1,file=file_stg1 \ -device virtio-blk-pci,drive=drive_stg1,id=data,bus=pcie.0-root-port-7,addr=0x0,iothread=iothread0,logical_block_size=8192,physical_block_size=8192 \ \ -device pcie-root-port,id=pcie.0-root-port-5,slot=5,chassis=5,addr=0x5,bus=pcie.0 \ -device virtio-net-pci,mac=9a:55:56:57:58:59,id=id18Xcuo,netdev=idGRsMas,bus=pcie.0-root-port-5,addr=0x0 \ -netdev tap,id=idGRsMas,vhost=on \ -m 8G \ -vnc :5 \ -rtc base=localtime,clock=host,driftfix=slew \ -boot order=cdn,once=c,menu=off,strict=off \ -enable-kvm \ -monitor stdio \ -qmp tcp:0:5955,server,nowait \ updated version posted: https://lkml.org/lkml/2020/7/21/381 After evaluating this issue, there are no plans to address it further or fix it in an upcoming release. Therefore, it is being closed. If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened. QE agree to close. |