Bug 2156536
| Summary: | [RHEL9.2] nvme_user_space_driver Bus error (core dumped) when install guest | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | Zhenyu Zhang <zhenyzha> | |
| Component: | qemu-kvm | Assignee: | Eric Auger <eric.auger> | |
| qemu-kvm sub component: | NVMe | QA Contact: | Zhenyu Zhang <zhenyzha> | |
| Status: | CLOSED CURRENTRELEASE | Docs Contact: | ||
| Severity: | low | |||
| Priority: | low | CC: | alexander.lougovski, alex.williamson, aliang, cohuck, coli, eric.auger, gshan, jinzhao, juzhang, kwolf, lijin, sgarzare, stefanha, vgoyal, virt-maint, xuwei, yanghliu, yihyu, zhguo | |
| Version: | 9.2 | Keywords: | Triaged | |
| Target Milestone: | rc | Flags: | pm-rhel:
mirror+
|
|
| Target Release: | --- | |||
| Hardware: | aarch64 | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 2158686 (view as bug list) | Environment: | ||
| Last Closed: | 2023-02-09 08:37:27 UTC | Type: | --- | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 2158686 | |||
(Just out of interest, this is not high priority.)
SIGBUS from this line:
cap = le64_to_cpu(regs->cap);
This 64-bit field is 4KB aligned:
regs = 0xffff87629000
typedef struct QEMU_PACKED NvmeBar {
uint64_t cap;
...
The mapping function call looks okay to me as well:
regs = qemu_vfio_pci_map_bar(s->vfio, 0, 0, sizeof(NvmeBar),
PROT_READ | PROT_WRITE, errp);
I'm not sure why there is a SIGBUS.
Eric: Any ideas how the VFIO BAR mapping could raise a SIGBUS on a memory load?
(In reply to Stefan Hajnoczi from comment #18) > (Just out of interest, this is not high priority.) > > SIGBUS from this line: > > cap = le64_to_cpu(regs->cap); > > This 64-bit field is 4KB aligned: > > regs = 0xffff87629000 > > typedef struct QEMU_PACKED NvmeBar { > uint64_t cap; > ... > > The mapping function call looks okay to me as well: > > regs = qemu_vfio_pci_map_bar(s->vfio, 0, 0, sizeof(NvmeBar), > PROT_READ | PROT_WRITE, errp); > > I'm not sure why there is a SIGBUS. > > Eric: Any ideas how the VFIO BAR mapping could raise a SIGBUS on a memory > load? Stefan, one case that comes to my mind is if the memory access is disabled on the PCI device. We got the case in the past on arm but I fail to remember the exact circumstances. https://lists.archive.carbon60.com/linux/kernel/3714001 Hi Zhenyu, sure, please close it. The best would be to run this kind of test in our VFIO test plan. Thanks Eric Ok, we will keep this test scenario in the aarch64 platform. Close this bug as a current release according to comments 24-26. |
Description of problem: NVMe Userspace Driver install a guest on NVMe device failed Hit Bus error (core dumped) Version-Release number of selected component (if applicable): Test Environment: Host Distro: RHEL-9.2.0-20221226.0 Host Kernel: 5.14.0-226.el9.aarch64 Qemu-kvm: qemu-kvm-7.2.0-2.el9.aarch64 Guest Kernel: 5.14.0-226.el9.aarch64 edk2:edk2-aarch64-20221207gitfff6d81270b5-1.el9.noarch How reproducible: 100% Steps to Reproduce: 1. Setup host nvme [root@ampere-hr350a-07 ~]# modprobe vfio-pci [root@ampere-hr350a-07 ~]# modinfo vfio-pci filename: /lib/modules/5.14.0-226.el9.aarch64/kernel/drivers/vfio/pci/vfio-pci.ko.xz description: VFIO PCI - User Level meta-driver author: Alex Williamson <alex.williamson> license: GPL v2 rhelversion: 9.2 srcversion: A326814B1A0356BEFB22207 alias: vfio_pci:v*d*sv*sd*bc*sc*i* depends: vfio-pci-core intree: Y name: vfio_pci vermagic: 5.14.0-226.el9.aarch64 SMP preempt mod_unload modversions aarch64 sig_id: PKCS#7 signer: Red Hat Enterprise Linux kernel signing key sig_key: 0D:4E:DB:4A:00:71:20:EC:E4:2A:9C:E4:91:88:40:D5:1A:DB:FC:A3 sig_hashalgo: sha256 signature: 9B:4F:5A:25:FC:7F:37:2E:C0:20:19:1B:A5:E7:06:DF:D2:3C:89:B4: 04:73:44:5C:F4:6B:0B:F4:A7:74:3B:E8:24:E3:AB:E2:0A:C1:34:B4: E1:25:F9:64:51:94:74:AA:EE:64:50:B2:3B:09:FF:4D:DA:1C:9D:AF: 58:2B:84:96:03:05:8F:91:F8:94:9F:79:4B:09:A9:E5:79:5E:B2:08: B9:AF:BF:F9:61:C2:F9:EC:A3:38:F3:9E:30:BE:96:04:40:1B:08:8C: 0C:AF:22:1A:EE:FF:9A:AC:5E:90:E4:AC:97:37:06:DB:E2:A2:2F:38: 32:2E:09:0A:3A:DE:82:86:AE:10:2D:63:35:95:9B:E4:3C:FA:DB:C7: 4E:E8:9D:96:D0:47:25:2F:8C:5A:78:05:6F:77:3D:84:7D:D0:02:7D: 18:6F:84:1B:95:3B:2E:AE:FC:B7:FF:55:6E:12:11:AC:5C:3E:D3:B6: D9:B1:E2:CB:C2:02:37:AA:B8:F2:47:4F:83:11:13:94:5B:6D:5D:F5: 88:7B:96:34:41:F1:07:B3:FC:F1:E7:FF:52:A0:6B:85:19:CA:DE:E5: 0C:EE:77:F6:BB:B7:6E:70:11:A6:AF:A8:EE:64:6B:B6:A1:61:1A:BA: 4D:D1:D4:DC:F8:98:B1:88:7B:B5:8B:5D:A2:17:99:B6:D1:CC:7A:49: 19:71:F7:7F:2C:F7:1D:9C:6B:65:F7:37:87:D0:31:0A:2A:52:26:50: 59:31:8C:13:9A:9F:E0:75:0F:BA:45:54:AD:AE:65:D8:04:55:88:FA: 19:0D:94:E2:6F:94:65:8A:57:AB:97:D7:E9:5A:AE:B4:64:37:17:3D: C7:2F:FC:E4:B9:CE:FF:B4:E0:47:99:51:53:22:AC:82:5C:80:BB:6B: D3:D0:41:B5:37:F9:3D:D6:57:AC:95:A6:BD:34:71:FD:C1:8D:EB:07: 87:8E:AC:67:9B:69:57:EF:94:4F:81:7C:93:F0:DC:82:31:D1:79:43: BD:FA:21:6B parm: ids:Initial PCI IDs to add to the vfio driver, format is "vendor:device[:subvendor[:subdevice[:class[:class_mask]]]]" and multiple comma separated entries can be specified (string) parm: nointxmask:Disable support for PCI 2.3 style INTx masking. If this resolves problems for specific devices, report lspci -vvvxxx to linux-pci.org so the device can be fixed automatically via the broken_intx_masking flag. (bool) parm: disable_idle_d3:Disable using the PCI D3 low power state for idle, unused devices (bool) parm: enable_sriov:Enable support for SR-IOV configuration. Enabling SR-IOV on a PF typically requires support of the userspace PF driver, enabling VFs without such support may result in non-functional VFs or PF. (bool) parm: disable_denylist:Disable use of device denylist. Disabling the denylist allows binding to devices with known errata that may lead to exploitable stability or security issues when accessed by untrusted users. (bool) # lspci | grep -i 'Non-Volatile' 0004:01:00.0 Non-Volatile memory controller: Intel Corporation NVMe Datacenter SSD [3DNAND, Beta Rock Controller] 0005:01:00.0 Non-Volatile memory controller: Intel Corporation NVMe Datacenter SSD [3DNAND, Beta Rock Controller] # lspci -n -s 0005:01:00.0 0005:01:00.0 0108: 8086:0a54 # echo 0005:01:00.0 > /sys/bus/pci/devices/0005:01:00.0/driver/unbind # echo 8086 0a54 > /sys/bus/pci/drivers/vfio-pci/new_id 2. create a image on nvme [root@ampere-hr350a-07 ~]# /usr/bin/qemu-img create -f qcow2 nvme://0005:01:00.0/1 20G Formatting 'nvme://0005:01:00.0/1', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=21474836480 lazy_refcounts=off refcount_bits=16 [root@ampere-hr350a-07 ~]# qemu-img info -U nvme://0005:01:00.0/1 image: nvme://0005:01:00.0/1 file format: qcow2 virtual size: 20 GiB (21474836480 bytes) disk size: unavailable cluster_size: 65536 Format specific information: compat: 1.1 compression type: zlib lazy refcounts: false refcount bits: 16 corrupt: false extended l2: false 3. install a guest on image /usr/libexec/qemu-kvm \ -name 'avocado-vt-vm1' \ -sandbox on \ -blockdev node-name=file_aavmf_code,driver=file,filename=/usr/share/edk2/aarch64/QEMU_EFI-silent-pflash.raw,auto-read-only=on,discard=unmap \ -blockdev node-name=drive_aavmf_code,driver=raw,read-only=on,file=file_aavmf_code \ -blockdev node-name=file_aavmf_vars,driver=file,filename=/root/avocado/data/avocado-vt/avocado-vt-vm1_rhel920-aarch64_nvme_direct_raw_VARS.fd,auto-read-only=on,discard=unmap \ -blockdev node-name=drive_aavmf_vars,driver=raw,read-only=off,file=file_aavmf_vars \ -machine virt,gic-version=host,memory-backend=mem-machine_mem,pflash0=drive_aavmf_code,pflash1=drive_aavmf_vars \ -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \ -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0 \ -nodefaults \ -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \ -device virtio-gpu-pci,bus=pcie-root-port-1,addr=0x0 \ -m 253952 \ -object '{"qom-type": "memory-backend-ram", "size": 266287972352, "id": "mem-machine_mem"}' \ -smp 16,maxcpus=16,cores=8,threads=1,clusters=1,sockets=2 \ -cpu 'host' \ -chardev socket,wait=off,server=on,path=/var/tmp/monitor-qmpmonitor1,id=qmp_id_qmpmonitor1 \ -mon chardev=qmp_id_qmpmonitor1,mode=control \ -chardev socket,wait=off,server=on,path=/var/tmp/monitor-catch_monitor,id=qmp_id_catch_monitor \ -mon chardev=qmp_id_catch_monitor,mode=control \ -serial unix:'/var/tmp/serial-serial0',server=on,wait=off \ -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \ -device qemu-xhci,id=usb1,bus=pcie-root-port-2,addr=0x0 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -blockdev '{"node-name": "nvme_image1", "driver": "nvme", "auto-read-only": true, "discard": "unmap", "device": "0005:01:00.0", "namespace": 1, "cache": {"direct": false, "no-flush": false}}' \ -blockdev '{"node-name": "drive_image1", "driver": "raw", "read-only": false, "cache": {"direct": false, "no-flush": false}, "file": "nvme_image1"}' \ -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \ -device '{"driver": "virtio-blk-pci", "id": "image1", "drive": "drive_image1", "bootindex": 1, "write-cache": "off", "bus": "pcie-root-port-3", "addr": "0x0"}' \ -device pcie-root-port,id=pcie-root-port-4,port=0x4,addr=0x1.0x4,bus=pcie.0,chassis=5 \ -device virtio-net-pci,mac=9a:91:18:a0:16:a9,rombar=0,id=idsTPBvn,netdev=idsuVgf9,bus=pcie-root-port-4,addr=0x0 \ -netdev tap,id=idsuVgf9,vhost=on \ -device pcie-root-port,id=pcie-root-port-5,port=0x5,addr=0x1.0x5,bus=pcie.0,chassis=6 \ -device '{"id": "virtio_scsi_pci0", "driver": "virtio-scsi-pci", "bus": "pcie-root-port-5", "addr": "0x0"}' \ -blockdev '{"node-name": "file_cd1", "driver": "file", "auto-read-only": true, "discard": "unmap", "aio": "threads", "filename": "/home/kvm_autotest_root/iso/linux/RHEL-9.2.0-20221227.0-aarch64-dvd1.iso", "cache": {"direct": true, "no-flush": false}}' \ -blockdev '{"node-name": "drive_cd1", "driver": "raw", "read-only": true, "cache": {"direct": true, "no-flush": false}, "file": "file_cd1"}' \ -device '{"driver": "scsi-cd", "id": "cd1", "drive": "drive_cd1", "bootindex": 2, "write-cache": "on"}' \ -blockdev '{"node-name": "file_unattended", "driver": "file", "auto-read-only": true, "discard": "unmap", "aio": "threads", "filename": "/home/kvm_autotest_root/images/rhel920-aarch64/ks.iso", "cache": {"direct": true, "no-flush": false}}' \ -blockdev '{"node-name": "drive_unattended", "driver": "raw", "read-only": true, "cache": {"direct": true, "no-flush": false}, "file": "file_unattended"}' \ -device '{"driver": "scsi-cd", "id": "unattended", "drive": "drive_unattended", "bootindex": 3, "write-cache": "on"}' \ -kernel '/home/kvm_autotest_root/images/rhel920-aarch64/vmlinuz' \ -append 'inst.sshd inst.repo=cdrom inst.ks=cdrom:/ks.cfg net.ifnames=0 console=ttyAMA0,38400' \ -initrd '/home/kvm_autotest_root/images/rhel920-aarch64/initrd.img' \ -vnc :20 \ -rtc base=utc,clock=host,driftfix=slew \ -no-shutdown \ -enable-kvm \ -monitor stdio Actual results: guest hang 2022-12-27 02:07:47: 192.168.122.209Dec 27 06:10:45 localhost.localdomain kernel: INFO: task dd:2693 blocked for more than 122 seconds. 2022-12-27 02:07:47: 192.168.122.209Dec 27 06:10:45 localhost.localdomain kernel: Not tainted 5.14.0-226.el9.aarch64 #1 2022-12-27 02:07:47: 192.168.122.209Dec 27 06:10:45 localhost.localdomain kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 2022-12-27 02:07:47: 192.168.122.209Dec 27 06:10:45 localhost.localdomain kernel: task:dd state:D stack: 0 pid: 2693 ppid: 2686 flags:0x00000204 2022-12-27 02:07:47: 192.168.122.209Dec 27 06:10:45 localhost.localdomain kernel: Call trace: 2022-12-27 02:07:47: 192.168.122.209Dec 27 06:10:45 localhost.localdomain kernel: __switch_to+0xc8/0x110 2022-12-27 02:07:47: 192.168.122.209Dec 27 06:10:45 localhost.localdomain kernel: __schedule+0x220/0x520 2022-12-27 02:07:47: 192.168.122.209Dec 27 06:10:45 localhost.localdomain kernel: schedule+0x60/0xe0 2022-12-27 02:07:47: 192.168.122.209Dec 27 06:10:45 localhost.localdomain kernel: io_schedule+0x40/0x60 2022-12-27 02:07:47: 192.168.122.209Dec 27 06:10:45 localhost.localdomain kernel: bit_wait_io+0x1c/0x6c 2022-12-27 02:07:47: 192.168.122.209Dec 27 06:10:45 localhost.localdomain kernel: __wait_on_bit+0x9c/0x190 2022-12-27 02:07:47: 192.168.122.209Dec 27 06:10:45 localhost.localdomain kernel: out_of_line_wait_on_bit+0x84/0xb0 2022-12-27 02:07:47: 192.168.122.209Dec 27 06:10:45 localhost.localdomain kernel: __block_write_begin_int+0x218/0x4cc 2022-12-27 02:07:47: 192.168.122.209Dec 27 06:10:45 localhost.localdomain kernel: block_write_begin+0x54/0xa0 2022-12-27 02:07:47: 192.168.122.209Dec 27 06:10:45 localhost.localdomain kernel: blkdev_write_begin+0x30/0x40 2022-12-27 02:07:47: 192.168.122.209Dec 27 06:10:45 localhost.localdomain kernel: generic_perform_write+0xac/0x1e0 2022-12-27 02:07:47: 192.168.122.209Dec 27 06:10:45 localhost.localdomain kernel: __generic_file_write_iter+0xe4/0x1a0 2022-12-27 02:07:47: 192.168.122.209Dec 27 06:10:45 localhost.localdomain kernel: blkdev_write_iter+0xfc/0x180 2022-12-27 02:07:47: 192.168.122.209Dec 27 06:10:45 localhost.localdomain kernel: new_sync_write+0xd4/0x160 2022-12-27 02:07:47: 192.168.122.209Dec 27 06:10:45 localhost.localdomain kernel: vfs_write+0x1a0/0x1ec 2022-12-27 02:07:47: 192.168.122.209Dec 27 06:10:45 localhost.localdomain kernel: ksys_write+0x6c/0xf4 2022-12-27 02:07:47: 192.168.122.209Dec 27 06:10:45 localhost.localdomain kernel: __arm64_sys_write+0x20/0x2c 2022-12-27 02:07:47: 192.168.122.209Dec 27 06:10:45 localhost.localdomain kernel: invoke_syscall.constprop.0+0x7c/0xd0 2022-12-27 02:07:47: 192.168.122.209Dec 27 06:10:45 localhost.localdomain kernel: el0_svc_common.constprop.0+0x154/0x160 2022-12-27 02:07:47: 192.168.122.209Dec 27 06:10:45 localhost.localdomain kernel: do_el0_svc+0x2c/0xb0 2022-12-27 02:07:47: 192.168.122.209Dec 27 06:10:45 localhost.localdomain kernel: el0_svc+0x38/0x190 2022-12-27 02:07:47: 192.168.122.209Dec 27 06:10:45 localhost.localdomain kernel: el0t_64_sync_handler+0x9c/0x120 2022-12-27 02:07:47: 192.168.122.209Dec 27 06:10:45 localhost.localdomain kernel: el0t_64_sync+0x174/0x178 hit Bus error (core dumped) [root@ampere-hr350a-07 home]# echo "/home/core" > /proc/sys/kernel/core_pattern [root@ampere-hr350a-07 home]# ulimit -c unlimited [root@ampere-hr350a-07 home]# qemu-img info nvme://0005:01:00.0/1 Bus error (core dumped)