RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1965145 - [RHEL9] BUG: KASAN: stack-out-of-bounds in kvm_make_vcpus_request_mask+0x174/0x440 [kvm] on AMD host
Summary: [RHEL9] BUG: KASAN: stack-out-of-bounds in kvm_make_vcpus_request_mask+0x174/...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: kernel
Version: 9.0
Hardware: x86_64
OS: Linux
high
high
Target Milestone: beta
: ---
Assignee: Vitaly Kuznetsov
QA Contact: liunana
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-05-27 01:46 UTC by liunana
Modified: 2022-05-17 15:42 UTC (History)
7 users (show)

Fixed In Version: kernel-5.14.0-14.el9
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-05-17 15:38:18 UTC
Type: ---
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Gitlab redhat/centos-stream/src/kernel centos-stream-9 merge_requests 37 0 None None None 2021-09-29 08:10:51 UTC
Red Hat Product Errata RHBA-2022:3907 0 None None None 2022-05-17 15:38:38 UTC

Description liunana 2021-05-27 01:46:27 UTC
Description of problem:
[RHEL9] BUG: KASAN: stack-out-of-bounds in kvm_make_vcpus_request_mask+0x174/0x440 [kvm] on AMD Milan (Zen3) host 


Version-Release number of selected component (if applicable):
  amd-daytona-08.khw1.lab.eng.bos.redhat.com
  kernel-5.12.0-1.el9.x86_64+debug
  qemu-kvm-6.0.0-1.el9.x86_64



How reproducible:
3/3


Steps to Reproduce:
1. Install a guest
2.
3.

Actual results:
Host outputs error log in the console but still can run some job on the host.

Expected results:
Host works well without error logs.

Additional info:
Will add them in comments

Comment 1 liunana 2021-05-27 01:50:46 UTC
dmesg log:

[root@amd-daytona-08 ~]# [  806.530983] FS-Cache: Loaded
[  806.698971] FS-Cache: Netfs 'nfs' registered for caching
[  806.727147] Key type dns_resolver registered
[  807.013959] NFS: Registering the id_resolver key type
[  807.019062] Key type id_resolver registered
[  807.023265] Key type id_legacy registered
[  807.779459] mount.nfs (4334) used greatest stack depth: 21776 bytes left
[  829.319478] Bluetooth: Core ver 2.22
[  829.323285] NET: Registered protocol family 31
[  829.327752] Bluetooth: HCI device and connection manager initialized
[  829.334233] Bluetooth: HCI socket layer initialized
[  829.339130] Bluetooth: L2CAP socket layer initialized
[  829.344254] Bluetooth: SCO socket layer initialized
[  844.813454] tun: Universal TUN/TAP device driver, 1.6
[  844.828593] switch: port 2(t0-HuClnA) entered blocking state
[  844.834288] switch: port 2(t0-HuClnA) entered disabled state
[  844.840440] device t0-HuClnA entered promiscuous mode
[  844.858586] switch: port 2(t0-HuClnA) entered blocking state
[  844.864296] switch: port 2(t0-HuClnA) entered forwarding state
[  919.564489] ==================================================================
[  919.572015] BUG: KASAN: stack-out-of-bounds in kvm_make_vcpus_request_mask+0x174/0x440 [kvm]
[  919.580491] Read of size 8 at addr ffffc9001364f638 by task qemu-kvm/4798
[  919.587279] 
[  919.588780] CPU: 0 PID: 4798 Comm: qemu-kvm Tainted: G               X --------- ---  5.12.0-1.el9.x86_64+debug #1
[  919.599116] Hardware name: AMD Corporation DAYTONA_X/DAYTONA_X, BIOS RYM0081C 07/13/2020
[  919.607203] Call Trace:
[  919.609660]  dump_stack+0xa5/0xe6
[  919.612986]  print_address_description.constprop.0+0x18/0x130
[  919.618738]  ? kvm_make_vcpus_request_mask+0x174/0x440 [kvm]
[  919.624433]  __kasan_report.cold+0x7f/0x114
[  919.628623]  ? kvm_make_vcpus_request_mask+0x174/0x440 [kvm]
[  919.634316]  kasan_report+0x38/0x50
[  919.637806]  kasan_check_range+0xf5/0x1d0
[  919.641819]  kvm_make_vcpus_request_mask+0x174/0x440 [kvm]
[  919.647349]  kvm_make_scan_ioapic_request_mask+0x84/0xc0 [kvm]
[  919.653217]  ? kvm_arch_exit+0x110/0x110 [kvm]
[  919.657694]  ? sched_clock+0x5/0x10
[  919.661196]  ioapic_write_indirect+0x59f/0x9e0 [kvm]
[  919.666196]  ? static_obj+0xc0/0xc0
[  919.669692]  ? __lock_acquired+0x1d2/0x8c0
[  919.673790]  ? kvm_ioapic_eoi_inject_work+0x120/0x120 [kvm]
[  919.679422]  ? __lock_contended+0x910/0x910
[  919.683608]  ? do_raw_spin_trylock+0xb5/0x180
[  919.687974]  ? ioapic_mmio_write+0xe9/0x1e0 [kvm]
[  919.692712]  ioapic_mmio_write+0xff/0x1e0 [kvm]
[  919.697280]  __kvm_io_bus_write+0x1d1/0x450 [kvm]
[  919.702018]  ? check_prev_add+0x20f0/0x20f0
[  919.706207]  kvm_io_bus_write+0x105/0x1f0 [kvm]
[  919.710772]  ? kvm_stat_data_get+0x380/0x380 [kvm]
[  919.715600]  ? __lock_acquire+0xb69/0x18e0
[  919.719705]  write_mmio+0x13b/0x3a0 [kvm]
[  919.723761]  emulator_read_write_onepage+0x168/0x470 [kvm]
[  919.729275]  ? vcpu_mmio_gva_to_gpa+0x5e0/0x5e0 [kvm]
[  919.734364]  ? decode_imm+0x7d0/0x7d0 [kvm]
[  919.738585]  emulator_read_write+0x157/0x550 [kvm]
[  919.743409]  ? decode_operand+0xb68/0x2cf0 [kvm]
[  919.748059]  segmented_write.isra.0+0xc9/0x110 [kvm]
[  919.753055]  ? segmented_read.isra.0+0x330/0x330 [kvm]
[  919.758228]  writeback+0x6a7/0x8b0 [kvm]
[  919.762181]  ? emulator_task_switch+0x2b0/0x2b0 [kvm]
[  919.767261]  ? em_loop+0x530/0x530 [kvm]
[  919.771222]  ? mmio_info_in_cache+0x32c/0x410 [kvm]
[  919.776139]  x86_emulate_insn+0x1a0c/0x3cf0 [kvm]
[  919.780876]  ? kvm_mmu_reset_context+0x20/0x20 [kvm]
[  919.785874]  ? rcu_read_unlock+0x40/0x40
[  919.789832]  x86_emulate_instruction+0x5e5/0x1180 [kvm]
[  919.795096]  vcpu_enter_guest+0x1ae5/0x39c0 [kvm]
[  919.799829]  ? lock_acquire+0x1ca/0x490
[  919.803671]  ? kvm_vcpu_reload_apic_access_page+0x60/0x60 [kvm]
[  919.809631]  ? rcu_read_unlock+0x40/0x40
[  919.813557]  ? mark_lock_irq+0x1d00/0x1d00
[  919.817656]  ? kvm_vcpu_ioctl+0x153/0xac0 [kvm]
[  919.822227]  ? kvm_get_linear_rip+0x12c/0x260 [kvm]
[  919.827142]  ? vcpu_run+0x144/0x7f0 [kvm]
[  919.831185]  vcpu_run+0x144/0x7f0 [kvm]
[  919.835057]  kvm_arch_vcpu_ioctl_run+0x23b/0xd10 [kvm]
[  919.840226]  kvm_vcpu_ioctl+0x384/0xac0 [kvm]
[  919.844616]  ? __lock_release+0x494/0xa40
[  919.848632]  ? install_new_memslots+0x270/0x270 [kvm]
[  919.853718]  ? generic_block_fiemap+0x60/0x60
[  919.858085]  ? insert_inode_locked+0x1de/0x4f0
[  919.862532]  ? selinux_inode_getsecctx+0x80/0x80
[  919.867162]  ? __fget_files+0x1bf/0x2d0
[  919.871007]  __x64_sys_ioctl+0x127/0x190
[  919.874935]  do_syscall_64+0x33/0x40
[  919.878522]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  919.883583] RIP: 0033:0x7fbf581dd21b
[  919.887164] Code: ff ff ff 85 c0 79 9b 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 25 bc 0c 00 f7 d8 64 89 01 48
[  919.905906] RSP: 002b:00007fbf54a04588 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  919.913475] RAX: ffffffffffffffda RBX: 0000560e95142520 RCX: 00007fbf581dd21b
[  919.920605] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000019
[  919.927738] RBP: 00007fbf58b22000 R08: 0000560e94830b90 R09: 00000000000000ff
[  919.934871] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000001
[  919.942006] R13: 0000000000000001 R14: 00000000000003f9 R15: 0000000000000000
[  919.949150] 
[  919.950646] 
[  919.952144] addr ffffc9001364f638 is located in stack of task qemu-kvm/4798 at offset 40 in frame:
[  919.961097]  ioapic_write_indirect+0x0/0x9e0 [kvm]
[  919.965924] 
[  919.967424] this frame has 2 objects:
[  919.971091]  [32, 40) 'vcpu_bitmap'
[  919.971094]  [64, 88) 'irq'
[  919.974582] 
[  919.978872] Memory state around the buggy address:
[  919.983665]  ffffc9001364f500: 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 f3
[  919.990884]  ffffc9001364f580: f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  919.998104] >ffffc9001364f600: 00 00 f1 f1 f1 f1 00 f2 f2 f2 00 00 00 f3 f3 f3
[  920.005323]                                         ^
[  920.010376]  ffffc9001364f680: f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  920.017604]  ffffc9001364f700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  920.024848] ==================================================================
[  920.032068] Disabling lock debugging due to kernel taint

Comment 2 John Ferlan 2021-07-08 16:35:06 UTC
Assigned to Amnon for initial triage per bz process and age of bug created or assigned to virt-maint without triage.

Comment 4 Bandan Das 2021-07-28 18:18:42 UTC
I tried this with kernel-5.14.0-0.rc2.23.el9.x86_64 on dell-per6525-01.dell2.lab.eng.bos.redhat.com which (I think) is a Zen3. 
I see no trace launching a guest. Is there a specific guest config that causes this ? Could you also try a more recent build ?

Comment 5 liunana 2021-07-30 09:45:52 UTC
(In reply to Bandan Das from comment #4)
> I tried this with kernel-5.14.0-0.rc2.23.el9.x86_64 on
> dell-per6525-01.dell2.lab.eng.bos.redhat.com which (I think) is a Zen3. 
> I see no trace launching a guest.

It occurs in the machine's use, I didn't meet the issue at first while doing my test.
But when I met it once, it is easy to reproduce.


Is there a specific guest config that
> causes this ? Could you also try a more recent build ?

Ok, I will try it again using the recent build, and will update the result.




Best regards
Liu Nana

Comment 6 Dr. David Alan Gilbert 2021-08-02 18:55:06 UTC
(In reply to liunana from comment #5)
> (In reply to Bandan Das from comment #4)
> > I tried this with kernel-5.14.0-0.rc2.23.el9.x86_64 on
> > dell-per6525-01.dell2.lab.eng.bos.redhat.com which (I think) is a Zen3. 
> > I see no trace launching a guest.
> 
> It occurs in the machine's use, I didn't meet the issue at first while doing
> my test.
> But when I met it once, it is easy to reproduce.
> 
> 
> Is there a specific guest config that
> > causes this ? Could you also try a more recent build ?
> 
> Ok, I will try it again using the recent build, and will update the result.
> 
> 
> 
> 
> Best regards
> Liu Nana

Liu: Just a guess, but how big are your VMs?  What's the command line you're using for the qemu?

Comment 7 liunana 2021-08-04 03:33:21 UTC
(In reply to Dr. David Alan Gilbert from comment #6)
> (In reply to liunana from comment #5)
> > (In reply to Bandan Das from comment #4)
> > > I tried this with kernel-5.14.0-0.rc2.23.el9.x86_64 on
> > > dell-per6525-01.dell2.lab.eng.bos.redhat.com which (I think) is a Zen3. 
> > > I see no trace launching a guest.
> > 
> > It occurs in the machine's use, I didn't meet the issue at first while doing
> > my test.
> > But when I met it once, it is easy to reproduce.
> > 
> > 
> > Is there a specific guest config that
> > > causes this ? Could you also try a more recent build ?
> > 
> > Ok, I will try it again using the recent build, and will update the result.
> > 
> > 
> > 
> > 
> > Best regards
> > Liu Nana
> 
> Liu: Just a guess, but how big are your VMs?  What's the command line you're
> using for the qemu?


About 6 VMs, including windows guests and RHEL guests. I installed them with avocado automatically.

And I still can reproduce this bug with the latest kernel at the first VM installation (RHEL.9.0) this time. Please check the command line follows 'QEMU command line [1]'. And the guest is installed successfully.


Test Environments:
    amd-milan-07.khw1.lab.eng.bos.redhat.com
    5.14.0-0.rc4.35.el9.x86_64+debug
    qemu-kvm-6.0.0-10.el9.x86_64


QEMU command line [1]
/usr/libexec/qemu-kvm \
    -S  \
    -name 'avocado-vt-vm1'  \
    -sandbox on  \
    -machine pc,memory-backend=mem-machine_mem  \
    -nodefaults \
    -device VGA,bus=pci.0,addr=0x2 \
    -m 105472 \
    -object memory-backend-ram,size=105472M,id=mem-machine_mem  \
    -smp 128,maxcpus=128,cores=64,threads=1,dies=1,sockets=2  \
    -cpu 'EPYC-Milan',+kvm_pv_unhalt \
    -chardev socket,path=/tmp/avocado_w2tu9r_3/monitor-qmpmonitor1-20210803-225107-89L3F5Lz,wait=off,server=on,id=qmp_id_qmpmonitor1  \
    -mon chardev=qmp_id_qmpmonitor1,mode=control \
    -chardev socket,path=/tmp/avocado_w2tu9r_3/monitor-catch_monitor-20210803-225107-89L3F5Lz,wait=off,server=on,id=qmp_id_catch_monitor  \
    -mon chardev=qmp_id_catch_monitor,mode=control \
    -device pvpanic,ioport=0x505,id=id6fT7W7 \
    -chardev socket,path=/tmp/avocado_w2tu9r_3/serial-serial0-20210803-225107-89L3F5Lz,wait=off,server=on,id=chardev_serial0 \
    -device isa-serial,id=serial0,chardev=chardev_serial0  \
    -chardev socket,id=seabioslog_id_20210803-225107-89L3F5Lz,path=/tmp/avocado_w2tu9r_3/seabios-20210803-225107-89L3F5Lz,server=on,wait=off \
    -device isa-debugcon,chardev=seabioslog_id_20210803-225107-89L3F5Lz,iobase=0x402 \
    -device ich9-usb-ehci1,id=usb1,addr=0x1d.0x7,multifunction=on,bus=pci.0 \
    -device ich9-usb-uhci1,id=usb1.0,multifunction=on,masterbus=usb1.0,addr=0x1d.0x0,firstport=0,bus=pci.0 \
    -device ich9-usb-uhci2,id=usb1.1,multifunction=on,masterbus=usb1.0,addr=0x1d.0x2,firstport=2,bus=pci.0 \
    -device ich9-usb-uhci3,id=usb1.2,multifunction=on,masterbus=usb1.0,addr=0x1d.0x4,firstport=4,bus=pci.0 \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
    -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=0x3 \
    -blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/rhel900-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \
    -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \
    -device virtio-net-pci,mac=9a:3a:9a:e5:09:d4,id=idEsbHMX,netdev=idznZlxd,bus=pci.0,addr=0x4  \
    -netdev tap,id=idznZlxd,vhost=on,vhostfd=18,fd=15 \
    -blockdev node-name=file_cd1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/iso/linux/RHEL-9.0.0-20210707.2-x86_64-dvd1.iso,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_cd1,driver=raw,read-only=on,cache.direct=on,cache.no-flush=off,file=file_cd1 \
    -device scsi-cd,id=cd1,drive=drive_cd1,write-cache=on \
    -blockdev node-name=file_unattended,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/rhel900-64/ks.iso,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_unattended,driver=raw,read-only=on,cache.direct=on,cache.no-flush=off,file=file_unattended \
    -device scsi-cd,id=unattended,drive=drive_unattended,write-cache=on  \
    -kernel '/home/kvm_autotest_root/images/rhel900-64/vmlinuz'  \
    -append 'inst.sshd ksdevice=link inst.repo=cdrom inst.ks=cdrom:/ks.cfg nicdelay=60 biosdevname=0 net.ifnames=0 console=ttyS0,115200 console=tty0'  \
    -initrd '/home/kvm_autotest_root/images/rhel900-64/initrd.img'  \
    -vnc :0  \
    -rtc base=utc,clock=host,driftfix=slew  \
    -boot menu=off,order=cdn,once=d,strict=off  \
    -no-shutdown \
    -enable-kvm



error log:

[  671.917354] ==================================================================
[  671.924778] BUG: KASAN: stack-out-of-bounds in kvm_make_vcpus_request_mask+0x174/0x440 [kvm]
[  671.933976] Read of size 8 at addr ffffc90010fe75e0 by task qemu-kvm/4844
[  671.940759] 
[  671.942262] CPU: 58 PID: 4844 Comm: qemu-kvm Not tainted 5.14.0-0.rc4.35.el9.x86_64+debug #1
[  671.950700] Hardware name: AMD Corporation DAYTONA_X/DAYTONA_X, BIOS RYM0092C 11/03/2020
[  671.958787] Call Trace:
[  671.961241]  dump_stack_lvl+0x57/0x7d
[  671.965922]  print_address_description.constprop.0+0x1f/0x140
[  671.971674]  ? kvm_make_vcpus_request_mask+0x174/0x440 [kvm]
[  671.977369]  __kasan_report.cold+0x7f/0x11e
[  671.981559]  ? kvm_make_vcpus_request_mask+0x174/0x440 [kvm]
[  671.987251]  kasan_report+0x38/0x50
[  671.990741]  kasan_check_range+0xf5/0x1d0
[  671.994754]  kvm_make_vcpus_request_mask+0x174/0x440 [kvm]
[  672.001289]  kvm_make_scan_ioapic_request_mask+0x84/0xc0 [kvm]
[  672.007165]  ? inject_pending_event+0x1080/0x1080 [kvm]
[  672.012421]  ioapic_write_indirect+0x59f/0x9e0 [kvm]
[  672.017414]  ? static_obj+0x40/0xc0
[  672.020911]  ? __lock_acquired+0x1d2/0x8c0
[  672.025009]  ? kvm_ioapic_eoi_inject_work+0x120/0x120 [kvm]
[  672.031612]  ? __lock_contended+0x910/0x910
[  672.035798]  ? do_raw_spin_trylock+0xb5/0x180
[  672.040163]  ? ioapic_mmio_write+0xe9/0x1e0 [kvm]
[  672.044902]  ioapic_mmio_write+0xff/0x1e0 [kvm]
[  672.049468]  __kvm_io_bus_write+0x1d1/0x450 [kvm]
[  672.054203]  kvm_io_bus_write+0xfe/0x1d0 [kvm]
[  672.058677]  ? check_prev_add+0x20f0/0x20f0
[  672.063549]  ? __bpf_trace_kvm_test_age_hva+0xb0/0xb0 [kvm]
[  672.069166]  write_mmio+0x13b/0x3a0 [kvm]
[  672.073218]  emulator_read_write_onepage+0x167/0x4b0 [kvm]
[  672.078736]  ? vcpu_mmio_gva_to_gpa+0x5b0/0x5b0 [kvm]
[  672.083811]  ? decode_register+0xf1/0x400 [kvm]
[  672.088369]  ? fetch_possible_mmx_operand.part.0+0x120/0x120 [kvm]
[  672.095501]  emulator_read_write+0x157/0x550 [kvm]
[  672.100331]  ? decode_operand+0x9a9/0x2920 [kvm]
[  672.104996]  segmented_write.isra.0+0xc9/0x110 [kvm]
[  672.109993]  ? segmented_read.isra.0+0x380/0x380 [kvm]
[  672.115165]  writeback+0x6a5/0x8c0 [kvm]
[  672.119119]  ? emulator_task_switch+0x2b0/0x2b0 [kvm]
[  672.124196]  ? em_rdmsr+0x420/0x420 [kvm]
[  672.129147]  x86_emulate_insn+0x1a0c/0x3cf0 [kvm]
[  672.133888]  ? ept_invlpg+0xc0/0xc0 [kvm]
[  672.137932]  ? rcu_read_unlock+0x40/0x40
[  672.141864]  x86_emulate_instruction+0x5e5/0x1190 [kvm]
[  672.147129]  vcpu_enter_guest+0x1af3/0x3ac0 [kvm]
[  672.151861]  ? lock_acquire+0x1ca/0x570
[  672.155702]  ? kvm_vcpu_reload_apic_access_page+0x50/0x50 [kvm]
[  672.162530]  ? rcu_read_unlock+0x40/0x40
[  672.166453]  ? mark_lock_irq+0xda0/0xda0
[  672.170371]  ? __mutex_lock+0xb77/0x1170
[  672.174298]  ? mark_lock+0xd3/0xae0
[  672.177794]  ? kvm_get_linear_rip+0x12c/0x260 [kvm]
[  672.182710]  ? vcpu_run+0x144/0x7f0 [kvm]
[  672.186751]  vcpu_run+0x144/0x7f0 [kvm]
[  672.191567]  kvm_arch_vcpu_ioctl_run+0x23d/0xf40 [kvm]
[  672.196854]  kvm_vcpu_ioctl+0x42c/0xb20 [kvm]
[  672.201243]  ? __bpf_trace_kvm_age_hva+0xe0/0xe0 [kvm]
[  672.206410]  ? __lock_release+0x494/0xa40
[  672.210424]  ? lock_downgrade+0x110/0x110
[  672.214432]  ? __lock_contended+0x4de/0x910
[  672.218620]  ? selinux_inode_getsecctx+0x80/0x80
[  672.224286]  ? lock_acquire+0x80/0x570
[  672.228044]  ? __fget_files+0x189/0x2f0
[  672.231887]  ? security_file_ioctl+0x50/0x90
[  672.236164]  __x64_sys_ioctl+0x127/0x190
[  672.240090]  do_syscall_64+0x3b/0x90
[  672.243666]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  672.248719] RIP: 0033:0x7f648c3253eb
[  672.252297] Code: ff ff ff 85 c0 79 9b 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 0d 2a 0f 00 f7 d8 64 89 01 48
[  672.271989] RSP: 002b:00007f64889f3548 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  672.279555] RAX: ffffffffffffffda RBX: 00005613af6071e0 RCX: 00007f648c3253eb
[  672.286688] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000019
[  672.294719] RBP: 00007f648cc06000 R08: 00005613acedd210 R09: 00000000000000ff
[  672.301852] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000001
[  672.308984] R13: 0000000000000001 R14: 00000000000003f9 R15: 0000000000000000
[  672.316124] 
[  672.317616] 
[  672.319109] addr ffffc90010fe75e0 is located in stack of task qemu-kvm/4844 at offset 40 in frame:
[  672.328781]  ioapic_write_indirect+0x0/0x9e0 [kvm]
[  672.333609] 
[  672.335106] this frame has 2 objects:
[  672.338774]  [32, 40) 'vcpu_bitmap'
[  672.338776]  [64, 88) 'irq'
[  672.342265] 
[  672.346547] Memory state around the buggy address:
[  672.351341]  ffffc90010fe7480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f1
[  672.359382]  ffffc90010fe7500: f1 f1 f1 00 f3 f3 f3 00 00 00 00 00 00 00 00 00
[  672.366600] >ffffc90010fe7580: 00 00 00 00 00 00 00 f1 f1 f1 f1 00 f2 f2 f2 00
[  672.373821]                                                        ^
[  672.380171]  ffffc90010fe7600: 00 00 f3 f3 f3 f3 f3 00 00 00 00 00 00 00 00 00
[  672.388164]  ffffc90010fe7680: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  672.395381] ==================================================================
[  672.402595] Disabling lock debugging due to kernel taint
[  673.811651] hrtimer: interrupt took 722589 ns



Host kernel line info:
# cat /proc/cmdline 
BOOT_IMAGE=(hd0,gpt2)/vmlinuz-5.14.0-0.rc4.35.el9.x86_64+debug root=/dev/mapper/rhel_amd--milan--07-root ro resume=/dev/mapper/rhel_amd--milan--07-swap rd.lvm.lv=rhel_amd-milan-07/root rd.lvm.lv=rhel_amd-milan-07/swap console=ttyS0,115200n81 crashkernel=auto



Hi, would you please help to check this? Thanks.



Best regards
Liu Nana

Comment 8 Bandan Das 2021-08-04 22:19:07 UTC
(In reply to liunana from comment #7)
> (In reply to Dr. David Alan Gilbert from comment #6)
> > (In reply to liunana from comment #5)
> > > (In reply to Bandan Das from comment #4)
> > > > I tried this with kernel-5.14.0-0.rc2.23.el9.x86_64 on
> > > > dell-per6525-01.dell2.lab.eng.bos.redhat.com which (I think) is a Zen3. 
> > > > I see no trace launching a guest.
> > > 
> > > It occurs in the machine's use, I didn't meet the issue at first while doing
> > > my test.
> > > But when I met it once, it is easy to reproduce.
> > > 
> > > 
> > > Is there a specific guest config that
> > > > causes this ? Could you also try a more recent build ?
> > > 
> > > Ok, I will try it again using the recent build, and will update the result.
> > > 
> > > 
> > > 
> > > 
> > > Best regards
> > > Liu Nana
> > 
> > Liu: Just a guess, but how big are your VMs?  What's the command line you're
> > using for the qemu?
> 
> 
> About 6 VMs, including windows guests and RHEL guests. I installed them with
> avocado automatically.
> 
Thanks for confirming that it's still there.
I need to reproduce this on my setup. 

Can you either 
- give me instructions on how you set this up using avocado ?
Or
- Give me setup instructions using qemu ? I know you posted the qemu command line but is that 
enough to reproduce ? Do I have to run 6 guests to reproduce this ? Should the guests be idle or should I run
something for the trace to occur ?


Dave, you asked about large guests. Are you aware of any known issue with Zen3 with the address
sanitizer ?

> And I still can reproduce this bug with the latest kernel at the first VM
> installation (RHEL.9.0) this time. Please check the command line follows
> 'QEMU command line [1]'. And the guest is installed successfully.
> 
> 
> Test Environments:
>     amd-milan-07.khw1.lab.eng.bos.redhat.com
>     5.14.0-0.rc4.35.el9.x86_64+debug
>     qemu-kvm-6.0.0-10.el9.x86_64
> 
> 
> QEMU command line [1]
> /usr/libexec/qemu-kvm \
>     -S  \
>     -name 'avocado-vt-vm1'  \
>     -sandbox on  \
>     -machine pc,memory-backend=mem-machine_mem  \
>     -nodefaults \
>     -device VGA,bus=pci.0,addr=0x2 \
>     -m 105472 \
>     -object memory-backend-ram,size=105472M,id=mem-machine_mem  \
>     -smp 128,maxcpus=128,cores=64,threads=1,dies=1,sockets=2  \
>     -cpu 'EPYC-Milan',+kvm_pv_unhalt \
>     -chardev
> socket,path=/tmp/avocado_w2tu9r_3/monitor-qmpmonitor1-20210803-225107-
> 89L3F5Lz,wait=off,server=on,id=qmp_id_qmpmonitor1  \
>     -mon chardev=qmp_id_qmpmonitor1,mode=control \
>     -chardev
> socket,path=/tmp/avocado_w2tu9r_3/monitor-catch_monitor-20210803-225107-
> 89L3F5Lz,wait=off,server=on,id=qmp_id_catch_monitor  \
>     -mon chardev=qmp_id_catch_monitor,mode=control \
>     -device pvpanic,ioport=0x505,id=id6fT7W7 \
>     -chardev
> socket,path=/tmp/avocado_w2tu9r_3/serial-serial0-20210803-225107-89L3F5Lz,
> wait=off,server=on,id=chardev_serial0 \
>     -device isa-serial,id=serial0,chardev=chardev_serial0  \
>     -chardev
> socket,id=seabioslog_id_20210803-225107-89L3F5Lz,path=/tmp/avocado_w2tu9r_3/
> seabios-20210803-225107-89L3F5Lz,server=on,wait=off \
>     -device
> isa-debugcon,chardev=seabioslog_id_20210803-225107-89L3F5Lz,iobase=0x402 \
>     -device ich9-usb-ehci1,id=usb1,addr=0x1d.0x7,multifunction=on,bus=pci.0 \
>     -device
> ich9-usb-uhci1,id=usb1.0,multifunction=on,masterbus=usb1.0,addr=0x1d.0x0,
> firstport=0,bus=pci.0 \
>     -device
> ich9-usb-uhci2,id=usb1.1,multifunction=on,masterbus=usb1.0,addr=0x1d.0x2,
> firstport=2,bus=pci.0 \
>     -device
> ich9-usb-uhci3,id=usb1.2,multifunction=on,masterbus=usb1.0,addr=0x1d.0x4,
> firstport=4,bus=pci.0 \
>     -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
>     -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=0x3 \
>     -blockdev
> node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,
> aio=threads,filename=/home/kvm_autotest_root/images/rhel900-64-virtio-scsi.
> qcow2,cache.direct=on,cache.no-flush=off \
>     -blockdev
> node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-
> flush=off,file=file_image1 \
>     -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \
>     -device
> virtio-net-pci,mac=9a:3a:9a:e5:09:d4,id=idEsbHMX,netdev=idznZlxd,bus=pci.0,
> addr=0x4  \
>     -netdev tap,id=idznZlxd,vhost=on,vhostfd=18,fd=15 \
>     -blockdev
> node-name=file_cd1,driver=file,auto-read-only=on,discard=unmap,aio=threads,
> filename=/home/kvm_autotest_root/iso/linux/RHEL-9.0.0-20210707.2-x86_64-dvd1.
> iso,cache.direct=on,cache.no-flush=off \
>     -blockdev
> node-name=drive_cd1,driver=raw,read-only=on,cache.direct=on,cache.no-
> flush=off,file=file_cd1 \
>     -device scsi-cd,id=cd1,drive=drive_cd1,write-cache=on \
>     -blockdev
> node-name=file_unattended,driver=file,auto-read-only=on,discard=unmap,
> aio=threads,filename=/home/kvm_autotest_root/images/rhel900-64/ks.iso,cache.
> direct=on,cache.no-flush=off \
>     -blockdev
> node-name=drive_unattended,driver=raw,read-only=on,cache.direct=on,cache.no-
> flush=off,file=file_unattended \
>     -device scsi-cd,id=unattended,drive=drive_unattended,write-cache=on  \
>     -kernel '/home/kvm_autotest_root/images/rhel900-64/vmlinuz'  \
>     -append 'inst.sshd ksdevice=link inst.repo=cdrom inst.ks=cdrom:/ks.cfg
> nicdelay=60 biosdevname=0 net.ifnames=0 console=ttyS0,115200 console=tty0'  \
>     -initrd '/home/kvm_autotest_root/images/rhel900-64/initrd.img'  \
>     -vnc :0  \
>     -rtc base=utc,clock=host,driftfix=slew  \
>     -boot menu=off,order=cdn,once=d,strict=off  \
>     -no-shutdown \
>     -enable-kvm
> 
> 
> 
> error log:
> 
> [  671.917354]
> ==================================================================
> [  671.924778] BUG: KASAN: stack-out-of-bounds in
> kvm_make_vcpus_request_mask+0x174/0x440 [kvm]
> [  671.933976] Read of size 8 at addr ffffc90010fe75e0 by task qemu-kvm/4844
> [  671.940759] 
> [  671.942262] CPU: 58 PID: 4844 Comm: qemu-kvm Not tainted
> 5.14.0-0.rc4.35.el9.x86_64+debug #1
> [  671.950700] Hardware name: AMD Corporation DAYTONA_X/DAYTONA_X, BIOS
> RYM0092C 11/03/2020
> [  671.958787] Call Trace:
> [  671.961241]  dump_stack_lvl+0x57/0x7d
> [  671.965922]  print_address_description.constprop.0+0x1f/0x140
> [  671.971674]  ? kvm_make_vcpus_request_mask+0x174/0x440 [kvm]
> [  671.977369]  __kasan_report.cold+0x7f/0x11e
> [  671.981559]  ? kvm_make_vcpus_request_mask+0x174/0x440 [kvm]
> [  671.987251]  kasan_report+0x38/0x50
> [  671.990741]  kasan_check_range+0xf5/0x1d0
> [  671.994754]  kvm_make_vcpus_request_mask+0x174/0x440 [kvm]
> [  672.001289]  kvm_make_scan_ioapic_request_mask+0x84/0xc0 [kvm]
> [  672.007165]  ? inject_pending_event+0x1080/0x1080 [kvm]
> [  672.012421]  ioapic_write_indirect+0x59f/0x9e0 [kvm]
> [  672.017414]  ? static_obj+0x40/0xc0
> [  672.020911]  ? __lock_acquired+0x1d2/0x8c0
> [  672.025009]  ? kvm_ioapic_eoi_inject_work+0x120/0x120 [kvm]
> [  672.031612]  ? __lock_contended+0x910/0x910
> [  672.035798]  ? do_raw_spin_trylock+0xb5/0x180
> [  672.040163]  ? ioapic_mmio_write+0xe9/0x1e0 [kvm]
> [  672.044902]  ioapic_mmio_write+0xff/0x1e0 [kvm]
> [  672.049468]  __kvm_io_bus_write+0x1d1/0x450 [kvm]
> [  672.054203]  kvm_io_bus_write+0xfe/0x1d0 [kvm]
> [  672.058677]  ? check_prev_add+0x20f0/0x20f0
> [  672.063549]  ? __bpf_trace_kvm_test_age_hva+0xb0/0xb0 [kvm]
> [  672.069166]  write_mmio+0x13b/0x3a0 [kvm]
> [  672.073218]  emulator_read_write_onepage+0x167/0x4b0 [kvm]
> [  672.078736]  ? vcpu_mmio_gva_to_gpa+0x5b0/0x5b0 [kvm]
> [  672.083811]  ? decode_register+0xf1/0x400 [kvm]
> [  672.088369]  ? fetch_possible_mmx_operand.part.0+0x120/0x120 [kvm]
> [  672.095501]  emulator_read_write+0x157/0x550 [kvm]
> [  672.100331]  ? decode_operand+0x9a9/0x2920 [kvm]
> [  672.104996]  segmented_write.isra.0+0xc9/0x110 [kvm]
> [  672.109993]  ? segmented_read.isra.0+0x380/0x380 [kvm]
> [  672.115165]  writeback+0x6a5/0x8c0 [kvm]
> [  672.119119]  ? emulator_task_switch+0x2b0/0x2b0 [kvm]
> [  672.124196]  ? em_rdmsr+0x420/0x420 [kvm]
> [  672.129147]  x86_emulate_insn+0x1a0c/0x3cf0 [kvm]
> [  672.133888]  ? ept_invlpg+0xc0/0xc0 [kvm]
> [  672.137932]  ? rcu_read_unlock+0x40/0x40
> [  672.141864]  x86_emulate_instruction+0x5e5/0x1190 [kvm]
> [  672.147129]  vcpu_enter_guest+0x1af3/0x3ac0 [kvm]
> [  672.151861]  ? lock_acquire+0x1ca/0x570
> [  672.155702]  ? kvm_vcpu_reload_apic_access_page+0x50/0x50 [kvm]
> [  672.162530]  ? rcu_read_unlock+0x40/0x40
> [  672.166453]  ? mark_lock_irq+0xda0/0xda0
> [  672.170371]  ? __mutex_lock+0xb77/0x1170
> [  672.174298]  ? mark_lock+0xd3/0xae0
> [  672.177794]  ? kvm_get_linear_rip+0x12c/0x260 [kvm]
> [  672.182710]  ? vcpu_run+0x144/0x7f0 [kvm]
> [  672.186751]  vcpu_run+0x144/0x7f0 [kvm]
> [  672.191567]  kvm_arch_vcpu_ioctl_run+0x23d/0xf40 [kvm]
> [  672.196854]  kvm_vcpu_ioctl+0x42c/0xb20 [kvm]
> [  672.201243]  ? __bpf_trace_kvm_age_hva+0xe0/0xe0 [kvm]
> [  672.206410]  ? __lock_release+0x494/0xa40
> [  672.210424]  ? lock_downgrade+0x110/0x110
> [  672.214432]  ? __lock_contended+0x4de/0x910
> [  672.218620]  ? selinux_inode_getsecctx+0x80/0x80
> [  672.224286]  ? lock_acquire+0x80/0x570
> [  672.228044]  ? __fget_files+0x189/0x2f0
> [  672.231887]  ? security_file_ioctl+0x50/0x90
> [  672.236164]  __x64_sys_ioctl+0x127/0x190
> [  672.240090]  do_syscall_64+0x3b/0x90
> [  672.243666]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> [  672.248719] RIP: 0033:0x7f648c3253eb
> [  672.252297] Code: ff ff ff 85 c0 79 9b 49 c7 c4 ff ff ff ff 5b 5d 4c 89
> e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48>
> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 0d 2a 0f 00 f7 d8 64 89 01 48
> [  672.271989] RSP: 002b:00007f64889f3548 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000010
> [  672.279555] RAX: ffffffffffffffda RBX: 00005613af6071e0 RCX:
> 00007f648c3253eb
> [  672.286688] RDX: 0000000000000000 RSI: 000000000000ae80 RDI:
> 0000000000000019
> [  672.294719] RBP: 00007f648cc06000 R08: 00005613acedd210 R09:
> 00000000000000ff
> [  672.301852] R10: 0000000000000001 R11: 0000000000000246 R12:
> 0000000000000001
> [  672.308984] R13: 0000000000000001 R14: 00000000000003f9 R15:
> 0000000000000000
> [  672.316124] 
> [  672.317616] 
> [  672.319109] addr ffffc90010fe75e0 is located in stack of task
> qemu-kvm/4844 at offset 40 in frame:
> [  672.328781]  ioapic_write_indirect+0x0/0x9e0 [kvm]
> [  672.333609] 
> [  672.335106] this frame has 2 objects:
> [  672.338774]  [32, 40) 'vcpu_bitmap'
> [  672.338776]  [64, 88) 'irq'
> [  672.342265] 
> [  672.346547] Memory state around the buggy address:
> [  672.351341]  ffffc90010fe7480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 00 f1
> [  672.359382]  ffffc90010fe7500: f1 f1 f1 00 f3 f3 f3 00 00 00 00 00 00 00
> 00 00
> [  672.366600] >ffffc90010fe7580: 00 00 00 00 00 00 00 f1 f1 f1 f1 00 f2 f2
> f2 00
> [  672.373821]                                                        ^
> [  672.380171]  ffffc90010fe7600: 00 00 f3 f3 f3 f3 f3 00 00 00 00 00 00 00
> 00 00
> [  672.388164]  ffffc90010fe7680: 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 00 00
> [  672.395381]
> ==================================================================
> [  672.402595] Disabling lock debugging due to kernel taint
> [  673.811651] hrtimer: interrupt took 722589 ns
> 
> 
> 
> Host kernel line info:
> # cat /proc/cmdline 
> BOOT_IMAGE=(hd0,gpt2)/vmlinuz-5.14.0-0.rc4.35.el9.x86_64+debug
> root=/dev/mapper/rhel_amd--milan--07-root ro
> resume=/dev/mapper/rhel_amd--milan--07-swap rd.lvm.lv=rhel_amd-milan-07/root
> rd.lvm.lv=rhel_amd-milan-07/swap console=ttyS0,115200n81 crashkernel=auto
> 
> 
> 
> Hi, would you please help to check this? Thanks.
> 
> 
> 
> Best regards
> Liu Nana

Comment 9 Dr. David Alan Gilbert 2021-08-05 08:48:53 UTC
(In reply to Bandan Das from comment #8)

> Dave, you asked about large guests. Are you aware of any known issue with
> Zen3 with the address
> sanitizer ?

No I'm not; but since the backtrace showed 'kvm_make_vcpus_request_mask' I wondered if
that mask was dependent on number of vcpus and something was going wrong in the size
calculations or numbering there.

Dave

> 
> > And I still can reproduce this bug with the latest kernel at the first VM
> > installation (RHEL.9.0) this time. Please check the command line follows
> > 'QEMU command line [1]'. And the guest is installed successfully.
> > 
> > 
> > Test Environments:
> >     amd-milan-07.khw1.lab.eng.bos.redhat.com
> >     5.14.0-0.rc4.35.el9.x86_64+debug
> >     qemu-kvm-6.0.0-10.el9.x86_64
> > 
> > 
> > QEMU command line [1]
> > /usr/libexec/qemu-kvm \
> >     -S  \
> >     -name 'avocado-vt-vm1'  \
> >     -sandbox on  \
> >     -machine pc,memory-backend=mem-machine_mem  \
> >     -nodefaults \
> >     -device VGA,bus=pci.0,addr=0x2 \
> >     -m 105472 \
> >     -object memory-backend-ram,size=105472M,id=mem-machine_mem  \
> >     -smp 128,maxcpus=128,cores=64,threads=1,dies=1,sockets=2  \
> >     -cpu 'EPYC-Milan',+kvm_pv_unhalt \
> >     -chardev
> > socket,path=/tmp/avocado_w2tu9r_3/monitor-qmpmonitor1-20210803-225107-
> > 89L3F5Lz,wait=off,server=on,id=qmp_id_qmpmonitor1  \
> >     -mon chardev=qmp_id_qmpmonitor1,mode=control \
> >     -chardev
> > socket,path=/tmp/avocado_w2tu9r_3/monitor-catch_monitor-20210803-225107-
> > 89L3F5Lz,wait=off,server=on,id=qmp_id_catch_monitor  \
> >     -mon chardev=qmp_id_catch_monitor,mode=control \
> >     -device pvpanic,ioport=0x505,id=id6fT7W7 \
> >     -chardev
> > socket,path=/tmp/avocado_w2tu9r_3/serial-serial0-20210803-225107-89L3F5Lz,
> > wait=off,server=on,id=chardev_serial0 \
> >     -device isa-serial,id=serial0,chardev=chardev_serial0  \
> >     -chardev
> > socket,id=seabioslog_id_20210803-225107-89L3F5Lz,path=/tmp/avocado_w2tu9r_3/
> > seabios-20210803-225107-89L3F5Lz,server=on,wait=off \
> >     -device
> > isa-debugcon,chardev=seabioslog_id_20210803-225107-89L3F5Lz,iobase=0x402 \
> >     -device ich9-usb-ehci1,id=usb1,addr=0x1d.0x7,multifunction=on,bus=pci.0 \
> >     -device
> > ich9-usb-uhci1,id=usb1.0,multifunction=on,masterbus=usb1.0,addr=0x1d.0x0,
> > firstport=0,bus=pci.0 \
> >     -device
> > ich9-usb-uhci2,id=usb1.1,multifunction=on,masterbus=usb1.0,addr=0x1d.0x2,
> > firstport=2,bus=pci.0 \
> >     -device
> > ich9-usb-uhci3,id=usb1.2,multifunction=on,masterbus=usb1.0,addr=0x1d.0x4,
> > firstport=4,bus=pci.0 \
> >     -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
> >     -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=0x3 \
> >     -blockdev
> > node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,
> > aio=threads,filename=/home/kvm_autotest_root/images/rhel900-64-virtio-scsi.
> > qcow2,cache.direct=on,cache.no-flush=off \
> >     -blockdev
> > node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-
> > flush=off,file=file_image1 \
> >     -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \
> >     -device
> > virtio-net-pci,mac=9a:3a:9a:e5:09:d4,id=idEsbHMX,netdev=idznZlxd,bus=pci.0,
> > addr=0x4  \
> >     -netdev tap,id=idznZlxd,vhost=on,vhostfd=18,fd=15 \
> >     -blockdev
> > node-name=file_cd1,driver=file,auto-read-only=on,discard=unmap,aio=threads,
> > filename=/home/kvm_autotest_root/iso/linux/RHEL-9.0.0-20210707.2-x86_64-dvd1.
> > iso,cache.direct=on,cache.no-flush=off \
> >     -blockdev
> > node-name=drive_cd1,driver=raw,read-only=on,cache.direct=on,cache.no-
> > flush=off,file=file_cd1 \
> >     -device scsi-cd,id=cd1,drive=drive_cd1,write-cache=on \
> >     -blockdev
> > node-name=file_unattended,driver=file,auto-read-only=on,discard=unmap,
> > aio=threads,filename=/home/kvm_autotest_root/images/rhel900-64/ks.iso,cache.
> > direct=on,cache.no-flush=off \
> >     -blockdev
> > node-name=drive_unattended,driver=raw,read-only=on,cache.direct=on,cache.no-
> > flush=off,file=file_unattended \
> >     -device scsi-cd,id=unattended,drive=drive_unattended,write-cache=on  \
> >     -kernel '/home/kvm_autotest_root/images/rhel900-64/vmlinuz'  \
> >     -append 'inst.sshd ksdevice=link inst.repo=cdrom inst.ks=cdrom:/ks.cfg
> > nicdelay=60 biosdevname=0 net.ifnames=0 console=ttyS0,115200 console=tty0'  \
> >     -initrd '/home/kvm_autotest_root/images/rhel900-64/initrd.img'  \
> >     -vnc :0  \
> >     -rtc base=utc,clock=host,driftfix=slew  \
> >     -boot menu=off,order=cdn,once=d,strict=off  \
> >     -no-shutdown \
> >     -enable-kvm
> > 
> > 
> > 
> > error log:
> > 
> > [  671.917354]
> > ==================================================================
> > [  671.924778] BUG: KASAN: stack-out-of-bounds in
> > kvm_make_vcpus_request_mask+0x174/0x440 [kvm]
> > [  671.933976] Read of size 8 at addr ffffc90010fe75e0 by task qemu-kvm/4844
> > [  671.940759] 
> > [  671.942262] CPU: 58 PID: 4844 Comm: qemu-kvm Not tainted
> > 5.14.0-0.rc4.35.el9.x86_64+debug #1
> > [  671.950700] Hardware name: AMD Corporation DAYTONA_X/DAYTONA_X, BIOS
> > RYM0092C 11/03/2020
> > [  671.958787] Call Trace:
> > [  671.961241]  dump_stack_lvl+0x57/0x7d
> > [  671.965922]  print_address_description.constprop.0+0x1f/0x140
> > [  671.971674]  ? kvm_make_vcpus_request_mask+0x174/0x440 [kvm]
> > [  671.977369]  __kasan_report.cold+0x7f/0x11e
> > [  671.981559]  ? kvm_make_vcpus_request_mask+0x174/0x440 [kvm]
> > [  671.987251]  kasan_report+0x38/0x50
> > [  671.990741]  kasan_check_range+0xf5/0x1d0
> > [  671.994754]  kvm_make_vcpus_request_mask+0x174/0x440 [kvm]
> > [  672.001289]  kvm_make_scan_ioapic_request_mask+0x84/0xc0 [kvm]
> > [  672.007165]  ? inject_pending_event+0x1080/0x1080 [kvm]
> > [  672.012421]  ioapic_write_indirect+0x59f/0x9e0 [kvm]
> > [  672.017414]  ? static_obj+0x40/0xc0
> > [  672.020911]  ? __lock_acquired+0x1d2/0x8c0
> > [  672.025009]  ? kvm_ioapic_eoi_inject_work+0x120/0x120 [kvm]
> > [  672.031612]  ? __lock_contended+0x910/0x910
> > [  672.035798]  ? do_raw_spin_trylock+0xb5/0x180
> > [  672.040163]  ? ioapic_mmio_write+0xe9/0x1e0 [kvm]
> > [  672.044902]  ioapic_mmio_write+0xff/0x1e0 [kvm]
> > [  672.049468]  __kvm_io_bus_write+0x1d1/0x450 [kvm]
> > [  672.054203]  kvm_io_bus_write+0xfe/0x1d0 [kvm]
> > [  672.058677]  ? check_prev_add+0x20f0/0x20f0
> > [  672.063549]  ? __bpf_trace_kvm_test_age_hva+0xb0/0xb0 [kvm]
> > [  672.069166]  write_mmio+0x13b/0x3a0 [kvm]
> > [  672.073218]  emulator_read_write_onepage+0x167/0x4b0 [kvm]
> > [  672.078736]  ? vcpu_mmio_gva_to_gpa+0x5b0/0x5b0 [kvm]
> > [  672.083811]  ? decode_register+0xf1/0x400 [kvm]
> > [  672.088369]  ? fetch_possible_mmx_operand.part.0+0x120/0x120 [kvm]
> > [  672.095501]  emulator_read_write+0x157/0x550 [kvm]
> > [  672.100331]  ? decode_operand+0x9a9/0x2920 [kvm]
> > [  672.104996]  segmented_write.isra.0+0xc9/0x110 [kvm]
> > [  672.109993]  ? segmented_read.isra.0+0x380/0x380 [kvm]
> > [  672.115165]  writeback+0x6a5/0x8c0 [kvm]
> > [  672.119119]  ? emulator_task_switch+0x2b0/0x2b0 [kvm]
> > [  672.124196]  ? em_rdmsr+0x420/0x420 [kvm]
> > [  672.129147]  x86_emulate_insn+0x1a0c/0x3cf0 [kvm]
> > [  672.133888]  ? ept_invlpg+0xc0/0xc0 [kvm]
> > [  672.137932]  ? rcu_read_unlock+0x40/0x40
> > [  672.141864]  x86_emulate_instruction+0x5e5/0x1190 [kvm]
> > [  672.147129]  vcpu_enter_guest+0x1af3/0x3ac0 [kvm]
> > [  672.151861]  ? lock_acquire+0x1ca/0x570
> > [  672.155702]  ? kvm_vcpu_reload_apic_access_page+0x50/0x50 [kvm]
> > [  672.162530]  ? rcu_read_unlock+0x40/0x40
> > [  672.166453]  ? mark_lock_irq+0xda0/0xda0
> > [  672.170371]  ? __mutex_lock+0xb77/0x1170
> > [  672.174298]  ? mark_lock+0xd3/0xae0
> > [  672.177794]  ? kvm_get_linear_rip+0x12c/0x260 [kvm]
> > [  672.182710]  ? vcpu_run+0x144/0x7f0 [kvm]
> > [  672.186751]  vcpu_run+0x144/0x7f0 [kvm]
> > [  672.191567]  kvm_arch_vcpu_ioctl_run+0x23d/0xf40 [kvm]
> > [  672.196854]  kvm_vcpu_ioctl+0x42c/0xb20 [kvm]
> > [  672.201243]  ? __bpf_trace_kvm_age_hva+0xe0/0xe0 [kvm]
> > [  672.206410]  ? __lock_release+0x494/0xa40
> > [  672.210424]  ? lock_downgrade+0x110/0x110
> > [  672.214432]  ? __lock_contended+0x4de/0x910
> > [  672.218620]  ? selinux_inode_getsecctx+0x80/0x80
> > [  672.224286]  ? lock_acquire+0x80/0x570
> > [  672.228044]  ? __fget_files+0x189/0x2f0
> > [  672.231887]  ? security_file_ioctl+0x50/0x90
> > [  672.236164]  __x64_sys_ioctl+0x127/0x190
> > [  672.240090]  do_syscall_64+0x3b/0x90
> > [  672.243666]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > [  672.248719] RIP: 0033:0x7f648c3253eb
> > [  672.252297] Code: ff ff ff 85 c0 79 9b 49 c7 c4 ff ff ff ff 5b 5d 4c 89
> > e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48>
> > 3d 01 f0 ff ff 73 01 c3 48 8b 0d 0d 2a 0f 00 f7 d8 64 89 01 48
> > [  672.271989] RSP: 002b:00007f64889f3548 EFLAGS: 00000246 ORIG_RAX:
> > 0000000000000010
> > [  672.279555] RAX: ffffffffffffffda RBX: 00005613af6071e0 RCX:
> > 00007f648c3253eb
> > [  672.286688] RDX: 0000000000000000 RSI: 000000000000ae80 RDI:
> > 0000000000000019
> > [  672.294719] RBP: 00007f648cc06000 R08: 00005613acedd210 R09:
> > 00000000000000ff
> > [  672.301852] R10: 0000000000000001 R11: 0000000000000246 R12:
> > 0000000000000001
> > [  672.308984] R13: 0000000000000001 R14: 00000000000003f9 R15:
> > 0000000000000000
> > [  672.316124] 
> > [  672.317616] 
> > [  672.319109] addr ffffc90010fe75e0 is located in stack of task
> > qemu-kvm/4844 at offset 40 in frame:
> > [  672.328781]  ioapic_write_indirect+0x0/0x9e0 [kvm]
> > [  672.333609] 
> > [  672.335106] this frame has 2 objects:
> > [  672.338774]  [32, 40) 'vcpu_bitmap'
> > [  672.338776]  [64, 88) 'irq'
> > [  672.342265] 
> > [  672.346547] Memory state around the buggy address:
> > [  672.351341]  ffffc90010fe7480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > 00 f1
> > [  672.359382]  ffffc90010fe7500: f1 f1 f1 00 f3 f3 f3 00 00 00 00 00 00 00
> > 00 00
> > [  672.366600] >ffffc90010fe7580: 00 00 00 00 00 00 00 f1 f1 f1 f1 00 f2 f2
> > f2 00
> > [  672.373821]                                                        ^
> > [  672.380171]  ffffc90010fe7600: 00 00 f3 f3 f3 f3 f3 00 00 00 00 00 00 00
> > 00 00
> > [  672.388164]  ffffc90010fe7680: 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > 00 00
> > [  672.395381]
> > ==================================================================
> > [  672.402595] Disabling lock debugging due to kernel taint
> > [  673.811651] hrtimer: interrupt took 722589 ns
> > 
> > 
> > 
> > Host kernel line info:
> > # cat /proc/cmdline 
> > BOOT_IMAGE=(hd0,gpt2)/vmlinuz-5.14.0-0.rc4.35.el9.x86_64+debug
> > root=/dev/mapper/rhel_amd--milan--07-root ro
> > resume=/dev/mapper/rhel_amd--milan--07-swap rd.lvm.lv=rhel_amd-milan-07/root
> > rd.lvm.lv=rhel_amd-milan-07/swap console=ttyS0,115200n81 crashkernel=auto
> > 
> > 
> > 
> > Hi, would you please help to check this? Thanks.
> > 
> > 
> > 
> > Best regards
> > Liu Nana

Comment 10 liunana 2021-08-18 07:05:08 UTC
(In reply to Bandan Das from comment #8)
> (In reply to liunana from comment #7)
> > (In reply to Dr. David Alan Gilbert from comment #6)
> > > (In reply to liunana from comment #5)
> > > > (In reply to Bandan Das from comment #4)
> > > > > I tried this with kernel-5.14.0-0.rc2.23.el9.x86_64 on
> > > > > dell-per6525-01.dell2.lab.eng.bos.redhat.com which (I think) is a Zen3. 
> > > > > I see no trace launching a guest.
> > > > 
> > > > It occurs in the machine's use, I didn't meet the issue at first while doing
> > > > my test.
> > > > But when I met it once, it is easy to reproduce.
> > > > 
> > > > 
> > > > Is there a specific guest config that
> > > > > causes this ? Could you also try a more recent build ?
> > > > 
> > > > Ok, I will try it again using the recent build, and will update the result.
> > > > 
> > > > 
> > > > 
> > > > 
> > > > Best regards
> > > > Liu Nana
> > > 
> > > Liu: Just a guess, but how big are your VMs?  What's the command line you're
> > > using for the qemu?
> > 
> > 
> > About 6 VMs, including windows guests and RHEL guests. I installed them with
> > avocado automatically.
> > 
> Thanks for confirming that it's still there.
> I need to reproduce this on my setup. 
> 

Hi, sorry for the late reply, the Milan machine keeps being used.


> Can you either 
> - give me instructions on how you set this up using avocado ?
> Or
> - Give me setup instructions using qemu ? I know you posted the qemu command
> line but is that 
> enough to reproduce ? 

Would you please try the debug kernel packages? It easily to reproduce this bug and seems I can only reproduce this bug with debug kernel.


Test Env:
# rpm -qa | grep kernel
kernel-tools-libs-5.14.0-0.rc4.35.el9.x86_64
kernel-core-5.14.0-0.rc4.35.el9.x86_64
kernel-modules-5.14.0-0.rc4.35.el9.x86_64
kernel-5.14.0-0.rc4.35.el9.x86_64
kernel-tools-5.14.0-0.rc4.35.el9.x86_64
kernel-headers-5.14.0-0.rc4.35.el9.x86_64
kernel-srpm-macros-1.0-7.el9.noarch
kernel-debug-core-5.14.0-0.rc4.35.el9.x86_64
kernel-debug-modules-5.14.0-0.rc4.35.el9.x86_64
kernel-debug-devel-5.14.0-0.rc4.35.el9.x86_64
kernel-debug-5.14.0-0.rc4.35.el9.x86_64



Reproduce steps:
1. Install kernel-debug packages. Then reboot to choose the kernel '5.14.0-0.rc4.35.el9.x86_64+debug'.

2. Boot a guest, you will see the call trace in the console soon, or you check the dmesg log. And I guess you can reproduce this bug with your own boot script.

This is a simple qemu command line:
/usr/libexec/qemu-kvm \
    -name 'avocado-vt-vm1'  \
    -sandbox on  \
    -machine pc \
    -nodefaults \
    -m 104448 \
    -smp 128,maxcpus=128,cores=64,threads=1,dies=1,sockets=2  \
    -cpu 'EPYC-Milan',+kvm_pv_unhalt \
    -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=0x3 \
    -blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/rhel850-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \
    -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \
    -vnc :0  \
    -enable-kvm \
    -monitor stdio \




>Do I have to run 6 guests to reproduce this ?

No need, I can reproduce this bug easily at the guest first boot with new kernel.


> Should the guests be idle or should I run
> something for the trace to occur ?

No need. I just boot a guest after host rebooting.

Would you please check if this can reproduce the bug? Thanks.



Best regards
Liu Nana

Comment 11 Dr. David Alan Gilbert 2021-08-18 13:12:05 UTC
Yep, I can reproduce that on another box ( amd-epyc3-milan-7713-2s.tpb )

/usr/libexec/qemu-kvm -sandbox on -machine pc -nodefaults -m 200G -smp 128,maxcpus=128,cores=64,threads=1,dies=1,sockets=2  -cpu 'EPYC-Milan',+kvm_pv_unhalt -drive if=virtio,file=./rhel-guest-image-8.6-157.x86_64.qcow2 -nographic -enable-kvm

for me it's fine at smp 32 or 64; but 128 triggers it.

Comment 12 Dr. David Alan Gilbert 2021-08-18 16:24:18 UTC
I'm pretty sure the problem here is that kvm_make_vcpus_request_mask is being called with vcpu_bitmap being a single long on the stack 
of ioapic_write_indirect, and kvm_make_cpus_request_mask has a:

        kvm_for_each_vcpu(i, vcpu, kvm) {
                if ((vcpu_bitmap && !test_bit(i, vcpu_bitmap)) ||
                    vcpu == except)
                        continue;

which dutifully tasks all 128 bits of the 64bit vcpu_bitmap;
I can see the 'i' index over 100 in pr_info.

Comment 13 Vitaly Kuznetsov 2021-08-19 08:43:02 UTC
While it seems that ioapic_write_indirect() can't set bits about 64, it is illegal
indeed to call kvm_make_cpus_request_mask() with truncated vcpu mask as we don't
pass a 'length' parameter (so KVM_MAX_VCPUS bits is assumed). The following 
(compile and smoke-tested only) should help I believe:

diff --git a/arch/x86/kvm/ioapic.c b/arch/x86/kvm/ioapic.c
index ff005fe738a4..58829358224c 100644
--- a/arch/x86/kvm/ioapic.c
+++ b/arch/x86/kvm/ioapic.c
@@ -319,7 +319,7 @@ static void ioapic_write_indirect(struct kvm_ioapic *ioapic, u32 val)
        unsigned index;
        bool mask_before, mask_after;
        union kvm_ioapic_redirect_entry *e;
-       unsigned long vcpu_bitmap;
+       unsigned long vcpu_bitmap[BITS_TO_LONGS(KVM_MAX_VCPUS)];
        int old_remote_irr, old_delivery_status, old_dest_id, old_dest_mode;
 
        switch (ioapic->ioregsel) {
@@ -384,9 +384,9 @@ static void ioapic_write_indirect(struct kvm_ioapic *ioapic, u32 val)
                        irq.shorthand = APIC_DEST_NOSHORT;
                        irq.dest_id = e->fields.dest_id;
                        irq.msi_redir_hint = false;
-                       bitmap_zero(&vcpu_bitmap, 16);
+                       bitmap_zero(vcpu_bitmap, 16);
                        kvm_bitmap_or_dest_vcpus(ioapic->kvm, &irq,
-                                                &vcpu_bitmap);
+                                                vcpu_bitmap);
                        if (old_dest_mode != e->fields.dest_mode ||
                            old_dest_id != e->fields.dest_id) {
                                /*
@@ -399,10 +399,10 @@ static void ioapic_write_indirect(struct kvm_ioapic *ioapic, u32 val)
                                    kvm_lapic_irq_dest_mode(
                                        !!e->fields.dest_mode);
                                kvm_bitmap_or_dest_vcpus(ioapic->kvm, &irq,
-                                                        &vcpu_bitmap);
+                                                        vcpu_bitmap);
                        }
                        kvm_make_scan_ioapic_request_mask(ioapic->kvm,
-                                                         &vcpu_bitmap);
+                                                         vcpu_bitmap);
                } else {
                        kvm_make_scan_ioapic_request(ioapic->kvm);
                }

Comment 14 Dr. David Alan Gilbert 2021-08-19 09:48:40 UTC
Hmm that'll get quite big these days I guess.
Is there a reason that bitmap_zero doesn't have to cover the whole of the bitmap?

Comment 15 Vitaly Kuznetsov 2021-08-19 12:26:09 UTC
(In reply to Dr. David Alan Gilbert from comment #14)
> Hmm that'll get quite big these days I guess.

We need 1 bit per vCPU and KVM_MAX_VCPUS is 288 upstream and 2048 downstream
so we will need 256 bytes max. We can live with that I guess.

> Is there a reason that bitmap_zero doesn't have to cover the whole of the
> bitmap?

Nitesh (Cc:) should know exactly:

commit 7ee30bc132c683d06a6d9e360e39e483e3990708
Author: Nitesh Narayan Lal <nitesh>
Date:   Thu Nov 7 07:53:43 2019 -0500

    KVM: x86: deliver KVM IOAPIC scan request to target vCPUs

commit 9a2ae9f6b6bbd3ef05d5e5977ace854e9b8f04b5
Author: Nitesh Narayan Lal <nitesh>
Date:   Wed Nov 20 07:12:24 2019 -0500

    KVM: x86: Zero the IOAPIC scan request dest vCPUs bitmap

Comment 16 Nitesh Narayan Lal 2021-08-19 14:12:11 UTC
(In reply to Vitaly Kuznetsov from comment #15)
> (In reply to Dr. David Alan Gilbert from comment #14)
> > Hmm that'll get quite big these days I guess.
> 
> We need 1 bit per vCPU and KVM_MAX_VCPUS is 288 upstream and 2048 downstream
> so we will need 256 bytes max. We can live with that I guess.
> 
> > Is there a reason that bitmap_zero doesn't have to cover the whole of the
> > bitmap?
> 
> Nitesh (Cc:) should know exactly:
> 

Thanks, Vitaly for adding me.
I agree, it looks like the code should have been using KVM_MAX_VCPUS in the
first place. Will dig further into the code to see if I could share any
other useful information.

Comment 17 Vitaly Kuznetsov 2021-08-19 14:32:27 UTC
Thanks, Nitesh! I suspect kvm_bitmap_or_dest_vcpus() has the same issue as it passes
unsigned long bitmap to kvm_apic_map_get_dest_lapic. I think we should enlarge it too.
(I didn't spend much time on the code yet though, may be wrong)

Comment 18 Vitaly Kuznetsov 2021-08-19 14:32:27 UTC
Thanks, Nitesh! I suspect kvm_bitmap_or_dest_vcpus() has the same issue as it passes
unsigned long bitmap to kvm_apic_map_get_dest_lapic. I think we should enlarge it too.
(I didn't spend much time on the code yet though, may be wrong)

Comment 19 Dr. David Alan Gilbert 2021-08-19 15:37:47 UTC
I reproduced it on an older host, as expected it also triggers; so not Milan specific
(Not even sure it's AMD specific)

Comment 20 Nitesh Narayan Lal 2021-08-19 20:47:56 UTC
(In reply to Vitaly Kuznetsov from comment #18)
> Thanks, Nitesh! I suspect kvm_bitmap_or_dest_vcpus() has the same issue as
> it passes
> unsigned long bitmap to kvm_apic_map_get_dest_lapic. I think we should
> enlarge it too.
> (I didn't spend much time on the code yet though, may be wrong)

Yes, that should also be fixed along with its first for_each_set_bit loop.

I think the same change should also be made in kvm_irq_delivery_to_apic_fast
and kvm_intr_is_single_vcpu_fast, isn't it?

Thanks!

Comment 21 Vitaly Kuznetsov 2021-08-20 08:24:37 UTC
(In reply to Nitesh Narayan Lal from comment #20)
> 
> I think the same change should also be made in kvm_irq_delivery_to_apic_fast
> and kvm_intr_is_single_vcpu_fast, isn't it?

kvm_apic_map_get_logical_dest() doesn't seem to set more than 16 bits in 'bitmap'
(and its interface says "u16 *bitmap"), same goes to kvm_apic_map_get_dest_lapic()
so kvm_irq_delivery_to_apic_fast() and kvm_intr_is_single_vcpu_fast() should be 
safe I believe. I may have missed something of course...

Comment 22 Vitaly Kuznetsov 2021-08-20 12:46:07 UTC
I've sent https://lore.kernel.org/kvm/20210820124354.582222-2-vkuznets@redhat.com/ upstream.

Comment 28 liunana 2021-11-08 05:50:21 UTC
Hi Vitaly,


I can't reproduce this bug without debug kernel, so seems I need to pre-verify this with debug kernel.

But I didn't see the debug kernel packages on above mr link(Comment 24 & Comment 25), could you help to check this?

Thanks.



Best regards
Liu Nana

Comment 36 liunana 2021-11-17 03:29:04 UTC
Test Env:
   kernel-5.14.0-14.el9.x86_64+debug
   qemu-kvm-6.1.0-6.el9.x86_64
   amd-milan-04.khw1.lab.eng.bos.redhat.com


Test senarios:
1. Installation of RHEL9/Win11/Win2022 guests: PASS
2. sanity test of cpu model: PASS


existed bugs: Bug 1959421 - [RHEL9]Host hung with log "BUG: soft lockup - CPU#29 stuck for 22s! [systemd:1]" on Milan machine



Hi Vitaly,


Would you please help to take a look of the existed bug? 
This is the latest comment Which is reproduced while testing the current bug: https://bugzilla.redhat.com/show_bug.cgi?id=1959421#c14
If that's nothing with this bug I think we can move this bug to VERIFIED.
Thanks!



Best regards
Liu Nana

Comment 37 liunana 2021-11-17 08:29:52 UTC
There is another existed issue: Bug 2024063 - [RHEL.9.0.0] Host outputs Call Trace messages while booting windows guests on Milan host.


Besides, there is a new KSAN error now. 

I boot a windows 2019 guest with huge memory and then reboot it.
Then VM get stuck and host reports Call Trace.

But I only meet this issue once.


After sending a quit vm command, qemu reports error log:
(qemu) q
qemu:cpus_kick_thread: Invalid argument[60276.403712] switch: port 2(tap0) entered disabled state


Seems this is a new issue, I create one bug to track this: Bug 2024058 - [RHEL9] KASAN: null-ptr-deref in range [0x0000000000000130-0x0000000000000137] 

Could you please also help to check if above two bugs are related to the current bug?
If not, we can move this bug to VERIFIED.

Thanks!


Best
Liu Nana

Comment 38 Dr. David Alan Gilbert 2021-11-17 09:17:48 UTC
this one is always kvm_make_vcpus_request_mask, where as 2024058 is a scarier one in gup_pte_range

Both 2024058 and 2024063 are complaining a lot about 'cachline tracking' - first time I've seen that.

Comment 39 Vitaly Kuznetsov 2021-11-18 12:54:16 UTC
(In reply to liunana from comment #36)
> 
> Would you please help to take a look of the existed bug? 
> This is the latest comment Which is reproduced while testing the current
> bug: https://bugzilla.redhat.com/show_bug.cgi?id=1959421#c14


(In reply to liunana from comment #37)
> 
> Seems this is a new issue, I create one bug to track this: Bug 2024058 -
> [RHEL9] KASAN: null-ptr-deref in range
> [0x0000000000000130-0x0000000000000137] 
> 

Thanks for reporting these! From skimming through them I don't think they're 
directly related to this BZ. It would also be great to retest with KVM rebase
(https://bugzilla.redhat.com/show_bug.cgi?id=2009338).

Comment 44 liunana 2021-11-22 08:46:03 UTC
Move this bz to verified according to Comment 43. Thanks.

Comment 47 errata-xmlrpc 2022-05-17 15:38:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (new packages: kernel), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:3907


Note You need to log in before you can comment on or make changes to this bug.