RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2053526 - Guest hit call trace during reboot after hotplug vdisks + migration
Summary: Guest hit call trace during reboot after hotplug vdisks + migration
Keywords:
Status: CLOSED DUPLICATE of bug 2053584
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: qemu-kvm
Version: 9.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Virtualization Maintenance
QA Contact: Li Xiaohui
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-02-11 13:24 UTC by Li Xiaohui
Modified: 2022-03-30 01:16 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-02-23 16:40:07 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-112032 0 None None None 2022-02-11 13:32:39 UTC

Description Li Xiaohui 2022-02-11 13:24:45 UTC
Description of problem:
Hotplug two vdisks on source host, then migrate guest. After migration, guest would hit call trace when reboot, finally fail to reboot.


Version-Release number of selected component (if applicable):
hosts info: kernel-5.14.0-55.el9.x86_64 & qemu-kvm-6.2.0-7.el9.x86_64
guest info: kernel-4.18.0-364.el8.x86_64


How reproducible:
3/3


Steps to Reproduce:
1.Boot a guest with below qemu cmds[1]
2.Hot add two disk in guest:
2.1 Create two image on src host
# qemu-img create -f qcow2 /mnt/nfs/data-disk0.qcow2 10G
# qemu-img create -f qcow2 /mnt/nfs/data-disk1.qcow2 20G
2.2 Hot plug the above disks
{"execute": "blockdev-add", "arguments": {"node-name": "drive_d0", "driver": "file", "filename": "//mnt/nfs/data-disk0.qcow2"}, "id": "mq3QNZ1Z"}
{"return": {}, "id": "mq3QNZ1Z"}
{"execute": "blockdev-add", "arguments": {"node-name": "drive_d00", "driver": "qcow2", "file": "drive_d0"}, "id": "CoT6x2aF"}
{"return": {}, "id": "CoT6x2aF"}
{"execute": "device_add", "arguments": {"driver": "virtio-blk-pci", "drive": "drive_d00", "id": "data_d0", "bus": "extra_root1"}, "id": "rytH1O52"}
{"return": {}, "id": "rytH1O52"}
{"execute": "blockdev-add", "arguments": {"node-name": "drive_d1", "driver": "file", "filename": "//mnt/nfs/data-disk1.qcow2"}, "id": "QxilkKL2"}
{"return": {}, "id": "QxilkKL2"}
{"execute": "blockdev-add", "arguments": {"node-name": "drive_d11", "driver": "qcow2", "file": "drive_d1"}, "id": "4bsH4oTs"}
{"return": {}, "id": "4bsH4oTs"}
{"execute": "device_add", "arguments": {"driver": "virtio-scsi-pci", "id": "virtio_scsi_pci1", "bus": "extra_root2"}, "id": "D7vw4TPp"}
{"return": {}, "id": "D7vw4TPp"}
{"execute": "device_add", "arguments": {"driver": "scsi-hd", "drive": "drive_d11", "id": "data_d1", "bus": "virtio_scsi_pci1.0", "channel": "0", "scsi-id": "0", "lun": "1"}, "id": "lLNbK3RU"}
{"return": {}, "id": "lLNbK3RU"}
2.3 Check the hot plug disk on src guest through {"execute": "query-block"}, they hotplugged successfully.
3.Boot '-incoming mode' guest with disk added in step2 on dst host:
-device virtio-blk-pci,drive=drive_d00,id=data_d0,bus=extra_root1 \
-device virtio-scsi-pci,id=virtio_scsi_pci1,bus=extra_root2 \
-device scsi-hd,drive=drive_d11,id=data_d1,bus=virtio_scsi_pci1.0,channel=0,scsi-id=0,lun=1 \
-blockdev driver=file,cache.direct=on,cache.no-flush=off,filename=//mnt/nfs/data-disk0.qcow2,node-name=drive_d0 \
-blockdev driver=qcow2,node-name=drive_d00,file=drive_d0 \
-blockdev driver=file,cache.direct=on,cache.no-flush=off,filename=//mnt/nfs/data-disk1.qcow2,node-name=drive_d1 \
-blockdev driver=qcow2,node-name=drive_d11,file=drive_d1 \
4.Migrate guest from source to dst host, migration succeed.
5.Check vdisk on dst host through {"execute": "query-block"}, they're in guest.
6.Reboot guest on dst host


Actual results:
Guest hit call trace and fail to reboot after step 6:
2022-02-08-09:10:55: [K[  [0;31m*[0;1;31m*[0m[0;31m* [0m] A stop job is running for Monitorin…gress polling (4min 36s / 6min 2s)
2022-02-08-09:10:55: [  368.523176] INFO: task lvm:2975 blocked for more than 120 seconds.
2022-02-08-09:10:55: [  368.525398]       Tainted: G               X --------- -  - 4.18.0-364.el8.x86_64 #1
2022-02-08-09:10:55: [  368.526741] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
2022-02-08-09:10:55: [  368.528052] task:lvm             state:D stack:    0 pid: 2975 ppid:     1 flags:0x00004084
2022-02-08-09:10:55: [  368.529246] Call Trace:
2022-02-08-09:10:55: [  368.529847]  __schedule+0x2d1/0x830
2022-02-08-09:10:55: [  368.530775]  ? __switch_to_asm+0x35/0x70
2022-02-08-09:10:55: [  368.531729]  ? __switch_to_asm+0x41/0x70
2022-02-08-09:10:55: [  368.532681]  schedule+0x35/0xa0
2022-02-08-09:10:55: [  368.533496]  schedule_timeout+0x274/0x300
2022-02-08-09:10:55: [  368.534331]  ? __schedule+0x2d9/0x830
2022-02-08-09:10:55: [  368.535009]  ? check_preempt_curr+0x62/0x90
2022-02-08-09:10:55: [  368.535966]  wait_for_completion+0x96/0x100
2022-02-08-09:10:55: [  368.536899]  flush_work+0x11d/0x1c0
2022-02-08-09:10:55: [  368.537776]  ? worker_attach_to_pool+0xb0/0xb0
2022-02-08-09:10:55: [  368.538711]  ? __virtscsi_add_cmd+0x148/0x270 [virtio_scsi]
2022-02-08-09:10:55: [  368.539721]  ? work_busy+0x80/0x80
2022-02-08-09:10:55: [  368.540538]  __cancel_work_timer+0x105/0x190
2022-02-08-09:10:55: [  368.541338]  ? xas_load+0x8/0x80
2022-02-08-09:10:55: [  368.541925]  ? xa_load+0x48/0x90
2022-02-08-09:10:55: [  368.542691]  disk_block_events+0x6f/0x90
2022-02-08-09:10:55: [  368.543523]  blkdev_get+0x187/0x2c0
2022-02-08-09:10:55: [  368.544175]  ? igrab+0x31/0x50
2022-02-08-09:10:55: [  368.544726]  ? bd_acquire+0xd0/0xd0
2022-02-08-09:10:55: [  368.545480]  do_dentry_open+0x132/0x340
2022-02-08-09:10:55: [  368.546092]  path_openat+0x53e/0x14f0
2022-02-08-09:10:55: [  368.546689]  ? blk_mq_request_issue_directly+0x4e/0xb0
2022-02-08-09:10:55: [  368.547575]  do_filp_open+0x93/0x100
2022-02-08-09:10:55: [  368.548244]  ? getname_flags+0x4a/0x1e0
2022-02-08-09:10:55: [  368.548859]  ? __check_object_size+0xa8/0x16b
2022-02-08-09:10:55: [  368.549643]  do_sys_open+0x184/0x220
2022-02-08-09:10:55: [  368.550304]  do_syscall_64+0x5b/0x1a0
2022-02-08-09:10:55: [  368.550866]  entry_SYSCALL_64_after_hwframe+0x65/0xca
2022-02-08-09:10:55: [  368.551699] RIP: 0033:0x7f4608b9e272
2022-02-08-09:10:55: [  368.552378] Code: Unable to access opcode bytes at RIP 0x7f4608b9e248.
2022-02-08-09:10:55: [  368.553255] RSP: 002b:00007ffeb91476b0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
2022-02-08-09:10:55: [  368.554208] RAX: ffffffffffffffda RBX: 000055ead6896b80 RCX: 00007f4608b9e272
2022-02-08-09:10:55: [  368.555063] RDX: 0000000000044000 RSI: 000055ead6897768 RDI: 00000000ffffff9c
2022-02-08-09:10:55: [  368.555913] RBP: 00000000ffffffff R08: 000055ead6b0d000 R09: 000055ead6900ce0
2022-02-08-09:10:55: [  368.556930] R10: 0000000000000000 R11: 0000000000000246 R12: 000055ead6897768
2022-02-08-09:10:55: [  368.557915] R13: 0000000000044000 R14: 000055ead6aac720 R15: 00007ffeb9148a00


Expected results:
Guest works well after migration, reboot can succeed.


Additional info:
1.Didn't hit guest call trace if no migration like: hotplug two vdisks on src host, then reboot guest on src host, guest succeeds to reboot.
2.Will confirm whether rhel9 hit this bug, and it's a regression issue.

Qemu command line:
/usr/libexec/qemu-kvm  \
-name "mouse-vm" \
-sandbox on \
-machine q35,memory-backend=pc.ram \
-cpu Icelake-Server,ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,avx512ifma=on,sha-ni=on,waitpkg=on,rdpid=on,cldemote=on,movdiri=on,movdir64b=on,fsrm=on,md-clear=on,stibp=on,arch-capabilities=on,avx-vnni=on,avx512-bf16=on,xsaves=on,ibpb=on,ibrs=on,amd-stibp=on,amd-ssbd=on,rdctl-no=on,ibrs-all=on,skip-l1dfl-vmentry=on,mds-no=on,pschange-mc-no=on,tsx-ctrl=on,hle=off,rtm=off,mpx=off,intel-pt=off,kvm_pv_unhalt=on \
-nodefaults  \
-chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1,server=on,wait=off \
-chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor,server=on,wait=off \
-mon chardev=qmp_id_qmpmonitor1,mode=control \
-mon chardev=qmp_id_catch_monitor,mode=control \
-device pcie-root-port,port=0x10,chassis=1,id=root0,bus=pcie.0,multifunction=on,addr=0x2 \
-device pcie-root-port,port=0x11,chassis=2,id=root1,bus=pcie.0,addr=0x2.0x1 \
-device pcie-root-port,port=0x12,chassis=3,id=root2,bus=pcie.0,addr=0x2.0x2 \
-device pcie-root-port,port=0x13,chassis=4,id=root3,bus=pcie.0,addr=0x2.0x3 \
-device pcie-root-port,port=0x14,chassis=5,id=root4,bus=pcie.0,addr=0x2.0x4 \
-device pcie-root-port,port=0x15,chassis=6,id=root5,bus=pcie.0,addr=0x2.0x5 \
-device pcie-root-port,port=0x16,chassis=7,id=root6,bus=pcie.0,addr=0x2.0x6 \
-device pcie-root-port,port=0x17,chassis=8,id=root7,bus=pcie.0,addr=0x2.0x7 \
-device pcie-root-port,port=0x20,chassis=21,id=extra_root0,bus=pcie.0,multifunction=on,addr=0x3 \
-device pcie-root-port,port=0x21,chassis=22,id=extra_root1,bus=pcie.0,addr=0x3.0x1 \
-device pcie-root-port,port=0x22,chassis=23,id=extra_root2,bus=pcie.0,addr=0x3.0x2 \
-device nec-usb-xhci,id=usb1,bus=root0,addr=0x0 \
-device virtio-scsi-pci,id=virtio_scsi_pci0,bus=root1,addr=0x0 \
-device scsi-hd,id=image1,drive=drive_image1,bus=virtio_scsi_pci0.0,channel=0,scsi-id=0,lun=0,bootindex=0,write-cache=on \
-device virtio-net-pci,mac=9a:8a:8b:8c:8d:8e,id=net0,netdev=tap0,bus=root2,addr=0x0 \
-device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
-device virtio-balloon-pci,id=balloon0,bus=root3,addr=0x0 \
-device VGA,id=video0,vgamem_mb=16,bus=pcie.0,addr=0x1 \
-blockdev driver=file,auto-read-only=on,discard=unmap,aio=threads,cache.direct=on,cache.no-flush=off,filename=/mnt/nfs/rhel860-64-virtio-scsi.qcow2,node-name=drive_sys1 \
-blockdev driver=qcow2,node-name=drive_image1,read-only=off,cache.direct=on,cache.no-flush=off,file=drive_sys1 \
-netdev tap,id=tap0,vhost=on \
-m 4096 \
-object memory-backend-ram,id=pc.ram,size=4096M \
-smp 4,maxcpus=4,cores=2,threads=1,sockets=2 \
-vnc :10 \
-rtc base=utc,clock=host \
-boot menu=off,strict=off,order=cdn,once=c \
-enable-kvm  \
-qmp tcp:0:3333,server=on,wait=off \
-qmp tcp:0:9999,server=on,wait=off \
-qmp tcp:0:9888,server=on,wait=off \
-serial tcp:0:4444,server=on,wait=off \
-monitor stdio \
-msg timestamp=on \

Comment 1 Dr. David Alan Gilbert 2022-02-14 11:01:16 UTC
Can you please provide the output of the HMP command 'info pci' from both the source (after hotplugging) and the destination.

Comment 4 Li Xiaohui 2022-02-16 03:22:06 UTC
The diff pci info on source (after hotplugging) and destination host (after migration):
<       BAR1: 32 bit memory at 0xfd600000 [0xfd600fff].
<       BAR4: 64 bit prefetchable memory at 0xfb200000 [0xfb203fff].
---
>       BAR1: 32 bit memory at 0xffffffffffffffff [0x00000ffe].
>       BAR4: 64 bit prefetchable memory at 0xffffffffffffffff [0x00003ffe].
172,173c172,173
<       BAR1: 32 bit memory at 0xfd400000 [0xfd400fff].
<       BAR4: 64 bit prefetchable memory at 0xfb000000 [0xfb003fff].
---
>       BAR1: 32 bit memory at 0xffffffffffffffff [0x00000ffe].
>       BAR4: 64 bit prefetchable memory at 0xffffffffffffffff [0x00003ffe].

Comment 5 Igor Mammedov 2022-02-23 16:40:07 UTC

*** This bug has been marked as a duplicate of bug 2053584 ***


Note You need to log in before you can comment on or make changes to this bug.