Bug 2099934
Summary: | Guest reboot on destination host after postcopy migration completed | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | Li Xiaohui <xiaohli> |
Component: | qemu-kvm | Assignee: | Thomas Huth <thuth> |
qemu-kvm sub component: | Live Migration | QA Contact: | bfu <bfu> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | high | ||
Priority: | high | CC: | bfu, chayang, coli, dgilbert, dhorak, fjin, jinzhao, jomiller, juzhang, lijin, mdeng, peterx, quintela, smitterl, thuth, tstaudt, virt-maint, xiaohli |
Version: | 9.1 | Keywords: | Regression, Triaged |
Target Milestone: | rc | ||
Target Release: | 9.1 | ||
Hardware: | s390x | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | qemu-kvm-7.0.0-10.el9 | Doc Type: | No Doc Update |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2022-11-15 09:54:42 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 2066305 |
Description
Li Xiaohui
2022-06-22 03:06:50 UTC
(In reply to Li Xiaohui from comment #0) > Description of problem: > Guest seems crash on destination host after postcopy migration completed. > I don't get useful console information by automation, I will try to collect > by manually > > > Version-Release number of selected component (if applicable): > hosts info: kernel-5.14.0-114.el9.s390x & qemu-kvm-7.0.0-6.el9.s390x > guest info: kernel-5.14.0-114.el9.s390x > > > How reproducible: > 1/10 > > > Steps to Reproduce: > 1.Boot a guest on source host > 2.Boot a guest on destination host with appending '-incoming defer' > 3.In guest, run one program to produce dirty page > # cd /home && gcc migration_dirtypage_2.c -o dirty_page > # cd /home && ./dirty_page > /dev/null & > 4.Enable postcopy capability on both src and dst qemu, do migration, switch > to postcopy during migration is active > 5.After postcopy migration completed, check guest on destination host > > > Actual results: > Guest seems crash after postcopy migration completed since can't execute > 'dmesg' to check dmesg info in guest > > > Expected results: > Guest works well on destination host after postcopy migration completed > > > Additional info: > 1.Will collect guest crash info through console by manually > 2.Will check whether this bug is a regression later. > 3.Will check whether other architectures can reproduce this bug Thank you for this Li! Could you share some more details about which commands you use to run the migration and to start the vm with? For CTC1, we tested several live migration scenarios with postcopy and they passed. (In reply to smitterl from comment #1) > (In reply to Li Xiaohui from comment #0) > > Description of problem: > > Guest seems crash on destination host after postcopy migration completed. > > I don't get useful console information by automation, I will try to collect > > by manually > > > > > > Version-Release number of selected component (if applicable): > > hosts info: kernel-5.14.0-114.el9.s390x & qemu-kvm-7.0.0-6.el9.s390x > > guest info: kernel-5.14.0-114.el9.s390x Note, we used the following component versions for the migration testing: libvirt-8.4.0-2.el9.s390x qemu-kvm-7.0.0-6.el9.s390x kernel-5.14.0-106.el9.s390x I checked the linked Polarion test case; IIUC, you're using hugepage memory backing during the live migration is that correct? I have not found coverage for that in our test suite, Fangge do you happen to know if we have a test scenario for this on libvirt level? There is a test case in libvirt-level: VIRT-17395 - [Migration] Migrate vm with hugepage memory + numa But there is no "produce dirty page in vm" step in this case. Hi Sebastian, the reason I didn't provide the qemu command and didn't say something about huge pages since I guess we can reproduce this bug without huge pages. It turns out that my guess is correct In summary, I can also reproduce this bug only do postcopy migration without huge pages like Description, reproducible rate is nearly 1/50, the rate is higher (1/10) with hugepage configured The qemu command line is: /usr/libexec/qemu-kvm \ -name "mouse-vm" \ -sandbox on \ -machine s390-ccw-virtio,memory-backend=mem-machine_mem \ -cpu host \ -nodefaults \ -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1,server=on,wait=off \ -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor,server=on,wait=off \ -mon chardev=qmp_id_qmpmonitor1,mode=control \ -mon chardev=qmp_id_catch_monitor,mode=control \ -device virtio-scsi-ccw,id=virtio_scsi_ccw0,devno=fe.0.0000 \ -device scsi-hd,id=image1,drive=drive_image1,write-cache=on,bus=virtio_scsi_ccw0.0 \ -device virtio-net-ccw,mac=9a:fc:6e:3c:a6:f0,id=idE6xm3i,netdev=idnKxBqM,devno=fe.0.0001 \ -device virtio-mouse-ccw,id=input_mouse1,devno=fe.0.0002 \ -device virtio-keyboard-ccw,id=input_keyboard1,devno=fe.0.0003 \ -device virtio-balloon-ccw,id=balloon0,devno=fe.0.0004 \ -blockdev driver=file,auto-read-only=on,discard=unmap,aio=threads,cache.direct=on,cache.no-flush=off,filename=/mnt/xiaohli/rhel910-s390x-virtio-scsi.qcow2,node-name=drive_sys1 \ -blockdev driver=qcow2,node-name=drive_image1,read-only=off,cache.direct=on,cache.no-flush=off,file=drive_sys1 \ -netdev tap,id=idnKxBqM,vhost=on \ -m 4096 \ -object memory-backend-ram,id=mem-machine_mem,size=4096M \ -smp 1,maxcpus=1,cores=1,threads=1,sockets=1 \ -nographic \ -rtc base=utc,clock=host,driftfix=slew \ -boot strict=on \ -enable-kvm \ -qmp tcp:0:3333,server=on,wait=off \ -qmp tcp:0:9999,server=on,wait=off \ -qmp tcp:0:9888,server=on,wait=off \ -serial tcp:0:4444,server=on,wait=off \ -monitor stdio \ And migration_dirtypage_2.c is as below: # cat migration_dirtypage_2.c #include <stdlib.h> #include <stdio.h> #include <signal.h> int main() { void wakeup(); signal(SIGALRM,wakeup); alarm(120); char *buf = (char *) calloc(40960, 4096); while (1) { int i; for (i = 0; i < 40960 * 4; i++) { buf[i * 4096 / 4]++; } printf("."); } } void wakeup() { exit(0); } I have tried this case with more than 100 times on x86 (RHEL 9.1.0 kernel-5.14.0-110.el9.x86_64 & qemu-kvm-7.0.0-6.el9.x86_64), they all pass, don't reproduce this bug. So seems only s390x issue. OK, so we need to know if this is a regression in either qemu or the kernel or if it's already been there. Tried 100 times to reproduce this bug on RHEL 8.7.0 (kernel-4.18.0-402.el8.s390x & qemu-kvm-6.2.0-15.module+el8.7.0+15644+189a21f6.s390x) with same RHEL 9.1.0 guest, but didn't hit the issue, hit a new call trace with 1/100 rate when boot guest on source host with huge page configured. Do we need to open a new bug? 2022-06-23-10:34:30: [ 243.884523] Call Trace: 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.884534] [<000000001ab2c4c2>] __schedule+0x2c2/0x6b0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.884584] [<000000001ab2c904>] schedule+0x54/0xf0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.884586] [<000000001ab2f362>] rwsem_down_read_slowpath+0x182/0x400 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.884589] [<000003ff80283ada>] xlog_cil_commit+0x4a/0x180 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.885033] [<000003ff8027b56c>] __xfs_trans_commit+0x9c/0x330 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.885106] [<000003ff8026d966>] xfs_remove+0x276/0x400 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.885178] [<000003ff802679aa>] xfs_vn_unlink+0x5a/0xb0 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.885248] [<000000001a57260c>] vfs_rmdir+0x9c/0x200 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.885268] [<000000001a578d74>] do_rmdir+0x164/0x1d0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.885271] [<000000001ab25518>] __do_syscall+0x1b8/0x1e0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.885282] [<000000001ab330a8>] system_call+0x78/0xa0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.885292] INFO: task systemd:971 blocked for more than 122 seconds. 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.885294] Not tainted 5.14.0-114.el9.s390x #1 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.885295] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.885298] task:systemd state:D stack: 0 pid: 971 ppid: 0 flags:0x00000000 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.885301] Call Trace: 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.885302] [<000000001ab2c4c2>] __schedule+0x2c2/0x6b0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.885305] [<000000001ab2c904>] schedule+0x54/0xf0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.885306] [<000000001ab2f362>] rwsem_down_read_slowpath+0x182/0x400 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.885309] [<000003ff80283ada>] xlog_cil_commit+0x4a/0x180 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.885383] [<000003ff8027b56c>] __xfs_trans_commit+0x9c/0x330 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.885469] [<000003ff8026d966>] xfs_remove+0x276/0x400 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.885540] [<000003ff802679aa>] xfs_vn_unlink+0x5a/0xb0 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.885611] [<000000001a57260c>] vfs_rmdir+0x9c/0x200 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.885614] [<000000001a578d74>] do_rmdir+0x164/0x1d0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.885615] [<000000001ab25518>] __do_syscall+0x1b8/0x1e0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.885618] [<000000001ab330a8>] system_call+0x78/0xa0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.885623] INFO: task kworker/u2:0:8 blocked for more than 122 seconds. 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.885625] Not tainted 5.14.0-114.el9.s390x #1 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.885626] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.885628] task:kworker/u2:0 state:D stack: 0 pid: 8 ppid: 2 flags:0x00000000 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.885633] Workqueue: writeback wb_workfn (flush-253:0) 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.885653] Call Trace: 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.885656] [<000000001ab2c4c2>] __schedule+0x2c2/0x6b0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.885658] [<000000001ab2c904>] schedule+0x54/0xf0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.885660] [<000000001ab32222>] schedule_timeout+0x122/0x170 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.885663] [<000000001ab2d5ce>] __wait_for_common+0xbe/0x250 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.885665] [<000003ff80251bb4>] xfs_buf_iowait+0x34/0xf0 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.885735] [<000003ff80254104>] __xfs_buf_submit+0x114/0x1f0 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.885811] [<000003ff80254a74>] xfs_buf_read_map+0xe4/0x210 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.885887] [<000003ff80293856>] xfs_trans_read_buf_map+0x126/0x2e0 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.885977] [<000003ff801fef6e>] xfs_alloc_read_agfl+0x8e/0xd0 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.886052] [<000003ff802018a6>] xfs_alloc_fix_freelist+0x1a6/0x410 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.886122] [<000003ff80201cb0>] xfs_alloc_vextent+0x130/0x560 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.886190] [<000003ff80211b6a>] xfs_bmap_btalloc+0x18a/0x790 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.886257] [<000003ff80212234>] xfs_bmapi_allocate+0xc4/0x2f0 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.886325] [<000003ff8021335a>] xfs_bmapi_convert_delalloc+0x25a/0x490 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.886392] [<000003ff8024c490>] xfs_map_blocks+0x200/0x420 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.886462] [<000000001a5f09ca>] iomap_writepage_map+0x14a/0x540 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.886483] [<000000001a5f1c9e>] iomap_do_writepage+0x8e/0x230 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.886486] [<000000001a48c370>] write_cache_pages+0x1e0/0x430 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.886520] [<000000001a5ee706>] iomap_writepages+0x36/0x60 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.886523] [<000003ff8024c9f2>] xfs_vm_writepages+0x92/0xd0 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.886786] [<000000001a48ee62>] do_writepages+0x62/0x1f0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.886789] [<000000001a5a664c>] __writeback_single_inode+0x4c/0x3e0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.886791] [<000000001a5a6ed6>] writeback_sb_inodes+0x246/0x5b0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.886794] [<000000001a5a729a>] __writeback_inodes_wb+0x5a/0x130 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.886796] [<000000001a5a75de>] wb_writeback+0x26e/0x3a0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.886801] [<000000001a5a89a0>] wb_workfn+0x3a0/0x610 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.886803] [<000000001a2c7146>] process_one_work+0x216/0x4a0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.886819] [<000000001a2c7904>] worker_thread+0x64/0x4a0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.886821] [<000000001a2d044c>] kthread+0x17c/0x190 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.886824] [<000000001a2575dc>] __ret_from_fork+0x3c/0x60 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.886839] [<000000001ab330da>] ret_from_fork+0xa/0x30 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.886844] INFO: task kworker/0:1:17 blocked for more than 122 seconds. 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.886846] Not tainted 5.14.0-114.el9.s390x #1 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.886848] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.886850] task:kworker/0:1 state:D stack: 0 pid: 17 ppid: 2 flags:0x00000000 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.886856] Workqueue: xfs-sync/dm-0 xfs_log_worker [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.886931] Call Trace: 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.886934] [<000000001ab2c4c2>] __schedule+0x2c2/0x6b0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.886936] [<000000001ab2c904>] schedule+0x54/0xf0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.886938] [<000000001ab32222>] schedule_timeout+0x122/0x170 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.886941] [<000000001ab2d5ce>] __wait_for_common+0xbe/0x250 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.886943] [<000000001a2c5e9a>] __flush_work.isra.0+0x17a/0x2b0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.886946] [<000003ff80283c8c>] xlog_cil_force_seq+0x7c/0x240 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887020] [<000003ff80280ede>] xfs_log_force+0x8e/0x270 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887093] [<000003ff80281116>] xfs_log_worker+0x56/0xa0 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887166] [<000000001a2c7146>] process_one_work+0x216/0x4a0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887169] [<000000001a2c7904>] worker_thread+0x64/0x4a0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887171] [<000000001a2d044c>] kthread+0x17c/0x190 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887173] [<000000001a2575dc>] __ret_from_fork+0x3c/0x60 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887178] [<000000001ab330da>] ret_from_fork+0xa/0x30 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887185] INFO: task kworker/u2:1:35 blocked for more than 122 seconds. 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887187] Not tainted 5.14.0-114.el9.s390x #1 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887188] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887190] task:kworker/u2:1 state:D stack: 0 pid: 35 ppid: 2 flags:0x00000000 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887194] Workqueue: writeback wb_workfn (flush-253:0) 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887196] Call Trace: 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887200] [<000000001ab2c4c2>] __schedule+0x2c2/0x6b0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887202] [<000000001ab2c904>] schedule+0x54/0xf0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887204] [<000000001ab2ca12>] io_schedule+0x52/0x80 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887206] [<000000001a703350>] rq_qos_wait+0xc0/0x130 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887223] [<000000001a725a24>] wbt_wait+0x94/0x110 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887231] [<000000001a702e3e>] __rq_qos_throttle+0x3e/0x60 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887234] [<000000001a6ed86a>] blk_mq_get_new_requests+0xda/0x1a0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887243] [<000000001a6f2eec>] blk_mq_submit_bio+0x31c/0x4c0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887246] [<000000001a6e204c>] __submit_bio_noacct+0x6c/0x1e0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887260] [<000000001a5ee68e>] iomap_submit_ioend+0x7e/0xc0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887262] [<000000001a5f0b52>] iomap_writepage_map+0x2d2/0x540 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887264] [<000000001a5f1c9e>] iomap_do_writepage+0x8e/0x230 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887266] [<000000001a48c370>] write_cache_pages+0x1e0/0x430 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887269] [<000000001a5ee706>] iomap_writepages+0x36/0x60 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887271] [<000003ff8024c9f2>] xfs_vm_writepages+0x92/0xd0 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887344] [<000000001a48ee62>] do_writepages+0x62/0x1f0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887346] [<000000001a5a664c>] __writeback_single_inode+0x4c/0x3e0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887349] [<000000001a5a6ed6>] writeback_sb_inodes+0x246/0x5b0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887351] [<000000001a5a729a>] __writeback_inodes_wb+0x5a/0x130 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887353] [<000000001a5a75de>] wb_writeback+0x26e/0x3a0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887355] [<000000001a5a89a0>] wb_workfn+0x3a0/0x610 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887358] [<000000001a2c7146>] process_one_work+0x216/0x4a0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887360] [<000000001a2c7904>] worker_thread+0x64/0x4a0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887363] [<000000001a2d044c>] kthread+0x17c/0x190 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887394] [<000000001a2575dc>] __ret_from_fork+0x3c/0x60 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887396] [<000000001ab330da>] ret_from_fork+0xa/0x30 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887403] INFO: task kworker/u2:3:177 blocked for more than 122 seconds. 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887405] Not tainted 5.14.0-114.el9.s390x #1 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887407] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887408] task:kworker/u2:3 state:D stack: 0 pid: 177 ppid: 2 flags:0x00000000 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887416] Workqueue: xfs-cil/dm-0 xlog_cil_push_work [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887490] Call Trace: 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887494] [<000000001ab2c4c2>] __schedule+0x2c2/0x6b0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887496] [<000000001ab2c904>] schedule+0x54/0xf0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887498] [<000000001ab2ca12>] io_schedule+0x52/0x80 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887500] [<000000001a703350>] rq_qos_wait+0xc0/0x130 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887503] [<000000001a725a24>] wbt_wait+0x94/0x110 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887505] [<000000001a702e3e>] __rq_qos_throttle+0x3e/0x60 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887507] [<000000001a6ed86a>] blk_mq_get_new_requests+0xda/0x1a0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887510] [<000000001a6f2eec>] blk_mq_submit_bio+0x31c/0x4c0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887513] [<000000001a6e204c>] __submit_bio_noacct+0x6c/0x1e0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887516] [<000003ff8028336e>] xlog_cil_push_work+0x1fe/0x790 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887595] [<000000001a2c7146>] process_one_work+0x216/0x4a0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887597] [<000000001a2c7af0>] worker_thread+0x250/0x4a0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887599] [<000000001a2d044c>] kthread+0x17c/0x190 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887601] [<000000001a2575dc>] __ret_from_fork+0x3c/0x60 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887604] [<000000001ab330da>] ret_from_fork+0xa/0x30 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887607] INFO: task plymouthd:511 blocked for more than 122 seconds. 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887609] Not tainted 5.14.0-114.el9.s390x #1 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887611] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887612] task:plymouthd state:D stack: 0 pid: 511 ppid: 1 flags:0x00000000 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887615] Call Trace: 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887619] [<000000001ab2c4c2>] __schedule+0x2c2/0x6b0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887621] [<000000001ab2c904>] schedule+0x54/0xf0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887623] [<000000001ab2fa98>] rwsem_down_write_slowpath+0x258/0x540 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887626] [<000003ff80267dac>] xfs_vn_update_time+0xbc/0x1c0 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887701] [<000000001a587d06>] file_update_time+0xd6/0x140 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887708] [<000003ff8025a1be>] xfs_file_write_checks+0x24e/0x2f0 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887782] [<000003ff8025a8ec>] xfs_file_buffered_write+0x7c/0x2c0 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887853] [<000000001a55ffa0>] new_sync_write+0x100/0x190 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887860] [<000000001a562c58>] vfs_write+0x1e8/0x280 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887863] [<000000001a562f68>] ksys_write+0x68/0x100 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887868] [<000000001ab25518>] __do_syscall+0x1b8/0x1e0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887870] [<000000001ab330a8>] system_call+0x78/0xa0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887882] INFO: task auditd:900 blocked for more than 122 seconds. 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887884] Not tainted 5.14.0-114.el9.s390x #1 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887885] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887887] task:auditd state:D stack: 0 pid: 900 ppid: 1 flags:0x00000000 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887890] Call Trace: 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887891] [<000000001ab2c4c2>] __schedule+0x2c2/0x6b0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887894] [<000000001ab2c904>] schedule+0x54/0xf0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887896] [<000000001ab2f362>] rwsem_down_read_slowpath+0x182/0x400 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887898] [<000003ff80283ada>] xlog_cil_commit+0x4a/0x180 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.887972] [<000003ff8027b56c>] __xfs_trans_commit+0x9c/0x330 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.888044] [<000003ff80267e10>] xfs_vn_update_time+0x120/0x1c0 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.888116] [<000000001a587d06>] file_update_time+0xd6/0x140 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.888118] [<000003ff8025a1be>] xfs_file_write_checks+0x24e/0x2f0 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.888189] [<000003ff8025a8ec>] xfs_file_buffered_write+0x7c/0x2c0 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.888267] [<000000001a55ffa0>] new_sync_write+0x100/0x190 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.888270] [<000000001a562c58>] vfs_write+0x1e8/0x280 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.888272] [<000000001a562f68>] ksys_write+0x68/0x100 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.888274] [<000000001ab25518>] __do_syscall+0x1b8/0x1e0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.888277] [<000000001ab330a8>] system_call+0x78/0xa0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.888299] INFO: task rs:main Q:Reg:937 blocked for more than 122 seconds. 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.888301] Not tainted 5.14.0-114.el9.s390x #1 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.888302] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.888304] task:rs:main Q:Reg state:D stack: 0 pid: 937 ppid: 1 flags:0x00000000 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.888307] Call Trace: 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.888312] [<000000001ab2c4c2>] __schedule+0x2c2/0x6b0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.888314] [<000000001ab2c904>] schedule+0x54/0xf0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.888316] [<000000001ab32222>] schedule_timeout+0x122/0x170 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.888319] [<000000001ab2ecce>] __down+0x9e/0x110 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.888324] [<000000001a313e8c>] down+0x8c/0x90 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.888337] [<000003ff802531f6>] xfs_buf_lock+0x46/0x100 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.888407] [<000003ff802534da>] xfs_buf_find+0x22a/0x3d0 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.888477] [<000003ff80253860>] xfs_buf_get_map+0x50/0x350 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.888548] [<000003ff802549da>] xfs_buf_read_map+0x4a/0x210 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.888625] [<000003ff80293856>] xfs_trans_read_buf_map+0x126/0x2e0 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.888698] [<000003ff80217780>] xfs_btree_read_buf_block.constprop.0+0xa0/0xe0 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.888765] [<000003ff80219590>] xfs_btree_lookup_get_block+0xa0/0x190 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.888833] [<000003ff8021deec>] xfs_btree_visit_blocks+0x9c/0x150 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.888900] [<000003ff8020fb7e>] xfs_iread_extents+0xbe/0x150 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.888967] [<000003ff80266b46>] xfs_buffered_write_iomap_begin+0xf6/0x920 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.889039] [<000000001a5ee078>] iomap_apply+0x68/0x2c0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.889041] [<000000001a5ee460>] iomap_file_buffered_write+0x80/0xb0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.889043] [<000003ff8025a926>] xfs_file_buffered_write+0xb6/0x2c0 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.889114] [<000000001a55ffa0>] new_sync_write+0x100/0x190 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.889116] [<000000001a562c58>] vfs_write+0x1e8/0x280 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.889121] [<000000001a562f68>] ksys_write+0x68/0x100 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.889124] [<000000001ab25518>] __do_syscall+0x1b8/0x1e0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.889126] [<000000001ab330a8>] system_call+0x78/0xa0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.889130] INFO: task firewalld:962 blocked for more than 122 seconds. 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.889132] Not tainted 5.14.0-114.el9.s390x #1 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.889134] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.889135] task:firewalld state:D stack: 0 pid: 962 ppid: 1 flags:0x00000002 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.889138] Call Trace: 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.889143] [<000000001ab2c4c2>] __schedule+0x2c2/0x6b0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.889149] [<000000001ab2c904>] schedule+0x54/0xf0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.889150] [<000000001ab32222>] schedule_timeout+0x122/0x170 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.889153] [<000000001ab2d5ce>] __wait_for_common+0xbe/0x250 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.889155] [<000003ff80251bb4>] xfs_buf_iowait+0x34/0xf0 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.889226] [<000003ff80254104>] __xfs_buf_submit+0x114/0x1f0 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.889296] [<000003ff80254a74>] xfs_buf_read_map+0xe4/0x210 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.889367] [<000003ff80293856>] xfs_trans_read_buf_map+0x126/0x2e0 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.889441] [<000003ff8023ad06>] xfs_imap_to_bp+0x66/0x80 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.889510] [<000003ff8025fc7c>] xfs_iget_cache_miss+0x1ac/0x350 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.889580] [<000003ff802600e0>] xfs_iget+0x1d0/0x2e0 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.889651] [<000003ff8026c1e0>] xfs_lookup+0xe0/0x110 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.889723] [<000003ff80267b22>] xfs_vn_lookup+0x62/0xa0 [xfs] 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.889799] [<000000001a57198a>] __lookup_slow+0x9a/0x170 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.889802] [<000000001a575f6a>] walk_component+0x14a/0x1e0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.889805] [<000000001a5761f6>] link_path_walk.part.0.constprop.0+0x1f6/0x350 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.889808] [<000000001a57645c>] path_lookupat+0x4c/0x1f0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.889810] [<000000001a577ffa>] filename_lookup+0xaa/0x1a0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.889813] [<000000001a578212>] user_path_at_empty+0x72/0xa0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.889816] [<000000001a569872>] vfs_statx+0x82/0x130 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.889818] [<000000001a569b94>] __do_sys_newfstatat+0x34/0x60 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.889820] [<000000001ab25518>] __do_syscall+0x1b8/0x1e0 2022-06-23-10:34:30: 2022-06-23-10:34:30: [ 243.889822] [<000000001ab330a8>] system_call+0x78/0xa Hit this when I trigger stable guest abi test on s390x as well After the guest migration back to the src side the guest seems crashed Test log: 2022-06-24-03:12:48: Host(10.0.160.25) Sending qmp command : {"execute": "migrate-set-capabilities", "arguments": {"capabilities": [{"capability": "postcopy-ram", "state": true}]}, "id": "BCLFqkWC"} 2022-06-24-03:12:49: Host(10.0.160.25) Responding qmp command: {"return": {}, "id": "BCLFqkWC"} 2022-06-24-03:12:49: Host(10.0.160.25) Sending qmp command : {"execute": "query-migrate-capabilities", "id": "bHD83ePi"} 2022-06-24-03:12:49: Host(10.0.160.25) Responding qmp command: {"return": [{"state": false, "capability": "xbzrle"}, {"state": false, "capability": "rdma-pin-all"}, {"state": false, "capability": "auto-converge"}, {"state": false, "capability": "zero-blocks"}, {"state": false, "capability": "compress"}, {"state": false, "capability": "events"}, {"state": true, "capability": "postcopy-ram"}, {"state": false, "capability": "x-colo"}, {"state": false, "capability": "release-ram"}, {"state": false, "capability": "return-path"}, {"state": false, "capability": "pause-before-switchover"}, {"state": false, "capability": "multifd"}, {"state": false, "capability": "dirty-bitmaps"}, {"state": false, "capability": "postcopy-blocktime"}, {"state": false, "capability": "late-block-activate"}, {"state": false, "capability": "x-ignore-shared"}, {"state": false, "capability": "validate-uuid"}, {"state": false, "capability": "background-snapshot"}], "id": "bHD83ePi"} 2022-06-24-03:12:49: Succeed to set migration capabilities: postcopy-ram 2022-06-24-03:12:49: Host(10.0.160.24) Sending qmp command : {"execute": "migrate-set-capabilities", "arguments": {"capabilities": [{"capability": "postcopy-ram", "state": true}]}, "id": "HJpFNe7F"} 2022-06-24-03:12:49: Host(10.0.160.24) Responding qmp command: {"return": {}, "id": "HJpFNe7F"} 2022-06-24-03:12:49: Host(10.0.160.24) Sending qmp command : {"execute": "query-migrate-capabilities", "id": "rHY0W1i5"} 2022-06-24-03:12:49: Host(10.0.160.24) Responding qmp command: {"return": [{"state": false, "capability": "xbzrle"}, {"state": false, "capability": "rdma-pin-all"}, {"state": false, "capability": "auto-converge"}, {"state": false, "capability": "zero-blocks"}, {"state": false, "capability": "compress"}, {"state": false, "capability": "events"}, {"state": true, "capability": "postcopy-ram"}, {"state": false, "capability": "x-colo"}, {"state": false, "capability": "release-ram"}, {"state": false, "capability": "return-path"}, {"state": false, "capability": "pause-before-switchover"}, {"state": false, "capability": "multifd"}, {"state": false, "capability": "dirty-bitmaps"}, {"state": false, "capability": "postcopy-blocktime"}, {"state": false, "capability": "late-block-activate"}, {"state": false, "capability": "x-ignore-shared"}, {"state": false, "capability": "validate-uuid"}, {"state": false, "capability": "background-snapshot"}], "id": "rHY0W1i5"} 2022-06-24-03:12:49: Succeed to set migration capabilities: postcopy-ram 2022-06-24-03:12:49: Host(10.0.160.24) Sending qmp command : {"execute": "migrate-incoming", "arguments": {"uri": "tcp:[::]:4000"}, "id": "sPk3duNY"} 2022-06-24-03:12:50: Host(10.0.160.24) Responding qmp command: {"return": {}, "id": "sPk3duNY"} 2022-06-24-03:12:50: Host(10.0.160.25) Sending qmp command : {"execute": "migrate", "arguments": {"uri": "tcp:10.0.160.24:4000"}, "id": "YCJ4a63J"} 2022-06-24-03:12:50: Host(10.0.160.25) Responding qmp command: {"return": {}, "id": "YCJ4a63J"} 2022-06-24-03:12:50: Host(10.0.160.25) Sending qmp command : {"execute": "query-migrate", "id": "z4IBS1ZW"} 2022-06-24-03:12:50: Host(10.0.160.25) Responding qmp command: {"return": {"status": "setup"}, "id": "z4IBS1ZW"} 2022-06-24-03:12:50: Host(10.0.160.25) Sending qmp command : {"execute": "query-migrate", "id": "WNYwVDit"} 2022-06-24-03:12:50: Host(10.0.160.25) Responding qmp command: {"return": {"status": "setup"}, "id": "WNYwVDit"} 2022-06-24-03:12:50: Host(10.0.160.25) Sending qmp command : {"execute": "query-migrate", "id": "knhb2XR9"} 2022-06-24-03:12:50: Host(10.0.160.25) Responding qmp command: {"return": {"expected-downtime": 300, "status": "active", "setup-time": 418, "total-time": 424, "ram": {"total": 4294967296, "postcopy-requests": 0, "dirty-sync-count": 1, "multifd-bytes": 0, "pages-per-second": 0, "downtime-bytes": 0, "page-size": 4096, "remaining": 4286066688, "postcopy-bytes": 0, "mbps": 0, "transferred": 8795157, "precopy-bytes": 8795157, "duplicate": 29, "dirty-pages-rate": 0, "skipped": 0, "normal-bytes": 8777728, "normal": 2143}}, "id": "knhb2XR9"} 2022-06-24-03:12:55: Host(10.0.160.25) Sending qmp command : {"execute": "query-migrate", "id": "KIrihPs8"} 2022-06-24-03:12:55: Host(10.0.160.25) Responding qmp command: {"return": {"expected-downtime": 300, "status": "active", "setup-time": 418, "total-time": 5427, "ram": {"total": 4294967296, "postcopy-requests": 0, "dirty-sync-count": 1, "multifd-bytes": 0, "pages-per-second": 32710, "downtime-bytes": 0, "page-size": 4096, "remaining": 1616863232, "postcopy-bytes": 0, "mbps": 1073.93616, "transferred": 594242276, "precopy-bytes": 594242276, "duplicate": 510156, "dirty-pages-rate": 0, "skipped": 0, "normal-bytes": 588500992, "normal": 143677}}, "id": "KIrihPs8"} 2022-06-24-03:13:00: Host(10.0.160.25) Sending qmp command : {"execute": "query-migrate", "id": "3CpRTBtr"} 2022-06-24-03:13:00: Host(10.0.160.25) Responding qmp command: {"return": {"expected-downtime": 1851, "status": "active", "setup-time": 418, "total-time": 10435, "ram": {"total": 4294967296, "postcopy-requests": 0, "dirty-sync-count": 2, "multifd-bytes": 0, "pages-per-second": 32710, "downtime-bytes": 0, "page-size": 4096, "remaining": 40591360, "postcopy-bytes": 0, "mbps": 1073.93616, "transferred": 1218196374, "precopy-bytes": 1218196374, "duplicate": 800398, "dirty-pages-rate": 6274, "skipped": 0, "normal-bytes": 1208631296, "normal": 295076}}, "id": "3CpRTBtr"} 2022-06-24-03:13:00: Host(10.0.160.25) Sending qmp command : {"execute": "migrate-start-postcopy", "id": "BqYu0cZm"} 2022-06-24-03:13:00: Host(10.0.160.25) Responding qmp command: {"return": {}, "id": "BqYu0cZm"} 2022-06-24-03:13:00: Host(10.0.160.25) Sending qmp command : {"execute": "query-migrate", "id": "fGu3bqI7"} 2022-06-24-03:13:00: Host(10.0.160.25) Responding qmp command: {"return": {"expected-downtime": 1851, "status": "active", "setup-time": 418, "total-time": 10436, "ram": {"total": 4294967296, "postcopy-requests": 0, "dirty-sync-count": 2, "multifd-bytes": 0, "pages-per-second": 32710, "downtime-bytes": 0, "page-size": 4096, "remaining": 40591360, "postcopy-bytes": 0, "mbps": 1073.93616, "transferred": 1218196374, "precopy-bytes": 1218196374, "duplicate": 800398, "dirty-pages-rate": 6274, "skipped": 0, "normal-bytes": 1208631296, "normal": 295076}}, "id": "fGu3bqI7"} 2022-06-24-03:13:05: Host(10.0.160.25) Sending qmp command : {"execute": "query-migrate", "id": "MXGGD6WW"} 2022-06-24-03:13:06: Host(10.0.160.25) Responding qmp command: {"return": {"status": "completed", "setup-time": 418, "downtime": 372, "total-time": 11429, "ram": {"total": 4294967296, "postcopy-requests": 8, "dirty-sync-count": 3, "multifd-bytes": 0, "pages-per-second": 46451, "downtime-bytes": 0, "page-size": 4096, "remaining": 0, "postcopy-bytes": 192621379, "mbps": 1025.2246626101171, "transferred": 1410817753, "precopy-bytes": 1218196374, "duplicate": 800409, "dirty-pages-rate": 0, "skipped": 0, "normal-bytes": 1400877056, "normal": 342011}}, "id": "MXGGD6WW"} 2022-06-24-03:13:08: ======= Step 8.Repeat step6 ======= 2022-06-24-03:13:08: Connect to monitor(10.0.160.24:4444) successfully. 2022-06-24-03:13:08: Closed the monitor(10.0.160.24:4444). 2022-06-24-03:13:08: [root@guest ~]# dmesg Traceback (most recent call last): File "/home/ipa/runner.py", line 212, in _run getattr(self._case_dict[case], "run_case")(self._params) File "/home/ipa/virtkvmqe/stable_guest_abi_for_s390x_test_plan/test_scenarios/rhel_292442.py", line 213, in run_case src_serial.serial_check_dmesg() File "/home/ipa/monitor.py", line 753, in serial_check_dmesg getattr(self, '_check_dmesg_%s' % ( File "/home/ipa/monitor.py", line 745, in _check_dmesg_linux output = self.serial_cmd_output(cmd=cmd, timeout=timeout) File "/home/ipa/monitor.py", line 633, in serial_cmd_output output = RemoteMonitor.recv_data_timeout( File "/home/ipa/monitor.py", line 265, in recv_data_timeout RemoteMonitor.test_error(self, err_info) File "/home/ipa/vm.py", line 93, in test_error raise usr_exceptions.Error(info) usr_exceptions.Error: Failed to run "dmesg" under 60 sec, error info: "[ 0.782221] Loading iSCSI transport class v2.0-870. Didn't reproduce this bug on qemu-kvm-6.2.0-13.el9.s390x.rpm with same kernel version as Description: try 60 times with huge pages configured, all cases pass. So seems it's a regression bug in qemu (In reply to Li Xiaohui from comment #10) > Didn't reproduce this bug on qemu-kvm-6.2.0-13.el9.s390x.rpm with same > kernel version as Description: try 60 times with huge pages configured, all > cases pass. > > So seems it's a regression bug in qemu I just wanted to ask whether it's reproducible in an older version - thanks for reading my mind :-) ... this will make it a lot easier to find out where the problem comes from. (In reply to Li Xiaohui from comment #8) > Tried 100 times to reproduce this bug on RHEL 8.7.0 > (kernel-4.18.0-402.el8.s390x & > qemu-kvm-6.2.0-15.module+el8.7.0+15644+189a21f6.s390x) with same RHEL 9.1.0 > guest, but didn't hit the issue, hit a new call trace with 1/100 rate when > boot guest on source host with huge page configured. Do we need to open a > new bug? Yes, please open a new BZ, since this seems to be an unrelated issue. Thanks! FWIW, I tried to reproduce this the whole day, and so far I've only seen the guest crashing once with: [ 61.492895] User process fault: interruption code 0010 ilc:3 in libc.so.6[3ffa5880000+1c8000] [ 61.492923] Failing address: 0000000000000000 TEID: 0000000000000800 [ 61.492925] Fault in primary space mode while using user ASCE. [ 61.492928] AS:0000000082fc41c7 R3:0000000080bfc007 S:0000000000000020 [ 61.492934] CPU: 0 PID: 1217 Comm: migration_dirty Kdump: loaded Not tainted 5.14.0-114.el9.s390x #1 [ 61.492939] Hardware name: IBM 8561 LT1 400 (KVM/Linux) [ 61.492941] User PSW : 0705200180000000 000003ffa590c062 [ 61.492944] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:1 AS:0 CC:2 PM:0 RI:0 EA:3 [ 61.492948] User GPRS: 0000000000000003 000003ffa590c050 000000000000002e 000000000000002e [ 61.492950] 00000000fbad2a84 0000000000000001 0000000001001df0 000003ffdcd7f368 [ 61.492952] 00000000010007e0 0000000000000000 0000000000000001 000003ffdcd7efc8 [ 61.492954] 000003ffa5b2ef78 000003ffa5b2f018 0000000001000868 000003ffdcd7ef28 [ 61.492962] User Code: 000003ffa590c050: eb6ff0300024 stmg %r6,%r15,48(%r15) [ 61.492962] 000003ffa590c056: c498000a06b5 lgrl %r9,000003ffa5a4cdc0 [ 61.492962] #000003ffa590c05c: e3f0ff60ff71 lay %r15,-160(%r15) [ 61.492962] >000003ffa590c062: e3b090000004 lg %r11,0(%r9) [ 61.492962] 000003ffa590c068: b90400a2 lgr %r10,%r2 [ 61.492962] 000003ffa590c06c: 9180b002 tm 2(%r11),128 [ 61.492962] 000003ffa590c070: a7740043 brc 7,000003ffa590c0f6 [ 61.492962] 000003ffa590c074: b24f0080 ear %r8,%a0 [ 61.492981] Last Breaking-Event-Address: [ 61.492982] [<0000000000000001>] 0x1 [ 61.494193] Core dump to |/usr/lib/systemd/systemd-coredump pipe failed [ 61.494399] list_del corruption. next->prev should be 0000000083492f98, but was 0000000000000000 [ 61.494416] ------------[ cut here ]------------ [ 61.494417] kernel BUG at lib/list_debug.c:54! [ 61.494489] monitor event: 0040 ilc:2 [#1] SMP [ 61.494493] Modules linked in: nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set nf_tables nfnetlink vfio_ccw mdev vfio_iommu_type1 vfio drm fb fuse font i2c_core drm_panel_orientation_quirks xfs libcrc32c ghash_s390 prng aes_s390 sd_mod t10_pi sg des_s390 libdes sha3_512_s390 virtio_scsi virtio_net net_failover failover sha3_256_s390 pkey zcrypt [ 61.494518] CPU: 0 PID: 579 Comm: auditd Kdump: loaded Not tainted 5.14.0-114.el9.s390x #1 [ 61.494520] Hardware name: IBM 8561 LT1 400 (KVM/Linux) [ 61.494521] Krnl PSW : 0704c00180000000 000000006d754012 (__list_del_entry_valid+0x82/0xb0) [ 61.494531] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3 [ 61.494533] Krnl GPRS: c0000000ffffefff 0000000000000027 0000000000000054 00000000ffffefff [ 61.494534] 000003800077b6a8 000003800077b6a0 0000000085235a00 ffffffffffffff7f [ 61.494536] ffffffff00000000 000000006e221150 0000000083492f98 0000000083492e80 [ 61.494537] 0000000080b4e900 000003800077bdb0 000000006d75400e 000003800077b8c8 [ 61.494541] Krnl Code: 000000006d754002: c020003095a2 larl %r2,000000006dd66b46 [ 61.494541] 000000006d754008: c0e5001ed240 brasl %r14,000000006db2e488 [ 61.494541] #000000006d75400e: af000000 mc 0,0 [ 61.494541] >000000006d754012: b9040043 lgr %r4,%r3 [ 61.494541] 000000006d754016: b9040032 lgr %r3,%r2 [ 61.494541] 000000006d75401a: c02000309577 larl %r2,000000006dd66b08 [ 61.494541] 000000006d754020: c0e5001ed234 brasl %r14,000000006db2e488 [ 61.494541] 000000006d754026: af000000 mc 0,0 [ 61.494551] Call Trace: [ 61.494553] [<000000006d754012>] __list_del_entry_valid+0x82/0xb0 [ 61.494555] ([<000000006d75400e>] __list_del_en Reproducibility is really low for me, even with huge pages activated ... do you see a way for increasing the reproducibility or could you provide me with some instructions for easy automation of this issue (so far I only ran it manually)? ------- Comment From jjherne.com 2022-07-18 13:51 EDT------- So you tried to reproduce this bug with RHEL9.1, everything default, except that you downgraded qemu-kvm from qemu-kvm-6.2.0-15 to qemu-kvm-6.2.0-13? And when you did this, you could not reproduce the problem? I want to ensure I'm understanding exactly what you changed to make the problem go away. If this is true, can you run this environment for a longer period of time to be really sure that the crash won't happen? (In reply to IBM Bug Proxy from comment #16) > ------- Comment From jjherne.com 2022-07-18 13:51 EDT------- > > So you tried to reproduce this bug with RHEL9.1, everything default, except > that you downgraded qemu-kvm from qemu-kvm-6.2.0-15 to qemu-kvm-6.2.0-13? > And when you did this, you could not reproduce the problem? I want to ensure > I'm understanding exactly what you changed to make the problem go away. Not right. See Description and Comment 10, on RHEL 9.1.0, I can reproduce bug on qemu-kvm-7.0.0-6.el9.s390x, but can't reproduce on qemu-kvm-6.2.0-13, so I think qemu-kvm-7.0 brought out this problem > > If this is true, can you run this environment for a longer period of time to > be really sure that the crash won't happen? I would try more times on qemu-kvm-6.2.0-13 to ensure the crash won't happen. (In reply to Li Xiaohui from comment #17) > I would try more times on qemu-kvm-6.2.0-13 to ensure the crash won't happen. I have now finally also an environment set up for reproducing the problem (thanks to bfu), and I think I can confirm the findings so far: I ran the test 15 times with qemu-kvm-7.0.0-8.el9.s390x on the target host and 3 times it was failing. Then I ran the test 15 times with qemu-img-6.2.0-13.el9.s390x on the target host, and all runs were successful in that case. I'll try to reproduce the issue with upstream QEMU - if it also happens there, I should be able to bisect the issue. ------- Comment From jjherne.com 2022-07-27 10:55 EDT------- (In reply to comment #8) > (In reply to Li Xiaohui from comment #17) > > I would try more times on qemu-kvm-6.2.0-13 to ensure the crash won't happen. > I have now finally also an environment set up for reproducing the problem > (thanks to bfu), and I think I can confirm the findings so far: > I ran the test 15 times with qemu-kvm-7.0.0-8.el9.s390x on the target host > and 3 times it was failing. > Then I ran the test 15 times with qemu-img-6.2.0-13.el9.s390x on the target > host, and all runs were successful in that case. > I'll try to reproduce the issue with upstream QEMU - if it also happens > there, I should be able to bisect the issue. Thomas, Thanks for taking a look at this :). I was unable to quickly reproduce the bug. I'll hold off on reproducing the target environment since you seem to have the environment nailed down. But I'm happy to jump in if you need anything. Just keep me posted on your progress. Thanks again :) FWIW, I think we definitely want this commit in downstream, too: https://gitlab.com/qemu-project/qemu/-/commit/007e179ef0e97eafda4c9ff "multifd: Copy pages before compressing them with zlib" Not sure yet whether that's the only issue that we're currently facing, or whether there are more bugs lurking around ... I'm still testing... The zlib patch was indeed not the culprit - it seems to break with one of the commits at the end of the merge request that has been merged with commit ID d90e6f665d3ac197f83d93. So far, two of the lengthy bisect runs point to this commit here being the first bad commit: cfd66f30fb0f735df06ff4220e5000290a43dad3 is the first bad commit commit cfd66f30fb0f735df06ff4220e5000290a43dad3 Author: Peter Xu <peterx> Date: Wed Jan 19 16:09:19 2022 +0800 migration: Simplify unqueue_page() Peter, David, Juan, could it be that this modification maybe had some unforseen side effects on s390x? Thomas & all, sorry to have caused such a problem. What's the host (getpagesize() / guest (TARGET_PAGE_SIZE) page size for s390x? That'll be the most suspecious thing I'd like to know if it can reproduce without huge pages. I'd expect for the whole process of remote page faults we should be using host page size all along the way (e.g. UFFDIO_COPY uses qemu_ram_pagesize(), which AFAICT is host page size, and IIUC that's the only way we can do atomic pgtable updates - we can only atomically update pgtable in host page size), but maybe there's something I overlooked so they can differ. FWIW, I've done another bisection run over the weekend, this time with even more retries for each commit (50 instead of 20), just to be sure that I ended up at the right place. And also this run showed that "migration: Simplify unqueue_page()" is the first bad commit, so the problem must have been introduced here indeed.
> What's the host (getpagesize() / guest (TARGET_PAGE_SIZE) page size for s390x?
Just to be sure, I added some debugging code in the unqueue_page function() and all values are 4096, i.e. qemu_ram_pagesize(block) return 4096, getpagesize() returns 4096 and TARGET_PAGE_SIZE is 4096.
I'd be tempted to 1) add back the tracing after unqueue_page that prints whether it's dirty or not. The other loops should still try again as far as I can tell 2) When it hangs, is there anything left in the queue on the source? Dave (In reply to Dr. David Alan Gilbert from comment #24) > I'd be tempted to > > 1) add back the tracing after unqueue_page that prints whether it's dirty > or not. > The other loops should still try again as far as I can tell I checked out commit a1fe28df7547120b (the last working one) and added some test there (since I don't know how to get to the trace points in the complicated test harness), and indeed, I can see that there are 7 cases where "block" is not NULL and "dirty" is true. > 2) When it hangs, is there anything left in the queue on the source? It doesn't hang - the migration finishes "successfully" and then the guest sometimes crashes later. Given the fact that this problem also occurs in upstream, and QEMU is in freeze mode before the release already there, should we maybe simply revert the patch upstream first (and then downstream), and have another try with this optimization in the QEMU 7.2 release cycle instead? Thomas (In reply to Thomas Huth from comment #25) > (In reply to Dr. David Alan Gilbert from comment #24) > > I'd be tempted to > > > > 1) add back the tracing after unqueue_page that prints whether it's dirty > > or not. > > The other loops should still try again as far as I can tell > > I checked out commit a1fe28df7547120b (the last working one) and added some > test there (since I don't know how to get to the trace points in the > complicated test harness), and indeed, I can see that there are 7 cases > where "block" is not NULL and "dirty" is true. What I'm curious of, is in your failing case whethere you're seeing any block!=NULL, dirty=false > > 2) When it hangs, is there anything left in the queue on the source? > > It doesn't hang - the migration finishes "successfully" and then the guest > sometimes crashes later. But at the point the migration goes to complete on the source, is that queue empty? > > Given the fact that this problem also occurs in upstream, and QEMU is in > freeze mode before the release already there, should we maybe simply revert > the patch upstream first (and then downstream), and have another try with > this optimization in the QEMU 7.2 release cycle instead? Does it cleanly revert though? I thought there were loads of other patches on top that were reorganising that code? > Thomas (In reply to Dr. David Alan Gilbert from comment #26) > (In reply to Thomas Huth from comment #25) > > (In reply to Dr. David Alan Gilbert from comment #24) > > > I'd be tempted to > > > > > > 1) add back the tracing after unqueue_page that prints whether it's dirty > > > or not. > > > The other loops should still try again as far as I can tell > > > > I checked out commit a1fe28df7547120b (the last working one) and added some > > test there (since I don't know how to get to the trace points in the > > complicated test harness), and indeed, I can see that there are 7 cases > > where "block" is not NULL and "dirty" is true. > > What I'm curious of, is in your failing case whethere you're seeing any > block!=NULL, dirty=false Yes, there were also two cases with block!=NULL and dirty=false ! > > > 2) When it hangs, is there anything left in the queue on the source? > > > > It doesn't hang - the migration finishes "successfully" and then the guest > > sometimes crashes later. > > But at the point the migration goes to complete on the source, is that queue > empty? Sorry, now you've lost me ... I thought we were talking about the QEMU on the destination side only here, since that's the one where it matters whether the bad commit is included or not? Anyway, maybe it's easier if I provide you with access to the system ... ping me on IRC if you want to have a look on your own there. > > > > Given the fact that this problem also occurs in upstream, and QEMU is in > > freeze mode before the release already there, should we maybe simply revert > > the patch upstream first (and then downstream), and have another try with > > this optimization in the QEMU 7.2 release cycle instead? > > Does it cleanly revert though? I thought there were loads of other patches > on top that were reorganising that code? The bad commit is pretty much at the end of that merge request, so it's fortunately rather on top of the reorganising patches. As far as I can see, it reverts fine apart from the changes to the trace-events file, but that should be easy to fix up manually. I'll check whether a revert of the patch fixes the problem on QEMU's master branch... (In reply to Thomas Huth from comment #27) > (In reply to Dr. David Alan Gilbert from comment #26) > > (In reply to Thomas Huth from comment #25) > > > (In reply to Dr. David Alan Gilbert from comment #24) > > > > I'd be tempted to > > > > > > > > 1) add back the tracing after unqueue_page that prints whether it's dirty > > > > or not. > > > > The other loops should still try again as far as I can tell > > > > > > I checked out commit a1fe28df7547120b (the last working one) and added some > > > test there (since I don't know how to get to the trace points in the > > > complicated test harness), and indeed, I can see that there are 7 cases > > > where "block" is not NULL and "dirty" is true. > > > > What I'm curious of, is in your failing case whethere you're seeing any > > block!=NULL, dirty=false > > Yes, there were also two cases with block!=NULL and dirty=false ! > > > > > 2) When it hangs, is there anything left in the queue on the source? > > > > > > It doesn't hang - the migration finishes "successfully" and then the guest > > > sometimes crashes later. > > > > But at the point the migration goes to complete on the source, is that queue > > empty? > > Sorry, now you've lost me ... I thought we were talking about the QEMU on > the destination side only here, since that's the one where it matters > whether the bad commit is included or not? That patch only touches sending side code! Dave > Anyway, maybe it's easier if I provide you with access to the system ... > ping me on IRC if you want to have a look on your own there. > > > > > > > Given the fact that this problem also occurs in upstream, and QEMU is in > > > freeze mode before the release already there, should we maybe simply revert > > > the patch upstream first (and then downstream), and have another try with > > > this optimization in the QEMU 7.2 release cycle instead? > > > > Does it cleanly revert though? I thought there were loads of other patches > > on top that were reorganising that code? > > The bad commit is pretty much at the end of that merge request, so it's > fortunately rather on top of the reorganising patches. As far as I can see, > it reverts fine apart from the changes to the trace-events file, but that > should be easy to fix up manually. I'll check whether a revert of the patch > fixes the problem on QEMU's master branch... (In reply to Dr. David Alan Gilbert from comment #28) > (In reply to Thomas Huth from comment #27) [...] > > Sorry, now you've lost me ... I thought we were talking about the QEMU on > > the destination side only here, since that's the one where it matters > > whether the bad commit is included or not? > > That patch only touches sending side code! For the records: The problem is indeed occurring on the sending side - I missed the fact that the test script is doing a ping-pong migration, first from the source to the destination and then back to the source. Reverting cfd66f30fb0f735df06ff4220e5000290a43dad3 on top of QEMU's master branch fixes the issue in upstream QEMU. I did some test runs - with the clean unmodified master branch it fails multiple times out of 5 runs already, but if I revert the patch, I got 49 successful runs out of 50 where the 1 single failure was due to some bug in the test script instead. Thus I'd recommend to really revert the patch in upstream for QEMU 7.1, and maybe introduced it in a fixed form later again once we understood what was really going wrong here. I'll send a patch. Test VIRT-86251 case with repeating 150 times on qemu-img-7.0.0-9.el9._bz2099934v2.s390x, they all pass, didn't hit any issues. Tried 200 times for VIRT_86251 cases on qemu-kvm-7.0.0-10.el9.s390x, they all pass. We can mark this bug verified per the test results. RESULTS [RHEL-292436]: ==>TOTAL : 80 ==>PASS : 60 1: RHEL-292438-[stable guest abi][s390x]forward and backward migration with "s390-ccw-virtio-rhel8.4.0" machine type (13 min 10 sec) 2: RHEL-292441-[stable guest abi][s390x]forward and backward migration with "s390-ccw-virtio-rhel8.5.0" machine type (12 min 45 sec) 3: RHEL-292443-[stable guest abi][s390x]forward and backward migration with "s390-ccw-virtio-rhel8.6.0" machine type (12 min 36 sec) 4: RHEL-292440-[stable guest abi][s390x]forward and backward postcopy migration with "s390-ccw-virtio-rhel8.4.0" machine type (8 min 9 sec) 5: RHEL-292442-[stable guest abi][s390x]forward and backward postcopy migration with "s390-ccw-virtio-rhel8.5.0" machine type (8 min 9 sec) 6: RHEL-292444-[stable guest abi][s390x]forward and backward postcopy migration with "s390-ccw-virtio-rhel8.6.0" machine type (8 min 17 sec) 7: RHEL-292438-[stable guest abi][s390x]forward and backward migration with "s390-ccw-virtio-rhel8.4.0" machine type (13 min 3 sec) 8: RHEL-292441-[stable guest abi][s390x]forward and backward migration with "s390-ccw-virtio-rhel8.5.0" machine type (13 min 22 sec) 9: RHEL-292443-[stable guest abi][s390x]forward and backward migration with "s390-ccw-virtio-rhel8.6.0" machine type (13 min 12 sec) 10: RHEL-292440-[stable guest abi][s390x]forward and backward postcopy migration with "s390-ccw-virtio-rhel8.4.0" machine type (8 min 10 sec) 11: RHEL-292442-[stable guest abi][s390x]forward and backward postcopy migration with "s390-ccw-virtio-rhel8.5.0" machine type (8 min 13 sec) 12: RHEL-292444-[stable guest abi][s390x]forward and backward postcopy migration with "s390-ccw-virtio-rhel8.6.0" machine type (8 min 9 sec) 13: RHEL-292438-[stable guest abi][s390x]forward and backward migration with "s390-ccw-virtio-rhel8.4.0" machine type (12 min 49 sec) 14: RHEL-292441-[stable guest abi][s390x]forward and backward migration with "s390-ccw-virtio-rhel8.5.0" machine type (12 min 21 sec) 15: RHEL-292443-[stable guest abi][s390x]forward and backward migration with "s390-ccw-virtio-rhel8.6.0" machine type (12 min 34 sec) 16: RHEL-292440-[stable guest abi][s390x]forward and backward postcopy migration with "s390-ccw-virtio-rhel8.4.0" machine type (8 min 9 sec) 17: RHEL-292442-[stable guest abi][s390x]forward and backward postcopy migration with "s390-ccw-virtio-rhel8.5.0" machine type (8 min 9 sec) 18: RHEL-292444-[stable guest abi][s390x]forward and backward postcopy migration with "s390-ccw-virtio-rhel8.6.0" machine type (8 min 13 sec) 19: RHEL-292438-[stable guest abi][s390x]forward and backward migration with "s390-ccw-virtio-rhel8.4.0" machine type (12 min 26 sec) 20: RHEL-292441-[stable guest abi][s390x]forward and backward migration with "s390-ccw-virtio-rhel8.5.0" machine type (12 min 41 sec) 21: RHEL-292443-[stable guest abi][s390x]forward and backward migration with "s390-ccw-virtio-rhel8.6.0" machine type (12 min 20 sec) 22: RHEL-292440-[stable guest abi][s390x]forward and backward postcopy migration with "s390-ccw-virtio-rhel8.4.0" machine type (8 min 9 sec) 23: RHEL-292442-[stable guest abi][s390x]forward and backward postcopy migration with "s390-ccw-virtio-rhel8.5.0" machine type (8 min 5 sec) 24: RHEL-292444-[stable guest abi][s390x]forward and backward postcopy migration with "s390-ccw-virtio-rhel8.6.0" machine type (8 min 5 sec) 25: RHEL-292438-[stable guest abi][s390x]forward and backward migration with "s390-ccw-virtio-rhel8.4.0" machine type (12 min 29 sec) 26: RHEL-292441-[stable guest abi][s390x]forward and backward migration with "s390-ccw-virtio-rhel8.5.0" machine type (12 min 21 sec) 27: RHEL-292443-[stable guest abi][s390x]forward and backward migration with "s390-ccw-virtio-rhel8.6.0" machine type (13 min 22 sec) 28: RHEL-292440-[stable guest abi][s390x]forward and backward postcopy migration with "s390-ccw-virtio-rhel8.4.0" machine type (8 min 9 sec) 29: RHEL-292442-[stable guest abi][s390x]forward and backward postcopy migration with "s390-ccw-virtio-rhel8.5.0" machine type (8 min 9 sec) 30: RHEL-292444-[stable guest abi][s390x]forward and backward postcopy migration with "s390-ccw-virtio-rhel8.6.0" machine type (8 min 9 sec) 31: RHEL-292438-[stable guest abi][s390x]forward and backward migration with "s390-ccw-virtio-rhel8.4.0" machine type (12 min 37 sec) 32: RHEL-292441-[stable guest abi][s390x]forward and backward migration with "s390-ccw-virtio-rhel8.5.0" machine type (12 min 15 sec) 33: RHEL-292443-[stable guest abi][s390x]forward and backward migration with "s390-ccw-virtio-rhel8.6.0" machine type (12 min 21 sec) 34: RHEL-292440-[stable guest abi][s390x]forward and backward postcopy migration with "s390-ccw-virtio-rhel8.4.0" machine type (8 min 9 sec) 35: RHEL-292442-[stable guest abi][s390x]forward and backward postcopy migration with "s390-ccw-virtio-rhel8.5.0" machine type (8 min 9 sec) 36: RHEL-292444-[stable guest abi][s390x]forward and backward postcopy migration with "s390-ccw-virtio-rhel8.6.0" machine type (8 min 13 sec) 37: RHEL-292438-[stable guest abi][s390x]forward and backward migration with "s390-ccw-virtio-rhel8.4.0" machine type (10 min 42 sec) 38: RHEL-292441-[stable guest abi][s390x]forward and backward migration with "s390-ccw-virtio-rhel8.5.0" machine type (12 min 16 sec) 39: RHEL-292443-[stable guest abi][s390x]forward and backward migration with "s390-ccw-virtio-rhel8.6.0" machine type (12 min 41 sec) 40: RHEL-292440-[stable guest abi][s390x]forward and backward postcopy migration with "s390-ccw-virtio-rhel8.4.0" machine type (8 min 5 sec) 41: RHEL-292442-[stable guest abi][s390x]forward and backward postcopy migration with "s390-ccw-virtio-rhel8.5.0" machine type (8 min 5 sec) 42: RHEL-292444-[stable guest abi][s390x]forward and backward postcopy migration with "s390-ccw-virtio-rhel8.6.0" machine type (8 min 9 sec) 43: RHEL-292438-[stable guest abi][s390x]forward and backward migration with "s390-ccw-virtio-rhel8.4.0" machine type (12 min 9 sec) 44: RHEL-292441-[stable guest abi][s390x]forward and backward migration with "s390-ccw-virtio-rhel8.5.0" machine type (12 min 11 sec) 45: RHEL-292443-[stable guest abi][s390x]forward and backward migration with "s390-ccw-virtio-rhel8.6.0" machine type (12 min 11 sec) 46: RHEL-292440-[stable guest abi][s390x]forward and backward postcopy migration with "s390-ccw-virtio-rhel8.4.0" machine type (8 min 9 sec) 47: RHEL-292442-[stable guest abi][s390x]forward and backward postcopy migration with "s390-ccw-virtio-rhel8.5.0" machine type (8 min 9 sec) 48: RHEL-292444-[stable guest abi][s390x]forward and backward postcopy migration with "s390-ccw-virtio-rhel8.6.0" machine type (8 min 9 sec) 49: RHEL-292438-[stable guest abi][s390x]forward and backward migration with "s390-ccw-virtio-rhel8.4.0" machine type (13 min 10 sec) 50: RHEL-292441-[stable guest abi][s390x]forward and backward migration with "s390-ccw-virtio-rhel8.5.0" machine type (12 min 5 sec) 51: RHEL-292443-[stable guest abi][s390x]forward and backward migration with "s390-ccw-virtio-rhel8.6.0" machine type (12 min 1 sec) 52: RHEL-292440-[stable guest abi][s390x]forward and backward postcopy migration with "s390-ccw-virtio-rhel8.4.0" machine type (8 min 5 sec) 53: RHEL-292442-[stable guest abi][s390x]forward and backward postcopy migration with "s390-ccw-virtio-rhel8.5.0" machine type (8 min 23 sec) 54: RHEL-292444-[stable guest abi][s390x]forward and backward postcopy migration with "s390-ccw-virtio-rhel8.6.0" machine type (9 min 20 sec) 55: RHEL-292438-[stable guest abi][s390x]forward and backward migration with "s390-ccw-virtio-rhel8.4.0" machine type (11 min 19 sec) 56: RHEL-292441-[stable guest abi][s390x]forward and backward migration with "s390-ccw-virtio-rhel8.5.0" machine type (11 min 22 sec) 57: RHEL-292443-[stable guest abi][s390x]forward and backward migration with "s390-ccw-virtio-rhel8.6.0" machine type (13 min 50 sec) 58: RHEL-292440-[stable guest abi][s390x]forward and backward postcopy migration with "s390-ccw-virtio-rhel8.4.0" machine type (8 min 29 sec) 59: RHEL-292442-[stable guest abi][s390x]forward and backward postcopy migration with "s390-ccw-virtio-rhel8.5.0" machine type (8 min 21 sec) 60: RHEL-292444-[stable guest abi][s390x]forward and backward postcopy migration with "s390-ccw-virtio-rhel8.6.0" machine type (8 min 22 sec) As the test result, set this bz to verified Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: qemu-kvm security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:7967 |