Description of problem: Stress guest and stop guest, then do live migration, guest hit call trace on destination end. This issue also could be hit on local migration. If without "stress" in guest or without "stop" guest, this issue can't be reproduced. Version-Release number of selected component (if applicable): Host: 4.18.0-40.el8.s390x qemu-kvm-2.12.0-42.module+el8+2173+537e5cb5.s390x Guest: 4.18.0-44.el8.s390x How reproducible: 100% Steps to Reproduce: 1.Boot a guest on src end 2.Boot a guest on dst end with incoming mode 3.In guest: execute "stress --cpu 2 --io 2 --vm 2 --vm-bytes 64M" 4.On src, stop guest and then do live migration {"execute":"stop"} {"execute": "migrate", "arguments": {"uri": "tcp:10.16.106.172:5801"}} 5.After migration completed, execute "cont" on dst end {"execute":"cont"} Actual results: Guest hit call trace as following,"dd" and "reboot" work well. [ 2667.765246] rcu: INFO: rcu_sched self-detected stall on CPU [ 2667.765269] rcu: 1-...!: (1 ticks this GP) idle=3c2/1/0x4000000000000002 softirq=84260/84260 fqs=0 [ 2667.765271] rcu: (t=17921 jiffies g=20885 q=1) [ 2667.765276] rcu: rcu_sched kthread starved for 17921 jiffies! g20885 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0 [ 2667.765278] rcu: RCU grace-period kthread stack dump: [ 2667.765280] rcu_sched I 0 10 2 0x00000000 [ 2667.765284] Call Trace: [ 2667.765294] ([<0000000000784066>] __schedule+0x2ce/0x8e8) [ 2667.765296] [<00000000007846ca>] schedule+0x4a/0xb8 [ 2667.765299] [<0000000000788586>] schedule_timeout+0x1ce/0x458 [ 2667.765306] [<00000000001bf9fc>] rcu_gp_kthread+0x2fc/0x630 [ 2667.765310] [<00000000001696ac>] kthread+0x14c/0x168 [ 2667.765313] [<0000000000789672>] kernel_thread_starter+0xa/0x10 [ 2667.765315] [<0000000000789668>] kernel_thread_starter+0x0/0x10 [ 2667.765320] Task dump for CPU 1: [ 2667.765321] stress R running task 0 6995 6989 0x02000000 [ 2667.765324] Call Trace: [ 2667.765330] ([<000000007fff3700>] 0x7fff3700) [ 2667.765331] [<0000000077123c70>] 0x77123c70 [ 2667.765335] ([<00000000002d2aa6>] clear_huge_page+0x1ce/0x320) [ 2667.765340] [<0000000000314fdc>] do_huge_pmd_anonymous_page+0x1bc/0x6c0 [ 2667.765342] [<00000000002d228a>] __handle_mm_fault+0x9ea/0xa98 [ 2667.765344] [<00000000002d243a>] handle_mm_fault+0x102/0x1c0 [ 2667.765349] [<00000000001248b4>] do_dat_exception+0x194/0x3e0 [ 2667.765351] [<0000000000789818>] pgm_check_handler+0x1a0/0x1f4 Expected results: Guest should not call trace. Additional info: 1)I have tried this scenario on power9, it does not exist this issue qemu cli: /usr/libexec/qemu-kvm \ -name 'avocado-vt-vm1' \ -sandbox off \ -machine s390-ccw-virtio-rhel7.6.0 \ -nodefaults \ -nographic \ -mon chardev=serial_id_serial0,mode=readline \ -chardev stdio,mux=on,id=serial_id_serial0,server,nowait,signal=off \ -device sclpconsole,id=serial111,chardev=serial_id_serial0 \ -object iothread,id=iothread0 \ -device virtio-scsi-ccw,id=virtio_scsi_ccw0,iothread=iothread0 \ -drive id=drive1,if=none,snapshot=off,aio=threads,cache=none,format=qcow2,file=/home/xianwang/mount_point/rhel80-s390x-virtio-scsi.qcow2 \ -device scsi-hd,bus=virtio_scsi_ccw0.0,drive=drive1,id=hd1,bootindex=1 \ -m 2048 \ -smp 2 \ -cpu 'host' \ -vnc :1 \ -qmp tcp:0:8881,server,nowait \ -rtc base=utc,clock=host,driftfix=slew \ -boot menu=on \ -enable-kvm \
I have tried several times on power9(both two hosts migration and local migration) and didn't hit this issue. Host: 4.18.0-32.el8.ppc64le qemu-kvm-2.12.0-41.el8+2104+3e32e6f8.ppc64le SLOF-20170724-5.git89f519f.el8+5.ppc64le Guest: 4.18.0-32.el8.ppc64le
Just to make sure, are you running the exact same kernel in the source and target host?
Another set of questions: 1. Can it be reproduced without the stop? 2. How long is the guest stopped? (time between stop and cont on destination)
I am pretty sure this is related to us not loading/restoring the actual TOD clock when stopping/starting the VM. PPC does this. My assumption is true if getting rid of the stop fixes the issue.
> Another set of questions: > > 1. Can it be reproduced without the stop? No, if without "stop" it can't be hit; if without "stress", it also can't be hit. > 2. How long is the guest stopped? (time between stop and cont on destination) Although I didn't calculate the time, because there is "migration" between "stop" and "cont", I think maybe it is about 2 minutes. what's more, the kernel and qemu build between src and dst hosts are same; this issue is also reproduced on local migration(src host is dst host).
Okay, so I am pretty sure the problem is that we don't actually stop the guest clock (TOD) while not running the guest. I am working on a QEMU fix. If so, the problem can be reproduced by not migrating but simply doing {"execute":"stop"} ... wait 2 minutes ... {"execute":"cont"} Nice finding! :)
(In reply to David Hildenbrand from comment #6) > Okay, so I am pretty sure the problem is that we don't actually stop the > guest clock (TOD) while not running the guest. I am working on a QEMU fix. > > If so, the problem can be reproduced by not migrating but simply doing > > {"execute":"stop"} > ... wait 2 minutes ... > {"execute":"cont"} > > Nice finding! :) Yes, you are right, I have tried {"execute":"stop"}--wait 7 minutes--{"execute":"cont"}, without migration, then reproduce it as following and there is a core file generated: [ 885.086312] rcu: INFO: rcu_sched self-detected stall on CPU [ 885.086335] rcu: 0-...!: (1 GPs behind) idle=a46/1/0x4000000000000002 softirq=19869/19872 fqs=0 [ 885.086337] rcu: (t=70414 jiffies g=3441 q=939) [ 885.086345] rcu: rcu_sched kthread starved for 70414 jiffies! g3441 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=1 [ 885.086346] rcu: RCU grace-period kthread stack dump: [ 885.086349] rcu_sched I 0 10 2 0x00000000 [ 885.086353] Call Trace: [ 885.086368] ([<0000000000784066>] __schedule+0x2ce/0x8e8) [ 885.086370] [<00000000007846ca>] schedule+0x4a/0xb8 [ 885.086373] [<0000000000788586>] schedule_timeout+0x1ce/0x458 [ 885.086382] [<00000000001bf9fc>] rcu_gp_kthread+0x2fc/0x630 [ 885.086386] [<00000000001696ac>] kthread+0x14c/0x168 [ 885.086390] [<0000000000789672>] kernel_thread_starter+0xa/0x10 [ 885.086392] [<0000000000789668>] kernel_thread_starter+0x0/0x10 [ 885.086397] Task dump for CPU 0: [ 885.086398] gdbus R running task 0 2102 2094 0x02000000 [ 885.086403] Call Trace: [ 885.086408] ([<04000000007004c0>] 0x4000000007004c0) [ 885.086409] Task dump for CPU 1: [ 885.086411] gnome-shell R running task 0 2105 2098 0x02000000 [ 885.086414] Call Trace: [ 885.086416] ([<000000007b753800>] 0x7b753800) [ 889.892214] systemd[1]: systemd-journald.service: Main process exited, code=dumped, status=6/ABRT [ 889.892576] systemd[1]: systemd-journald.service: Failed with result 'watchdog'. [ 889.894103] systemd[1]: systemd-journald.service: Service has no hold-off time (RestartSec=0), scheduling restart. [ 889.894173] systemd[1]: systemd-journald.service: Scheduled restart job, restart counter is at 2. [ 889.894330] systemd[1]: Stopped Flush Journal to Persistent Storage. [ 889.894357] systemd[1]: Stopping Flush Journal to Persistent Storage... [ 889.894363] systemd[1]: Stopped Journal Service. [ 889.897644] systemd[1]: Starting Journal Service... [ 889.918854] systemd-journald[2133]: File /run/log/journal/42b29c3f22be43d1b910415752bf6f31/system.journal corrupted or uncleanly shut down, renaming and replacing. [ 890.098641] systemd[1]: Started Journal Service. [ 892.212559] systemd-coredump[2113]: MESSAGE=Process 757 (systemd-journal) of user 0 dumped core. [ 892.212568] systemd-coredump[2113]: Coredump diverted to /var/lib/systemd/coredump/core.systemd-journal.0.c38ccab115aa4e07871f3e480d2f9742.757.1543392185000000.lz4 [ 892.212572] systemd-coredump[2113]: Stack trace of thread 757: [ 892.212575] systemd-coredump[2113]: #0 0x000003ffbaeff7ee epoll_wait (libc.so.6) [ 892.212578] systemd-coredump[2113]: #1 0x000003ffbb364604 sd_event_wait (libsystemd-shared-239.so) [ 892.212581] systemd-coredump[2113]: #2 0x000003ffbb365628 sd_event_run (libsystemd-shared-239.so) [ 892.212584] systemd-coredump[2113]: #3 0x000002aa33e8767a main (systemd-journald) [ 892.212587] systemd-coredump[2113]: #4 0x000003ffbae24172 __libc_start_main (libc.so.6) [ 892.212590] systemd-coredump[2113]: #5 0x000002aa33e8789c _start (systemd-journald)
Created attachment 1509410 [details] core file for comment7
Fix included in qemu-kvm-2.12.0-54.module+el8+2640+20cc81e1
Bug verify: Host: 4.18.0-56.el8.s390x qemu-kvm-2.12.0-55.module+el8+2645+07371192.s390x Guest: 4.18.0-34.el8.s390x steps are same with bug report. result: migration completed, vm works well, no call trace. So, this bug is fixed.