Bug 1970337

Summary:	Fail to get migration failure immediately if yank under multifd migration
Product:	Red Hat Enterprise Linux 9	Reporter:	Li Xiaohui <xiaohli>
Component:	qemu-kvm	Assignee:	Leonardo Bras <leobras>
qemu-kvm sub component:	Live Migration	QA Contact:	Li Xiaohui <xiaohli>
Status:	CLOSED ERRATA	Docs Contact:
Severity:	medium
Priority:	medium	CC:	chayang, dgilbert, fjin, jferlan, jinzhao, juzhang, leobras, mdean, mrezanin, quintela, qzhang, virt-maint, yfu, zixchen
Version:	9.0	Keywords:	Triaged
Target Milestone:	rc	Flags:	pm-rhel: mirror+
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	qemu-kvm-6.2.0-1.el9	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2022-05-17 12:23:27 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Li Xiaohui 2021-06-10 09:43:16 UTC

Description of problem:
Do multifd migration，inject firewall via drop on dst host when migration is active, then use yank to fail migration, but multifd migration doesn't fail immediately, always need wait more than 10 mins, then get failure:
{"execute":"query-migrate"}
{"return": {"blocked": false, "status": "failed", "error-desc": "Unable to write to socket: Connection timed out"}}


Version-Release number of selected component (if applicable):
hosts: kernel-4.18.0-310.el8.x86_64 & qemu-kvm-6.0.0-18.module+el8.5.0+11243+5269aaa1.x86_64


How reproducible:
100%


Steps to Reproduce:
1.Boot a guest on src host, the qemu cmd lists below[1];
2.Boot a guest on dst host with same cmds as src host but append '-incoming defer';
3.Enable multifd capabilities on src and dst host.
{"execute":"migrate-set-capabilities","arguments":{"capabilities":[{"capability":"multifd","state":true}]}}
4.Set migration incoming on dst host via qmp cmd;
{"execute":"migrate-incoming","arguments":{"uri":"tcp:[::]:1234"}}
4.Start migration on src host via qmp cmd;
{"execute": "migrate","arguments":{"uri": "tcp:$dst_host_ip:1234"}}
5.During migration is active, inject firewall via drop on dst host:
# iptables -A INPUT -p tcp --dport 1234 -j DROP
6.After migration hang(query migrate, only the total time of migration is increasing, other migration params stay unchanged), use yank cmd to fail migration on src host:
{ "execute": "query-yank" }
{"return": [{"type": "chardev", "id": "qmp_id_qmpmonitor1"}, {"type": "chardev", "id": "qmp_id_catch_monitor"}, {"type": "chardev", "id": "compat_monitor0"}, {"type": "chardev", "id": "serial0"}, {"type": "migration"}]}
{"execute":"yank","arguments":{"instances":[{"type":"migration"}]}}


Actual results:
After step 6, query migrate status, need wait more than 10 mins to get migration failed:
hmp prompt:
(qemu) qemu-kvm: multifd_send_pages: channel 0 has already quit!
qemu-kvm: Unable to write to socket: Connection timed out
qemu-kvm: Unable to write to socket: Connection timed out

query migrate status via qmp:
{"execute":"query-migrate"}
{"return": {"blocked": false, "status": "failed", "error-desc": "Unable to write to socket: Connection timed out"}}


Expected results:
Get migration failed immediately after executing 'yank' cmd


Additional info:
When do precopy migration like above, will get migration failed immediately.


The qemu cmds when boot vm:
/usr/libexec/qemu-kvm  \
-name "mouse-vm" \
-sandbox off \
-machine q35,memory-backend=pc.ram \
-cpu EPYC-IBPB \
-nodefaults  \
-vga std \
-chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1,server=on,wait=off \
-chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor,server=on,wait=off \
-mon chardev=qmp_id_qmpmonitor1,mode=control \
-mon chardev=qmp_id_catch_monitor,mode=control \
-device pcie-root-port,port=0x10,chassis=1,id=root0,bus=pcie.0,multifunction=on,addr=0x2 \
-device pcie-root-port,port=0x11,chassis=2,id=root1,bus=pcie.0,addr=0x2.0x1 \
-device pcie-root-port,port=0x12,chassis=3,id=root2,bus=pcie.0,addr=0x2.0x2 \
-device pcie-root-port,port=0x13,chassis=4,id=root3,bus=pcie.0,addr=0x2.0x3 \
-device pcie-root-port,port=0x14,chassis=5,id=root4,bus=pcie.0,addr=0x2.0x4 \
-device pcie-root-port,port=0x15,chassis=6,id=root5,bus=pcie.0,addr=0x2.0x5 \
-device pcie-root-port,port=0x16,chassis=7,id=root6,bus=pcie.0,addr=0x2.0x6 \
-device pcie-root-port,port=0x17,chassis=8,id=root7,bus=pcie.0,addr=0x2.0x7 \
-device nec-usb-xhci,id=usb1,bus=root0 \
-device virtio-scsi-pci,id=virtio_scsi_pci0,bus=root1 \
-device scsi-hd,id=image1,drive=drive_image1,bus=virtio_scsi_pci0.0,channel=0,scsi-id=0,lun=0,bootindex=0 \
-device virtio-net-pci,mac=9a:8a:8b:8c:8d:8e,id=net0,vectors=4,netdev=tap0,bus=root2 \
-device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
-blockdev driver=file,cache.direct=on,cache.no-flush=off,filename=/mnt/nfs/rhel900-64-virtio-scsi.qcow2,node-name=drive_sys1 \
-blockdev driver=qcow2,node-name=drive_image1,file=drive_sys1 \
-netdev tap,id=tap0,vhost=on \
-m 4096 \
-object memory-backend-ram,id=pc.ram,size=4294967296 \
-smp 4,maxcpus=4,cores=2,threads=1,sockets=2 \
-vnc :10 \
-rtc base=utc,clock=host \
-boot menu=off,strict=off,order=cdn,once=c \
-enable-kvm  \
-qmp tcp:0:3333,server=on,wait=off \
-serial tcp:0:4444,server=on,wait=off \
-monitor stdio \

Comment 1 Dr. David Alan Gilbert 2021-06-10 09:48:25 UTC

This sounds like the 'yank' hasn't done it's job and we're waiting for some type of tcp timeout.

Comment 2 John Ferlan 2021-06-22 14:22:16 UTC

Assigned to Amnon for initial triage per bz process and age of bug created or assigned to virt-maint without triage.

Comment 3 Leonardo Bras 2021-07-28 19:58:16 UTC

This bug reproduces upstream.

So far, I could notice yank is correctly doing it's job and calling migration_yank_iochannel().

On the other hand, migration_yank_iochannel() calls qio_channel_shutdown() which is not enough to abort the migration in multifd case.

Currently I am trying to understand what should I call to abort in multifd case. 
(As a test, I called qio_channel_shutdown() in every multifd iochannel and yank worked just fine, but I could not retry migration, because it was still 'ongoing')

Comment 4 Leonardo Bras 2021-07-30 07:53:30 UTC

v1 posted upstream: 
http://patchwork.ozlabs.org/project/qemu-devel/patch/20210730074043.54260-1-leobras@redhat.com/

Comment 5 John Ferlan 2021-09-08 21:28:17 UTC

Move RHEL-AV bugs to RHEL9. If necessary to resolve in RHEL8, then clone to the current RHEL8 release.

Comment 6 Leonardo Bras 2021-09-13 17:38:36 UTC

Update:

Lukas Straub implemented a fix based in my previous patchset:

http://patchwork.ozlabs.org/project/qemu-devel/patch/20210901175857.0858efe1@gecko.fritz.box/

Waiting for upstream merge in order to start the backporting process.

Comment 7 John Ferlan 2021-12-20 14:58:15 UTC

Looks like the above was included in qemu-6.2 which was recently used to rebase into RHEL 9.0

I think this then can be moved along in the process, but want to make sure before actually doing it. I did set the ITR just to ensure it stays on the radar.

Be sure the Devel Whiteboard contains "Fixed in upstream qemu-6.2 commit <commit-id>"... 

May need to get Mirek to help move to ON_QA since this bug wouldn't have been included when he did the update and we'll need qa to set ITM

Comment 8 Leonardo Bras 2022-01-04 17:28:56 UTC

(In reply to John Ferlan from comment #7)
> Looks like the above was included in qemu-6.2 which was recently used to
> rebase into RHEL 9.0

Yes, it seems correct. 
The commit-id for this change is 20171ea8950c619f00dc5cfa6136fd489998ffc5, 
which is the same for upstream and rhel-9 branches. 

> 
> I think this then can be moved along in the process, but want to make sure
> before actually doing it. I did set the ITR just to ensure it stays on the
> radar.
> 
> Be sure the Devel Whiteboard contains "Fixed in upstream qemu-6.2 commit
> <commit-id>"... 

Could you please point the Devel Whiteboard? 
(I don't quite recall where to find it)

Comment 9 John Ferlan 2022-01-04 17:45:32 UTC

(In reply to Leonardo Bras from comment #8)
> (In reply to John Ferlan from comment #7)
> > Looks like the above was included in qemu-6.2 which was recently used to
> > rebase into RHEL 9.0
> 
> Yes, it seems correct. 
> The commit-id for this change is 20171ea8950c619f00dc5cfa6136fd489998ffc5, 
> which is the same for upstream and rhel-9 branches. 
> 
> > 
> > I think this then can be moved along in the process, but want to make sure
> > before actually doing it. I did set the ITR just to ensure it stays on the
> > radar.
> > 
> > Be sure the Devel Whiteboard contains "Fixed in upstream qemu-6.2 commit
> > <commit-id>"... 
> 
> Could you please point the Devel Whiteboard? 
> (I don't quite recall where to find it)

It's at the top (search in the bz window)...

I updated the bz, moved it to POST, set devel_ack+, and I set DTM=20 (hopefully avoid the noisy bots)... Will need ITM to be set in order to get release+ (it's all process stuff). The reality is it could already be tested with the rebase - we just have to follow process to get there...

Leaving a needinfo for Mirek since this was fixed by the qemu-6.2 rebase, but since we've already gone through the errata processes some system may need more massaging before moving to ON_QA.

Comment 12 Yanan Fu 2022-01-10 01:56:16 UTC

Gating test with qemu-kvm-6.2.0-1.el9 pass, add Verified:Tested, SanityOnly

Comment 14 Li Xiaohui 2022-01-13 04:05:46 UTC

Verify bug on rhel 9.0.0 (kernel-5.14.0-39.el9.x86_64 & qemu-kvm-6.2.0-1.el9.x86_64)

Multifd get failed immediately when inject firewall via drop on dst host, and multifd migration go on and succeed after close firewalld.


I have one question what should be the status of destination qemu when migration failed:
Won't qemu process on dst host automatic quit when migration fail?

When I test plain and multifd migration, I get two different results:
1.For plain migration, migration is always active on dst host, luckily I can stop the qemu process like { "execute": "quit" }. Qmp and hmp still work well;
2.For multifd migration, Qmp and hmp hang when I inject firewall on dst host even I enable qmp capability with 'OOB': '{"execute":"qmp_capabilities", "arguments":{"enable":["oob"]}}'. 
The only way I can stop dst qemu process is to kill process "kill -9 $dst_qemu_pid"


I'm not sure whether it's a bug, could someone help?

Comment 15 Dr. David Alan Gilbert 2022-01-13 09:23:53 UTC

This *might* be OK, adding Juan to see if he can see where the destination is blocked.
You're doing a 'yank' on the source; you should be able to do a 'yank' on the dest
to kill off any hung connections and recover quickly.

To do this you'll need a second qmp connection, and use the "exec-oob" command to execute the
yank rather than "execute", see section 2.3 and 2.3.1 of 
https://github.com/qemu/qemu/blob/master/docs/interop/qmp-spec.txt#L89

that should then recover your main monitor for you.
If it doesn't, then we have a bug with the recovery on the destination side.

Even if that does work, lets check with Juan if the current hang you're seeing is avoidable.

Comment 16 Li Xiaohui 2022-01-13 11:16:13 UTC

(In reply to Dr. David Alan Gilbert from comment #15)
> This *might* be OK, adding Juan to see if he can see where the destination
> is blocked.

Found QMP & HMP on dst host sometimes hang, but sometimes still be active as plain migration.
Maybe it's a problem.

> You're doing a 'yank' on the source; you should be able to do a 'yank' on
> the dest
> to kill off any hung connections and recover quickly.
> 
> To do this you'll need a second qmp connection, and use the "exec-oob"
> command to execute the
> yank rather than "execute", 

Thanks for the heads up.

When QMP on dst hang, I tried to execute yank under oob, but it still hang, what shall I do for the next steps?
{"execute":"qmp_capabilities", "arguments":{"enable":["oob"]}}
{"return": {}}
{ "exec-oob": "query-yank" }
{"return": [{"type": "chardev", "id": "qmp_id_qmpmonitor1"}, {"type": "chardev", "id": "qmp_id_catch_monitor"}, {"type": "chardev", "id": "compat_monitor0"}, {"type": "chardev", "id": "compat_monitor1"}, {"type": "chardev", "id": "compat_monitor2"}, {"type": "chardev", "id": "serial0"}, {"type": "migration"}]}
{"exec-oob":"yank","arguments":{"instances":[{"type":"migration"}]}}
{"return": {}}

> see section 2.3 and 2.3.1 of 
> https://github.com/qemu/qemu/blob/master/docs/interop/qmp-spec.txt#L89
> 
> that should then recover your main monitor for you.
> If it doesn't, then we have a bug with the recovery on the destination side.
> 
> Even if that does work, lets check with Juan if the current hang you're
> seeing is avoidable.

Comment 17 Dr. David Alan Gilbert 2022-01-13 14:16:36 UTC

Yeh if its still hanging after a yank I think it's one for juan to check the multifd code

Comment 18 Li Xiaohui 2022-01-14 03:38:20 UTC

Ok, I would mark this bug verified per Comment 14 & Comment 15.

We could go on track qemu on dst host hang issue for multifd migration (I think it's not same issue as bug). 
Juan, could you check according to Comment 14 & Comment 15, Comment 16?

Comment 19 Juan Quintela 2022-01-31 12:29:13 UTC

(In reply to Dr. David Alan Gilbert from comment #15)
> This *might* be OK, adding Juan to see if he can see where the destination
> is blocked.
> You're doing a 'yank' on the source; you should be able to do a 'yank' on
> the dest
> to kill off any hung connections and recover quickly.
> 
> To do this you'll need a second qmp connection, and use the "exec-oob"
> command to execute the
> yank rather than "execute", see section 2.3 and 2.3.1 of 
> https://github.com/qemu/qemu/blob/master/docs/interop/qmp-spec.txt#L89
> 
> that should then recover your main monitor for you.
> If it doesn't, then we have a bug with the recovery on the destination side.
> 
> Even if that does work, lets check with Juan if the current hang you're
> seeing is avoidable.

You need to do the yank on destination by hand.
See what happened:
- we start migration on both sides
- we cut the network cable
- we yank source and live is good there.
- but destination is still waiting for more data.  I can't see how it can detect that there has been a network cut when it is just reading.  It needs to wait whatever timeout is needed.  So I would say that things are ok.

Comment 21 errata-xmlrpc 2022-05-17 12:23:27 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (new packages: qemu-kvm), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:2307