RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1869023 - Migration seldom is postcopy-active on dst host after waiting 30 mins with network down status
Summary: Migration seldom is postcopy-active on dst host after waiting 30 mins with ne...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: qemu-kvm
Version: 9.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: rc
: 9.2
Assignee: jason wang
QA Contact: Li Xiaohui
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-08-15 14:32 UTC by Li Xiaohui
Modified: 2023-03-14 19:32 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-02-13 20:59:55 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Li Xiaohui 2020-08-15 14:32:03 UTC
Description of problem:
Migration seldom is postcopy-active on dst host after waiting 30 mins with network down  


Version-Release number of selected component (if applicable):
host info: kernel-4.18.0-232.el8.x86_64 & qemu-kvm-5.1.0-2.module+el8.3.0+7652+b30e6901.x86_64
guest info: kernel-4.18.0-232.el8.x86_64


How reproducible:
1/5


Steps to Reproduce:
1.set TCP timeout on src&dst hosts:
# cd /proc/sys/net/ipv4/
# echo 3 > tcp_keepalive_probes  
# echo 3 > tcp_keepalive_intvl 
# echo 1 > tcp_retries1 
# echo 1 > tcp_retries2 
# echo 2 > tcp_fin_timeout 
2.boot a guest on src host
3.boot a guest with "-incoming defer" on dst host
4.enable postcopy mode both on src&dst host, and set postcopy speed on src host
5.migrate guest from src to dst host, during migration, change migration into postcopy phase
6.when finish step 5, down migration network on dst host at once
# nmcli con down ens6f0


Actual results:
After step 6, qemu on src host is working normally, and the migration status is postcopy-paused. But migration status is still postcopy-active on dst host after waiting for 30 mins


Expected results:
after step 6, migration status is right(postcopy-paused) on src and dst host.
Migration can go on and finish successfully.


Additional info:
1.Also found this issue on qemu-img-5.0.0-0.module+el8.3.0+6620+5d5e1420.x86_64
2.Use qemu command lines:
/usr/libexec/qemu-kvm  \
-name "mouse-vm",debug-threads=on \
-sandbox off \
-machine q35 \
-cpu EPYC \
-nodefaults  \
-device VGA \
-chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1,server,nowait \
-chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor,server,nowait \
-mon chardev=qmp_id_qmpmonitor1,mode=control \
-mon chardev=qmp_id_catch_monitor,mode=control \
-device pcie-root-port,port=0x10,chassis=1,id=root0,bus=pcie.0,multifunction=on,addr=0x2 \
-device pcie-root-port,port=0x11,chassis=2,id=root1,bus=pcie.0,addr=0x2.0x1 \
-device pcie-root-port,port=0x12,chassis=3,id=root2,bus=pcie.0,addr=0x2.0x2 \
-device pcie-root-port,port=0x13,chassis=4,id=root3,bus=pcie.0,addr=0x2.0x3 \
-device pcie-root-port,port=0x14,chassis=5,id=root4,bus=pcie.0,addr=0x2.0x4 \
-device pcie-root-port,port=0x15,chassis=6,id=root5,bus=pcie.0,addr=0x2.0x5 \
-device pcie-root-port,port=0x16,chassis=7,id=root6,bus=pcie.0,addr=0x2.0x6 \
-device pcie-root-port,port=0x17,chassis=8,id=root7,bus=pcie.0,addr=0x2.0x7 \
-device nec-usb-xhci,id=usb1,bus=root0 \
-device virtio-scsi-pci,id=virtio_scsi_pci0,bus=root1 \
-device scsi-hd,id=image1,drive=drive_image1,bus=virtio_scsi_pci0.0,channel=0,scsi-id=0,lun=0,bootindex=0 \
-device virtio-net-pci,mac=9a:8a:8b:8c:8d:8e,id=net0,vectors=4,netdev=tap0,bus=root2 \
-device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
-blockdev driver=file,cache.direct=on,cache.no-flush=off,filename=/mnt/nfs/rhel830-64-virtio-scsi.qcow2,node-name=drive_sys1 \
-blockdev driver=qcow2,node-name=drive_image1,file=drive_sys1 \
-netdev tap,id=tap0,vhost=on \
-m 4096 \
-smp 4,maxcpus=4,cores=2,threads=1,sockets=2 \
-vnc :10 \
-rtc base=utc,clock=host \
-boot menu=off,strict=off,order=cdn,once=c \
-enable-kvm  \
-qmp tcp:0:3333,server,nowait \
-serial tcp:0:4444,server,nowait \
-monitor stdio \

Comment 1 Peter Xu 2020-08-15 15:24:35 UTC
Current information might not be enough to know what has happened.

Xiaohui, would you please keep the destionation host environment when it reproduces next time?  We need to know whether the tcp session has correctly closed on dest host first.

Comment 2 Li Xiaohui 2020-08-17 12:17:17 UTC
(In reply to Peter Xu from comment #1)
> Current information might not be enough to know what has happened.
> 
> Xiaohui, would you please keep the destionation host environment when it
> reproduces next time?  We need to know whether the tcp session has correctly
> closed on dest host first.

Hi, 
I could reproduce this bz for most times. 
I have sent you the environment information and test steps via email, please check.

Comment 3 Peter Xu 2020-08-17 20:05:37 UTC
The bug can be reproduced by me running the "nmcli conn down xxx" very quickly right after start postcopy on src, manually.

Not easy to dump the qemu backtrace on destination since it already does not respond to gdb signals, which seems to hang gdb attachings.  However strangely (but also luckily) it'll still respond to sigabrt, so finally I saw where the migration incoming thread hanged at:

Thread 7 (Thread 0x7f19f9fff700 (LWP 7334)):
#0  __libc_recvmsg (flags=1073741824, msg=0x7f19f9ffe2d0, fd=32) at ../sysdeps/unix/sysv/linux/recvmsg.c:28
#1  __libc_recvmsg (fd=32, msg=0x7f19f9ffe2d0, flags=1073741824) at ../sysdeps/unix/sysv/linux/recvmsg.c:25
#2  0x000055db480ddf0c in qio_channel_socket_readv ()
#3  0x000055db480db7f6 in qio_channel_read ()
#4  0x000055db47ff53c1 in channel_get_buffer ()
#5  0x000055db47ff3fd4 in qemu_fill_buffer ()
#6  0x000055db47ff49d5 in qemu_peek_byte ()
#7  0x000055db47ff4c92 in qemu_get_be32 ()
#8  0x000055db47ff4d22 in qemu_get_be64 ()
#9  0x000055db47e12879 in ram_load ()
#10 0x000055db47fef5bb in qemu_loadvm_state_main ()
#11 0x000055db47ff022c in postcopy_ram_listen_thread ()
#12 0x000055db48134b64 in qemu_thread_start ()
#13 0x00007f1b1497f14a in start_thread (arg=<optimized out>) at pthread_create.c:479
#14 0x00007f1b146b0f23 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

This is kind of strange, because after "nmcli conn down xxx" I thought the recvmsg() should return and fail gracefully and quickly (at least in 30 seconds).  It seems not...

I double confirmed this is the case by trigger again the bug and dump the kernel stack; it's indeed hanged in the tcp stack of recvmsg():

[<0>] wait_woken+0x43/0x80                                                                                                                        
[<0>] sk_wait_data+0x11f/0x130                                                                                                                    
[<0>] tcp_recvmsg+0x816/0xc60                                                                                                                     
[<0>] inet_recvmsg+0x5b/0xd0                                                                                                                      
[<0>] ____sys_recvmsg+0x91/0x150                                                                                                                  
[<0>] ___sys_recvmsg+0x7b/0xc0                                                                                                                    
[<0>] __sys_recvmsg+0x54/0xa0                                                                                                                     
[<0>] do_syscall_64+0x5b/0x1a0                                                                                                                    
[<0>] entry_SYSCALL_64_after_hwframe+0x65/0xca                                                                                                    

I'm not sure whether I get it right, but I do see that during schedule_timeout() of wait_woken() we'll even ignore signals (I guess that's why gdb attach won't work).  I did a quick trace on wait_woken() to dump the timeout, it seems to be a very huge number (0x7fffffffffffffff), not 100% sure I'm getting the right stuff dumped, though.

It gives me the feeling that this could be a network issue.  Adding Jason in the loop too in case there's any input.

Comment 4 Dr. David Alan Gilbert 2020-08-18 09:56:14 UTC
I agree that I'd expect the TCP connection to time out; especially with those TCP settings - but I'll leave that to Jason.

However, can we recover the destination my issuing a 'migrate-pause' on an OOB qmp ?

Comment 6 jason wang 2021-02-02 07:21:11 UTC
(In reply to Peter Xu from comment #3)
> The bug can be reproduced by me running the "nmcli conn down xxx" very
> quickly right after start postcopy on src, manually.
> 
> Not easy to dump the qemu backtrace on destination since it already does not
> respond to gdb signals, which seems to hang gdb attachings.  However
> strangely (but also luckily) it'll still respond to sigabrt, so finally I
> saw where the migration incoming thread hanged at:
> 
> Thread 7 (Thread 0x7f19f9fff700 (LWP 7334)):
> #0  __libc_recvmsg (flags=1073741824, msg=0x7f19f9ffe2d0, fd=32) at
> ../sysdeps/unix/sysv/linux/recvmsg.c:28
> #1  __libc_recvmsg (fd=32, msg=0x7f19f9ffe2d0, flags=1073741824) at
> ../sysdeps/unix/sysv/linux/recvmsg.c:25
> #2  0x000055db480ddf0c in qio_channel_socket_readv ()
> #3  0x000055db480db7f6 in qio_channel_read ()
> #4  0x000055db47ff53c1 in channel_get_buffer ()
> #5  0x000055db47ff3fd4 in qemu_fill_buffer ()
> #6  0x000055db47ff49d5 in qemu_peek_byte ()
> #7  0x000055db47ff4c92 in qemu_get_be32 ()
> #8  0x000055db47ff4d22 in qemu_get_be64 ()
> #9  0x000055db47e12879 in ram_load ()
> #10 0x000055db47fef5bb in qemu_loadvm_state_main ()
> #11 0x000055db47ff022c in postcopy_ram_listen_thread ()
> #12 0x000055db48134b64 in qemu_thread_start ()
> #13 0x00007f1b1497f14a in start_thread (arg=<optimized out>) at
> pthread_create.c:479
> #14 0x00007f1b146b0f23 in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
> 
> This is kind of strange, because after "nmcli conn down xxx" I thought the
> recvmsg() should return and fail gracefully and quickly (at least in 30
> seconds).  It seems not...
> 
> I double confirmed this is the case by trigger again the bug and dump the
> kernel stack; it's indeed hanged in the tcp stack of recvmsg():
> 
> [<0>] wait_woken+0x43/0x80                                                  
> 
> [<0>] sk_wait_data+0x11f/0x130                                              
> 
> [<0>] tcp_recvmsg+0x816/0xc60                                               
> 
> [<0>] inet_recvmsg+0x5b/0xd0                                                
> 
> [<0>] ____sys_recvmsg+0x91/0x150                                            
> 
> [<0>] ___sys_recvmsg+0x7b/0xc0                                              
> 
> [<0>] __sys_recvmsg+0x54/0xa0                                               
> 
> [<0>] do_syscall_64+0x5b/0x1a0                                              
> 
> [<0>] entry_SYSCALL_64_after_hwframe+0x65/0xca                              
> 
> 
> I'm not sure whether I get it right, but I do see that during
> schedule_timeout() of wait_woken() we'll even ignore signals (I guess that's
> why gdb attach won't work).

It looks to me sk_wait_event() is using interruptible sleep?


  I did a quick trace on wait_woken() to dump the
> timeout, it seems to be a very huge number (0x7fffffffffffffff), not 100%
> sure I'm getting the right stuff dumped, though.

I suspect it was somehow related to the changing of tcp timeout due to the sysfs operations.

Thanks

> 
> It gives me the feeling that this could be a network issue.  Adding Jason in
> the loop too in case there's any input.

Comment 7 jason wang 2021-02-02 07:22:15 UTC
Hi:

Some questions:

1) Can you reproduce the issue without tunning tcp (sysfs)?
2) If not, can you try to see which parameter cause the long wait?

Thanks

Comment 8 Li Xiaohui 2021-02-02 09:13:40 UTC
(In reply to jason wang from comment #7)
> Hi:
> 
> Some questions:
> 
> 1) Can you reproduce the issue without tunning tcp (sysfs)?
> 2) If not, can you try to see which parameter cause the long wait?
> 
> Thanks

I haven't available machines for this test now and this may need some days.
I will try to loan two machines and test it asap.

Comment 10 Li Xiaohui 2021-06-15 07:14:47 UTC
(In reply to jason wang from comment #7)
> Hi:
> 

Hi, sorry for so late.

> Some questions:
> 
> 1) Can you reproduce the issue without tunning tcp (sysfs)?

Could still reproduce this bz without tunning tcp (sysfs).
But not always reproduce, the reproduce rate is lower (maybe 1/10)

> 2) If not, can you try to see which parameter cause the long wait?

Viewing the answer of question 1, this bz isn't related with tunning tcp.

> 
> Thanks

Comment 11 Li Xiaohui 2021-06-15 14:24:11 UTC
(In reply to Li Xiaohui from comment #10)
> (In reply to jason wang from comment #7)
> > Hi:
> > 
> 
> Hi, sorry for so late.
> 
> > Some questions:
> > 
> > 1) Can you reproduce the issue without tunning tcp (sysfs)?
> 
> Could still reproduce this bz without tunning tcp (sysfs).
> But not always reproduce, the reproduce rate is lower (maybe 1/10)

Update the accurate reproduce rate: 3/10

> 
> > 2) If not, can you try to see which parameter cause the long wait?
> 
> Viewing the answer of question 1, this bz isn't related with tunning tcp.
> 
> > 
> > Thanks

Comment 12 John Ferlan 2021-09-08 21:27:57 UTC
Move RHEL-AV bugs to RHEL9. If necessary to resolve in RHEL8, then clone to the current RHEL8 release.

Comment 14 RHEL Program Management 2022-02-15 07:27:19 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.

Comment 15 Li Xiaohui 2022-04-26 07:13:32 UTC
Reopen as this bug was closed due to out of stale date.  Will close as currentrelease if not reproduce on RHEL 9.1.

Comment 17 Li Xiaohui 2022-06-12 09:16:13 UTC
Still reproduce this bug on RHEL 9.1.0 hosts(kernel-5.14.0-105.el9.x86_64 && qemu-kvm-7.0.0-5.el9.x86_64)

Comment 18 Li Xiaohui 2022-06-13 09:33:09 UTC
(In reply to Dr. David Alan Gilbert from comment #4)
> I agree that I'd expect the TCP connection to time out; especially with
> those TCP settings - but I'll leave that to Jason.
> 
> However, can we recover the destination my issuing a 'migrate-pause' on an
> OOB qmp ?

Sorry to ignore this comment.

When I use {"exec-oob":"migrate-pause"} to pause migration on destination host, it works well.
And after network recovery, postcopy migration succeeds after recovering migration, and vm works on destination host.

Comment 21 John Ferlan 2023-02-13 20:59:55 UTC
This aging bug was planned to be resolved within the 9.2.0 release timeframe; however, since it has not been addressed by ITM 24, we are closing as deferred. 

If this bug is reopened, there must be a downstream commit ready to be posted for merge or the bug must be part of some rebase. When reopening, add the release ITR and appropriate DTM.

Comment 22 Li Xiaohui 2023-02-14 01:54:46 UTC
Hi John,

Yesterday I tried to test this bug on the latest RHEL 9.2 (kernel-5.14.0-252.el9.x86_64 & qemu-kvm-7.2.0-8.el9.x86_64), they all passed, didn't hit any issues.


So maybe we can close this bug as currentrelease if you agree.

Comment 23 Li Xiaohui 2023-02-14 01:55:32 UTC
(In reply to Li Xiaohui from comment #22)
> Hi John,
> 
> Yesterday I tried to test this bug on the latest RHEL 9.2
> (kernel-5.14.0-252.el9.x86_64 & qemu-kvm-7.2.0-8.el9.x86_64), they all
> passed, didn't hit any issues.

repeat 100 times, all passed.

> 
> 
> So maybe we can close this bug as currentrelease if you agree.

Comment 24 John Ferlan 2023-02-14 15:17:46 UTC
(In reply to Li Xiaohui from comment #22)
> Hi John,
> 
> Yesterday I tried to test this bug on the latest RHEL 9.2
> (kernel-5.14.0-252.el9.x86_64 & qemu-kvm-7.2.0-8.el9.x86_64), they all
> passed, didn't hit any issues.
> 
> 
> So maybe we can close this bug as currentrelease if you agree.

Sure I can change to currentrelease given the outcome, although I have no way to attribute the resolution to any specific commit as that knowledge so I'll just update TargetRelease to 9.2, but not change ITR (and set any other fields or keyword to TestOnly - feel free to do so if you wish though).

Comment 25 Li Xiaohui 2023-02-15 03:10:22 UTC
(In reply to John Ferlan from comment #24)
> (In reply to Li Xiaohui from comment #22)
> > Hi John,
> > 
> > Yesterday I tried to test this bug on the latest RHEL 9.2
> > (kernel-5.14.0-252.el9.x86_64 & qemu-kvm-7.2.0-8.el9.x86_64), they all
> > passed, didn't hit any issues.
> > 
> > 
> > So maybe we can close this bug as currentrelease if you agree.
> 
> Sure I can change to currentrelease given the outcome, although I have no
> way to attribute the resolution to any specific commit as that knowledge so
> I'll just update TargetRelease to 9.2, but not change ITR (and set any other
> fields or keyword to TestOnly - feel free to do so if you wish though).

Thank you. I think the current status is ok.


Note You need to log in before you can comment on or make changes to this bug.