RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1722022 - rhel8.0.1 guest get stuck when do migration from dst to src host with postcopy mode
Summary: rhel8.0.1 guest get stuck when do migration from dst to src host with postcop...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev
Version: 7.7
Hardware: Unspecified
OS: Unspecified
low
unspecified
Target Milestone: rc
: ---
Assignee: Dr. David Alan Gilbert
QA Contact: Li Xiaohui
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-06-19 11:44 UTC by Li Xiaohui
Modified: 2019-09-19 08:26 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-09-19 08:26:16 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Li Xiaohui 2019-06-19 11:44:24 UTC
Description of problem:
migrate rhel8.0.1 guest with postcopy mode from src to dst host, migration finish successfully.
then migrate guest with postcopy mode from dst to src host, in hmp, mgiration is successful, but found guest get stuck via remote-viewer


Version-Release number of selected component (if applicable):
host info:
kernel-3.10.0-1056.el7.x86_64 & qemu-kvm-rhev-2.12.0-32.el7.x86_64
guest info:
kernel-4.18.0-80.1.2.el8_0.x86_64


How reproducible:
2/2


Steps to Reproduce:
1.start guest on src host:
/usr/libexec/qemu-kvm \
-machine q35  \
-cpu host \
-enable-kvm \
-m 4G \
-smp 4 \
-nodefaults \
-rtc base=utc,clock=host,driftfix=slew \
-device pcie-root-port,id=pcie.0-root-port-2,slot=2,chassis=2,addr=0x2,bus=pcie.0 \
-device pcie-root-port,id=pcie.0-root-port-3,slot=3,chassis=3,addr=0x3,bus=pcie.0 \
-device pcie-root-port,id=pcie.0-root-port-4,slot=4,chassis=4,addr=0x4,bus=pcie.0 \
-blockdev node-name=back_image,driver=file,cache.direct=on,cache.no-flush=off,filename=/mnt/nfs/rhel8-0-1-blk.qcow2,aio=threads \
-blockdev node-name=drive-virtio-disk0,driver=qcow2,cache.direct=on,cache.no-flush=off,file=back_image \
-device virtio-blk-pci,drive=drive-virtio-disk0,id=disk0,bus=pcie.0-root-port-2 \
-device virtio-net-pci,mac=6c:0b:84:a4:53:f6,id=idLnLWR0,vectors=4,netdev=idINi0TE,bus=pcie.0-root-port-3,addr=0x0  \
-netdev tap,id=idINi0TE,vhost=on \
-spice port=5902,disable-ticketing \
-qmp tcp:0:4442,server,nowait \
-monitor stdio \
-device qxl-vga \
-boot menu=on \
2.run stressapptest in guest
# stressapptest -M 2000 -s 10000
3.start guest with "-incoming tcp:0:5800" on dst host
4.set postcopy-ram on both on src and dst hmp command
5.migrate rhel8.0.1 guest from src to dst host
src host(qemu) migrate -d tcp:10.73.72.82:5800 
6.change into postcopy mode before migration finished
src host(qemu) migrate_start_postcopy
7.migrate from src to dst host finished, check migration status and guest status
8.repeat 2~7 to migrate guest back to src host.


Actual results:
after step 7, guest runs normal on dst host after migration with postcopy mode, but after step 8, guest get stuck on src host after migration finished, couldn't operate guest under remote-viewer. in src hmp and dst hmp, guest status are normal:
dst hmp:
(qemu) info migrate
globals:
store-global-state: on
only-migratable: off
send-configuration: on
send-section-footer: on
decompress-error-check: on
capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off postcopy-ram: on x-colo: off release-ram: off return-path: off pause-before-switchover: off x-multifd: off dirty-bitmaps: off late-block-activate: off 
Migration status: completed
total time: 109789 milliseconds
downtime: 45 milliseconds
setup: 127 milliseconds
transferred ram: 5828744 kbytes
throughput: 434.93 mbps
remaining ram: 0 kbytes
total ram: 4326224 kbytes
duplicate: 159117 pages
skipped: 0 pages
normal: 1453994 pages
normal bytes: 5815976 kbytes
dirty sync count: 3
page size: 4 kbytes
postcopy request count: 162
(qemu) info status 
VM status: paused (postmigrate)

src hmp:
(qemu) info migrate
globals:
store-global-state: on
only-migratable: off
send-configuration: on
send-section-footer: on
decompress-error-check: on
(qemu) info status 
VM status: running


Expected results:
guest run normally in ping-pong migration with postcopy mode.


Additional info:
1.On rhel7.7 host, ping-pong migration with postcopy mode succeed when guest is the latest rhel7.7

Comment 2 Dr. David Alan Gilbert 2019-06-20 13:30:08 UTC
Does this still happen if you use something other than qxl and spice; what happens if you try VGA graphics with VNC ?

Comment 3 Li Xiaohui 2019-06-21 03:27:26 UTC
(In reply to Dr. David Alan Gilbert from comment #2)
> Does this still happen if you use something other than qxl and spice; what
> happens if you try VGA graphics with VNC ?

Hi Dave,
Still can reproduce this issue when boot guest with "-vnc :10 -device VGA" instead of "-spice port=5902,disable-ticketing -device qxl-vga".
Thanks.

Comment 8 Dr. David Alan Gilbert 2019-09-05 19:08:33 UTC
Can you retest using a recent rhel 8.1 as *host* and tell me if it still happens please?

Is the guest running a GUI or just at a console?

Comment 9 Li Xiaohui 2019-09-08 16:03:55 UTC
Hi Dave,
I found this test is on two different cpu hosts(Skylake-Client-IBRS & Skylake-Server-IBRS), but boot guest with "-cpu host" and then do migration. Maybe it's not a bz, I need confirm again after retest on rhel7.7 host.

I have tried again on rhel8.1-av to use above two hosts, test like bz steps:
1.when boot guest with "-cpu host", can reproduce this issue;
2.when boot guest with "-cpu Skylake-Client", ping-pong migration succeed.

Comment 10 Dr. David Alan Gilbert 2019-09-09 08:38:14 UTC
OK, as you say, using -cpu host is wrong with different CPUs - so if it's fine with -cpu Skylake-Client
So yes, please recheck with 7.7

Dave

Comment 11 Dr. David Alan Gilbert 2019-09-18 18:39:28 UTC
Please confirm your retest on rhel 7.7 is OK as comment 9.

Comment 12 Li Xiaohui 2019-09-19 07:54:34 UTC
(In reply to Dr. David Alan Gilbert from comment #11)
> Please confirm your retest on rhel 7.7 is OK as comment 9.

Confirm on rhel7.7 host with rhel8.0.1 guest, couldn't reproduce this bz

Comment 13 Dr. David Alan Gilbert 2019-09-19 08:26:16 UTC
Not-a-bug as per comment 9 & 12; test was originally done on mismatched hosts (Skylake Client vs Skylake Server) with -cpu host.


Note You need to log in before you can comment on or make changes to this bug.