RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1378006 - guest paused on target host sometimes when do migration during guest boot
Summary: guest paused on target host sometimes when do migration during guest boot
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: seabios
Version: 7.3
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Dr. David Alan Gilbert
QA Contact: FuXiangChun
URL:
Whiteboard:
Depends On:
Blocks: 1401400
TreeView+ depends on / blocked
 
Reported: 2016-09-21 09:44 UTC by yafu
Modified: 2017-08-01 17:44 UTC (History)
15 users (show)

Fixed In Version: seabios-1.10.1-2.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-08-01 17:44:06 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
domain xml (3.83 KB, text/html)
2016-09-21 09:44 UTC, yafu
no flags Details
qemu log and libvirtd log both on source and target host (499.05 KB, application/x-gzip)
2016-09-21 09:46 UTC, yafu
no flags Details
host crash screen (1.89 MB, image/jpeg)
2016-11-29 05:21 UTC, yafu
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:1855 0 normal SHIPPED_LIVE seabios bug fix and enhancement update 2017-08-01 18:03:30 UTC

Description yafu 2016-09-21 09:44:55 UTC
Created attachment 1203194 [details]
domain xml

Description of problem:
Do migration during guest boot, guest paused on target host after migration completed sometimes.

Version-Release number of selected component (if applicable):
libvirt-2.0.0-9.el7.x86_64
qemu-kvm-rhev-2.6.0-25.el7.x86_64

How reproducible:
20%

Steps to Reproduce:
 
1.start a guest on the source host
 #virsh start mig1

2.During the process of guest boot, begin to perform the migration:
# virsh migrate mig1 --live qemu+ssh://10.66.14.148/system --verbose
Migration: [100 %]

3.Check the guest status on the target host:
# virsh list
 Id    Name                           State
----------------------------------------------------
 3     mig1                       paused

4.Check the guesgt status using qmp:
## virsh qemu-monitor-command mig1 --hmp info status
VM status: paused (internal-error)


5.There is error in the qemu log on the target host:
#cat /var/log/libvirt/qemu/mig1.log
KVM internal error. Suberror: 1
emulation failure
EAX=0000a0b5 EBX=ffffffff ECX=0002ffff EDX=000a0000
ESI=ffffffff EDI=ffffffff EBP=ffffffff ESP=000a8000
EIP=ffffffff EFL=00010002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
CS =0008 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA]
SS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
DS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
FS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
GS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
GDT=     000f79b0 00000037
IDT=     000f79ee 00000000
CR0=00000011 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
Code=5b 66 5e 66 c3 ea 5b e0 00 f0 30 36 2f 32 33 2f 39 39 00 fc <00> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00


Actual results:
guest on the target host paused after migration completed.

Expected results:
guest on the target host is running after migration completed.

Additional info:
1.Please see the guest xml in the attachment;

2.Please see the libvirt and qemu log both on the source and target host in the attachment;

3.The stack trace of the paused guest:
  #gstack `pidof qemu-kvm`
  #2  0x00007f28f2f48f7e in call_rcu_thread ()
#3  0x00007f28da226dc5 in start_thread () from /lib64/libpthread.so.0
#4  0x00007f28d9f5573d in clone () from /lib64/libc.so.6
Thread 7 (Thread 0x7f28cd662700 (LWP 1375)):
#0  0x00007f28da22a6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f28f2f3a669 in qemu_cond_wait ()
#2  0x00007f28f2ca45c3 in qemu_kvm_cpu_thread_fn ()
#3  0x00007f28da226dc5 in start_thread () from /lib64/libpthread.so.0
#4  0x00007f28d9f5573d in clone () from /lib64/libc.so.6
Thread 6 (Thread 0x7f28cce61700 (LWP 1376)):
#0  0x00007f28da22a6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f28f2f3a669 in qemu_cond_wait ()
#2  0x00007f28f2ca45c3 in qemu_kvm_cpu_thread_fn ()
#3  0x00007f28da226dc5 in start_thread () from /lib64/libpthread.so.0
#4  0x00007f28d9f5573d in clone () from /lib64/libc.so.6
Thread 5 (Thread 0x7f28cc660700 (LWP 1377)):
#0  0x00007f28da22a6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f28f2f3a669 in qemu_cond_wait ()
#2  0x00007f28f2ca45c3 in qemu_kvm_cpu_thread_fn ()
#3  0x00007f28da226dc5 in start_thread () from /lib64/libpthread.so.0
#4  0x00007f28d9f5573d in clone () from /lib64/libc.so.6
Thread 4 (Thread 0x7f28cbe5f700 (LWP 1378)):
#0  0x00007f28da22a6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f28f2f3a669 in qemu_cond_wait ()
#2  0x00007f28f2ca45c3 in qemu_kvm_cpu_thread_fn ()
#3  0x00007f28da226dc5 in start_thread () from /lib64/libpthread.so.0
#4  0x00007f28d9f5573d in clone () from /lib64/libc.so.6
Thread 3 (Thread 0x7f2866fff700 (LWP 1380)):
#0  0x00007f28d9f4adfd in poll () from /lib64/libc.so.6
#1  0x00007f28dbc81327 in red_worker_main () from /lib64/libspice-server.so.1
#2  0x00007f28da226dc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f28d9f5573d in clone () from /lib64/libc.so.6
Thread 2 (Thread 0x7f28667fe700 (LWP 1381)):
#0  0x00007f28d9f4adfd in poll () from /lib64/libc.so.6
#1  0x00007f28dbc81327 in red_worker_main () from /lib64/libspice-server.so.1
#2  0x00007f28da226dc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f28d9f5573d in clone () from /lib64/libc.so.6
Thread 1 (Thread 0x7f28f2a08d00 (LWP 1362)):
#0  0x00007f28d9f4aebf in ppoll () from /lib64/libc.so.6
#1  0x00007f28f2ea8009 in qemu_poll_ns ()
#2  0x00007f28f2ea799c in main_loop_wait ()
#3  0x00007f28f2c7570f in main ()


4.Use the scripts below can easily reproduce the error:
  #!/bin/bash

HOSTA="$2"
HOSTB="$3"
i=0
while [ $i -le 1024 ]
do
        virsh migrate $1 qemu+ssh://$HOSTB/system --live --verbose
        ssh $HOSTB virsh list
        sleep 5
        ssh $HOSTB virsh migrate $1 qemu+ssh://$HOSTA/system --live --verbose
        virsh list
        sleep 5
        i=`expr $i + 1`
done

Comment 1 yafu 2016-09-21 09:46:34 UTC
Created attachment 1203195 [details]
qemu log and libvirtd log both on source and target host

Comment 9 Amit Shah 2016-09-29 08:18:38 UTC
Can you try with seabios from 7.2?  Paolo's guess is that if that works, it would be a problem with SMM.

Comment 10 yafu 2016-09-29 09:17:42 UTC
(In reply to Amit Shah from comment #9)
> Can you try with seabios from 7.2?  Paolo's guess is that if that works, it
> would be a problem with SMM.

Sorry, I did not understand. Do you mean test with rhel7.2 host or guest machine type using pc-i440fx-rhel7.2.0 ?

Comment 11 Amit Shah 2016-09-29 09:34:05 UTC
(In reply to yafu from comment #10)
> (In reply to Amit Shah from comment #9)
> > Can you try with seabios from 7.2?  Paolo's guess is that if that works, it
> > would be a problem with SMM.
> 
> Sorry, I did not understand. Do you mean test with rhel7.2 host or guest
> machine type using pc-i440fx-rhel7.2.0 ?

Just use the seabios package from RHEL7.2 release.  Keep everything else the same as in your original test.

Comment 13 yafu 2016-10-08 04:11:07 UTC
(In reply to Amit Shah from comment #11)
> (In reply to yafu from comment #10)
> > (In reply to Amit Shah from comment #9)
> > > Can you try with seabios from 7.2?  Paolo's guess is that if that works, it
> > > would be a problem with SMM.
> > 
> > Sorry, I did not understand. Do you mean test with rhel7.2 host or guest
> > machine type using pc-i440fx-rhel7.2.0 ?
> 
> Just use the seabios package from RHEL7.2 release.  Keep everything else the
> same as in your original test.

1.I tested with seabios-bin-1.7.5-11.el7.noarch and multiple times. The migration works well during guest boot.

2.I can still reproduce the issue with seabios-bin-1.9.1-5.el7.noarch and qemu-kvm-rhev-2.6.0-28.el7.x86_64.

Comment 14 Amit Shah 2016-10-12 10:04:45 UTC
Thank you for testing!

Upstream commit 9c1f8f4493e8355d0e48f7d1eebdf86893ba082d would likely resolve this.

Can you check if this qemu build fixes the problem?  (Check with all components from 7.3 - including seabios).

Comment 22 yafu 2016-11-29 05:21:31 UTC
Created attachment 1225605 [details]
host crash screen

Comment 37 Miroslav Rezanina 2017-02-03 10:46:42 UTC
Fix included in seabios-1.10.1-2.el7

Comment 40 FuXiangChun 2017-06-15 10:44:59 UTC
According to comment0, I try to reproduce it with qemu-kvm-rhev-2.6.0-25.el7.x86_64 & seabios-1.9.1-5.el7.x86_64

ping-pong 32 times via your script. But can not reproduce it yet. how many times did you test before?

Comment 41 yafu 2017-06-16 06:00:13 UTC
(In reply to FuXiangChun from comment #40)
> According to comment0, I try to reproduce it with
> qemu-kvm-rhev-2.6.0-25.el7.x86_64 & seabios-1.9.1-5.el7.x86_64
> 
> ping-pong 32 times via your script. But can not reproduce it yet. how many
> times did you test before?

The bug only happens during guest os booting. So you need to repeat doing migration during guest os booting.

Comment 43 FuXiangChun 2017-06-16 07:44:28 UTC
According to this result. set this bug as verified.

Comment 47 errata-xmlrpc 2017-08-01 17:44:06 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1855


Note You need to log in before you can comment on or make changes to this bug.