Bug 1378006
Summary: | guest paused on target host sometimes when do migration during guest boot | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | yafu <yafu> | ||||||||
Component: | seabios | Assignee: | Dr. David Alan Gilbert <dgilbert> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | FuXiangChun <xfu> | ||||||||
Severity: | unspecified | Docs Contact: | |||||||||
Priority: | unspecified | ||||||||||
Version: | 7.3 | CC: | chayang, dgilbert, dyuan, fjin, hhuang, juzhang, lersek, mrezanin, qizhu, quintela, qzhang, rjones, virt-maint, yafu, zpeng | ||||||||
Target Milestone: | rc | ||||||||||
Target Release: | --- | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Unspecified | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | seabios-1.10.1-2.el7 | Doc Type: | If docs needed, set a value | ||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2017-08-01 17:44:06 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 1401400 | ||||||||||
Attachments: |
|
Created attachment 1203195 [details]
qemu log and libvirtd log both on source and target host
Can you try with seabios from 7.2? Paolo's guess is that if that works, it would be a problem with SMM. (In reply to Amit Shah from comment #9) > Can you try with seabios from 7.2? Paolo's guess is that if that works, it > would be a problem with SMM. Sorry, I did not understand. Do you mean test with rhel7.2 host or guest machine type using pc-i440fx-rhel7.2.0 ? (In reply to yafu from comment #10) > (In reply to Amit Shah from comment #9) > > Can you try with seabios from 7.2? Paolo's guess is that if that works, it > > would be a problem with SMM. > > Sorry, I did not understand. Do you mean test with rhel7.2 host or guest > machine type using pc-i440fx-rhel7.2.0 ? Just use the seabios package from RHEL7.2 release. Keep everything else the same as in your original test. (In reply to Amit Shah from comment #11) > (In reply to yafu from comment #10) > > (In reply to Amit Shah from comment #9) > > > Can you try with seabios from 7.2? Paolo's guess is that if that works, it > > > would be a problem with SMM. > > > > Sorry, I did not understand. Do you mean test with rhel7.2 host or guest > > machine type using pc-i440fx-rhel7.2.0 ? > > Just use the seabios package from RHEL7.2 release. Keep everything else the > same as in your original test. 1.I tested with seabios-bin-1.7.5-11.el7.noarch and multiple times. The migration works well during guest boot. 2.I can still reproduce the issue with seabios-bin-1.9.1-5.el7.noarch and qemu-kvm-rhev-2.6.0-28.el7.x86_64. Thank you for testing! Upstream commit 9c1f8f4493e8355d0e48f7d1eebdf86893ba082d would likely resolve this. Can you check if this qemu build fixes the problem? (Check with all components from 7.3 - including seabios). Created attachment 1225605 [details]
host crash screen
Fix included in seabios-1.10.1-2.el7 According to comment0, I try to reproduce it with qemu-kvm-rhev-2.6.0-25.el7.x86_64 & seabios-1.9.1-5.el7.x86_64 ping-pong 32 times via your script. But can not reproduce it yet. how many times did you test before? (In reply to FuXiangChun from comment #40) > According to comment0, I try to reproduce it with > qemu-kvm-rhev-2.6.0-25.el7.x86_64 & seabios-1.9.1-5.el7.x86_64 > > ping-pong 32 times via your script. But can not reproduce it yet. how many > times did you test before? The bug only happens during guest os booting. So you need to repeat doing migration during guest os booting. According to this result. set this bug as verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:1855 |
Created attachment 1203194 [details] domain xml Description of problem: Do migration during guest boot, guest paused on target host after migration completed sometimes. Version-Release number of selected component (if applicable): libvirt-2.0.0-9.el7.x86_64 qemu-kvm-rhev-2.6.0-25.el7.x86_64 How reproducible: 20% Steps to Reproduce: 1.start a guest on the source host #virsh start mig1 2.During the process of guest boot, begin to perform the migration: # virsh migrate mig1 --live qemu+ssh://10.66.14.148/system --verbose Migration: [100 %] 3.Check the guest status on the target host: # virsh list Id Name State ---------------------------------------------------- 3 mig1 paused 4.Check the guesgt status using qmp: ## virsh qemu-monitor-command mig1 --hmp info status VM status: paused (internal-error) 5.There is error in the qemu log on the target host: #cat /var/log/libvirt/qemu/mig1.log KVM internal error. Suberror: 1 emulation failure EAX=0000a0b5 EBX=ffffffff ECX=0002ffff EDX=000a0000 ESI=ffffffff EDI=ffffffff EBP=ffffffff ESP=000a8000 EIP=ffffffff EFL=00010002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] CS =0008 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA] SS =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] DS =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] FS =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] GS =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy GDT= 000f79b0 00000037 IDT= 000f79ee 00000000 CR0=00000011 CR2=00000000 CR3=00000000 CR4=00000000 DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 DR6=00000000ffff0ff0 DR7=0000000000000400 EFER=0000000000000000 Code=5b 66 5e 66 c3 ea 5b e0 00 f0 30 36 2f 32 33 2f 39 39 00 fc <00> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Actual results: guest on the target host paused after migration completed. Expected results: guest on the target host is running after migration completed. Additional info: 1.Please see the guest xml in the attachment; 2.Please see the libvirt and qemu log both on the source and target host in the attachment; 3.The stack trace of the paused guest: #gstack `pidof qemu-kvm` #2 0x00007f28f2f48f7e in call_rcu_thread () #3 0x00007f28da226dc5 in start_thread () from /lib64/libpthread.so.0 #4 0x00007f28d9f5573d in clone () from /lib64/libc.so.6 Thread 7 (Thread 0x7f28cd662700 (LWP 1375)): #0 0x00007f28da22a6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f28f2f3a669 in qemu_cond_wait () #2 0x00007f28f2ca45c3 in qemu_kvm_cpu_thread_fn () #3 0x00007f28da226dc5 in start_thread () from /lib64/libpthread.so.0 #4 0x00007f28d9f5573d in clone () from /lib64/libc.so.6 Thread 6 (Thread 0x7f28cce61700 (LWP 1376)): #0 0x00007f28da22a6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f28f2f3a669 in qemu_cond_wait () #2 0x00007f28f2ca45c3 in qemu_kvm_cpu_thread_fn () #3 0x00007f28da226dc5 in start_thread () from /lib64/libpthread.so.0 #4 0x00007f28d9f5573d in clone () from /lib64/libc.so.6 Thread 5 (Thread 0x7f28cc660700 (LWP 1377)): #0 0x00007f28da22a6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f28f2f3a669 in qemu_cond_wait () #2 0x00007f28f2ca45c3 in qemu_kvm_cpu_thread_fn () #3 0x00007f28da226dc5 in start_thread () from /lib64/libpthread.so.0 #4 0x00007f28d9f5573d in clone () from /lib64/libc.so.6 Thread 4 (Thread 0x7f28cbe5f700 (LWP 1378)): #0 0x00007f28da22a6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f28f2f3a669 in qemu_cond_wait () #2 0x00007f28f2ca45c3 in qemu_kvm_cpu_thread_fn () #3 0x00007f28da226dc5 in start_thread () from /lib64/libpthread.so.0 #4 0x00007f28d9f5573d in clone () from /lib64/libc.so.6 Thread 3 (Thread 0x7f2866fff700 (LWP 1380)): #0 0x00007f28d9f4adfd in poll () from /lib64/libc.so.6 #1 0x00007f28dbc81327 in red_worker_main () from /lib64/libspice-server.so.1 #2 0x00007f28da226dc5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f28d9f5573d in clone () from /lib64/libc.so.6 Thread 2 (Thread 0x7f28667fe700 (LWP 1381)): #0 0x00007f28d9f4adfd in poll () from /lib64/libc.so.6 #1 0x00007f28dbc81327 in red_worker_main () from /lib64/libspice-server.so.1 #2 0x00007f28da226dc5 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f28d9f5573d in clone () from /lib64/libc.so.6 Thread 1 (Thread 0x7f28f2a08d00 (LWP 1362)): #0 0x00007f28d9f4aebf in ppoll () from /lib64/libc.so.6 #1 0x00007f28f2ea8009 in qemu_poll_ns () #2 0x00007f28f2ea799c in main_loop_wait () #3 0x00007f28f2c7570f in main () 4.Use the scripts below can easily reproduce the error: #!/bin/bash HOSTA="$2" HOSTB="$3" i=0 while [ $i -le 1024 ] do virsh migrate $1 qemu+ssh://$HOSTB/system --live --verbose ssh $HOSTB virsh list sleep 5 ssh $HOSTB virsh migrate $1 qemu+ssh://$HOSTA/system --live --verbose virsh list sleep 5 i=`expr $i + 1` done