Bug 632557
Summary: | Migration with STRESS caused guest hang | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Keqin Hong <khong> | ||||
Component: | qemu-kvm | Assignee: | Juan Quintela <quintela> | ||||
Status: | CLOSED DUPLICATE | QA Contact: | Virtualization Bugs <virt-bugs> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | low | ||||||
Version: | 6.0 | CC: | bcao, llim, michen, mkenneth, plyons, rwu, tburke, virt-maint | ||||
Target Milestone: | rc | Keywords: | RHELNAK | ||||
Target Release: | 6.1 | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2011-02-04 12:47:23 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 580951 | ||||||
Attachments: |
|
Description
Keqin Hong
2010-09-10 11:17:11 UTC
Thank you for your bug report. This issue was evaluated for inclusion in the current release of Red Hat Enterprise Linux. Unfortunately, we are unable to address this request in the current release. Because we are in the final stage of Red Hat Enterprise Linux 6 development, only significant, release-blocking issues involving serious regressions and data corruption can be considered. If you believe this issue meets the release blocking criteria as defined and communicated to you by your Red Hat Support representative, please ask your representative to file this issue as a blocker for the current release. Otherwise, ask that it be evaluated for inclusion in the next minor release of Red Hat Enterprise Linux. Created attachment 446491 [details]
kvm_stat log
What's the output of 5 consecutive "info migrate" commands at qemu monitor console, when the migration is stalled? (qemu) info migrate Migration status: active transferred ram: 10860216 kbytes remaining ram: 3079744 kbytes total ram: 8405440 kbytes (qemu) info migrate Migration status: active transferred ram: 11212244 kbytes remaining ram: 3058088 kbytes total ram: 8405440 kbytes (qemu) info migrate Migration status: active transferred ram: 11686008 kbytes remaining ram: 2987196 kbytes total ram: 8405440 kbytes (qemu) info migrate Migration status: active transferred ram: 12123908 kbytes remaining ram: 3125364 kbytes total ram: 8405440 kbytes (qemu) info migrate Migration status: active transferred ram: 12189708 kbytes remaining ram: 3125872 kbytes total ram: 8405440 kbytes Ok, so on the migration side, it does really seem that the reason is that we're dirtying pages faster than we transfer, and not some other mystical reason. Still don't have a theory on why it hangs after it is finished. Would be good to rule out the dirty pages as a driver for this. Can you try migrating again, but now issuing, right before migration: (qemu) migrate_set_speed 4G This should transfer the pages, regardless of the memory pressure we're seeing... I tried with (qemu) migrate_set_speed 4G right before migration. Migration first succeeded from A to B with no problem, but triggered guest hang after migration from B to A. (qemu) migrate_set_speed 4G (qemu) migrate -d tcp:10.66.86.26:5831 (qemu) info migrate Migration status: active transferred ram: 3670644 kbytes remaining ram: 4734932 kbytes total ram: 8405440 kbytes (qemu) info migrate Migration status: completed Ok, let me get it straight: You do set_speed from A -> B, and it works You *DO NOT* do set_speed from B -> A, and then it hangs. Is that correct? (In reply to comment #8) > Ok, let me get it straight: > > You do set_speed from A -> B, and it works > You *DO NOT* do set_speed from B -> A, and then it hangs. > > Is that correct? No, I did set_speed for both before migration. From A -> B, migration finished, and guest continued to work. From B -> A, migration also completed, but guest hanged. It might just show that under high mem stress, even with set_speed migration still can cause guest to hang, just not 100% reproducible. Thanks. This is a very good test case for live migration! When the guest hang, is there a message? Can you still see the screen with vnc? Without live migration, does a guest running stress will even hang? (In reply to comment #10) > This is a very good test case for live migration! > > When the guest hang, is there a message? no message I observed. > Can you still see the screen with vnc? Yes I can. But guest hung, no network, no mouse/keyboard input allowed. > Without live migration, does a guest running stress will even hang? No, it won't. *** This bug has been marked as a duplicate of bug 643970 *** |