| Summary: | [libvirt] When migrating domain libvirt start logging the error " virNetSocketReadWire:826 : Cannot recv data: Input/output error" ~8000 times a sec | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | David Naori <dnaori> | ||||||
| Component: | libvirt | Assignee: | Jiri Denemark <jdenemar> | ||||||
| Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||
| Severity: | high | Docs Contact: | |||||||
| Priority: | urgent | ||||||||
| Version: | 6.1 | CC: | berrange, dallan, danken, dnaori, dyuan, gren, hateya, mgoldboi, mzhan, nzhang, rwu, veillard, weizhan, ykaul | ||||||
| Target Milestone: | rc | Keywords: | Regression | ||||||
| Target Release: | --- | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | libvirt-0.9.3-5.el6 | Doc Type: | Bug Fix | ||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2011-12-06 11:16:33 UTC | Type: | --- | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Attachments: |
|
||||||||
Created attachment 512909 [details]
libvirtd log on the destenation machine
I can reproduce the problem. It only occurs if performing peer-2-peer migration using a TLS enabled URI for the migration function. There are two fixes we need, one fixes it server side, the other client side:
commit 3cfdc57b8553cae95b8849bbcb7a4b227085cec1
Author: Daniel P. Berrange <berrange>
Date: Fri Jul 8 12:54:29 2011 +0100
Fix sending of reply to final RPC message
The dispatch for the CLOSE RPC call was invoking the method
virNetServerClientClose(). This caused the client connection
to be immediately terminated. This meant the reply to the
final RPC message was never sent. Prior to the RPC rewrite
we merely flagged the connection for closing, and actually
closed it when the next RPC call dispatch had completed.
* daemon/remote.c: Flag connection for a delayed close
* daemon/stream.c: Update to use new API for closing
failed connection
* src/rpc/virnetserverclient.c, src/rpc/virnetserverclient.h:
Add support for a delayed connection close. Rename the
virNetServerClientMarkClose method to virNetServerClientImmediateClose
to clarify its semantics
commit afe8839f011c8c54c429f33ca0e6515fceb4e0fd
Author: Daniel P. Berrange <berrange>
Date: Fri Jul 8 12:41:06 2011 +0100
Fix leak of remote driver if final 'CLOSE' RPC call fails
When closing a remote connection we issue a (fairly pointless)
'CLOSE' RPC call to the daemon. If this fails we skip all the
cleanup of private data, but the virConnectPtr object still
gets released as normal. This causes a memory leak. Since the
CLOSE RPC call is pretty pointless, just carry on freeing the
remote driver if it fails.
* src/remote/remote_driver.c: Ignore failure to issue CLOSE
RPC call
This additional patch would prevent the infinite loop ever recurring in the event of similar bugs http://www.redhat.com/archives/libvir-list/2011-July/msg00946.html verify pass on
kernel-2.6.32-166.el6.x86_64
qemu-kvm-0.12.1.2-2.169.el6.x86_64
libvirt-0.9.3-5.el6.x86_64
steps:
1. prepare tls enabled uri environment
2. prepare a nfs server and mount nfs on both hosts
3. do p2p migration using tls enabled uri
virsh migrate --live --p2p domain qemu+tls://{target ip}/system
4. check /var/log/libvirt/libvirtd.log
there is no Input/output error and migration is success.
can reproduce the bug on libvirt-0.9.3-3.el6.x86_64
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2011-1513.html |
Created attachment 512908 [details] source-log Description of problem: When migrating a domain, the source machine starting to log those errors repeatedly- forever. every migration adds ~8000 lines a second in case of migrating several vms the logs are floded and ends the disk space. 16:37:48.141: 14332: debug : qemuMonitorEmitStop:897 : mon=0x7f8a80166100 16:37:48.141: 14332: debug : qemuProcessHandleStop:471 : Transitioned guest TOEXPORT-06 to paused state due to unknown event 16:37:48.141: 14332: debug : qemuProcessHandleStop:481 : Preserving lock state '(null)' 16:37:48.143: 14332: debug : virDomainFree:2092 : dom=0x1f76360, (VM: name=TOEXPORT-06, uuid=cb206c72-7d64-473d-b784-25e63b7fd055), 16:37:52.439: 14332: debug : qemuMonitorFree:209 : mon=0x7f8a80166100 16:37:54.356: 14332: debug : virDomainFree:2092 : dom=0x1f748e0, (VM: name=TOEXPORT-06, uuid=cb206c72-7d64-473d-b784-25e63b7fd055), 16:37:54.364: 14332: error : virNetSocketReadWire:826 : Cannot recv data: Input/output error 16:37:54.364: 14332: error : virNetSocketReadWire:826 : Cannot recv data: Input/output error Version-Release number of selected component (if applicable): libvirt-0.9.3-3.el6.x86_64 How reproducible: 100% Steps to Reproduce: 1.migrate domain and watch the logs. Actual results: no disk space left on / directory. Expected results: Additional info: added relevant part of the log int he source and dest host