Bug 950416
| Summary: | Function virStreamRecv() always get failed | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Luwen Su <lsu> | ||||
| Component: | libvirt | Assignee: | Jiri Denemark <jdenemar> | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Virtualization Bugs <virt-bugs> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | medium | ||||||
| Version: | 7.0 | CC: | acathrow, berrange, chhu, cwei, dallan, david.pravec, dyuan, gsun, mjenner, mzhan, vbudikov, weizhan, ydu, zhwang | ||||
| Target Milestone: | rc | Keywords: | Regression | ||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | libvirt-1.1.1-10.el7 | Doc Type: | Bug Fix | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | |||||||
| : | 1026136 (view as bug list) | Environment: | |||||
| Last Closed: | 2014-06-13 12:57:28 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Luwen Su
2013-04-10 08:33:17 UTC
i would like to confirm this problem, screenshots fail for me in libvirt 1.0.4 -- worked well in 1.0.3 virsh list |grep 5 5 migt10 running virsh screenshot 5 error: could not receive data from domain 5 error: stream had I/O failure error: Reconnected to the hypervisor This was an awful one to track down :-) https://www.redhat.com/archives/libvir-list/2013-May/msg00805.html Fixed by commit a2214c5257d3bd7b086ce04aca1648e8ff05ee96, move to POST. thanks! tried making screenshot and it works. *** Bug 954588 has been marked as a duplicate of this bug. *** *** Bug 958687 has been marked as a duplicate of this bug. *** Reproduced with the packages: libvirt-1.0.4-1.el7.x86_64 qemu-kvm-1.4.0-2.el7.x86_64 Verified with the packages: libvirt-1.0.6-1.el7.x86_64 qemu-kvm-1.5.0-2.el7.x86_64 Test steps: 1. define, build and start a dir_pool 2. check there is a volume in the dir_pool #virsh vol-list dir_pool Name Path ----------------------------------------- rh7-qcow2.img /var/lib/libvirt/test/rh7-qcow2.img 3. run vol-download command: #virsh vol-download --pool dir_pool rh7-qcow2.img /tmp/vol-download.img 4. define, and start a guest 5. check the guest is running # virsh list Id Name State ---------------------------------------------------- 3 r7-qcow2 running 6. run screenshot command: # virsh screenshot r7-qcow2 /tmp/screenshot.save Screenshot saved to /tmp/screenshot.save, with type of image/x-portable-pixmap Test results: --current commands work well Hi osier
I met a issue that fail to download the file from the vol while the vol format was raw, since this issue was similiar with this bug, so can you help me have a look about this issue ,thanks
pkg info
qemu-kvm-1.5.3-3.el7.x86_64
libvirt-1.1.1-5.el7.x86_64
kernel-3.10.0-20.el7.x86_64
steps
1 # cat vol-test.xml
<volume>
<name>vol-test</name>
<key>/var/lib/libvirt/images/vol-test</key>
<source>
</source>
<capacity>10737418240</capacity>
<allocation>3179753472</allocation>
<target>
<path>/var/lib/libvirt/images/vol-test</path>
<format type='raw'/>
<permissions>
<mode>0644</mode>
<owner>36</owner>
<group>36</group>
<label>system_u:object_r:nfs_t</label>
</permissions>
</target>
</volume>
2. create the vol
# virsh vol-create default vol-test.xml
# qemu-img info /var/lib/libvirt/images/vol-test
image: /var/lib/libvirt/images/vol-test
file format: raw
virtual size: 10G (10737418240 bytes)
disk size: 2.0G
3.Download file from the vol, will report the following error
# virsh vol-download --vol vol-test --pool default --file /home/test1/a
error: cannot receive data from volume vol-test
error: internal error: client socket is closed
error: One or more references were leaked after disconnect from the hypervisor
error: Reconnected to the hypervisor
check the libvirtd.log
2013-09-18 06:24:43.622+0000: 30539: error : virNetMessageEncodePayloadRaw:432 : Stream data too long to send (16777192 bytes needed, 16777188 bytes available)
2013-09-18 06:24:43.623+0000: 30539: error : virFDStreamCloseInt:315 : internal error: I/O helper exited abnormally
4.it works well for qcow2 format.
(In reply to zhenfeng wang from comment #10) > Hi osier > I met a issue that fail to download the file from the vol while the vol > format was raw, since this issue was similiar with this bug, so can you help > me have a look about this issue ,thanks > > pkg info > qemu-kvm-1.5.3-3.el7.x86_64 > libvirt-1.1.1-5.el7.x86_64 > kernel-3.10.0-20.el7.x86_64 > > steps > > 1 # cat vol-test.xml > <volume> > <name>vol-test</name> > <key>/var/lib/libvirt/images/vol-test</key> > <source> > </source> > <capacity>10737418240</capacity> > <allocation>3179753472</allocation> > <target> > <path>/var/lib/libvirt/images/vol-test</path> > <format type='raw'/> > <permissions> > <mode>0644</mode> > <owner>36</owner> > <group>36</group> > <label>system_u:object_r:nfs_t</label> > </permissions> > </target> > </volume> > > 2. create the vol > # virsh vol-create default vol-test.xml > > # qemu-img info /var/lib/libvirt/images/vol-test > image: /var/lib/libvirt/images/vol-test > file format: raw > virtual size: 10G (10737418240 bytes) > disk size: 2.0G > > 3.Download file from the vol, will report the following error > # virsh vol-download --vol vol-test --pool default --file /home/test1/a > error: cannot receive data from volume vol-test > error: internal error: client socket is closed > error: One or more references were leaked after disconnect from the > hypervisor > error: Reconnected to the hypervisor > > check the libvirtd.log > 2013-09-18 06:24:43.622+0000: 30539: error : > virNetMessageEncodePayloadRaw:432 : Stream data too long to send (16777192 > bytes needed, 16777188 bytes available) It exceeds the maximum message size of RPC. You can download the whole volume in several steps with "--offset" and "--length". (In reply to Osier Yang from comment #11) > (In reply to zhenfeng wang from comment #10) > > > > 3.Download file from the vol, will report the following error > > # virsh vol-download --vol vol-test --pool default --file /home/test1/a > > error: cannot receive data from volume vol-test > > error: internal error: client socket is closed > > error: One or more references were leaked after disconnect from the > > hypervisor > > error: Reconnected to the hypervisor > > > > check the libvirtd.log > > 2013-09-18 06:24:43.622+0000: 30539: error : > > virNetMessageEncodePayloadRaw:432 : Stream data too long to send (16777192 > > bytes needed, 16777188 bytes available) > > It exceeds the maximum message size of RPC. You can download the whole > volume in several steps with "--offset" and "--length". No, that is absolutely wrong. The whole point of the stream APIs is that the application is not constrained by the message size. The data is supposed to be automatically sent as a series of packets, each less than the message size. If this is not happening, we've broken something in the stream code. The steam length problem is a regression caused by
commit e914dcfdaa8b382bc4831b622d9686ade924520f
Author: Daniel Hansel <daniel.hansel.ibm.com>
Date: Tue May 7 13:22:00 2013 +0200
rpc: message related sizes enlarged
It added two checks when enlarging the message buffer
if ((msg->bufferLength - VIR_NET_MESSAGE_LEN_MAX) * 4 > VIR_NET_MESSAGE_MAX) {
virReportError(VIR_ERR_RPC, "%s", _("Unable to encode message payload"));
goto error;
}
if ((msg->bufferOffset + len) > VIR_NET_MESSAGE_MAX) {
virReportError(VIR_ERR_RPC,
_("Stream data too long to send (%zu bytes needed, %zu bytes available)"),
len, (VIR_NET_MESSAGE_MAX - msg->bufferOffset));
return -1;
}
Both of these checks are wrong, because VIR_NET_MESSAGE_MAX does not include the 4 byte length packet (VIR_NET_MESSAGE_LEN_MAX).
Hence we have an off-by-4 in the stream packet calculation
This patch doesn't fix the broken size check, but it avoids triggering it https://www.redhat.com/archives/libvir-list/2013-September/msg01769.html (In reply to Daniel Berrange from comment #14) > This patch doesn't fix the broken size check, but it avoids triggering it > > https://www.redhat.com/archives/libvir-list/2013-September/msg01769.html I'm assuming the broken size check should also be fixed, is that right? Yes, but the broken size check is almost impossible to trigger now. Test with:
libvirt-1.1.1-10.el7.x86_64
qemu-kvm-1.5.3-10.el7.x86_64
The comments 10's steps are okay for me , also test the raw img.
But , the vol-download's speed will down to too slow when the image large enough (1G in my environment)
#time virsh vol-download --vol vol-test.img --pool default --file /home/test1/a
real 1m56.311s
user 1m24.267s
sys 0m2.743s
And the 500M's volumes time:
real 0m19.033s
virsh # vol-dumpxml vol-test.img --pool default
<volume>
<name>vol-test.img</name>
<key>/var/lib/libvirt/images/vol-test.img</key>
<source>
</source>
<capacity unit='bytes'>1048576000</capacity>
<allocation unit='bytes'>1048580096</allocation>
<target>
<path>/var/lib/libvirt/images/vol-test.img</path>
<format type='raw'/>
<permissions>
<mode>0600</mode>
<owner>0</owner>
<group>0</group>
<label>system_u:object_r:virt_image_t:s0</label>
</permissions>
<timestamps>
<atime>1382683904.221608140</atime>
<mtime>1382683904.144609566</mtime>
<ctime>1382683904.145609548</ctime>
</timestamps>
</target>
</volume>
ll /home/test1/
total 63484
-rw-r--r--. 1 root root 65005760 Oct 25 14:53 a
# ll /home/test1/a
-rw-r--r--. 1 root root 133943320 Oct 25 14:53 /home/test1/a
# ll /home/test1/a
-rw-r--r--. 1 root root 152029600 Oct 25 14:53 /home/test1/a
.........
-rw-r--r--. 1 root root 853136008 Oct 25 14:54 /home/test1/a
# ll /home/test1/a
-rw-r--r--. 1 root root 853987976 Oct 25 14:54 /home/test1/a
# ll /home/test1/a
-rw-r--r--. 1 root root 854643336 Oct 25 14:54 /home/test1/a
Is it related libvirt ?
BTW , also reproduced it on rhel6.
1.
And in the second time , the speed can be accepted if the target is same .
# time virsh vol-download --vol test.img --pool default --file /home/a
real 3m38.059s
user 3m32.044s
sys 0m2.489s
# time virsh vol-download --vol test.img --pool default --file /home/a
real 0m10.085s
user 0m1.719s
sys 0m2.323s
# time virsh vol-download --vol test.img --pool default --file /home/b ;;cancel
^C
real 1m49.476s
user 1m45.474s
sys 0m1.921s
2.
During the process , virsh will use almost 100% CPU , watched via top
KiB Mem: 7364840 total, 4982532 used, 2382308 free, 126120 buffers
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 28663 root 20 0 628020 312468 4784 R 98.3 4.2 0:48.99 virsh
/*HERE*/
(In reply to time.su from comment #19) > Test with: > libvirt-1.1.1-10.el7.x86_64 > qemu-kvm-1.5.3-10.el7.x86_64 > > > The comments 10's steps are okay for me , also test the raw img. > But , the vol-download's speed will down to too slow when the image large > enough (1G in my environment) > > #time virsh vol-download --vol vol-test.img --pool default --file > /home/test1/a > > real 1m56.311s > user 1m24.267s > sys 0m2.743s > > And the 500M's volumes time: > real 0m19.033s > > virsh # vol-dumpxml vol-test.img --pool default > <volume> > <name>vol-test.img</name> > <key>/var/lib/libvirt/images/vol-test.img</key> > <source> > </source> > <capacity unit='bytes'>1048576000</capacity> > <allocation unit='bytes'>1048580096</allocation> > <target> > <path>/var/lib/libvirt/images/vol-test.img</path> > <format type='raw'/> > <permissions> > <mode>0600</mode> > <owner>0</owner> > <group>0</group> > <label>system_u:object_r:virt_image_t:s0</label> > </permissions> > <timestamps> > <atime>1382683904.221608140</atime> > <mtime>1382683904.144609566</mtime> > <ctime>1382683904.145609548</ctime> > </timestamps> > </target> > </volume> > > > ll /home/test1/ > total 63484 > -rw-r--r--. 1 root root 65005760 Oct 25 14:53 a > # ll /home/test1/a > -rw-r--r--. 1 root root 133943320 Oct 25 14:53 /home/test1/a > # ll /home/test1/a > -rw-r--r--. 1 root root 152029600 Oct 25 14:53 /home/test1/a > ......... > -rw-r--r--. 1 root root 853136008 Oct 25 14:54 /home/test1/a > # ll /home/test1/a > -rw-r--r--. 1 root root 853987976 Oct 25 14:54 /home/test1/a > # ll /home/test1/a > -rw-r--r--. 1 root root 854643336 Oct 25 14:54 /home/test1/a > > > > Is it related libvirt ? I think it's not related with libvirt, but I also have no idea why it becomes slow in the end of the transfering (when we talked f2f, time said that for a 1G volume, it becomes slow about at the point of 800M, before that it just looks fine), Dan, I think you are the more appropriate person for this question? Since the patches which fixed the bug not related the speed down issue , i verified this bug per comment 19 , and will open a new one to track the new problem. Hi DB && Osier, When I test it on perl-Sys-Virt with script #!/usr/bin/perl use warnings; use strict; use Sys::Virt; my $uri = "qemu:///system"; my $domname = "sendtest"; my $con = Sys::Virt->new(address => $uri, readonly => 0); my $dom = $con->get_domain_by_name($domname); my $st = $con->new_stream(); $dom->screenshot($st, 0); $st->finish(); It will report error 2013-11-07 09:38:45.373+0000: 907: error : virFDStreamCloseInt:315 : internal error: I/O helper exited abnormally 2013-11-07 09:38:45.374+0000: 907: error : virFDStreamUpdateCallback:133 : internal error: stream is not open Is it related to this bug? Created attachment 820974 [details]
libvirtd.log in level 1
(In reply to weizhang from comment #23) > Hi DB && Osier, > > When I test it on perl-Sys-Virt with script > #!/usr/bin/perl > use warnings; > use strict; > use Sys::Virt; > > my $uri = "qemu:///system"; > my $domname = "sendtest"; > my $con = Sys::Virt->new(address => $uri, readonly => 0); > my $dom = $con->get_domain_by_name($domname); > my $st = $con->new_stream(); > $dom->screenshot($st, 0); > $st->finish(); > > It will report error > 2013-11-07 09:38:45.373+0000: 907: error : virFDStreamCloseInt:315 : > internal error: I/O helper exited abnormally > 2013-11-07 09:38:45.374+0000: 907: error : virFDStreamUpdateCallback:133 : > internal error: stream is not open > > Is it related to this bug? I don't think it's related with this bug, this bug is about the RPC payload length problem, however, the problem above says the I/O helper (libvirt_iohelper) process exited abnormally, it's complete too different cases. please file new bug with more info, if you can reproduce the problem with C API, file against libvirt, otherwise, file against perl-Sys-Virt. Thanks, file a new bug 1029726 This request was resolved in Red Hat Enterprise Linux 7.0. Contact your manager or support representative in case you have further questions about the request. |