Bug 1146007
| Summary: | Input/output error during conversion of esx guest. | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | tingting zheng <tzheng> | ||||||
| Component: | libguestfs | Assignee: | Richard W.M. Jones <rjones> | ||||||
| Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||
| Severity: | medium | Docs Contact: | |||||||
| Priority: | medium | ||||||||
| Version: | 7.1 | CC: | dyuan, jbuchta, juzhou, mbooth, mzhan, ptoscano, rjones | ||||||
| Target Milestone: | rc | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | V2V | ||||||||
| Fixed In Version: | libguestfs-1.28.1-1.41.el7 | Doc Type: | Bug Fix | ||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | |||||||||
| : | 1235812 (view as bug list) | Environment: | |||||||
| Last Closed: | 2015-11-19 06:57:13 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Bug Depends On: | |||||||||
| Bug Blocks: | 1235812, 1235813 | ||||||||
| Attachments: |
|
||||||||
My tests were not conclusive. I think the ESX server is just too far away (physically) for me to be able to test this. I was mainly just seeing extremely slow data transfers and timeouts. *** Bug 1194338 has been marked as a duplicate of this bug. *** Created attachment 1043238 [details]
virt-v2v.log.gz
I managed to reproduce the error, using the server at 10.66.111.25.
I enabled curl debugging in qemu to see if I could get more details
when the problem happens.
Eventually after running the test several times, I did capture a
full debug output from the bug. This is attached. The relevant
part is very close to the end:
CURL: Just reading 16344 bytes
CURL: Just reading 16344 byt* Operation timed out after 599821 milliseconds with
12403968 out of 69206016 bytes received
#1
* Closing connection 1
qemu-img: error while reading sector 20344832: Input/output error
es
CURL: Just reading 16344 bytes
CURL: Just reading 16344 bytes
The "#1" in the output comes from a debug message which I
manually added to qemu so I could tell the difference between the
two places where block/curl.c returns EIO:
@@ -305,6 +305,7 @@ static void curl_multi_check_completion(BDRVCURLState *s)
continue;
}
+ fprintf (stderr, "#1\n");
acb->common.cb(acb->common.opaque, -EIO);
qemu_aio_unref(acb);
state->acb[i] = NULL;
@@ -653,6 +654,7 @@ static void curl_readv_bh_cb(void *p)
// No cache found, so let's start a new request
state = curl_init_state(acb->common.bs, s);
if (!state) {
+ fprintf (stderr, "#2\n");
acb->common.cb(acb->common.opaque, -EIO);
qemu_aio_unref(acb);
return;
This is possibly a qemu bug, but I'm now in a good place to
investigate this further.
(This is all with upstream virt-v2v and qemu from git, but it appears
to be the same bug)
A few things to note:
(1) It's a timeout issue. In virt-v2v we set the timeout to 600 seconds,
and in the qemu driver the timeout is firing after 599,821 milliseconds,
which is close as makes no difference to 600 seconds. Therefore a
simple virt-v2v patch to lengthen the timeout should fix it:
--- a/v2v/input_libvirt_vcenter_https.ml
+++ b/v2v/input_libvirt_vcenter_https.ml
@@ -250,7 +250,8 @@ let map_source_to_uri ?readahead password uri scheme server path =
let json_params = [
"file.driver", JSON.String "https";
"file.url", JSON.String url;
- "file.timeout", JSON.Int 600;
+ (* https://bugzilla.redhat.com/show_bug.cgi?id=1146007#c10 *)
+ "file.timeout", JSON.Int 2000;
] in
let json_params =
(2) qemu loses the real underlying error, and replaces it with EIO,
hence the user sees just "Input/output error" which is wrong and
confusing. Therefore we can change qemu so it doesn't lose the real
error.
Unfortunately the options for this are not very good. We could
map the CURLE_* errors to errno, and/or call error_report. What we
cannot do is call error_setg (because the infrastructure doesn't
exist in qemu yet).
libguestfs patch posted: https://www.redhat.com/archives/libguestfs/2015-June/msg00278.html I'll just note what I actually fixed here: The bug happens because we exceed a 10 minute timeout during the copy phase. I have increased the timeout to make it much larger. However if we still hit the larger timeout, you'll still see the strange "Input/output error". The timeout error message is lost, but that's a separate fix in qemu (bug 1235812, bug 1235813). Try to verify this bug with package:
libvirt-1.2.17-2.el7.x86_64
libguestfs-1.28.1-1.47.el7.x86_64
virt-v2v-1.28.1-1.47.el7.x86_64
qemu-kvm-rhev-2.3.0-13.el7.x86_64
steps:
# virt-v2v -ic vpx://root.4.103/tzheng-demo/10.66.106.63/?no_verify=1 --password-file /tmp/passwd2 esx5.5-win7-x86_64
[ 0.0] Opening the source -i libvirt -ic vpx://root.4.103/tzheng-demo/10.66.106.63/?no_verify=1 esx5.5-win7-x86_64
[ 25.0] Creating an overlay to protect the source from being modified
[ 26.0] Opening the overlay
[ 39.0] Initializing the target -o libvirt -os default
[ 39.0] Inspecting the overlay
[ 94.0] Checking for sufficient free disk space in the guest
[ 94.0] Estimating space required on target for each disk
[ 94.0] Converting Windows 7 Ultimate to run on KVM
virt-v2v: This guest has virtio drivers installed.
[ 108.0] Mapping filesystem data to avoid copying unused and blank areas
[ 110.0] Closing the overlay
[ 110.0] Checking if the guest needs BIOS or UEFI to boot
[ 110.0] Copying disk 1/1 to /var/lib/libvirt/images/esx5.5-win7-x86_64-sda (raw)
(100.00/100%)
[ 864.0] Creating output metadata
Pool default refreshed
Domain esx5.5-win7-x86_64 defined from /tmp/v2vlibvirt9ac9d6.xml
[ 865.0] Finishing off
Result: Conversion finished without no error, and guest can boot up. Also according to acceptance testing passed for this virt-v2v version, move this bug from ON_QA to VERIFIED.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-2183.html |
Created attachment 940729 [details] Detailed log file Description Input/output error during conversion of esx guest. Version: virt-v2v-1.27.53-1.1.el7.x86_64 libguestfs-winsupport-7.1-4.el7.x86_64 libguestfs-1.27.53-1.1.el7.x86_64 virtio-win-1.7.2-1.el7.noarch How reproducible: 50% Steps to Reproduce: 1.Use virt-v2v -v -x to convert guest running on esx server. # virt-v2v -ic vpx://administrator.7.125/tzheng-test/10.66.71.84/?no_verify=1 esx5.1-rhel6.4-x86_64 [ 0.0] Opening the source -i libvirt -ic vpx://administrator.7.125/tzheng-test/10.66.71.84/?no_verify=1 esx5.1-rhel6.4-x86_64 Enter administrator's password for 10.66.7.125: Enter host password for user 'administrator': [ 56.0] Creating an overlay to protect the source from being modified [ 57.0] Opening the overlay [ 72.0] Initializing the target -o libvirt -os default [ 72.0] Inspecting the overlay [ 195.0] Checking for sufficient free disk space in the guest [ 195.0] Estimating space required on target for each disk [ 195.0] Converting Red Hat Enterprise Linux Server release 6.4 (Santiago) to run on KVM This guest has virtio drivers installed. [1198.0] Mapping filesystem data to avoid copying unused and blank areas [1203.0] Closing the overlay [1204.0] Copying disk 1/1 to /var/lib/libvirt/images/esx5.1-rhel6.4-x86_64-sda (raw) qemu-img: error while reading sector 378880: Input/output error virt-v2v: error: qemu-img command failed, see earlier errors If reporting bugs, run virt-v2v with debugging enabled and include the complete output: virt-v2v -v -x [...] Actual results: As decribed. Expected results: Met I/O error during conversion of esx guests. Additional info: I can not reproduce this bug every time,and each time I can reproduce,it fails at different sector. Attached the log file.