Bug 1146007

Summary: Input/output error during conversion of esx guest.
Product: Red Hat Enterprise Linux 7 Reporter: tingting zheng <tzheng>
Component: libguestfsAssignee: Richard W.M. Jones <rjones>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.1CC: dyuan, jbuchta, juzhou, mbooth, mzhan, ptoscano, rjones
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: V2V
Fixed In Version: libguestfs-1.28.1-1.41.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1235812 (view as bug list) Environment:
Last Closed: 2015-11-19 06:57:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1235812, 1235813    
Attachments:
Description Flags
Detailed log file
none
virt-v2v.log.gz none

Description tingting zheng 2014-09-24 09:43:46 UTC
Created attachment 940729 [details]
Detailed log file

Description
Input/output error during conversion of esx guest.

Version:
virt-v2v-1.27.53-1.1.el7.x86_64
libguestfs-winsupport-7.1-4.el7.x86_64
libguestfs-1.27.53-1.1.el7.x86_64
virtio-win-1.7.2-1.el7.noarch

How reproducible:
50%

Steps to Reproduce:
1.Use virt-v2v -v -x to convert guest running on esx server.
# virt-v2v -ic vpx://administrator.7.125/tzheng-test/10.66.71.84/?no_verify=1 esx5.1-rhel6.4-x86_64
[   0.0] Opening the source -i libvirt -ic vpx://administrator.7.125/tzheng-test/10.66.71.84/?no_verify=1 esx5.1-rhel6.4-x86_64
Enter administrator's password for 10.66.7.125: 
Enter host password for user 'administrator':
[  56.0] Creating an overlay to protect the source from being modified
[  57.0] Opening the overlay
[  72.0] Initializing the target -o libvirt -os default
[  72.0] Inspecting the overlay
[ 195.0] Checking for sufficient free disk space in the guest
[ 195.0] Estimating space required on target for each disk
[ 195.0] Converting Red Hat Enterprise Linux Server release 6.4 (Santiago) to run on KVM
This guest has virtio drivers installed.
[1198.0] Mapping filesystem data to avoid copying unused and blank areas
[1203.0] Closing the overlay
[1204.0] Copying disk 1/1 to /var/lib/libvirt/images/esx5.1-rhel6.4-x86_64-sda (raw)
qemu-img: error while reading sector 378880: Input/output error

virt-v2v: error: qemu-img command failed, see earlier errors

If reporting bugs, run virt-v2v with debugging enabled and include the 
complete output:

  virt-v2v -v -x [...]

Actual results:
As decribed.

Expected results:
Met I/O error during conversion of esx guests.

Additional info:
I can not reproduce this bug every time,and each time I can reproduce,it fails at different sector.
Attached the log file.

Comment 4 Richard W.M. Jones 2014-09-25 16:39:14 UTC
My tests were not conclusive.  I think the ESX server is
just too far away (physically) for me to be able to test this.
I was mainly just seeing extremely slow data transfers and
timeouts.

Comment 5 Richard W.M. Jones 2015-02-24 11:43:20 UTC
*** Bug 1194338 has been marked as a duplicate of this bug. ***

Comment 9 Richard W.M. Jones 2015-06-25 18:45:33 UTC
Created attachment 1043238 [details]
virt-v2v.log.gz

I managed to reproduce the error, using the server at 10.66.111.25.
I enabled curl debugging in qemu to see if I could get more details
when the problem happens.

Eventually after running the test several times, I did capture a
full debug output from the bug.  This is attached.  The relevant
part is very close to the end:

  CURL: Just reading 16344 bytes
  CURL: Just reading 16344 byt* Operation timed out after 599821 milliseconds with
 12403968 out of 69206016 bytes received
  #1
  * Closing connection 1
  qemu-img: error while reading sector 20344832: Input/output error
  es
  CURL: Just reading 16344 bytes
  CURL: Just reading 16344 bytes

The "#1" in the output comes from a debug message which I
manually added to qemu so I could tell the difference between the
two places where block/curl.c returns EIO:

  @@ -305,6 +305,7 @@ static void curl_multi_check_completion(BDRVCURLState *s)
                         continue;
                     }
 
  +                  fprintf (stderr, "#1\n");
                     acb->common.cb(acb->common.opaque, -EIO);
                     qemu_aio_unref(acb);
                     state->acb[i] = NULL;
  @@ -653,6 +654,7 @@ static void curl_readv_bh_cb(void *p)
     // No cache found, so let's start a new request
     state = curl_init_state(acb->common.bs, s);
     if (!state) {
  +    fprintf (stderr, "#2\n");
         acb->common.cb(acb->common.opaque, -EIO);
         qemu_aio_unref(acb);
         return;

This is possibly a qemu bug, but I'm now in a good place to
investigate this further.

(This is all with upstream virt-v2v and qemu from git, but it appears
to be the same bug)

Comment 10 Richard W.M. Jones 2015-06-25 19:12:15 UTC
A few things to note:

(1) It's a timeout issue.  In virt-v2v we set the timeout to 600 seconds,
and in the qemu driver the timeout is firing after 599,821 milliseconds,
which is close as makes no difference to 600 seconds.  Therefore a
simple virt-v2v patch to lengthen the timeout should fix it:

--- a/v2v/input_libvirt_vcenter_https.ml
+++ b/v2v/input_libvirt_vcenter_https.ml
@@ -250,7 +250,8 @@ let map_source_to_uri ?readahead password uri scheme server path =
     let json_params = [
       "file.driver", JSON.String "https";
       "file.url", JSON.String url;
-      "file.timeout", JSON.Int 600;
+      (* https://bugzilla.redhat.com/show_bug.cgi?id=1146007#c10 *)
+      "file.timeout", JSON.Int 2000;
     ] in
 
     let json_params =

(2) qemu loses the real underlying error, and replaces it with EIO,
hence the user sees just "Input/output error" which is wrong and
confusing.  Therefore we can change qemu so it doesn't lose the real
error.

Unfortunately the options for this are not very good.  We could
map the CURLE_* errors to errno, and/or call error_report.  What we
cannot do is call error_setg (because the infrastructure doesn't
exist in qemu yet).

Comment 11 Richard W.M. Jones 2015-06-25 19:55:36 UTC
libguestfs patch posted:

https://www.redhat.com/archives/libguestfs/2015-June/msg00278.html

Comment 13 Richard W.M. Jones 2015-06-30 13:55:11 UTC
I'll just note what I actually fixed here:

The bug happens because we exceed a 10 minute timeout during
the copy phase.  I have increased the timeout to make it much
larger.

However if we still hit the larger timeout, you'll still see
the strange "Input/output error".  The timeout error message
is lost, but that's a separate fix in qemu (bug 1235812, bug 1235813).

Comment 14 zhoujunqin 2015-07-28 03:51:09 UTC
Try to verify this bug with package:
libvirt-1.2.17-2.el7.x86_64
libguestfs-1.28.1-1.47.el7.x86_64
virt-v2v-1.28.1-1.47.el7.x86_64
qemu-kvm-rhev-2.3.0-13.el7.x86_64

steps:
# virt-v2v -ic vpx://root.4.103/tzheng-demo/10.66.106.63/?no_verify=1 --password-file /tmp/passwd2 esx5.5-win7-x86_64
[   0.0] Opening the source -i libvirt -ic vpx://root.4.103/tzheng-demo/10.66.106.63/?no_verify=1 esx5.5-win7-x86_64
[  25.0] Creating an overlay to protect the source from being modified
[  26.0] Opening the overlay
[  39.0] Initializing the target -o libvirt -os default
[  39.0] Inspecting the overlay
[  94.0] Checking for sufficient free disk space in the guest
[  94.0] Estimating space required on target for each disk
[  94.0] Converting Windows 7 Ultimate to run on KVM
virt-v2v: This guest has virtio drivers installed.
[ 108.0] Mapping filesystem data to avoid copying unused and blank areas
[ 110.0] Closing the overlay
[ 110.0] Checking if the guest needs BIOS or UEFI to boot
[ 110.0] Copying disk 1/1 to /var/lib/libvirt/images/esx5.5-win7-x86_64-sda (raw)
    (100.00/100%)
[ 864.0] Creating output metadata
Pool default refreshed

Domain esx5.5-win7-x86_64 defined from /tmp/v2vlibvirt9ac9d6.xml

[ 865.0] Finishing off

Result: Conversion finished without no error, and guest can boot up. Also according to acceptance testing passed for this virt-v2v version, move this bug from ON_QA to VERIFIED.

Comment 16 errata-xmlrpc 2015-11-19 06:57:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2183.html