Bug 1905772
| Summary: | RFE: Make the error clearer when vddk-thumbprint is wrong | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | liuzi <zili> |
| Component: | nbdkit | Assignee: | Laszlo Ersek <lersek> |
| Status: | CLOSED ERRATA | QA Contact: | Vera <vwu> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | unspecified | CC: | eblake, juzhou, lersek, mxie, ptoscano, rjones, tyan, tzheng, virt-maint, xiaodwan |
| Target Milestone: | beta | Keywords: | FutureFeature, Triaged |
| Target Release: | --- | Flags: | pm-rhel:
mirror+
|
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | V2V | ||
| Fixed In Version: | nbdkit-1.30.8-1.el9 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-11-15 09:50:17 UTC | Type: | Feature Request |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
liuzi
2020-12-09 03:07:44 UTC
Here's a shorter reproducer: $ virt-v2v -ic vpx://root.198.169/data/10.73.199.217/?no_verify=1 esx7.0-win2019-x86_64 -it vddk -io vddk-libdir=/home/rjones/tmp/vddk/vmware-vix-disklib-distrib -io vddk-thumbprint=C2:99:4E:B8:87:75:E8:41:71:6B:38:DA:07:C4:6B:0E:66:18:C0:75 -ip /tmp/passwd -o null [ 0.0] Opening the source -i libvirt -ic vpx://root.198.169/data/10.73.199.217/?no_verify=1 esx7.0-win2019-x86_64 -it vddk -io vddk-libdir=/home/rjones/tmp/vddk/vmware-vix-disklib-distrib -io vddk-thumbprint=C2:99:4E:B8:87:75:E8:41:71:6B:38:DA:07:C4:6B:0E:66:18:C0:75 [ 10.6] Creating an overlay to protect the source from being modified nbdkit: vddk[1]: error: VixDiskLib_Open: [esx7.0-matrix] esx7.0-win2019-x86_64/esx7.0-win2019-x86_64.vmdk: Unknown error qemu-img: /home/rjones/d/virt-v2v/tmp/v2vovla529dd.qcow2: Requested export not available Could not open backing image. virt-v2v: error: qemu-img command failed, see earlier errors Compare this to the case where the password is wrong, and the error is clearer (because the error in this case comes from libvirt): $ virt-v2v -ic vpx://root.198.169/data/10.73.199.217/?no_verify=1 esx7.0-win2019-x86_64 -it vddk -io vddk-libdir=/home/rjones/tmp/vddk/vmware-vix-disklib-distrib -io vddk-thumbprint=C2:99:4E:B8:87:75:E8:41:71:6B:38:DA:07:C4:6B:0E:66:18:C0:75 -o null [ 0.0] Opening the source -i libvirt -ic vpx://root.198.169/data/10.73.199.217/?no_verify=1 esx7.0-win2019-x86_64 -it vddk -io vddk-libdir=/home/rjones/tmp/vddk/vmware-vix-disklib-distrib -io vddk-thumbprint=C2:99:4E:B8:87:75:E8:41:71:6B:38:DA:07:C4:6B:0E:66:18:C0:75 Enter root's password for 10.73.198.169: virt-v2v: error: exception: libvirt: VIR_ERR_INTERNAL_ERROR: VIR_FROM_ESX: internal error: HTTP response code 500 for call to 'Login'. Fault: ServerFaultCode - Cannot complete login due to an incorrect user name or password. The only way to find the true failure in the first case is to use virt-v2v -vx, where you will see this within the huge debug log: nbdkit: vddk[1]: debug: VixDiskLib: VixDiskLib_OpenEx: Failed to start session. Other error encountered: SSL Exception: Verification parameters: nbdkit: vddk[1]: debug: PeerThumbprint: B5:52:1F:B4:21:09:45:24:51:32:56:F6:63:6A:93:5D:54:08:2D:78 nbdkit: vddk[1]: debug: ExpectedThumbprint: c2:99:4e:b8:87:75:e8:41:71:6b:38:da:07:c4:6b:0e:66:18:c0:75 The basic problem that makes this hard is that the error happens in the nbdkit subprocess, and even worse VDDK only sends this information to its debug channel - it doesn't produce a useful error message at all in this case (literally VDDK's error is "Unknown error"). The only real solution to this is going to involve somehow parsing out the free text VDDK debug messages. I don't think we should go to extreme lengths here. Previously I've made the point that presenting crypto errors understandably to the user is a lost cause <https://bugzilla.redhat.com/show_bug.cgi?id=1778090#c13>. Instead, we should check for VixDiskLib_Open() returning VIX_E_FAIL ("Unknown error"), and then suggest that the user verify the thumbprint. It's probably better yet if we distinguish VIX_E_FAIL in our VDDK_ERROR() macro. I've been musing on if we can send the error message over the NBD protocol itself. The protocol supports a thing called "structured replies" where in response to an NBD_CMD_PREAD (only?) call you can send error message strings. However I don't think you can send error strings for the initial connection. In any case nbdkit doesn't support structured replies at all. You can only send a limited set of error codes (success, EPERM, EIO, ENOMEM, EINVAL, ENOSPC, EOVERFLOW, ENOTSUP and ESHUTDOWN) so I guess you could encode the VIX_E_FAIL into one of those. (In reply to Richard W.M. Jones from comment #8) > You can only send a limited set of error codes (success, EPERM, EIO, > ENOMEM, EINVAL, ENOSPC, EOVERFLOW, ENOTSUP and ESHUTDOWN) so I guess you > could encode the VIX_E_FAIL into one of those. I don't understand -- the "open" plugin method <https://libguestfs.org/nbdkit-plugin.3.html#open> does not seem to support returning error codes at all. And vddk_open() calls nbdkit_error() already, via VDDK_ERROR(). At <https://libguestfs.org/nbdkit-plugin.3.html#Error-handling>, the documentation says, "additionally, if the callback is involved in serving data, the plugin should call nbdkit_set_error to influence the error code that will be sent to the client" -- but "open" does not read or write data. The documentation continues, "nbdkit_set_error can be called at any time, but only has an impact during callbacks for serving data". Yes I think you're right and I was confused again about connection vs data commands. I think it would be possible to send one of the error codes NBD_REP_ERR_* listed here: https://gitlab.com/nbdkit/nbdkit/-/blob/7eb356719376c4d0b2379cea5d39c81602d2d304/common/protocol/nbd-protocol.h#L128 nbdkit doesn't provide any way to control this from the plugin, although maybe it should. (Note those error codes are not related to errno, they're a separate set of values for failed connections only.) But but! I wasn't imagining it after all ... There *is* a way to send back an error string in one of those replies. Note in the NBD protocol here it says: "All error replies MAY have some data set, in which case that data is an error message string suitable for display to the user." https://github.com/NetworkBlockDevice/nbd/blob/master/doc/proto.md#option-reply-types Anyway this isn't available through nbdkit, although I think it would be a pretty nice addition. For virt-v2v's purposes though, nbdkit runs locally, and nbdkit's stderr messages (from nbdkit_error()) are nicely interleaved with virt-v2v's messages. This effectively means that whatever we log from VDDK_ERROR() reaches the user (or the log file, if redirected) fine. I understand that "data in error replies" would be a generic feature replacing "see earlier errors" in comment#0, but that's larger than what I'd like to bite off for this BZ :) Agreed, that's by far the easiest way to fix this bug. If at some future time we want to split out copying into a separate supervisor process, we can try the nice (but hard) fix. [nbdkit PATCH] vddk: advise user on obscure thumbprint mismatch error condition Message-Id: <20220517080228.8713-1-lersek> https://listman.redhat.com/archives/libguestfs/2022-May/028866.html (Setting component to nbdkit since the fix must be added to that component) (In reply to Laszlo Ersek from comment #13) > [nbdkit PATCH] vddk: advise user on obscure thumbprint mismatch error condition > Message-Id: <20220517080228.8713-1-lersek> > https://listman.redhat.com/archives/libguestfs/2022-May/028866.html Upstream commit ce6d20eee4f3. I didn't notice but this bug was fixed already by an earlier rebase. Verified with the versions: virt-v2v-2.0.7-1.el9.x86_64 nbdkit-1.30.6-3.el9.x86_64 Steps: Convert the VM with the wrong vddk-thumbprint and check the output error. # virt-v2v -ic vpx://root.227.27/data/10.73.199.217/?no_verify=1 -o rhv-upload -of raw -os nfs_data -oo rhv-cluster=NFS -oa preallocated -oo rhv-verifypeer=true --mac 00:50:56:83:8e:f6:network:ovirtmgmt -it vddk -io vddk-libdir=/home/vddk7.0.3 -io vddk-thumbprint=1F:97:34:5F:B6:C2:BA:66:46:CB:1A:71:76:7D:6B:50:1E:03:00:EA esx7.0-win2022-x86_64 -ip /v2v-ops/esxpw -oc https://dell-per740-22.lab.eng.pek2.redhat.com/ovirt-engine/api -op /v2v-ops/rhvpasswd [ 0.0] Setting up the source: -i libvirt -ic vpx://root.227.27/data/10.73.199.217/?no_verify=1 -it vddk esx7.0-win2022-x86_64 [ 1.9] Opening the source nbdkit: vddk[1]: error: VixDiskLib_Open: [esx7.0-matrix] esx7.0-win2022-x86_64_1/esx7.0-win2022-x86_64.vmdk: Unknown error nbdkit: vddk[1]: error: Please verify whether the "thumbprint" parameter (1F:97:34:5F:B6:C2:BA:66:46:CB:1A:71:76:7D:6B:50:1E:03:00:EA) matches the SHA1 fingerprint of the remote VMware server. Refer to nbdkit-vddk-plugin(1) section "THUMBPRINTS" for details. virt-v2v: error: libguestfs error: could not create appliance through libvirt. Try running qemu directly without libvirt using this environment variable: export LIBGUESTFS_BACKEND=direct Original error from libvirt: internal error: process exited while connecting to monitor: 2022-07-12T02:40:21.824905Z qemu-kvm: -blockdev {"driver":"nbd","server":{"type":"unix","path":"/tmp/v2v.7y6c5r/in0"},"node-name":"libvirt-2-storage","cache":{"direct":false,"no-flush":true},"auto-read-only":true,"discard":"unmap"}: Requested export not available [code=1 int1=-1] If reporting bugs, run virt-v2v with debugging enabled and include the complete output: virt-v2v -v -x [...] Moving to Verified. Hi Laszlo,
I think below error info looks a little confusing, is it possible to hide it in the fixing for the bug?
> Try running qemu directly without libvirt using this environment variable:
> export LIBGUESTFS_BACKEND=direct
>
> Original error from libvirt: internal error: process exited while
> connecting to monitor: 2022-07-12T02:40:21.824905Z qemu-kvm: -blockdev
> {"driver":"nbd","server":{"type":"unix","path":"/tmp/v2v.7y6c5r/in0"},"node-
> name":"libvirt-2-storage","cache":{"direct":false,"no-flush":true},"auto-
> read-only":true,"discard":"unmap"}:
> Requested export not available [code=1 int1=-1]
>
> If reporting bugs, run virt-v2v with debugging enabled and include the
> complete output:
I don't think we should hide these errors, they might provide more information when troubleshooting. The message we print about "Please verify whether the "thumbprint" parameter" is only a hint, the real problem might be something else. What Rich said. A wrong thumbprint is one common cause for "Unknown error", but by no means the only one. The thumbprint hint is basically a guess, it can easily be wrong, and the cause of "Unknown error" could be something different. That's the nature of "Unkown error" :/ Really bad vddk API behavior there. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (nbdkit bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:7945 |