RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1301593 - Windows inspection fails with: guestfsd: error: readdir: Invalid or incomplete multibyte or wide character
Summary: Windows inspection fails with: guestfsd: error: readdir: Invalid or incomplet...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libguestfs-winsupport
Version: 7.3
Hardware: Unspecified
OS: Unspecified
medium
unspecified
Target Milestone: rc
: 7.4
Assignee: Richard W.M. Jones
QA Contact: Virtualization Bugs
URL:
Whiteboard: V2V P2V
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-01-25 13:17 UTC by Richard W.M. Jones
Modified: 2017-08-01 16:52 UTC (History)
8 users (show)

Fixed In Version: libguestfs-winsupport-7.2-2.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-08-01 16:52:00 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
virt-v2v conversion log (89.53 KB, text/plain)
2016-01-25 13:17 UTC, Richard W.M. Jones
no flags Details
test1.img.xz (101.28 KB, application/x-xz)
2016-01-25 14:39 UTC, Richard W.M. Jones
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:1979 0 normal SHIPPED_LIVE libguestfs-winsupport bug fix update 2017-08-01 17:58:08 UTC

Description Richard W.M. Jones 2016-01-25 13:17:57 UTC
Created attachment 1117985 [details]
virt-v2v conversion log

Description of problem:

During a virt-p2v conversion, Windows inspection fails with an ntfs-3g error:

libguestfs: trace: case_sensitive_path "/WINDOWS/system32/config"
guestfsd: main_loop: proc 38 (is_dir) took 0.00 seconds
guestfsd: main_loop: new request, len 0x44
guestfsd: error: readdir: Invalid or incomplete multibyte or wide character
libguestfs: trace: case_sensitive_path = NULL (error)

The full log supplied by the user is attached.

Version-Release number of selected component (if applicable):

virt-v2v 1.28.1.1.55
CentOS 7.2

The physical server being converted is Windows Server 2003.

How reproducible:

For the user this is 100% reproducible.  I have not managed to
reproduce it myself yet.

Comment 1 Richard W.M. Jones 2016-01-25 13:20:28 UTC
I'm adding V2V & P2V whiteboard flags, but this is not specifically
about v2v (or even about libguestfs).

Comment 5 Richard W.M. Jones 2016-01-25 13:52:39 UTC
Our working theory is that the c:\windows\system32 directory
contains a file called "Chaînes.scf".  I have attached the directory
listings supplied by the user as private attachments.

Comment 6 Richard W.M. Jones 2016-01-25 14:39:42 UTC
Created attachment 1118059 [details]
test1.img.xz

(In reply to Richard W.M. Jones from comment #5)
> Our working theory is that the c:\windows\system32 directory
> contains a file called "Chaînes.scf".  I have attached the directory
> listings supplied by the user as private attachments.

This theory was wrong, but I have reproduced the problem
by deliberately creating a malformed NTFS partition.  I did
this by touching a file called "/test/pqrst" and then (using
a hex editor) modifying the file "t" character on disk to be the
invalid[1] UCS2 character U+DF00.

The xz-compressed disk image is attached.

Opening the disk image in guestfish causes the error for various
commands, eg:

><fs> ll /test
libguestfs: error: ll: ls: reading directory /sysroot/test: Invalid or incomplete multibyte or wide character
><fs> case-sensitive-path /test/a
libguestfs: error: case_sensitive_path: readdir: Invalid or incomplete multibyte or wide character

So I suspect that the reporter's disk image contains some illegal
filename.  What's interesting is that Windows 2003 seems quite
happy, so the filename is only illegal for ntfs-3g and not for
Windows.  Or maybe this is a bug in ntfs-3g and the filename is not
illegal at all.

[1] http://www.fileformat.info/info/unicode/char/df00/index.htm

Comment 8 Richard W.M. Jones 2016-06-22 13:19:33 UTC
Long thread discussing this:

https://www.mail-archive.com/ntfs-3g-devel@lists.sourceforge.net/msg01174.html

The outcome were a couple of patches which went upstream (in ntfs-3g).
Including these in ntfs-3g (or libguestfs-winsupport) ought to make
our handling of these guests more robust.

Unfortunately the problem with this plan is that libguestfs-winsupport
is not on the RHEL 7.3 ACL.

commit d9c61dd60ec484909f70b7a916ada3a93af94b60
Author: Erik Larsson <*@*>
Date:   Fri Apr 8 05:39:48 2016 +0200

    unistr.c: Enable encoding broken UTF-16 into broken UTF-8, A.K.A. WTF-8.
    
    Windows filenames may contain invalid UTF-16 sequences (specifically
    broken surrogate pairs), which cannot be converted to UTF-8 if we do
    strict conversion.
    
    This patch enables encoding broken UTF-16 into similarly broken UTF-8 by
    encoding any surrogate character that don't have a match into a separate
    3-byte UTF-8 sequence.
    
    This is "sort of" valid UTF-8, but not valid Unicode since the code
    points used for surrogate pair encoding are not supposed to occur in a
    valid Unicode string... but on the other hand the source UTF-16 data is
    also broken, so we aren't really making things any worse.
    
    This format is sometimes referred to as WTF-8 (Wobbly Translation
    Format, 8-bit encoding) and is a common solution to represent broken
    UTF-16 as UTF-8.
    
    It is a lossless round-trip conversion, i.e converting from broken
    UTF-16 to "WTF-8" and back to UTF-16 yields the same broken UTF-16
    sequence. Because of this property it enables accessing these files
    by filename through ntfs-3g and the ntfsprogs (e.g. ls -la works as
    expected).
    
    To disable this behaviour you can pass the preprocessor/compiler flag
    '-DALLOW_BROKEN_SURROGATES=0' when building ntfs-3g.

commit f0370bfa9c47575d4e47c94e443aa91983683a43
Author: Erik Larsson <*@*>
Date:   Tue Apr 12 17:02:40 2016 +0200

    unistr.c: Unify the two defines NOREVBOM and ALLOW_BROKEN_SURROGATES.
    
    In the mailing list discussion we came to the conclusion that there
    doesn't seem to be any reason to keep these declarations separate since
    they address the same issue, namely libntfs-3g's tolerance for bad
    Unicode data in filenames and other UTF-16 strings in the file system,
    so merge the two defines into the new define ALLOW_BROKEN_UNICODE.

Comment 10 Richard W.M. Jones 2017-02-22 11:28:52 UTC
Reproducer using RHEL 7.4 host:

$ guestfish \
    set-program virt-foo : \
    add-ro test1.img : run : mount /dev/sda1 / : \
    ll /test
libguestfs: error: ll: ls: reading directory /sysroot/test: Invalid or incomplete multibyte or wide character

Comment 11 Richard W.M. Jones 2017-02-22 11:30:25 UTC
(In reply to Richard W.M. Jones from comment #10)
> Reproducer using RHEL 7.4 host:
> 
> $ guestfish \
>     set-program virt-foo : \
>     add-ro test1.img : run : mount /dev/sda1 / : \
>     ll /test
> libguestfs: error: ll: ls: reading directory /sysroot/test: Invalid or
> incomplete multibyte or wide character

I should have said, you must download the test image from
comment 6.

If you use the fixed package, you will see this output instead:

$ guestfish set-program virt-foo : add-ro test1.img : run : mount /dev/sda1 / : ll /test
total 4
drwxrwxrwx 1 root root    0 Jan 25  2016 .
drwxrwxrwx 1 root root 4096 Jan 25  2016 ..
-rwxrwxrwx 1 root root    0 Jan 25  2016 pqrs���

Comment 12 YongkuiGuo 2017-06-22 11:41:00 UTC
Verified with packages:
libguestfs-winsupport-7.2-2.el7.x86_64
libguestfs-1.36.3-5.el7.x86_64


1. Download the test1.img.xz in the attachment.
2. #xz -d test1.img.xz
3. # guestfish set-program virt-foo : add-ro test1.img : run : mount /dev/sda1 / : ll /test
--------------------------------------------------
drwxrwxrwx 1 root root    0 Jan 25  2016 .
drwxrwxrwx 1 root root 4096 Jan 25  2016 ..
-rwxrwxrwx 1 root root    0 Jan 25  2016 pqrs���
--------------------------------------------------

The guestfish command can be executed successfully. So verified this bug. 





Note: I have reproduced it with the package of libguestfs-winsupport-7.2-1.el7.x86_64.

Comment 13 errata-xmlrpc 2017-08-01 16:52:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1979


Note You need to log in before you can comment on or make changes to this bug.