Bug 1778962 - ls fails on certain Windows 98 directories with UnicodeDecodeError
Summary: ls fails on certain Windows 98 directories with UnicodeDecodeError
Keywords:
Status: NEW
Alias: None
Product: Virtualization Tools
Classification: Community
Component: libguestfs
Version: unspecified
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Richard W.M. Jones
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-12-02 23:42 UTC by mathieu.tarral
Modified: 2025-10-17 12:52 UTC (History)
1 user (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2025-10-17 00:10:58 UTC
Embargoed:


Attachments (Terms of Use)
libguesttfs Python ls failure on Windows 98 SendTo directory (83.22 KB, image/png)
2019-12-02 23:42 UTC, mathieu.tarral
no flags Details

Description mathieu.tarral 2019-12-02 23:42:16 UTC
Created attachment 1641518 [details]
libguesttfs Python ls failure on Windows 98 SendTo directory

Description of problem:

I can't run the ls method on certain Windows 98 subdirectories.


Version-Release number of selected component: 1.40.2-2ubuntu6


How reproducible: Always


Steps to Reproduce:
1. install Windows 98 in a VM
2. mount it in libguestfs
3. walk the filesystem using g.ls()
4. UnicodeDecodeError for /WINDOWS/SendTo

Actual results:

libguest returns a UnicodeDecodeError:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbc in position 1: invalid start byte

Expected results:
It should have listed the directory contents.


Additional info:

Please look at the attachment, it showcase the error.

I can provide the qcow if needed.

Thanks !

Comment 1 Richard W.M. Jones 2019-12-03 14:34:40 UTC
This is probably related to the abject state of unicode handling in
our Python bindings.  Is this Python 2 or Python 3?

You might also try:

$ virt-rescue --ro -a disk.img -i
><rescue> ls -l /sysroot/WINDOWS/SendTo | hexdump -C

and see what actual characters ntfs-3g sees in the name.

Comment 2 mathieu.tarral 2019-12-03 19:58:51 UTC
Hi Richard,

>  Is this Python 2 or Python 3?
This is Python3.7.5

> virt-rescue --ro -a disk.img -i

virt-rescue: no operating system was found on this disk

Libguestfs cannot detect an operating system inside the qcow, even though its there,
I can boot it.

The main partition is using vfat:
In [4]: g.list_partitions()
Out[4]: ['/dev/sda1']

In [5]: g.list_filesystems()
Out[5]: {'/dev/sda1': 'vfat'}

Thanks !

Comment 3 mathieu.tarral 2019-12-03 20:15:20 UTC
@Richard,

I uploaded the qcow here, that helps:
https://drive.google.com/open?id=1TBWx10bkQvk3i7H5JeXIs_gvxs0I8TrY

Comment 4 Richard W.M. Jones 2019-12-09 10:36:54 UTC
I can't see any control character in the names in that directory:

$ virt-rescue --ro -a win98.qcow2
><rescue> mount /dev/sda1 /sysroot
><rescue> ls -1 --show-control-chars --quoting-style=literal /sysroot/WINDOWS/SendTo  
5? Floppy (A).lnk
Desktop as Shortcut.DeskLink
Mail Recipient.MAPIMail
My Documents.mydocs
><rescue> ls -1 --show-control-chars --quoting-style=literal /sysroot/WINDOWS/SendTo  | hexdump -C
00000000  35 3f 20 46 6c 6f 70 70  79 20 28 41 29 2e 6c 6e  |5? Floppy (A).ln|
00000010  6b 0a 44 65 73 6b 74 6f  70 20 61 73 20 53 68 6f  |k.Desktop as Sho|
00000020  72 74 63 75 74 2e 44 65  73 6b 4c 69 6e 6b 0a 4d  |rtcut.DeskLink.M|
00000030  61 69 6c 20 52 65 63 69  70 69 65 6e 74 2e 4d 41  |ail Recipient.MA|
00000040  50 49 4d 61 69 6c 0a 4d  79 20 44 6f 63 75 6d 65  |PIMail.My Docume|
00000050  6e 74 73 2e 6d 79 64 6f  63 73 0a                 |nts.mydocs.|
0000005b

In particular no 0xbc byte.  Do you know which exact file in that image causes
a problem?

Comment 5 Richard W.M. Jones 2019-12-09 10:39:16 UTC
Actually I wonder if ntfs-3g is doing some quoting of its own before
the Linux kernel even sees the strings.  That might explain the "?"
character that appears in one filename.  You might try playing with
mount options in virt-rescue (see https://linux.die.net/man/8/mount.ntfs-3g )

Comment 6 mathieu.tarral 2019-12-10 19:20:56 UTC
Hi Richard,

Thanks for looking at the qcow file.

> In particular no 0xbc byte.  Do you know which exact file in that image causes
a problem?

I have no idea, I just have the screenshot that I sent you.
And all I know is that when I want to do a listing of this directory, I get an Python UnicodeDecodeError

You can try to reproduce the issue using libguestfs Python API, as I demonstrated.

Thanks !

Comment 7 mathieu.tarral 2019-12-23 03:03:00 UTC
Hi Richard,

I'm getting spammed by Bugzilla, requesting comments on this bug report because
the status is "NEEDINFO".

Do you need something else than than the qcow to reproduce the bug on your side ?
Otherwise please change the bug status :)

Thanks.

Comment 8 Richard W.M. Jones 2020-03-02 10:58:23 UTC
Unsetting NEEDINFO.  I don't have any further insight into this bug though ...

Comment 9 Red Hat Bugzilla 2025-10-17 00:10:58 UTC
This product has been discontinued or is no longer tracked in Red Hat Bugzilla.

Comment 10 Alasdair Kergon 2025-10-17 12:52:13 UTC
Reopening because Virtualization Tools has not been discontinued.


Note You need to log in before you can comment on or make changes to this bug.