Created attachment 1641518 [details] libguesttfs Python ls failure on Windows 98 SendTo directory Description of problem: I can't run the ls method on certain Windows 98 subdirectories. Version-Release number of selected component: 1.40.2-2ubuntu6 How reproducible: Always Steps to Reproduce: 1. install Windows 98 in a VM 2. mount it in libguestfs 3. walk the filesystem using g.ls() 4. UnicodeDecodeError for /WINDOWS/SendTo Actual results: libguest returns a UnicodeDecodeError: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbc in position 1: invalid start byte Expected results: It should have listed the directory contents. Additional info: Please look at the attachment, it showcase the error. I can provide the qcow if needed. Thanks !
This is probably related to the abject state of unicode handling in our Python bindings. Is this Python 2 or Python 3? You might also try: $ virt-rescue --ro -a disk.img -i ><rescue> ls -l /sysroot/WINDOWS/SendTo | hexdump -C and see what actual characters ntfs-3g sees in the name.
Hi Richard, > Is this Python 2 or Python 3? This is Python3.7.5 > virt-rescue --ro -a disk.img -i virt-rescue: no operating system was found on this disk Libguestfs cannot detect an operating system inside the qcow, even though its there, I can boot it. The main partition is using vfat: In [4]: g.list_partitions() Out[4]: ['/dev/sda1'] In [5]: g.list_filesystems() Out[5]: {'/dev/sda1': 'vfat'} Thanks !
@Richard, I uploaded the qcow here, that helps: https://drive.google.com/open?id=1TBWx10bkQvk3i7H5JeXIs_gvxs0I8TrY
I can't see any control character in the names in that directory: $ virt-rescue --ro -a win98.qcow2 ><rescue> mount /dev/sda1 /sysroot ><rescue> ls -1 --show-control-chars --quoting-style=literal /sysroot/WINDOWS/SendTo 5? Floppy (A).lnk Desktop as Shortcut.DeskLink Mail Recipient.MAPIMail My Documents.mydocs ><rescue> ls -1 --show-control-chars --quoting-style=literal /sysroot/WINDOWS/SendTo | hexdump -C 00000000 35 3f 20 46 6c 6f 70 70 79 20 28 41 29 2e 6c 6e |5? Floppy (A).ln| 00000010 6b 0a 44 65 73 6b 74 6f 70 20 61 73 20 53 68 6f |k.Desktop as Sho| 00000020 72 74 63 75 74 2e 44 65 73 6b 4c 69 6e 6b 0a 4d |rtcut.DeskLink.M| 00000030 61 69 6c 20 52 65 63 69 70 69 65 6e 74 2e 4d 41 |ail Recipient.MA| 00000040 50 49 4d 61 69 6c 0a 4d 79 20 44 6f 63 75 6d 65 |PIMail.My Docume| 00000050 6e 74 73 2e 6d 79 64 6f 63 73 0a |nts.mydocs.| 0000005b In particular no 0xbc byte. Do you know which exact file in that image causes a problem?
Actually I wonder if ntfs-3g is doing some quoting of its own before the Linux kernel even sees the strings. That might explain the "?" character that appears in one filename. You might try playing with mount options in virt-rescue (see https://linux.die.net/man/8/mount.ntfs-3g )
Hi Richard, Thanks for looking at the qcow file. > In particular no 0xbc byte. Do you know which exact file in that image causes a problem? I have no idea, I just have the screenshot that I sent you. And all I know is that when I want to do a listing of this directory, I get an Python UnicodeDecodeError You can try to reproduce the issue using libguestfs Python API, as I demonstrated. Thanks !
Hi Richard, I'm getting spammed by Bugzilla, requesting comments on this bug report because the status is "NEEDINFO". Do you need something else than than the qcow to reproduce the bug on your side ? Otherwise please change the bug status :) Thanks.
Unsetting NEEDINFO. I don't have any further insight into this bug though ...
This product has been discontinued or is no longer tracked in Red Hat Bugzilla.
Reopening because Virtualization Tools has not been discontinued.