Bug 1648520

Summary: Hivex key collation disagrees with Windows so sometimes keys are missing after import
Product: [Community] Virtualization Tools Reporter: Leonard <593749519>
Component: hivexAssignee: Laszlo Ersek <lersek>
Status: CLOSED UPSTREAM QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: 593749519, lersek, ptoscano, rjones
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Windows   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1648523 1648524 (view as bug list) Environment:
Last Closed: 2021-09-13 19:00:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1648523, 1648524    
Attachments:
Description Flags
a registry example none

Description Leonard 2018-11-10 09:20:52 UTC
Created attachment 1503955 [details]
a registry example

Found one special case that two keys created using the lib compiled with vc, one of them missing(the shorter one) when mount with windows regedit.exe. the attachment is the sample.

Comment 1 Richard W.M. Jones 2018-11-10 09:30:40 UTC
Hivex refuses to open this file:

$ hivexsh -d test1.reg 
hivex: hivex_open: created handle 0x55d64a255dd0
hivex: hivex_open: returning EINVAL because: test1.reg: file is too small to be a Windows NT Registry hive file
hivexsh: failed to open hive file: test1.reg: Invalid argument

How did you create it?

Comment 2 Richard W.M. Jones 2018-11-10 09:52:31 UTC
The registry is:

[HKEY_LOCAL_MACHINE\SYSTEM\1\CurrentControlSet\Services\VSS\Diag\Lovelace(__?GLOBALROOT_Device_HarddiskVolume3)]

[HKEY_LOCAL_MACHINE\SYSTEM\1\CurrentControlSet\Services\VSS\Diag\Lovelace(C:_)]

Comment 3 Richard W.M. Jones 2018-11-10 10:15:59 UTC
To test this I created a new hive called 'bz1648520' by this method:

(1) Copy hivex/images/minimal (from hivex source) to bz1648520

(2) Edit it using hivexsh:

$ hivexsh -w bz1648520
bz1648520\> add Lovelace(__?GLOBALROOT_Device_HarddiskVolume3)
bz1648520\> add Lovelace(C:_)
bz1648520\> ls
Lovelace(__?GLOBALROOT_Device_HarddiskVolume3)
Lovelace(C:_)
bz1648520\> commit

(3) Load the hive into a temporary Windows VM:

$ virt-builder windows-6.2-server --upload bz1648520:/bz1648520 

(4) Boot Windows VM and open the hive in regedit:

$ qemu-system-x86_64 -nodefconfig -nodefaults -display gtk -vga qxl -machine accel=kvm:tcg -cpu host                    -m 2048 -drive file=windows-6.2-server.img,format=raw,if=ide

C:\regedit

Create a new temporary registry key anywhere in the tree.

Select File -> Import -> File of type *.* -> C:\bz1648520

Only one key appears in Windows ("Lovelace(__?GLOBALROOT_Device_HarddiskVolume3)").
The other key is missing.

I'm pretty sure this is caused by our key collation order being wrong.

Comment 4 Richard W.M. Jones 2018-11-10 10:17:58 UTC
ie. It's something to do with:

https://github.com/libguestfs/hivex/blob/be51757920b56a77e2e63247f9a8409ce994d33c/lib/write.c#L664

Our ordering probably doesn't match Windows's ordering.

Comment 5 Leonard 2018-11-10 11:52:30 UTC
(In reply to Richard W.M. Jones from comment #4)
> ie. It's something to do with:
> 
> https://github.com/libguestfs/hivex/blob/
> be51757920b56a77e2e63247f9a8409ce994d33c/lib/write.c#L664
> 
> Our ordering probably doesn't match Windows's ordering.

Is there any resolution for this?

Comment 6 Richard W.M. Jones 2018-11-10 12:23:32 UTC
I'm afraid it's a bug in open source software so you get what
you pay for!  Please have a look at the code and contribute a patch.

I have cloned this bug for RHEL 7 since it will also affect
RHEL so we will eventually allocate some time to fix it, but
that might take a long time since it doesn't directly affect
any software we are writing for RHEL.

Comment 7 Leonard 2018-11-10 14:44:43 UTC
(In reply to Richard W.M. Jones from comment #6)
> I'm afraid it's a bug in open source software so you get what
> you pay for!  Please have a look at the code and contribute a patch.
> 
> I have cloned this bug for RHEL 7 since it will also affect
> RHEL so we will eventually allocate some time to fix it, but
> that might take a long time since it doesn't directly affect
> any software we are writing for RHEL.

Tried again and again, finally change TOLOWER macro to TOUPPER in strcasecmp solved the problem of my sample. Hope it helps. I want to attach a test.reg which i exported from my own registry, which would be used to test most cases as I thought, but found nowhere to attach it in the reply.

Comment 8 Laszlo Ersek 2021-09-07 21:35:44 UTC
(1) compare_name_with_nk_name() [lib/write.c] calls strcasecmp().

(2) main() [sh/hivexsh.c] calls setlocale (LC_ALL, "").

According to POSIX, strcasecmp() depends on the current locale (the
LC_CTYPE category). If LC_CTYPE is POSIX, then strcasecmp() behaves as
if it converted both strings to lower case, and then compared the
resultant byte arrays. In any other locale, the results are unspecified.

https://pubs.opengroup.org/onlinepubs/9699919799/functions/strcasecmp.html

This means that calling strcasecmp() in a non-POSIX locale is not
invalid (not a programming error), but strcasecmp() is permitted to
return whatever it wants.

This further means that running hivexsh in a common non-POSIX locale,
such as en_US.UTF-8, will produce a hive file with keys in unspecified
order.

Comment 7 is consistent with this hypothesis; see its reference to
tolower()/toupper().

We should probably replace strcasecmp() with a newlocale() +
strcasecmp_l() combination (gnulib provides newlocale()). We should
force strcasecmp_l() to always operate in the POSIX locale, which is
most likely what Windows expects too, regarding key order.

Alternatively, we should open-code strcasecmp() -- manually lower-casing
the ASCII characters A..Z, and manually comparing the bytes. While this
seems to take more code and to run more slowly, it also appears more
portable / robust.

Comment 9 Laszlo Ersek 2021-09-13 19:00:55 UTC
Fixed in upstream commit d5a522c0bb73 ("lib: write: improve key collation compatibility with Windows", 2021-09-13).

(I'm slightly hesitant between UPSTREAM vs. NEXTRELEASE as the BZ resolution. Looking at other upstream hivex tickets in this Bugzilla installation, UPSTREAM looks more appropriate.)