RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2051559 - work around g_file_query_info() succeeding on such files for which the underlying lstat() fails with EACCES
Summary: work around g_file_query_info() succeeding on such files for which the underl...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: libosinfo
Version: 9.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: rc
: ---
Assignee: Laszlo Ersek
QA Contact: Xiaodai Wang
URL:
Whiteboard:
Depends On:
Blocks: 2053272
TreeView+ depends on / blocked
 
Reported: 2022-02-07 13:48 UTC by Xiaodai Wang
Modified: 2022-05-17 13:32 UTC (History)
12 users (show)

Fixed In Version: libosinfo-1.9.0-5.el9
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2053272 (view as bug list)
Environment:
Last Closed: 2022-05-17 13:19:13 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
GNOME Gitlab GNOME glib issues 1237 0 None None None 2022-02-09 11:05:19 UTC
GNOME Gitlab GNOME glib issues 2543 0 None None None 2022-02-09 11:58:33 UTC
Red Hat Issue Tracker RHELPLAN-111353 0 None None None 2022-02-07 14:00:23 UTC
Red Hat Product Errata RHBA-2022:2490 0 None None None 2022-05-17 13:19:19 UTC

Description Xiaodai Wang 2022-02-07 13:48:53 UTC
Description of problem:
v2v command cannot return when setting 'HOME=/root'

Version-Release number of selected component (if applicable):
virt-v2v-1.45.97-3.el9.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Run v2v command with 'HOME=/root'.
$ HOME=/root LIBGUESTFS_BACKEND=direct /bin/virt-v2v -i libvirt -ic vpx://root@<ip>/data/<ip>?no_verify=1 -o null --mac 00:50:56:ac:59:8d:network:default esx6.7-win2019-x86_64 -it vddk -io vddk-libdir=/home/v2v_auto/vddk_libdir/latest -io vddk-thumbprint=xxx -on esx6.7-win2019-x86_64f8Jp -ip /tmp/v2v_vpx_passwd
[   0.0] Setting up the source: -i libvirt -ic vpx://root.x.x/data/x.x.x.x/?no_verify=1 -it vddk esx6.7-win2019-x86_64
[   1.8] Opening the source
[   5.9] Inspecting the source
[  10.6] Checking for sufficient free disk space in the guest
[  10.6] Converting Windows Server 2019 Standard to run on KVM
virt-v2v: error: failure: libosinfo error: 
osinfo_loader_process_default_path: Can't read path /root/.config/osinfo

If reporting bugs, run virt-v2v with debugging enabled and include the 
complete output:

  virt-v2v -v -x [...]
Killed


# ps axu | grep v2v
root       50076  0.0  0.0  15292  6828 pts/2    S    08:00   0:00 su - v2v_auto
v2v_auto   50077  0.0  0.0   7440  4380 pts/2    S    08:00   0:00 -bash
v2v_auto   50435  0.9  0.2 134412 39816 pts/2    Sl+  08:08   0:00 /bin/virt-v2v -i libvirt -ic vpx://root.x.x/data/x.x.x.x/?no_verify=1 -o null --mac 00:50:56:ac:59:8d:network:default esx6.7-win2019-x86_64 -it vddk -io vddk-libdir=/home/v2v_auto/vddk_libdir/latest -io vddk-thumbprint=x.x.x.x -on esx6.7-win2019-x86_64f8Jp -ip /tmp/v2v_vpx_passwd
v2v_auto   50463  4.1  0.0      0     0 pts/2    Z+   08:08   0:00 [exe] <defunct>
v2v_auto   50474 18.9  3.2 4017688 518988 pts/2  Sl+  08:08   0:03 /usr/libexec/qemu-kvm -global virtio-blk-pci.scsi=off -no-user-config -nodefaults -display none -machine accel=kvm:tcg -cpu max -smp 4 -m 2560 -no-reboot -rtc driftfix=slew -no-hpet -global kvm-pit.lost_tick_policy=discard -kernel /var/tmp/.guestfs-1003/appliance.d/kernel -initrd /var/tmp/.guestfs-1003/appliance.d/initrd -object rng-random,filename=/dev/urandom,id=rng0 -device virtio-rng-pci,rng=rng0 -device virtio-scsi-pci,id=scsi -drive file=nbd:unix:/tmp/v2v.weL747/in0,cache=unsafe,format=raw,discard=unmap,id=hd0,if=none -device scsi-hd,drive=hd0 -drive file=/var/tmp/.guestfs-1003/appliance.d/root,snapshot=on,id=appliance,cache=unsafe,if=none -device scsi-hd,drive=appliance -device virtio-serial-pci -serial stdio -chardev socket,path=/tmp/libguestfszsD2Qw/guestfsd.sock,id=channel0 -device virtserialport,chardev=channel0,name=org.libguestfs.channel.0 -netdev user,id=usernet,net=169.254.0.0/16 -device virtio-net-pci,netdev=usernet -append panic=1 console=ttyS0 edd=off udevtimeout=6000 udev.event-timeout=6000 no_timer_check printk.time=1 cgroup_disable=memory usbcore.nousb cryptomgr.notests tsc=reliable 8250.nr_uarts=1 root=UUID=d6c759c4-e57e-4dea-a5bc-66e4cd1e1c16 selinux=0 quiet guestfs_network=1 TERM=xterm-256color guestfs_identifier=v2v
v2v_auto   50475  0.0  0.0  51956  8580 pts/2    S+   08:08   0:00 /bin/virt-v2v -i libvirt -ic vpx://root.x.x/data/x.x.x.x/?no_verify=1 -o null --mac 00:50:56:ac:59:8d:network:default esx6.7-win2019-x86_64 -it vddk -io vddk-libdir=/home/v2v_auto/vddk_libdir/latest -io vddk-thumbprint=xxx -on esx6.7-win2019-x86_64f8Jp -ip /tmp/v2v_vpx_passwd


Actual results:
The v2v command hangs forever.

Expected results:


Additional info:
1. if both 'HOME=/root' and '-v -x' exist, the v2v command can return successfully.
2. if don't set HOME=/root', the v2v command can return successfully.

Comment 1 Richard W.M. Jones 2022-02-07 14:11:40 UTC
What version of libosinfo is installed?
I'm now trying to reproduce the issue locally.

Comment 2 Xiaodai Wang 2022-02-07 14:12:19 UTC
(In reply to Richard W.M. Jones from comment #1)
> What version of libosinfo is installed?
> I'm now trying to reproduce the issue locally.

libosinfo-1.9.0-4.el9.x86_64

Comment 3 Richard W.M. Jones 2022-02-07 14:47:43 UTC
I can reproduce this with libosinfo-1.9.0-4.el9.x86_64.

Somewhat unrelated, but annoyingly, as well as failing it causes some
sort of deadlock with libguestfs on exit:

Thread 1 (Thread 0x7f7ba88aaac0 (LWP 3752309) "virt-v2v"):
#0  0x00007f7ba9effe9f in poll () at /lib64/libc.so.6
#1  0x00007f7baa5135d3 in read_data () at /lib64/libguestfs.so.0
#2  0x00007f7baa52e27f in guestfs_int_recv_from_daemon () at /lib64/libguestfs.so.0
#3  0x00007f7baa52e73a in guestfs_int_recv () at /lib64/libguestfs.so.0
#4  0x00007f7baa4fd22b in guestfs_internal_autosync () at /lib64/libguestfs.so.0
#5  0x00007f7baa522dac in shutdown_backend.lto_priv () at /lib64/libguestfs.so.0
#6  0x00007f7baa523042 in guestfs_close () at /lib64/libguestfs.so.0
#7  0x00007f7baa52318d in close_handles () at /lib64/libguestfs.so.0
#8  0x00007f7ba9e14475 in __run_exit_handlers () at /lib64/libc.so.6
#9  0x00007f7ba9e145f0 in on_exit () at /lib64/libc.so.6
#10 0x0000560ffcf41def in caml_sys_exit ()
#11 0x0000560ffce98bf3 in camlStdlib__exit_459 ()
#12 0x0000560ffcde2824 in camlV2v__entry () at v2v.ml:660
#13 0x0000560ffcdd0a29 in caml_program ()
#14 0x0000560ffcf4ff34 in caml_start_program ()
#15 0x0000560ffcf50347 in caml_startup_common ()
#16 0x0000560ffcf5038d in caml_startup ()
#17 0x0000560ffcdd0090 in main ()

(This only happens without debugging, when debugging is enabled it
still fails but at least doesn't deadlock).

Comment 4 Richard W.M. Jones 2022-02-07 14:56:09 UTC
Downgrading to libosinfo-1.9.0-3.el9.x86_64 gives the old error:

osinfo_loader_process_default_path: Unexpected file type

Comment 5 Victor Toso 2022-02-07 15:04:07 UTC
Running with virt-v2v without root but with HOME=/root should trigger

> osinfo_loader_process_default_path: Can't read path /root/.config/osinfo

by libosinfo POV. Seems like v2v_osinfo_db_load() is not handling the error properly?


> Downgrading to libosinfo-1.9.0-3.el9.x86_64 gives the old error:
> osinfo_loader_process_default_path: Unexpected file type

Yes, handling the error message was the intention of Bug 1942431 - https://gitlab.com/libosinfo/libosinfo/-/merge_requests/129/diffs?commit_id=d3b1587f7b77b630bae8ab3f4760eda69bd7fe66

Comment 6 Laszlo Ersek 2022-02-08 09:54:55 UTC
The "expected results" entry in comment 0 is not filled in. What is the
expected behavior?

Per bug 1942431, libosinfo is behaving correctly; it reports the right
error message.

By my understanding, v2v_osinfo_db_load() is also correct; it transforms
the libosinfo error into a Failure exception; caml_failwith_value() does
not return, and the argument passed to it ("errv") indeed stands for a
string (created with caml_copy_string()).

v2v_osinfo_db_load() is called "osinfo_db_load" in
"convert/libosinfo.ml"; it is used in a let expression of the class
osinfo_db:

> class osinfo_db () =
>   let h = osinfo_db_load () in
>   object (self)
>     method find_os_by_short_id name =
>       let os = osinfo_db_find_os_by_short_id h name in
>       new osinfo_os os
> end

Per my OCaml book, "Classes can be defined with leading let-definitions,
which are evaluated before a new object is created". This is how an
"osinfo_db" object is created [convert/libosinfo_utils.ml]:

> (* Singleton DB, created on the first access. *)
> let db = lazy (new Libosinfo.osinfo_db ())
> (*
>  * Helper function to get the DB -- use it as sole way to get the DB.
>  *)
> let get_db () =
>   Lazy.force db
> 
> let get_os_by_short_id os =
>   let os = (get_db ())#find_os_by_short_id os in
>   debug "libosinfo: loaded OS: %s" (os#get_id ());
>   os

So the exception is raised when "get_os_by_short_id" is first called, an
the object creation is forced.

Per <https://ocaml.org/api/Stdlib.Lazy.html>, Lazy handles exceptions
transparently: "force x forces the suspension x and returns its result.
If x has already been forced, Lazy.force x returns the same value again
without recomputing it. If it raised an exception, the same exception is
raised again."

Now, wherever "get_os_by_short_id" is called (such as in
"convert/convert_windows.ml" and "convert/windows_virtio.ml"), the
Not_found exception is caught and handled cleanly. (By falling back to
other means for determining guest capabilities and locating virtio-win
drivers etc.) A *Failure exception* escapes however.

So I think that's the primary bug here. The Not_found exception is
thrown explicitly by v2v_osinfo_os_find_os_by_short_id()
[convert/libosinfo-c.c]; however, if loading the osinfo database fails
*first*, the ocaml code in virt-v2v should arguably treat that
identically -- "osinfo is not usable now, for whatever reason". So the
question is *where* we should catch the Failure exception:

(a) in the get_db() helper function [convert/libosinfo_utils.ml],
mapping Failure to Not_found (after logging the error), or

(b) in every spot where "Libosinfo_utils.get_os_by_short_id" is called.

I think (a) looks better; something like:

> diff --git a/convert/libosinfo_utils.ml b/convert/libosinfo_utils.ml
> index 8504e2b2f001..8e2489f1f34d 100644
> --- a/convert/libosinfo_utils.ml
> +++ b/convert/libosinfo_utils.ml
> @@ -26,7 +26,10 @@ let db = lazy (new Libosinfo.osinfo_db ())
>   * Helper function to get the DB -- use it as sole way to get the DB.
>   *)
>  let get_db () =
> -  Lazy.force db
> +  try Lazy.force db
> +  with Failure errmsg ->
> +    warning "%s" errmsg;
> +    raise Not_found
>  
>  let get_os_by_short_id os =
>    let os = (get_db ())#find_os_by_short_id os in

Thoughts? Thanks.

Comment 7 Richard W.M. Jones 2022-02-08 10:21:50 UTC
The background to this is that the virt-v2v wrapper used by IMS has or had a bug
where it started as root (because it runs in a container) and then setuid to a non-root
account before running virt-v2v.  However it didn't reset $HOME.  So when virt-v2v
started it was running as non-root, but with HOME=/root.  This caused a failure in
libosinfo (bug 1942431).

So in one sense it's a bug in whatever ran virt-v2v with an incorrect $HOME variable.

However I also think that we should deal with this situation by not failing.
We're reading some local configuration file that has about a 0.0001% chance of
existing on any given machine.  If we cannot open the path, it doesn't matter
and we should try to continue.

The rest of the analysis around lazy values is correct, but I'm not sure about
the proposed fix.  Aside from the "leak" of Not_found (I understand it's not
really going to leak it, but raising Not_found always makes me uneasy), won't
this mean that we entirely skip opening the libosinfo database?  All that's
missing here is some irrelevant, unlikely-to-exist config file, why can't
we skip the check for the local config file and go on to try opening the libosinfo
database?

Comment 8 Laszlo Ersek 2022-02-09 09:56:10 UTC
(In reply to Richard W.M. Jones from comment #7)

> However I also think that we should deal with this situation by not
> failing. We're reading some local configuration file that has about a
> 0.0001% chance of existing on any given machine.  If we cannot open
> the path, it doesn't matter and we should try to continue.

Indeed, in my own testing (which has been quite osinfo-centric lately),
$HOME/.config/osinfo also does not exist, and that causes no issues at
all.

> The rest of the analysis around lazy values is correct, but I'm not
> sure about the proposed fix.  Aside from the "leak" of Not_found (I
> understand it's not really going to leak it, but raising Not_found
> always makes me uneasy), won't this mean that we entirely skip opening
> the libosinfo database?

Yes, it does mean that.

> All that's missing here is some irrelevant, unlikely-to-exist config
> file, why can't we skip the check for the local config file and go on
> to try opening the libosinfo database?

Well the reason we cannot go on an try opening the system-level osinfo
database, after the "$HOME/.config/osinfo" access fails, is that the
traversal of *all* of these paths is buried in the
osinfo_loader_process_default_path() API function. With libosinfo @
8ce1bee28ba6, the function looks like this:

> void osinfo_loader_process_default_path(OsinfoLoader *loader, GError **err)
> {
>     GFile *dirs[] = {
>         osinfo_loader_get_pci_path(),
>         osinfo_loader_get_usb_path(),
>         osinfo_loader_get_system_path(),
>         osinfo_loader_get_local_path(),
>         osinfo_loader_get_user_path(),
>         NULL,
>     };
>
>     osinfo_loader_process_list(loader, dirs, TRUE, err);
>     g_object_unref(dirs[0]);
>     g_object_unref(dirs[1]);
>     g_object_unref(dirs[2]);
>     g_object_unref(dirs[3]);
>     g_object_unref(dirs[4]);
> }

In turn, osinfo_loader_process_list() loops over "dirs" and calls
osinfo_loader_find_files() for each.

Victor's commit d3b1587f7b77 ("loader: add check for unknown file type",
2021-11-29; mentioned in comment 5 above) modifies the behavior of
osinfo_loader_find_files(). However, what that commit does is, it
creates a better error message. It does not change the fact that
osinfo_loader_find_files() fails, nor the fact that this failure aborts
-- or overrides the thus-far success of -- osinfo_loader_process_list().

In other words, if we wanted libosinfo to "just load what it can, and
ignore errors", that logic would belong in
osinfo_loader_process_default_path().

There is an alternative function (which we could use instead of
osinfo_loader_process_default_path()):
osinfo_loader_process_system_path(). This is almost identical to
osinfo_loader_process_default_path(), but (a) it drops the last two
elements of "dirs" (the "local" and the "user" paths), (b) the
"skipMissing" parameter passed to osinfo_loader_process_list() is FALSE,
not TRUE.

... Ultimately, shouldn't calling osinfo_loader_process_list() with
skipMissing=TRUE, like osinfo_loader_process_default_path() *already
does*, prevent precisely this error?...

Ah I know what's up. In osinfo_loader_find_files(), "skipMissing" is
only consulted if g_file_query_info() returns G_IO_ERROR_NOT_FOUND (and
then skipMissing does the right thing, and masks the error). However,
"skipMissing" does not make a difference for other kinds of errors.

>     info = g_file_query_info(file, "standard::*", G_FILE_QUERY_INFO_NONE, NULL, &error);
>     if (error) {
>         if (error->code == G_IO_ERROR_NOT_FOUND && skipMissing) {
>             g_error_free(error);
>             return;
>         }
>         g_propagate_error(err, error);
>         return;
>     }
>     type = g_file_info_get_attribute_uint32(info,
>                                             G_FILE_ATTRIBUTE_STANDARD_TYPE);

In our case, g_file_query_info() actually succeeds (!):

  https://docs.gtk.org/gio/method.File.query_info.html

and then we proceed to g_file_info_get_attribute_uint32():

  https://docs.gtk.org/gio/method.FileInfo.get_attribute_uint32.html
  https://docs.gtk.org/gio/const.FILE_ATTRIBUTE_STANDARD_TYPE.html
  https://docs.gtk.org/gio/enum.FileType.html

and that's the one returning G_FILE_TYPE_UNKNOWN, where Victor's
above-mentioned patch takes effect (and "skipMissing" is again ignored).

So, we could even argue that this is a GIO library bug:
g_file_query_info() succeeds on an inaccessible file -- not in the case
when the leading pathname components (directories) *can* be searched and
the last pathname component (the filename) is missing, but in the case
when the *directory search* fails.

Hmmm... I've found the following code in glib @ 2ea9f4b0d31c, file
"gio/gfile.c":

> /**
>  * g_file_query_file_type:
>  * @file: input #GFile
>  * @flags: a set of #GFileQueryInfoFlags passed to g_file_query_info()
>  * @cancellable: (nullable): optional #GCancellable object,
>  *   %NULL to ignore
>  *
>  * Utility function to inspect the #GFileType of a file. This is
>  * implemented using g_file_query_info() and as such does blocking I/O.
>  *
>  * The primary use case of this method is to check if a file is
>  * a regular file, directory, or symlink.
>  *
>  * Returns: The #GFileType of the file and #G_FILE_TYPE_UNKNOWN
>  *   if the file does not exist
>  *
>  * Since: 2.18
>  */
> GFileType
> g_file_query_file_type (GFile               *file,
>                         GFileQueryInfoFlags  flags,
>                         GCancellable        *cancellable)
> {
>   GFileInfo *info;
>   GFileType file_type;
>
>   g_return_val_if_fail (G_IS_FILE(file), G_FILE_TYPE_UNKNOWN);
>   info = g_file_query_info (file, G_FILE_ATTRIBUTE_STANDARD_TYPE, flags,
>                             cancellable, NULL);
>   if (info != NULL)
>     {
>       file_type = g_file_info_get_file_type (info);
>       g_object_unref (info);
>     }
>   else
>     file_type = G_FILE_TYPE_UNKNOWN;
>
>   return file_type;
> }

Note that this is similar to what osinfo_loader_find_files() does, but
not exactly the same: it calls g_file_query_info(), but for the error
check, it does not use the output "error" parameter. Instead, it
compares the return value against NULL.

So here's an idea: what if the g_file_query_info() call in
osinfo_loader_find_files() returns NULL, but fails to set "error"? In
that case, we proceed to g_file_info_get_attribute_uint32(), and maybe
that function works (?) on a NULL pointer, just returns
G_FILE_TYPE_UNKNOWN as the file type.

Indeed the GIO source code seems consistent with this! The
g_file_info_get_attribute_uint32() function starts like this
[gio/gfileinfo.c]:

> /**
>  * g_file_info_get_attribute_uint32:
>  * @info: a #GFileInfo.
>  * @attribute: a file attribute key.
>  *
>  * Gets an unsigned 32-bit integer contained within the attribute. If the
>  * attribute does not contain an unsigned 32-bit integer, or is invalid,
>  * 0 will be returned.
>  *
>  * Returns: an unsigned 32-bit integer from the attribute.
>  **/
> guint32
> g_file_info_get_attribute_uint32 (GFileInfo  *info,
> 				  const char *attribute)
> {
>   GFileAttributeValue *value;
>
>   g_return_val_if_fail (G_IS_FILE_INFO (info), 0);

Note first that the retval 0 stands for G_FILE_TYPE_UNKNOWN -- see
"GFileType" in "gio/gioenums.h":

> typedef enum {
>   G_FILE_TYPE_UNKNOWN = 0,
>   G_FILE_TYPE_REGULAR,
>   G_FILE_TYPE_DIRECTORY,
>   G_FILE_TYPE_SYMBOLIC_LINK,
>   G_FILE_TYPE_SPECIAL, /* socket, fifo, blockdev, chardev */
>   G_FILE_TYPE_SHORTCUT,
>   G_FILE_TYPE_MOUNTABLE
> } GFileType;

Note second the G_IS_FILE_INFO macro [gio/gfileinfo.h]:

> #define G_IS_FILE_INFO(o)        (G_TYPE_CHECK_INSTANCE_TYPE ((o), G_TYPE_FILE_INFO))

Then in "gobject/gtype.h":

> /**
>  * G_TYPE_CHECK_INSTANCE_TYPE:
>  * @instance: (nullable): Location of a #GTypeInstance structure.
>  * @g_type: The type to be checked
>  *
>  * Checks if @instance is an instance of the type identified by @g_type. If
>  * @instance is %NULL, %FALSE will be returned.
>  *
>  * This macro should only be used in type implementations.
>  *
>  * Returns: %TRUE on success
>  */
> #define G_TYPE_CHECK_INSTANCE_TYPE(instance, g_type)            (_G_TYPE_CIT ((instance), (g_type)))

So, that's most likely what's happening in osinfo_loader_find_files():

- g_file_query_info() returns NULL, and does not overwrite "error"

- then we proceed to g_file_info_get_attribute_uint32(), which does not
  crash on NULL, but returns G_FILE_TYPE_UNKNOWN.

Comment 9 Laszlo Ersek 2022-02-09 10:18:01 UTC
I've looked into the reproducer from <https://bugzilla.redhat.com/show_bug.cgi?id=1942431#c0> with gdb:

HOME=/root osinfo-query os

and I confirm that g_file_query_info() does not fail *at all*; it neither returns NULL nor sets "error" to non-NULL.

Comment 10 Victor Toso 2022-02-09 10:20:32 UTC
> Victor's commit d3b1587f7b77 ("loader: add check for unknown file type",
> 2021-11-29; mentioned in comment 5 above) modifies the behavior of
> osinfo_loader_find_files(). However, what that commit does is, it
> creates a better error message. It does not change the fact that
> osinfo_loader_find_files() fails, nor the fact that this failure aborts
> -- or overrides the thus-far success of -- osinfo_loader_process_list().
> 
> In other words, if we wanted libosinfo to "just load what it can, and
> ignore errors", that logic would belong in
> osinfo_loader_process_default_path().

IMHO, the main issue for libosinfo / glib is that we are asking access to
a path where the user does not have access to read from but we expect that
it should be able to. $HOME can be considered an API parameter to libosinfo.

Note that in the original https://bugzilla.redhat.com/show_bug.cgi?id=1901423#c18
virt-conversion-host was leaking $USER and $HOME so it is nice to have a failure
somewhere to find the real problem.
 
> and that's the one returning G_FILE_TYPE_UNKNOWN, where Victor's
> above-mentioned patch takes effect (and "skipMissing" is again ignored).
> 
> So, we could even argue that this is a GIO library bug:

I agree that more can be done in glib/gio level and I actually opened
https://gitlab.gnome.org/GNOME/glib/-/issues/2543 for better handling
this kind of scenario but I still think that libosinfo / glib should error
in this scenario.

Comment 11 Laszlo Ersek 2022-02-09 11:05:19 UTC
This is *absolutely* a glib bug.

Here's the call tree that I've traced, with gdb, in the reproducer from
comment#9.

First, osinfo_loader_get_user_path() reaches the following assignment:

>         file = g_file_new_for_path(dbdir);

This creates a GFile for pathname "/root/.config/osinfo".

Then we have:

osinfo_loader_process_default_path() [libosinfo/osinfo/osinfo_loader.c]
  osinfo_loader_process_list()       [libosinfo/osinfo/osinfo_loader.c]
    osinfo_loader_find_files()       [libosinfo/osinfo/osinfo_loader.c]
      g_file_query_info()            [glib/gio/gfile.c]
        g_local_file_query_info()    [glib/gio/glocalfile.c]
          _g_local_file_info_get()   [glib/gio/glocalfileinfo.c]
            g_local_file_lstat()     [glib/gio/glocalfileinfo.h]

Let me quote the relevant part from _g_local_file_info_get():

>   res = g_local_file_lstat (path,
>                             G_LOCAL_FILE_STAT_FIELD_BASIC_STATS | G_LOCAL_FILE_STAT_FIELD_BTIME,
>                             G_LOCAL_FILE_STAT_FIELD_ALL & (~G_LOCAL_FILE_STAT_FIELD_BTIME) & (~G_LOCAL_FILE_STAT_FIELD_ATIME),
>                             &statbuf);
>
>   if (res == -1)
>     {
>       int errsv = errno;
>
>       /* Don't bail out if we get Permission denied (SELinux?) */
>       if (errsv != EACCES)
>         {
>           char *display_name = g_filename_display_name (path);
>           g_object_unref (info);
>           g_set_error (error, G_IO_ERROR,
> 		       g_io_error_from_errno (errsv),
> 		       _("Error when getting information for file “%s”: %s"),
> 		       display_name, g_strerror (errsv));
>           g_free (display_name);
>           return NULL;
>         }
>     }
>
>   /* Even if stat() fails, try to get as much as other attributes possible */
>   stat_ok = res != -1;

The g_local_file_lstat() call totally fails, and sets errno to EACCES
(value 13, error string "Permission denied"), entirely as expected, for
"/root/.config/osinfo", given that the "/root" directory does not grant
"world search" (0001) rights. This has nothing at all to do with
SELinux.

And then _g_local_file_info_get() dutifully walks through the rest of
operations, and creates a bunch of file information that's not based in
reality, such as "application/octet-stream" for the "content type" of
the file (which could not even be lstated).

At the end of the function, a seemingly valid GFileInfo object is
returned, and "error" is never set.

This glib bug (regression) was introduced in historical commit
71e7b5800a31 ("Handle MLS selinux policy better", 2010-07-08), for the
following ticket:

  https://bugzilla.gnome.org/show_bug.cgi?id=623692

Note the comment there:

> When we get the EACCESS error on stat(), we still try to collect other
> attributes. No error is propagated to clients

In fact, the regression was reported in the GNOME bugzilla:

  "g_file_query_exists returns TRUE for files in inaccessible dirs"
  https://bugzilla.gnome.org/show_bug.cgi?id=777187

and there is still a "live" ticket for the regression:

  "g_file_query_exists returns TRUE for files in inaccessible dirs"
  https://gitlab.gnome.org/GNOME/glib/-/issues/1237

Given that the regression is more than a decade old, I don't think we
can fix it in glib. It needs to be worked around in libosinfo.

Here's my suggestion: in the branch added by Victor's libosinfo commit
d3b1587f7b77 ("loader: add check for unknown file type", 2021-11-29),
rework a the logic:

- add a comment pointing to the above glib issue (1237 on gitlab),

- don't propagate, but suppress, the error, if "skipMissing" is set.

In my opinion, it is unrealistic to get a G_FILE_TYPE_UNKNOWN type
unless the underlying stat() or lstat() fails, and if "skipMissing" is
set, then "failure to stat" should be suppressed.

Comment 12 Laszlo Ersek 2022-02-09 11:10:46 UTC
Huh, just seeing this:

(In reply to Victor Toso from comment #10)
> > Victor's commit d3b1587f7b77 ("loader: add check for unknown file
> > type", 2021-11-29; mentioned in comment 5 above) modifies the
> > behavior of osinfo_loader_find_files(). However, what that commit
> > does is, it creates a better error message. It does not change the
> > fact that osinfo_loader_find_files() fails, nor the fact that this
> > failure aborts -- or overrides the thus-far success of --
> > osinfo_loader_process_list().
> >
> > In other words, if we wanted libosinfo to "just load what it can,
> > and ignore errors", that logic would belong in
> > osinfo_loader_process_default_path().
>
> IMHO, the main issue for libosinfo / glib is that we are asking access
> to a path where the user does not have access to read from but we
> expect that it should be able to. $HOME can be considered an API
> parameter to libosinfo.

In that case, we don't have anything to do with this bug -- the Failure
exception is valid then. (Well, fixing the libguestfs shutdown hang
would be nice.)

> Note that in the original
> https://bugzilla.redhat.com/show_bug.cgi?id=1901423#c18
> virt-conversion-host was leaking $USER and $HOME so it is nice to have
> a failure somewhere to find the real problem.
>
> > and that's the one returning G_FILE_TYPE_UNKNOWN, where Victor's
> > above-mentioned patch takes effect (and "skipMissing" is again
> > ignored).
> >
> > So, we could even argue that this is a GIO library bug:
>
> I agree that more can be done in glib/gio level and I actually opened
> https://gitlab.gnome.org/GNOME/glib/-/issues/2543 for better handling
> this kind of scenario

Thanks, but (at this point) I'm thinking that ticket#2543 should just be
closed as a duplicate of ticket#1237.

> but I still think that libosinfo / glib should error in this scenario.

I agree that g_file_query_info() should definitely fail in this scenario
(it should return NULL and set "error").

On the other hand, I believe that, in osinfo_loader_find_files(), the
"skipMissing" branch should cover this case, even if the error code from
glib is not specifically G_IO_ERROR_NOT_FOUND, but (perhaps, as you
suggest) G_IO_ERROR_PERMISSION_DENIED.

Comment 13 Richard W.M. Jones 2022-02-09 12:37:57 UTC
(In reply to Victor Toso from comment #10)
> I agree that more can be done in glib/gio level and I actually opened
> https://gitlab.gnome.org/GNOME/glib/-/issues/2543 for better handling
> this kind of scenario but I still think that libosinfo / glib should error
> in this scenario.

An alternative then could be to have virt-v2v error out earlier on
with a clearer error message?  Current error message happens 10+ seconds
in and isn't actionable (ie. "your $HOME variable is wrong, fix it!")

Comment 17 Victor Toso 2022-02-10 20:01:58 UTC
Moving to POST, fix is being discussed upstream - comment #14

Setting all the metadata. I'll backport as soon as it gets merged upstream.

Many thanks again for the fix Laszlo!

Comment 18 Laszlo Ersek 2022-02-11 12:24:40 UTC
(In reply to Laszlo Ersek from comment #14)
> https://gitlab.com/libosinfo/libosinfo/-/merge_requests/130

Upstream commit e5bdc6759195.

Comment 23 Xiaodai Wang 2022-02-15 07:40:13 UTC
Run the script with libosinfo-1.9.0-5.el9.x86_64, virt-v2v command can run successfully now.
And a WARNING is given as shown below. so this bug can be VERIFIED.


[stdlog] 2022-02-15 02:13:42,748 avocado.utils.process DEBUG| [stdout] [   0.0] Setting up the source: -i libvirt -ic vpx://root.x.x/data/x.x.x.x/?no_verify=1 -it vddk esx6.7-win2019-x86_64
[stdlog] 2022-02-15 02:13:44,479 avocado.utils.process DEBUG| [stdout] [   1.7] Opening the source
[stdlog] 2022-02-15 02:13:51,074 avocado.utils.process DEBUG| [stdout] [   8.3] Inspecting the source
[stdlog] 2022-02-15 02:13:58,518 avocado.utils.process DEBUG| [stdout] [  15.8] Checking for sufficient free disk space in the guest
[stdlog] 2022-02-15 02:13:58,518 avocado.utils.process DEBUG| [stdout] [  15.8] Converting Windows Server 2019 Standard to run on KVM
[stdlog] 2022-02-15 02:14:01,103 avocado.utils.process DEBUG| [stderr]
[stdlog] 2022-02-15 02:14:01,103 avocado.utils.process DEBUG| [stderr] ** (process:357666): WARNING **: 02:14:01.103: Can't read path /root/.config/osinfo
[stdlog] 2022-02-15 02:14:08,146 avocado.utils.process DEBUG| [stdout] virt-v2v: This guest has virtio drivers installed.
[stdlog] 2022-02-15 02:14:08,293 avocado.utils.process DEBUG| [stdout] [  25.6] Mapping filesystem data to avoid copying unused and blank areas
[stdlog] 2022-02-15 02:14:09,145 avocado.utils.process DEBUG| [stdout] [  26.4] Closing the overlay
[stdlog] 2022-02-15 02:14:09,188 avocado.utils.process DEBUG| [stdout] [  26.5] Assigning disks to buses
[stdlog] 2022-02-15 02:14:09,188 avocado.utils.process DEBUG| [stdout] [  26.5] Checking if the guest needs BIOS or UEFI to boot
[stdlog] 2022-02-15 02:14:09,188 avocado.utils.process DEBUG| [stdout] [  26.5] Setting up the destination: -o json -os /home/v2v_auto
[stdlog] 2022-02-15 02:14:10,601 avocado.utils.process DEBUG| [stdout] [  27.9] Copying disk 1/1
[stdlog] 2022-02-15 02:19:51,984 avocado.utils.process DEBUG| [stdout] [ 369.2] Creating output metadata
[stdlog] 2022-02-15 02:19:51,984 avocado.utils.process DEBUG| [stdout] [ 369.2] Finishing off

Comment 25 errata-xmlrpc 2022-05-17 13:19:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (new packages: libosinfo), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:2490


Note You need to log in before you can comment on or make changes to this bug.