Bug 1804207 - Libguestfs relies on /dev/sdX device enumeration order, kernel no longer enumerates them in order
Summary: Libguestfs relies on /dev/sdX device enumeration order, kernel no longer enum...
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: Virtualization Tools
Classification: Community
Component: libguestfs
Version: unspecified
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Richard W.M. Jones
QA Contact:
URL:
Whiteboard:
: 1803191 (view as bug list)
Depends On:
Blocks: PYTHON39
TreeView+ depends on / blocked
 
Reported: 2020-02-18 12:57 UTC by Richard W.M. Jones
Modified: 2020-03-12 10:14 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-03-12 10:14:17 UTC
Embargoed:


Attachments (Terms of Use)

Description Richard W.M. Jones 2020-02-18 12:57:38 UTC
Description of problem:

In libguestfs, it's an unfortunate (in hindsight) ABI that we rely on
the order that /dev/sdX devices are enumerated being identical to the
order that the devices appear in the libvirt XML.  For example, that
the first device appears as /dev/sda, the second as /dev/sdb and so
on.

This was true until fairly recently.  What changed (in Linux) was
that it now does asynchronous polling:

- [drivers] driver core: Probe devices asynchronously instead of the driver
(Jeff Moyer) [1724965]
- [drivers] device core: Consolidate locking and unlocking of parent and device 
(Jeff Moyer) [1724965]
- [drivers] driver core: Establish order of operations for device_add and
device_del via bitflag (Jeff Moyer)
- [drivers] driver core: Add missing dev->bus->need_parent_lock checks (Jeff
Moyer) [1724965]
- [drivers] driver core: Move async_synchronize_full call (Jeff Moyer) [1724965]

This means that devices are no longer created in order, the
/dev/sdX name can change from boot to boot.

This also affects supermin when it's looking for the root
device (see bug 1803191 for an example).  I don't know yet if
we should file a separate bug for supermin.

Version-Release number of selected component (if applicable):

libguestfs 1.41.8

How reproducible:

Quite infrequent, but easier to reproduce if you add a large
number of disks (eg. > 100).

Steps to Reproduce:

We actually have a regression test that picks this up, see:

https://github.com/libguestfs/libguestfs/blob/56834875b25a604983b1aa90b15a01e6cc22c9bc/tests/disks/test-add-disks.c#L311

(Thanks Vitaly Kuznetsov for bug analysis)

Comment 1 Richard W.M. Jones 2020-02-18 12:59:47 UTC
*** Bug 1803191 has been marked as a duplicate of this bug. ***

Comment 2 Richard W.M. Jones 2020-02-20 14:52:06 UTC
First part of the fix is:
https://www.redhat.com/archives/libguestfs/2020-February/msg00220.html

This isn't quite the whole story.  It does appear that we will need to
either modify supermin or else modify libguestfs to supply the
root=UUID=XXX parameter to supermin.  See also this commit in supermin:
https://github.com/libguestfs/supermin/commit/cd5281beed0af7b57473e36f6fa275eaecde4f09


Note You need to log in before you can comment on or make changes to this bug.