RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2141860 - Race condition causes kpartx to create a dm device which uses itself as part of the target, creating an infinite recursion
Summary: Race condition causes kpartx to create a dm device which uses itself as part ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: device-mapper-multipath
Version: 9.2
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Ben Marzinski
QA Contact: Lin Li
URL:
Whiteboard:
Depends On: 2128885
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-11-10 22:08 UTC by Ben Marzinski
Modified: 2023-05-09 13:50 UTC (History)
10 users (show)

Fixed In Version: device-mapper-multipath-0.8.7-15.el9
Doc Type: Bug Fix
Doc Text:
Cause: kpartx doesn't hold a disk device open while it creates partition devices from it. This means that the disk device can be removed while creating the partition devices. When this happens the disk device that the partition devices reference my end up pointing to another device, including the partition device itself. Consequence: If the partition device ends up pointing to itself, the kernel can enter an infinite loop trying to set the device up. Fix: kpartx holds the disk device open while creating the partition devices, so that it cannot be removed. Result: partition devices will end up pointing to the disk the device, and the kernel will not lock up in an infinite loop while trying to create the partition device.
Clone Of: 2128885
Environment:
Last Closed: 2023-05-09 08:14:07 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-139095 0 None None None 2022-11-10 22:11:22 UTC
Red Hat Product Errata RHSA-2023:2459 0 None None None 2023-05-09 08:14:30 UTC

Description Ben Marzinski 2022-11-10 22:08:15 UTC
+++ This bug was initially created as a clone of Bug #2128885 +++

A customer's system crashed while trying to load a new dm table for a dm device. While processing the table's parameters, it entered an infinite recursion condition which overflowed the kpartx's kernel stack:

PID: 513001   TASK: ffff8c43910ddc40  CPU: 26   COMMAND: "kpartx"
 #0 [fffffe0000464dc0] machine_kexec at ffffffff92059c8e
 #1 [fffffe0000464e18] __crash_kexec at ffffffff9215a27d
 #2 [fffffe0000464ee0] crash_kexec at ffffffff9215b15d
 #3 [fffffe0000464ef8] oops_end at ffffffff92021edd
 #4 [fffffe0000464f18] handle_stack_overflow at ffffffff9201f384
 #5 [fffffe0000464f30] do_double_fault.cold.15 at ffffffff9201f39e
 #6 [fffffe0000464f50] double_fault at ffffffff92a00dce
    [exception RIP: dm_dax_get_live_target+14]
    RIP: ffffffffc08f644e  RSP: ffffa6fc68e1c000  RFLAGS: 00010202
    RAX: ffff8c14d9760000  RBX: ffff8c14d9760000  RCX: ffffa6fc68e1fad8
    RDX: ffffa6fc68e1c01c  RSI: 0000000001810800  RDI: ffff8c14d9760000
    RBP: ffffa6fc68e1c01c   R8: ffffa6fc68e1fae8   R9: 00000000001fdefb
    R10: ffffffffc08fd380  R11: ffff8c3039115058  R12: 0000000001810800
    R13: ffffa6fc68e1fae8  R14: 0000000000000001  R15: 0000000001810800
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
--- <DOUBLEFAULT exception stack> ---
 #7 [ffffa6fc68e1c000] dm_dax_get_live_target at ffffffffc08f644e [dm_mod]
 #8 [ffffa6fc68e1c010] dm_dax_direct_access at ffffffffc08f64cf [dm_mod]
 #9 [ffffa6fc68e1c058] dax_direct_access at ffffffff925a8d9d
#10 [ffffa6fc68e1c068] linear_dax_direct_access at ffffffffc08fd3de [dm_mod]
#11 [ffffa6fc68e1c098] dm_dax_direct_access at ffffffffc08f6522 [dm_mod]
#12 [ffffa6fc68e1c0e0] dax_direct_access at ffffffff925a8d9d
...
#325 [ffffa6fc68e1f830] linear_dax_direct_access at ffffffffc08fd3de [dm_mod]
#326 [ffffa6fc68e1f860] dm_dax_direct_access at ffffffffc08f6522 [dm_mod]
#327 [ffffa6fc68e1f8a8] dax_direct_access at ffffffff925a8d9d
#328 [ffffa6fc68e1f8b8] linear_dax_direct_access at ffffffffc08fd3de [dm_mod]
#329 [ffffa6fc68e1f8e8] dm_dax_direct_access at ffffffffc08f6522 [dm_mod]
#330 [ffffa6fc68e1f930] dax_direct_access at ffffffff925a8d9d
#331 [ffffa6fc68e1f940] linear_dax_direct_access at ffffffffc08fd3de [dm_mod]
#332 [ffffa6fc68e1f970] dm_dax_direct_access at ffffffffc08f6522 [dm_mod]
#333 [ffffa6fc68e1f9b8] dax_direct_access at ffffffff925a8d9d
#334 [ffffa6fc68e1f9c8] linear_dax_direct_access at ffffffffc08fd3de [dm_mod]
#335 [ffffa6fc68e1f9f8] dm_dax_direct_access at ffffffffc08f6522 [dm_mod]
#336 [ffffa6fc68e1fa40] dax_direct_access at ffffffff925a8d9d
#337 [ffffa6fc68e1fa50] linear_dax_direct_access at ffffffffc08fd3de [dm_mod]
#338 [ffffa6fc68e1fa80] dm_dax_direct_access at ffffffffc08f6522 [dm_mod]
#339 [ffffa6fc68e1fac8] __generic_fsdax_supported at ffffffff925a93ff
#340 [ffffa6fc68e1fb50] device_supports_dax at ffffffffc08fa5fc [dm_mod]
#341 [ffffa6fc68e1fb58] dm_table_supports_dax at ffffffffc08fb918 [dm_mod]
#342 [ffffa6fc68e1fb80] dm_table_set_restrictions at ffffffffc08fc769 [dm_mod]
#343 [ffffa6fc68e1fbc0] dm_swap_table at ffffffffc08f9b01 [dm_mod]
#344 [ffffa6fc68e1fcb0] dev_suspend at ffffffffc08ff8d5 [dm_mod]
#345 [ffffa6fc68e1fcd8] ctl_ioctl at ffffffffc08fec1f [dm_mod]
#346 [ffffa6fc68e1fe78] dm_ctl_ioctl at ffffffffc08fee6a [dm_mod]
#347 [ffffa6fc68e1fe80] do_vfs_ioctl at ffffffff922dcde4
#348 [ffffa6fc68e1fef8] ksys_ioctl at ffffffff922dd3d0
#349 [ffffa6fc68e1ff30] __x64_sys_ioctl at ffffffff922dd416
#350 [ffffa6fc68e1ff38] do_syscall_64 at ffffffff9200419b
#351 [ffffa6fc68e1ff50] entry_SYSCALL_64_after_hwframe at ffffffff92a000ad
    RIP: 00007fc124a2087b  RSP: 00007ffd3f7798f8  RFLAGS: 00000206
    RAX: ffffffffffffffda  RBX: 00007fc124cfe0e0  RCX: 00007fc124a2087b
    RDX: 000055921b0603a0  RSI: 00000000c138fd06  RDI: 0000000000000003
    RBP: 00007fc124d39b23   R8: 00007fc124d3a6c0   R9: 00007ffd3f779760
    R10: 000000000000000f  R11: 0000000000000206  R12: 000055921b0603a0
    R13: 000055921b060450  R14: 0000000000000001  R15: 000055921b05bb20
    ORIG_RAX: 0000000000000010  CS: 0033  SS: 002b


The kernel's devicemapper code was in an infinite recursion checking for DAX support because the dm_table for the dm device had itself as its own destination. The table was for dm-8, yet dm-8 was the linear target's destination.


crash> p ((struct dm_table *)0xffff8c3039115000)->md->name
$8 = "253:8\000\000\000\000\000\000\000\000\000\000"



crash> p ((struct dm_table *)0xffff8c3039115000)->devices
$9 = {
  next = 0xffff8c0a17d6a2c0,
  prev = 0xffff8c0a17d6a2c0
}
crash> struct dm_dev_internal 0xffff8c0a17d6a2c0
struct dm_dev_internal {
  list = {
    next = 0xffff8c30391150f8,
    prev = 0xffff8c30391150f8
  },
  count = {
    refs = {
      counter = 1
    }
  },
  dm_dev = 0xffff8c19497c6d58
}


crash> struct dm_dev 0xffff8c19497c6d58
struct dm_dev {
  bdev = 0xffff8c24d71ecb00,
  dax_dev = 0xffff8c12c269bb80,
  mode = 3,
  name = "253:8\000\000\000\000\000\000\000\000\000\000"
}


This bizarre state looks to have been a result of a rare race condition. kpartx was creating a partition on a device 3600a098038314255672b4f6b4b75795a which no longer existed when the vmcore was captured. But 3600a098038314255672b4f6b4b75795a was a valid multipath device that can be connected to the system.

crash> dmshow | grep 3600a098038314255672b4f6b4b75795a
dm-8    3600a098038314255672b4f6b4b75795a1                 0xffff8c14d9760000    flags: 0x7       [Device suspended][Device frozen]

It appears 3600a098038314255672b4f6b4b75795a was originally device dm-8, then was removed while kpartx was running after it performed its partition scan. kpartx closes its filehandle to the multipath device before it creates partition devices, so it doesn't guarantee the multipath device will remain unchanged. If the multipath device is removed before kpartx creates a new dm partition device, then the new dm partition device can potentially reuse the released multipath device's dm number. By ending up assigned the dm number originally owned by the multipath device, kpartx will then create a dm device which uses itself as the target, causing the recursion condition which crashed the system.

Version-Release number of selected component (if applicable):
device-mapper-multipath-0.8.3-3.el8_2.5

How reproducible:
Seen rarely on a customer's openstack system.

Actual results:
system crashes from bad dm table causing a kernel stack overflow

Expected results:
kpartx should not generate and load a bad dm table which targets itself.

Additional info:
The vmcore is available at /cores/retrace/tasks/995362692/crash/vmcore on galvatron-x86.cee.redhat.com

Comment 6 errata-xmlrpc 2023-05-09 08:14:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: device-mapper-multipath security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:2459


Note You need to log in before you can comment on or make changes to this bug.