Bug 2155689

Summary: stratisd may put a set up pool into a stopped state if it is restarted
Product: Red Hat Enterprise Linux 9 Reporter: Bryan Gurney <bgurney>
Component: stratisdAssignee: Bryan Gurney <bgurney>
Status: CLOSED ERRATA QA Contact: Filip Suba <fsuba>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 9.2CC: amulhern, cwei, dkeefe
Target Milestone: rcKeywords: Triaged
Target Release: 9.2Flags: pm-rhel: mirror+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: stratisd-3.4.4-1.el9 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-05-09 07:41:19 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Bryan Gurney 2022-12-21 22:42:05 UTC
Description of problem:
A device-mapper error may occur when stratisd reads a device-mapper
device's table via a device-mapper ioctl. If stratisd is setting up a
partially or completely set up pool, and it receives this error, it
will put the pool in a stopped state.

The error is caused by a defect in the devicemapper-rs source code
which reuses a message buffer without re-writing the correct message
header at the start of the buffer. In some cases, the previous message
header may have had some fields overwritten with a partial result by
the devicemapper kernel module, so that reusing the modified header
results in an error.

Version-Release number of selected component (if applicable):
stratisd-3.4.3-1.el9

How reproducible:
Does not seem to be reproducible in RHEL 9.1; however, an upstream
user found the issue in Rocky Linux 9.1, running kernel
5.14.0-162.6.1.el9_1.0.1.x86_64 x86_64, stratisd-3.2.2-1.el9.x86_64,
and stratis-cli-3.2.0-1.el9.noarch.

Steps to Reproduce:
With a test device that is at least 50 GiB in size (example
"/dev/vdb"), running stratisd-3.2.2-1.el9:
1. stratis pool create spool1 /dev/vdb
2. stratis fs create spool1 sfs1
3. for j in $(seq 101 999); do echo $j; stratis pool set-fs-limit
spool1 $j; free -m; done
4. Stop the stratis daemon via "systemctl stop stratisd"
5. Start the stratis daemon via "systemctl start stratisd"

Actual results:
Attempting to run "stratis pool list" will not list the pool that was
created, despite the device-mapper devices that comprise the pool
being online.  The following kernel message will appear:

  kernel: device-mapper: ioctl: only supply one of name or uuid, cmd(12)


Expected results:
Attempting to run "stratis pool list" will display the pool that was created.

Additional info:

Comment 1 Bryan Gurney 2022-12-22 14:40:16 UTC
Additional reproduction steps, continuing from step 4:

4. Stop the stratis daemon via "systemctl stop stratisd"

5. Upgrade stratisd to version 3.4.3, and stratis-cli to version 3.4.0:
# dnf install stratisd-3.4.3-1.el9.x86_64.rpm stratis-cli-3.4.0-1.el9.noarch.rpm

6. Start the stratis daemon via "systemctl start stratisd"

Actual results:
Attempting to run "stratis pool list" will not list the pool that was
created, despite the device-mapper devices that comprise the pool
being online.  The following kernel message will appear:

  kernel: device-mapper: ioctl: only supply one of name or uuid, cmd(12)


Expected results:
Attempting to run "stratis pool list" will display the pool that was created.

Comment 4 Filip Suba 2023-01-23 15:02:11 UTC
Verified with stratisd-3.4.4-1.el9.

Comment 6 errata-xmlrpc 2023-05-09 07:41:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (stratisd bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:2272