Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 2233533

Summary: LVM thin installations are failing on ppc64le: Check of pool rhel_ibm-p9z-16-lp2/pool00 failed (status:64). Manual repair required!
Product: Red Hat Enterprise Linux 9 Reporter: Jan Stodola <jstodola>
Component: device-mapper-persistent-dataAssignee: Ming-Hung Tsai <mtsai>
Status: CLOSED ERRATA QA Contact: Filip Suba <fsuba>
Severity: high Docs Contact:
Priority: unspecified    
Version: 9.3CC: agk, awilliam, fsuba, heinzm, jwboyer, lvm-team, mcsontos, msnitzer, mtsai, pkis, pvauter, shdunne, thornber
Target Milestone: betaKeywords: Regression, Triaged
Target Release: 9.3Flags: pm-rhel: mirror+
Hardware: ppc64le   
OS: Linux   
Whiteboard:
Fixed In Version: device-mapper-persistent-data-1.0.6-3.el9_3 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-11-07 08:56:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2180384    

Description Jan Stodola 2023-08-22 12:50:37 UTC
Description of problem:
Starting with compose RHEL-9.3.0-20230821.32, installations on ppc64le are failing when LVM Thin provisioning is used in the kickstart file or selected during an interactive installation. Previous compose RHEL-9.3.0-20230817.2 worked fine. Also, I haven't seen this problem on the other architectures.


The installation fails with the following traceback in storage.log:

INFO:program:Running [94] lvm lvcreate -T rhel_ibm-p9z-16-lp2/pool00 -V 24412160K -n home --config= log {level=7 file=/tmp/lvm.log syslog=0} ...
INFO:program:stdout[94]: 
INFO:program:stderr[94]:   /usr/sbin/dmeventd: stat failed: No such file or directory
  WARNING: Failed to monitor rhel_ibm-p9z-16-lp2/pool00.
  /usr/sbin/dmeventd: stat failed: No such file or directory
  WARNING: Failed to unmonitor rhel_ibm-p9z-16-lp2/pool00.
  WARNING: Integrity check of metadata for pool rhel_ibm-p9z-16-lp2/pool00 failed.
  Check of pool rhel_ibm-p9z-16-lp2/pool00 failed (status:64). Manual repair required!
  Failed to activate thin pool rhel_ibm-p9z-16-lp2/pool00.

INFO:program:...done [94] (exit code: 5)
INFO:anaconda.threading:Thread Failed: AnaTaskThread-CreateStorageLayoutTask-1 (140735730151584)
ERROR:anaconda.modules.common.task.task:Thread AnaTaskThread-CreateStorageLayoutTask-1 has failed: Traceback (most recent call last):
  File "/usr/lib64/python3.9/site-packages/gi/overrides/BlockDev.py", line 1093, in wrapped
    ret = orig_obj(*args, **kwargs)
gi.repository.GLib.GError: g-bd-utils-exec-error-quark: Process reported exit code 5:   /usr/sbin/dmeventd: stat failed: No such file or directory
  WARNING: Failed to monitor rhel_ibm-p9z-16-lp2/pool00.
  /usr/sbin/dmeventd: stat failed: No such file or directory
  WARNING: Failed to unmonitor rhel_ibm-p9z-16-lp2/pool00.
  WARNING: Integrity check of metadata for pool rhel_ibm-p9z-16-lp2/pool00 failed.
  Check of pool rhel_ibm-p9z-16-lp2/pool00 failed (status:64). Manual repair required!
  Failed to activate thin pool rhel_ibm-p9z-16-lp2/pool00.
 (0)


The "Manual repair required!" error is reported even early during the installation in initramfs, after downloading the kickstart file, see syslog:

...
12:25:37,660 INFO dracut-initqueue:anaconda: kickstart locations are: http://XYZ/...
12:25:37,660 INFO dracut-initqueue:anaconda: fetching kickstart from http://XYZ/...
12:25:37,671 INFO dracut-initqueue:  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
12:25:37,671 INFO dracut-initqueue:                                 Dload  Upload   Total   Spent    Left  Speed
12:25:37,696 INFO dracut-initqueue:#015  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0#015100 25833  100 25833    0     0  1009k      0 --:--:-- --:--:-- --:--:-- 1009k
12:25:37,697 INFO dracut-initqueue:anaconda: successfully fetched kickstart from http://XYZ/...
12:25:43,853 INFO systemd:Started cancel waiting for multipath siblings of sda.
12:25:43,867 INFO systemd:cancel-multipath-wait-sda.service: Deactivated successfully.
12:25:43,868 INFO systemd:cancel-multipath-wait-sda.timer: Deactivated successfully.
12:25:44,331 INFO dracut-initqueue:Scanning devices sda3  for LVM volume groups
12:25:44,348 INFO dracut-initqueue:Found volume group "rhel_ibm-p9z-16-lp2" using metadata type lvm2
12:25:44,599 INFO dracut-initqueue:Check of pool rhel_ibm-p9z-16-lp2/pool00 failed (status:64). Manual repair required!
12:25:44,601 INFO kernel:dm-2: detected capacity change from 147456 to 0
12:25:44,671 INFO kernel:dm-3: detected capacity change from 148832256 to 0
12:25:44,831 INFO dracut-initqueue:0 logical volume(s) in volume group "rhel_ibm-p9z-16-lp2" now active
12:25:45,464 INFO systemd:Finished dracut initqueue hook.
...


Version-Release number of selected component (if applicable):
RHEL-9.3.0-20230821.32
device-mapper-persistent-data-1.0.6-1.el9

How reproducible:
7 out of 7 attempts

Steps to Reproduce:
1. Run a kickstart installation creating Thin LVM, for example:

clearpart --all --initlabel
zerombr
bootloader --location=mbr --leavebootorder
autopart --type thinp


Actual results:
The installation fails with a traceback.

Expected results:
A successful installation.

Comment 10 Zdenek Kabelac 2023-08-23 10:06:23 UTC
This seems to be not just showing some problem with new thin_check tool on ppc architecture,
but there shall not be monitoring enabled in the ramdisk -  this should really be started only AFTER the system is switched to rootfs.

Wondering were is now lost removal of monitoring in ramdisk that used to be for ages implemented in dracut's copy of lvm.conf.

Anyway the primary problem is ATM report of problems with a thin-pool metadata by new thin_check tool.

Comment 11 Shelley Dunne 2023-08-24 20:50:51 UTC
This was reviewed by the RHEL voting members and needs more information, please fill out the template justifying the blocker request.

Comment 12 Ming-Hung Tsai 2023-08-25 20:25:39 UTC
The ppc64 arch use a different ioctl code on BLKGETSIZE64 (0x40081272, not 0x80081272). Will fix soon.

Comment 13 Ming-Hung Tsai 2023-08-28 15:38:25 UTC
Fixed upstream https://github.com/jthornber/thin-provisioning-tools/commit/afcbcd7d85

Comment 15 Adam Williamson 2023-08-31 16:27:34 UTC
Ah, this is kinda interesting! We had this problem in Fedora back in June, but solved it a different way - we just put device-mapper-event back in the installer environment: https://github.com/weldr/lorax/pull/1328

so...if I'm understanding correctly, this was the 'wrong fix', and if we ensure that https://github.com/jthornber/thin-provisioning-tools/commit/afcbcd7d85 is applied in Fedora, we could drop device-mapper-event from the installer environment again?

Comment 16 Adam Williamson 2023-08-31 16:30:07 UTC
Oh, wait, now I check, we never actually merged that PR. So this is probably still broken in Fedora too.

Comment 20 Adam Williamson 2023-09-07 06:33:29 UTC
FWIW, I can confirm the upstream fix works - it landed in Fedora Rawhide on 2023-09-01, and the openQA test has passed every compose since then.

Comment 27 Filip Suba 2023-09-11 16:48:01 UTC
Verified with device-mapper-persistent-data-1.0.6-3.el9_3.

# vgcreate vg /dev/loop0
  Physical volume "/dev/loop0" successfully created.
  Volume group "vg" successfully created
# lvcreate -l 100%PV -T vg/mythinpool
  Thin pool volume with chunk size 64.00 KiB can address at most <15.88 TiB of data.
  Logical volume "mythinpool" created.
# lvcreate -V512M -T vg/mythinpool -n thinvolume1
  Logical volume "thinvolume1" created.


device-mapper-persistent-data-1.0.6-1:
# vgcreate vg /dev/loop0
  Volume group "vg" successfully created
# lvcreate -l 100%PV -T vg/mythinpool
  Thin pool volume with chunk size 64.00 KiB can address at most <15.88 TiB of data.
  Logical volume "mythinpool" created.
# lvcreate -V512M -T vg/mythinpool -n thinvolume1
  WARNING: Integrity check of metadata for pool vg/mythinpool failed.
  Check of pool vg/mythinpool failed (status:64). Manual repair required!
  Failed to activate thin pool vg/mythinpool.

Comment 29 errata-xmlrpc 2023-11-07 08:56:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (device-mapper-persistent-data bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:6701