Bug 2258485 - Fedora 39 dm-cache fails to start after system power failure
Summary: Fedora 39 dm-cache fails to start after system power failure
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: lvm2
Version: 39
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Ming-Hung Tsai
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-01-15 16:09 UTC by Rolf Fokkens
Modified: 2024-02-25 01:25 UTC (History)
16 users (show)

Fixed In Version: device-mapper-persistent-data v1.0.11
Clone Of:
Environment:
Last Closed: 2024-02-17 00:57:30 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Screenshot while hang occures (123.89 KB, image/png)
2024-01-15 16:16 UTC, Rolf Fokkens
no flags Details
dracut shell when boot fails (102.96 KB, image/jpeg)
2024-01-15 16:18 UTC, Rolf Fokkens
no flags Details
Normal boot screenshot (39.79 KB, image/jpeg)
2024-01-15 16:21 UTC, Rolf Fokkens
no flags Details
The dm-cache status while not being fully started after power failure (1.34 MB, image/jpeg)
2024-01-26 11:24 UTC, Rolf Fokkens
no flags Details
screened lvmdump output (95.73 KB, application/gzip)
2024-01-26 19:00 UTC, Rolf Fokkens
no flags Details
lvm lvs output (225.90 KB, image/jpeg)
2024-01-26 19:05 UTC, Rolf Fokkens
no flags Details
cache_check error: insert at index NNN exceeds fixbitset size NNN (273.49 KB, image/jpeg)
2024-01-28 12:21 UTC, Rolf Fokkens
no flags Details
/dev/mapper/base-cache_cvol-cmeta (9.70 MB, application/x-xz)
2024-01-29 21:36 UTC, Rolf Fokkens
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHEL-24962 0 None None None 2024-02-09 11:45:45 UTC

Description Rolf Fokkens 2024-01-15 16:09:40 UTC
After powerfailure the system is gets stuck during reboot and drops to a recovery shell.

The issue is reproducable with kernel-6.6.8-200.fc39.x86_64
The issue does not manifest itself with kernel-6.6.8-100.fc38.x86_64

Currently the only solution to get the system running again is by booting with the fc38 kernel.

Reproducible: Always

Steps to Reproduce:
1. Power the system off
2. Boot the system
3. See it hang during boot.

Normally at point 3 there's a line "device-mapper: cache: You have created a cache device with a lot of individual cache blocks (3258112)" after which the boot continues.
Actual Results:  
Hang during boot

Expected Results:  
Succesful boot

lsblk:
sda                         8:0    0 931.5G  0 disk  
├─sda1                      8:1    0   199M  0 part  
│ └─md125                   9:125  0   199M  0 raid1 /boot/efi
├─sda2                      8:2    0     1G  0 part  
│ └─md127                   9:127  0  1022M  0 raid1 /boot
└─sda3                      8:3    0 930.3G  0 part  
  └─md126                   9:126  0   1.8T  0 raid5 
    └─base-lukscache_corig
                          253:3    0   1.8T  0 lvm   
      └─base-lukscache    253:4    0   1.8T  0 lvm   
        └─luks-3d87465c-4df6-4215-8114-93a5b9d4cdd7
                          253:5    0   1.8T  0 crypt 
          └─crypt--fedora-root
                          253:6    0   1.7T  0 lvm   /var/lib/snapd/snap
                                                     /
sdb                         8:16   0 931.5G  0 disk  
├─sdb1                      8:17   0   199M  0 part  
│ └─md125                   9:125  0   199M  0 raid1 /boot/efi
├─sdb2                      8:18   0     1G  0 part  
│ └─md127                   9:127  0  1022M  0 raid1 /boot
└─sdb3                      8:19   0 930.3G  0 part  
  └─md126                   9:126  0   1.8T  0 raid5 
    └─base-lukscache_corig
                          253:3    0   1.8T  0 lvm   
      └─base-lukscache    253:4    0   1.8T  0 lvm   
        └─luks-3d87465c-4df6-4215-8114-93a5b9d4cdd7
                          253:5    0   1.8T  0 crypt 
          └─crypt--fedora-root
                          253:6    0   1.7T  0 lvm   /var/lib/snapd/snap
                                                     /
sdc                         8:32   0 931.5G  0 disk  
├─sdc1                      8:33   0   199M  0 part  
│ └─md125                   9:125  0   199M  0 raid1 /boot/efi
├─sdc2                      8:34   0     1G  0 part  
│ └─md127                   9:127  0  1022M  0 raid1 /boot
└─sdc3                      8:35   0 930.3G  0 part  
  └─md126                   9:126  0   1.8T  0 raid5 
    └─base-lukscache_corig
                          253:3    0   1.8T  0 lvm   
      └─base-lukscache    253:4    0   1.8T  0 lvm   
        └─luks-3d87465c-4df6-4215-8114-93a5b9d4cdd7
                          253:5    0   1.8T  0 crypt 
          └─crypt--fedora-root
                          253:6    0   1.7T  0 lvm   /var/lib/snapd/snap
                                                     /
sr0                        11:0    1  1024M  0 rom   
nvme0n1                   259:0    0 238.5G  0 disk  
├─nvme0n1p1               259:1    0   200G  0 part  
│ └─base-cache_cvol       253:0    0   199G  0 lvm   
│   ├─base-cache_cvol-cdata
│   │                     253:1    0 198.9G  0 lvm   
│   │ └─base-lukscache    253:4    0   1.8T  0 lvm   
│   │   └─luks-3d87465c-4df6-4215-8114-93a5b9d4cdd7
│   │                     253:5    0   1.8T  0 crypt 
│   │     └─crypt--fedora-root
│   │                     253:6    0   1.7T  0 lvm   /var/lib/snapd/snap
│   │                                                /
│   └─base-cache_cvol-cmeta
│                         253:2    0   144M  0 lvm   
│     └─base-lukscache    253:4    0   1.8T  0 lvm   
│       └─luks-3d87465c-4df6-4215-8114-93a5b9d4cdd7
│                         253:5    0   1.8T  0 crypt 
│         └─crypt--fedora-root
│                         253:6    0   1.7T  0 lvm   /var/lib/snapd/snap
│                                                    /
└─nvme0n1p2               259:2    0    16G  0 part  [SWAP]

Comment 1 Rolf Fokkens 2024-01-15 16:16:58 UTC
Created attachment 2008746 [details]
Screenshot while hang occures

Comment 2 Rolf Fokkens 2024-01-15 16:18:45 UTC
Created attachment 2008747 [details]
dracut shell when boot fails

Comment 3 Rolf Fokkens 2024-01-15 16:21:15 UTC
Created attachment 2008748 [details]
Normal boot screenshot

Comment 4 Rolf Fokkens 2024-01-15 16:25:14 UTC
The issue could also be attributed to device-mapper, because that has changed in FC39.

Comment 5 Rolf Fokkens 2024-01-26 11:13:12 UTC
Changed the component to LVM2 because I have a clear indication now that dm-cache fails to start after powerfailure.

Comment 6 Rolf Fokkens 2024-01-26 11:24:15 UTC
Created attachment 2010698 [details]
The dm-cache status while not being fully started after power failure

I added another screenshot showing the status of dm-cache while it's not being started after a powerfailure

Comment 7 Zdenek Kabelac 2024-01-26 11:58:01 UTC
The boot log suggest the device for decryption is not present in the system.

Unfortunately from the provided info it's not yet visible why.

Is it possible your 'raid' MD arrays is not working properly with newer kernel ?

Or is the cached LV activated, but fails during the activation ?

When you are in 'dracut' rescue shell -  are you able to see LVs  (see: lvs -a |  or "lvm lvs -a").

Are you able to activate your LVs ?   (see: vgchange -ay  |   "lvm vgchange -ay")

Do you even see PVs   (see:  pvs  |  "lvm pvs")

Aren't you using devicesfile - in that case you need to have properly configured /etc/lvm/devices/devicesfile   (see "lvm lvmdevices")

What errors do you get ?

You should probably attach to this BZ the file generated by  'lvmdump' so we can more easily see the device stack.


ATM I'd probably think the issue is related to  mdraid which is receiving some patches recently and also there are no changes on lvm2 side (especially if you only switch kernel on boot)

Comment 8 Rolf Fokkens 2024-01-26 17:24:17 UTC
Right after upgrading I ran the same kernel both in F38 (kernel-6.6.8-100.fc38.x86_64) and F39 (kernel-6.6.8-200.fc39.x86_64), at that time I was able to use both kernels + initramfs to boot F39. However after a powerfailure the F39 kernel would fail and the F38 was my lifeline. Not sure of these are actually the same kernels, but they seem close.

As may be seen in the lsblk output in previous comments, the luks devices stacks on top of the dm-cache device. So it's definitely not present (yet) because the dm-cache is not started/created properly.

In the failing situation it's noteworthy though that both md126 and md127 are active (auto-read-only) - Not sure is that's normal behaviour (not sure what rd.break setting is required to capture this an a health system). Is the read-only situation causing dm-cache to fail (wait, not do anything at all)? This is the mdstat stuation on a functional system: 

[root@home07 ~]# cat /proc/mdstat 
Personalities : [raid1] [raid6] [raid5] [raid4] 
md125 : active raid1 sdb1[1] sda1[0] sdc1[2]
      203776 blocks super 1.0 [3/3] [UUU]
      bitmap: 0/1 pages [0KB], 65536KB chunk

md126 : active raid1 sdb2[1] sdc2[2] sda2[0]
      1046528 blocks super 1.2 [3/3] [UUU]
      bitmap: 0/1 pages [0KB], 65536KB chunk

md127 : active raid5 sdb3[1] sdc3[3] sda3[0]
      1950750720 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/3] [UUU]
      bitmap: 7/8 pages [28KB], 65536KB chunk

unused devices: <none>
[root@home07 ~]#

Note that only md127 is required for dm-cache, the other arrays are (no cached) /boot and /boot/efi.

One of the screenshots shows the available dmsetup output (not actual lvs output), approximation (manual typing, see screenshot for accurate data):

[root]# dmsetup info -c -o name,uuid,suspended,readonly,open,tables_loaded,names_using_dev
Name                                      UUID                                                                                   Suspended Read-only Open Tables RefNames                                   
base-cache_cvol                           LVM-RLs0VskICCirKMnzAfDyVug37GpWhB1XEwQE1n1DCedKMKA4IsP3k257IyimpgYC-cvol              Active    Writeable    2 Live   base-cache_cvol-cdata
base-cache_cvol-cdata                     LVM-RLs0VskICCirKMnzAfDyVug37GpWhB1XEwQE1n1DCedKMKA4IsP3k257IyimpgYC-cdata             Active    Writeable    1 Live   
base-cache_cvol-cmeta                     LVM-RLs0VskICCirKMnzAfDyVug37GpWhB1XEwQE1n1DCedKMKA4IsP3k257IyimpgYC-cmeta             Active    Writeable    1 Live   
[root]# 

A functional system shows this:

[root@home07 ~]# dmsetup info -c -o name,uuid,suspended,readonly,open,tables_loaded,names_using_dev
Name                                      UUID                                                                                   Suspended Read-only Open Tables RefNames                                   
base-cache_cvol                           LVM-RLs0VskICCirKMnzAfDyVug37GpWhB1XEwQE1n1DCedKMKA4IsP3k257IyimpgYC-cvol              Active    Writeable    2 Live   base-cache_cvol-cdata,base-cache_cvol-cmeta
base-cache_cvol-cdata                     LVM-RLs0VskICCirKMnzAfDyVug37GpWhB1XEwQE1n1DCedKMKA4IsP3k257IyimpgYC-cdata             Active    Writeable    1 Live   base-lukscache                             
base-cache_cvol-cmeta                     LVM-RLs0VskICCirKMnzAfDyVug37GpWhB1XEwQE1n1DCedKMKA4IsP3k257IyimpgYC-cmeta             Active    Writeable    1 Live   base-lukscache                             
base-lukscache                            LVM-RLs0VskICCirKMnzAfDyVug37GpWhB1XKoCD1KnAEnW59iNak7M0A7Cps6RSQKxd                   Active    Writeable    1 Live   luks-3d87465c-4df6-4215-8114-93a5b9d4cdd7  
base-lukscache_corig                      LVM-RLs0VskICCirKMnzAfDyVug37GpWhB1XV9EVvnfuYxJa4ywz1Bsen2BujY9yEok9-real              Active    Writeable    1 Live   base-lukscache                             
crypt--fedora-root                        LVM-XZBlAXyOzRw9LWedyo6luoH7fAQpbW0lFqHAJt8Q1acUW5JUSGZmk3AccAWHOiKd                   Active    Writeable    1 Live                                              
luks-3d87465c-4df6-4215-8114-93a5b9d4cdd7 CRYPT-LUKS2-3d87465c4df64215811493a5b9d4cdd7-luks-3d87465c-4df6-4215-8114-93a5b9d4cdd7 Active    Writeable    1 Live   crypt--fedora-root                         
[root@home07 ~]#

Regarding lvmdevices, no /etc/lvm/devices/devicesfile:

[root@home07 ~]# lvm lvmdevices
  Devices file does not exist.
[root@home07 ~]# 

There's no errors, dracut just times out waiting for the luks stuff. See screenshot. Furthermore dmesg shows no addtl. info compared to console.

I'll get back on the other questions/remarks after hitting the powerswitch and creating the failing situation.

Comment 9 Rolf Fokkens 2024-01-26 19:00:34 UTC
Created attachment 2010792 [details]
screened lvmdump output

Comment 10 Rolf Fokkens 2024-01-26 19:05:02 UTC
Created attachment 2010794 [details]
lvm lvs output

Comment 11 Rolf Fokkens 2024-01-26 19:10:52 UTC
Attached a screenshot showing lvm lvs during a failing boot, this is the output on a working system:

[root@home07 ~]# lvm lvs -a
  LV                VG           Attr       LSize   Pool         Origin            Data%  Meta%  Move Log Cpy%Sync Convert
  [cache_cvol]      base         Cwi-aoC--- 199.00g                                                                       
  lukscache         base         Cwi-aoC---  <1.82t [cache_cvol] [lukscache_corig] 99.99  26.47           98.36           
  [lukscache_corig] base         owi-aoC---  <1.82t                                                                       
  root              crypt-fedora -wi-ao----  <1.68t                                                                       
[root@home07 ~]#

Comment 12 Rolf Fokkens 2024-01-26 22:12:25 UTC
FWIW I installed latest FC38 kernel:

dnf install --release=38 kernel-6.6.13-100.fc38

Fails to boot after "reboot -f" the same as kernel-6.6.13-200.fc39

No surprise probably.

Comment 13 Rolf Fokkens 2024-01-28 12:21:04 UTC
Created attachment 2011132 [details]
cache_check error: insert at index NNN exceeds fixbitset size NNN

Comment 14 Rolf Fokkens 2024-01-28 12:32:57 UTC
Assuming a needs_check is triggered in the status during a sudden reboot, one could assume that a cache_check is required. Doing so when the system fails to boot renders (more details in screenshot)

cache_check /dev/mapper/base-cache_cvol_cmeta
thread 'main' panicking ...
insert at index 30479487 exceeds fixbitset size 30479487
...

Could mean anything, related or not related to FC39 in combination with bad meta data and/or cache_check tool.

Comment 15 Zdenek Kabelac 2024-01-29 10:38:30 UTC
Passing to the author of cache_check  - it looks like there is detected some issue with metadata - not sure yet how this may influence target correctness - and whether it's not related to --auto-repair issue (see bug #2233177).

Lvm2 on activation calls cache_check which happens to detect some problems within metadata - thus prevents the activation.

What I don't really get here is - how it would be possible to spot the issue with newer kernel and not with old one.

So let me also clarify this -  when you run  'newer' kernel -  all other tools in the system  are bit-to-bit exactly the same ?  
Or do you just try with standard 'FC38' installation and standard 'F39' installation ?
(As the FC39 has newer  cache_check tool)

Comment 16 Ming-Hung Tsai 2024-01-29 15:38:38 UTC
Hi (In reply to Rolf Fokkens from comment #14)
> Assuming a needs_check is triggered in the status during a sudden reboot,
> one could assume that a cache_check is required. Doing so when the system
> fails to boot renders (more details in screenshot)
> 
> cache_check /dev/mapper/base-cache_cvol_cmeta
> thread 'main' panicking ...
> insert at index 30479487 exceeds fixbitset size 30479487
> ...
> 
> Could mean anything, related or not related to FC39 in combination with bad
> meta data and/or cache_check tool.

Would you be able to provide the cache metadata dump for us to do further check?

dd if=/dev/mapper/base-cache_cvol-cmeta of=cmeta.bin
tar -czvf cmeta.tar.gz cmeta.bin

Then upload the compressed cmeta.tar.gz if possible. Thanks.

Comment 17 Ming-Hung Tsai 2024-01-29 15:47:49 UTC
(In reply to Rolf Fokkens from comment #12)
> FWIW I installed latest FC38 kernel:
> 
> dnf install --release=38 kernel-6.6.13-100.fc38
> 
> Fails to boot after "reboot -f" the same as kernel-6.6.13-200.fc39
> 
> No surprise probably.

It looks like you're just downgrade the kernel but keep userland tools unchanged (still in cache_check v1.0.x), so the issue remains there.

Comment 18 Rolf Fokkens 2024-01-29 21:36:29 UTC
Created attachment 2013100 [details]
/dev/mapper/base-cache_cvol-cmeta

Comment 19 Rolf Fokkens 2024-01-29 21:38:55 UTC
As requested:

[root@home07 ~]# dd if=/dev/mapper/base-cache_cvol-cmeta of=cmeta.bin
294912+0 records in
294912+0 records out
150994944 bytes (151 MB, 144 MiB) copied, 0.719608 s, 210 MB/s
[root@home07 ~]# tar -czvf cmeta.tar.gz cmeta.bin
cmeta.bin
[root@home07 ~]# ls -l cmeta.tar.gz cmeta.bin
-rw-r--r--. 1 root root 150994944 Jan 29 22:30 cmeta.bin
-rw-r--r--. 1 root root  22664211 Jan 29 22:30 cmeta.tar.gz
[root@home07 ~]# tar -cJvf cmeta.tar.xz cmeta.bin
cmeta.bin
[root@home07 ~]# ls -l cmeta.tar.xz cmeta.bin
-rw-r--r--. 1 root root 150994944 Jan 29 22:30 cmeta.bin
-rw-r--r--. 1 root root  10171900 Jan 29 22:34 cmeta.tar.xz
[root@home07 ~]# 

Used xz compression because the gz compressed file was to large to upload.

Comment 20 Ming-Hung Tsai 2024-01-30 07:43:29 UTC
(In reply to Rolf Fokkens from comment #18)
> Created attachment 2013100 [details]
> /dev/mapper/base-cache_cvol-cmeta

There is a bug in checking the addresses of cached blocks, specifically to metadata that are not shut down normally (e.g., power failure). I'll fix the bug soon. Thank you for the detailed information provided, helping us in improving product quality.

Comment 21 Rolf Fokkens 2024-01-31 11:26:33 UTC
Thanks for addressing the issue so swiftly.

Comment 22 Ming-Hung Tsai 2024-02-07 11:32:07 UTC
Fixed in thin-provisioning-tools v1.0.11 upstream, commit d2390a50f3.

Two new test cases are added accordingly:

$ cargo test [--release]
...
test metadata_v1_without_slow_dev_size_info ... ok
test metadata_v2_without_slow_dev_size_info ... ok

Comment 23 Fedora Update System 2024-02-09 13:04:19 UTC
FEDORA-2024-64e5a607ae (device-mapper-persistent-data-1.0.11-1.fc39) has been submitted as an update to Fedora 39.
https://bodhi.fedoraproject.org/updates/FEDORA-2024-64e5a607ae

Comment 24 Fedora Update System 2024-02-09 13:05:12 UTC
FEDORA-2024-3381af2ed1 (device-mapper-persistent-data-1.0.11-1.fc38) has been submitted as an update to Fedora 38.
https://bodhi.fedoraproject.org/updates/FEDORA-2024-3381af2ed1

Comment 25 Fedora Update System 2024-02-10 03:56:13 UTC
FEDORA-2024-3381af2ed1 has been pushed to the Fedora 38 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2024-3381af2ed1`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2024-3381af2ed1

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 26 Rolf Fokkens 2024-02-10 10:26:17 UTC
I can confirm that this fixes the reported issue:
* updated the device-mapper-persistent-data-1.0.11-1.fc39 package
* did "dracut /boot/initramfs-6.6.13-200.fc39.x86_64.img.2 6.6.13-200.fc39.x86_64"
* did "reboot -f"
* hit <shift> to enter grub and selected this initramfs
* system booted fine

Comment 27 Fedora Update System 2024-02-11 08:07:51 UTC
FEDORA-2024-64e5a607ae has been pushed to the Fedora 39 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2024-64e5a607ae`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2024-64e5a607ae

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 28 Fedora Update System 2024-02-17 00:57:30 UTC
FEDORA-2024-64e5a607ae (device-mapper-persistent-data-1.0.11-1.fc39) has been pushed to the Fedora 39 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 29 Fedora Update System 2024-02-25 01:25:05 UTC
FEDORA-2024-3381af2ed1 (device-mapper-persistent-data-1.0.11-1.fc38) has been pushed to the Fedora 38 stable repository.
If problem still persists, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.