Bug 1732550 - LVM cache gets marked completely dirty on every boot when used for filesystem root
Summary: LVM cache gets marked completely dirty on every boot when used for filesystem...
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 36
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-07-23 16:07 UTC by Lucas Vinicius Hartmann
Modified: 2023-05-25 17:00 UTC (History)
18 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-05-25 17:00:13 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
Output of "journalctl --no-hostname -k > dmesg.txt" (74.29 KB, text/plain)
2019-07-23 16:07 UTC, Lucas Vinicius Hartmann
no flags Details

Description Lucas Vinicius Hartmann 2019-07-23 16:07:47 UTC
Created attachment 1592918 [details]
Output of "journalctl --no-hostname -k > dmesg.txt"

Issue Description:
==================
If a LVM cache is configured for the / mountpoint then all of it's contents are marked as dirty every time the computer is restarted.

From there normal cache flush occurs and no data is lost, but may take several hours depending on cache size and cache-origin disk speed.

Other Info
==========
Kernel: 5.1.18-300.fc30.x86_64
Split off from similar bug: https://bugzilla.redhat.com/show_bug.cgi?id=1727319

Reproducing:
============
- Install Fedora 30 on a hard drive.
- Create a SSD cache using the regular LVM procedure, writeback or writethrough does not seem to make any difference.
- Play a little (maybe install updates) until some cache is populated.
- Restart and run: 
sudo lvs -o +cache_dirty_blocks,cache_used_blocks,cache_total_blocks,cache_mode

Results:
========
All USED blocks are marked DIRTY and written back over time.

Output of lvs folows. Notice the high dirty block count on root despite writethrough mode. LV data is for /home, writeback, and NOT affected.

  LV   VG     Attr       LSize   Pool         Origin       Data%  Meta%  Move Log Cpy%Sync Convert CacheDirtyBlocks CacheUsedBlocks  CacheTotalBlocks CacheMode   
  data fedora Cwi-aoC--- 800,00g [data_cache] [data_corig] 8,22   8,79            0,01                            1            63804           775840 writeback   
  root fedora Cwi-aoC--- 128,50g [root_cache] [root_corig] 20,51  13,61           50,04                       28243            56444           275200 writethrough

Comment 1 Justin M. Forbes 2019-08-20 17:43:35 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There are a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 30 kernel bugs.

Fedora 30 has now been rebased to 5.2.9-200.fc30.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 31, and are still experiencing this issue, please change the version to Fedora 31.

If you experience different issues, please open a new bug report for those.

Comment 2 Lucas Vinicius Hartmann 2019-08-20 20:37:43 UTC
Tested, rebooted twice, still present on 5.2.9-200.fc30.x86_64.

# Before second rebooting:
sudo lvs -ovg_name,lv_name,cache_dirty_blocks,cache_used_blocks,cache_mode
  VG     LV   CacheDirtyBlocks CacheUsedBlocks  CacheMode   
  fedora data            16198           502794 writeback   
  fedora root                0           153332 writethrough

# After second reboot:
sudo lvs -ovg_name,lv_name,cache_dirty_blocks,cache_used_blocks,cache_mode
  VG     LV   CacheDirtyBlocks CacheUsedBlocks  CacheMode   
  fedora data             7844           502798 writeback   
  fedora root           147148           153436 writethrough

Notice that for fedora/data, which is mounted on /home, the dirty block count is preserved correctly (some were flushed before I could run the command, which is expected).
For fedora/root, mounted on /, all used blocks were marked dirty despite using writethrough mode.
After about 30 minutes both dirty counters were back at zero.

Comment 3 Justin M. Forbes 2020-03-03 16:37:50 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There are a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 30 kernel bugs.

Fedora 30 has now been rebased to 5.5.7-100.fc30.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 31, and are still experiencing this issue, please change the version to Fedora 31.

If you experience different issues, please open a new bug report for those.

Comment 4 Justin M. Forbes 2020-03-25 22:32:50 UTC
*********** MASS BUG UPDATE **************
This bug is being closed with INSUFFICIENT_DATA as there has not been a response in 3 weeks. If you are still experiencing this issue, please reopen and attach the relevant data from the latest kernel you are running and any data that might have been requested previously.

Comment 5 Rolf Fokkens 2022-08-30 17:25:08 UTC
All the reasons the reopen this: Fedora 36, root FS on lvm cache, still same issue.

It's actually a little more complex than that

SATA disks -> RAID5 -> base VG -> lukscache LV(+NVME cache) -> cryp-fedora VG -> root LV

lsblk output:

sda                                                   8:0    0 931.5G  0 disk  
├─sda1                                                8:1    0   199M  0 part  
│ └─md125                                             9:125  0   199M  0 raid1 /boot/efi
├─sda2                                                8:2    0     1G  0 part  
│ └─md126                                             9:126  0  1022M  0 raid1 /boot
└─sda3                                                8:3    0 930.3G  0 part  
  └─md127                                             9:127  0   1.8T  0 raid5 
    ├─base-lukscache_corig                          253:3    0   1.7T  0 lvm   
    │ └─base-lukscache                              253:4    0   1.7T  0 lvm   
    │   └─luks-3d87465c-4df6-4215-8114-93a5b9d4cdd7 253:5    0   1.7T  0 crypt 
    │     └─crypt--fedora-root                      253:6    0   1.6T  0 lvm   /
    └─base-tmproot                                  253:8    0    60G  0 lvm   
sdb                                                   8:16   0 931.5G  0 disk  
├─sdb1                                                8:17   0   199M  0 part  
│ └─md125                                             9:125  0   199M  0 raid1 /boot/efi
├─sdb2                                                8:18   0     1G  0 part  
│ └─md126                                             9:126  0  1022M  0 raid1 /boot
└─sdb3                                                8:19   0 930.3G  0 part  
  └─md127                                             9:127  0   1.8T  0 raid5 
    ├─base-lukscache_corig                          253:3    0   1.7T  0 lvm   
    │ └─base-lukscache                              253:4    0   1.7T  0 lvm   
    │   └─luks-3d87465c-4df6-4215-8114-93a5b9d4cdd7 253:5    0   1.7T  0 crypt 
    │     └─crypt--fedora-root                      253:6    0   1.6T  0 lvm   /
    └─base-tmproot                                  253:8    0    60G  0 lvm   
sdc                                                   8:32   0 931.5G  0 disk  
├─sdc1                                                8:33   0   199M  0 part  
│ └─md125                                             9:125  0   199M  0 raid1 /boot/efi
├─sdc2                                                8:34   0     1G  0 part  
│ └─md126                                             9:126  0  1022M  0 raid1 /boot
└─sdc3                                                8:35   0 930.3G  0 part  
  └─md127                                             9:127  0   1.8T  0 raid5 
    ├─base-lukscache_corig                          253:3    0   1.7T  0 lvm   
    │ └─base-lukscache                              253:4    0   1.7T  0 lvm   
    │   └─luks-3d87465c-4df6-4215-8114-93a5b9d4cdd7 253:5    0   1.7T  0 crypt 
    │     └─crypt--fedora-root                      253:6    0   1.6T  0 lvm   /
    └─base-tmproot                                  253:8    0    60G  0 lvm   
nvme0n1                                             259:0    0 238.5G  0 disk  
├─nvme0n1p1                                         259:1    0   128G  0 part  
│ └─base-cache_cvol                                 253:0    0   127G  0 lvm   
│   ├─base-cache_cvol-cdata                         253:1    0   127G  0 lvm   
│   │ └─base-lukscache                              253:4    0   1.7T  0 lvm   
│   │   └─luks-3d87465c-4df6-4215-8114-93a5b9d4cdd7 253:5    0   1.7T  0 crypt 
│   │     └─crypt--fedora-root                      253:6    0   1.6T  0 lvm   /
│   └─base-cache_cvol-cmeta                         253:2    0    28M  0 lvm   
│     └─base-lukscache                              253:4    0   1.7T  0 lvm   
│       └─luks-3d87465c-4df6-4215-8114-93a5b9d4cdd7 253:5    0   1.7T  0 crypt 
│         └─crypt--fedora-root                      253:6    0   1.6T  0 lvm   /

Comment 6 Rolf Fokkens 2022-09-02 14:15:58 UTC
I setup the lv with cleaner=1, hoping lvconvert --uncache would be faster. No such luck:

[root@home07 ~]# lvconvert --uncache /dev/base/lukscache 
  Flushing 2 blocks for cache base/lukscache.
  Flushing 520074 blocks for cache base/lukscache.
  Flushing 520068 blocks for cache base/lukscache.
  Flushing 520068 blocks for cache base/lukscache.
^C  Interrupted...
  Flushing of base/lukscache aborted.
  Logical volume base/lukscache is not cached and base/cache_cvol is removed.
[root@home07 ~]# lvs -o+cache_used_blocks,cache_dirty_blocks,cache_policy,cache_settings /dev/base/lukscache 
  LV        VG   Attr       LSize Pool         Origin            Data%  Meta%  Move Log Cpy%Sync Convert CacheUsedBlocks  CacheDirtyBlocks CachePolicy CacheSettings
  lukscache base Cwi-aoC--- 1.68t [cache_cvol] [lukscache_corig] 99.99  21.85           99.93                      520076           519704 smq         cleaner=1    
[root@home07 ~]# 

Not sure if this is relevant, but it 's the tendency to just consider all blocks dirty when it's not called for.

Comment 7 Ben Cotton 2023-04-25 16:39:19 UTC
This message is a reminder that Fedora Linux 36 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora Linux 36 on 2023-05-16.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
'version' of '36'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, change the 'version' 
to a later Fedora Linux version. Note that the version field may be hidden.
Click the "Show advanced fields" button if you do not see it.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora Linux 36 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora Linux, you are encouraged to change the 'version' to a later version
prior to this bug being closed.

Comment 8 Rolf Fokkens 2023-04-29 19:17:43 UTC
Not sure of any Kernel/LVM devs actually read these bugzilla Kernel/LVM tickets, because the only responses reflect the urge to just close this bug. Sinds 2019 it was never about solving, only about closing.

Well, closing this bug may actually appropriate now. The issue I had seems to be gone in Fedora 38. It may silently be solved... Not sure if the original reporter also considers this solved though.

Comment 9 Ludek Smid 2023-05-25 17:00:13 UTC
Fedora Linux 36 entered end-of-life (EOL) status on 2023-05-16.

Fedora Linux 36 is no longer maintained, which means that it
will not receive any further security or bug fix updates. As a result we
are closing this bug.

If you can reproduce this bug against a currently maintained version of Fedora Linux
please feel free to reopen this bug against that version. Note that the version
field may be hidden. Click the "Show advanced fields" button if you do not see
the version field.

If you are unable to reopen this bug, please file a new report against an
active release.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.