1959159 – [RBD] Numerous data availability and corruption issues with persistent writeback cache in ssd mode

Bug 1959159 - [RBD] Numerous data availability and corruption issues with persistent writeback cache in ssd mode

Summary: [RBD] Numerous data availability and corruption issues with persistent writeb...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Ceph Storage
Classification:	Red Hat Storage
Component:	RBD
Sub Component:
Version:	5.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	5.1
Assignee:	Ilya Dryomov
QA Contact:	Preethi
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2021-05-10 19:41 UTC by Ilya Dryomov
Modified:	2022-04-04 10:21 UTC (History)
CC List:	7 users (show)
Fixed In Version:	ceph-16.2.6-34.el8cp
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2022-04-04 10:20:39 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2022:1174	0	None	None	None	2022-04-04 10:21:06 UTC

Comment 7 Bertrand 2021-08-24 16:41:30 UTC

Hello Ilya,
What about those 2 PRs?

PR#42843 (https://github.com/ceph/ceph/pull/42843) = Tracker for issue 52323
PR#42883 (https://github.com/ceph/ceph/pull/42883)= Tracker for issue 52341

Should those 2 fixes be backported as well?

Thanks,

Bertrand

Comment 21 Preethi 2021-12-15 06:19:03 UTC

@llYa, @Deepika, We have used ceph-16.2.6-34.el8cp to verify the issue and below are my observation:

1) After updating conf file to SSD mode, We see cache status as RWL instead of SSD mode. Below are my observation -


root@plena007 log]# cat /etc/ceph/ceph.conf 
# minimal ceph.conf for d6e5c458-0f10-11ec-9663-002590fc25a4
[global]
	fsid = d6e5c458-0f10-11ec-9663-002590fc25a4
	mon_host = [v2:10.8.128.31:3300/0,v1:10.8.128.31:6789/0]
[client]
        rbd_cache = false
        rbd_persistent_cache_mode = ssd
        rbd_plugins = pwl_cache
        rbd_persistent_cache_size = 1073741824
        rbd_persistent_cache_path = /mnt/nvme/

Started Ios using rbd bench , and checked the mounted path to see the data

[root@plena007 nvme]# ls
lost+found  rbd-pwl.test1.11da251a5d7044.pool
[root@plena007 nvme]# ls
lost+found
[root@plena007 nvme]# ls
lost+found
[root@plena007 nvme]#


[root@magna031 yum.repos.d]# rbd du --pool test1
NAME     PROVISIONED  USED 
image1         1 GiB  1 GiB
image2      1000 GiB  1 GiB
<TOTAL>     1001 GiB  2 GiB
[root@magna031 yum.repos.d]#

[root@magna031 yum.repos.d]# rbd status test1/image1
Watchers: none
[root@magna031 yum.repos.d]# rbd status test1/image2
Watchers:
	watcher=10.1.172.7:0/225138355 client.1170002 cookie=140197396313344
Image cache state: {"present":"true","empty":"true","clean":"true","cache_type":"rwl","pwl_host":"plena007","pwl_path":"/mnt/nvme//rbd-pwl.test1.11da251a5d7044.pool","pwl_size":1073741824}
[root@magna031 yum.repos.d]# rbd status test1/image2
Watchers: none
[root@magna031 yum.repos.d]#

Above status seen as RWL instead of SSD******************************

NOTE: the above behaviour was inconsistent, second time when IO was ran we didnot see data writing to mounted path and there was none in watchers output when rbd status command was issued.



We also upgraded the cluster to the latest and saw Io's failed to start. (Triggered Ios from RBD bench and FIO both)
[root@magna031 ubuntu]# ceph version
ceph version 16.2.7-3.el8cp (54410e69e153d229a04fb6acc388f7e4afdd05e7) pacific (stable)


RBD bench output for reference -
[root@plena007 ubuntu]# rbd bench-write image1 --pool=test --io-threads=1
rbd: bench-write is deprecated, use rbd bench --io-type write ...
2021-12-14T07:25:30.666+0000 7fc3327fc700 -1 librbd::exclusive_lock::PostAcquireRequest: 0x7fc32c037000 handle_process_plugin_acquire_lock: failed to process plugins: (2) No such file or directory
rbd: failed to flush: 2021-12-14T07:25:30.669+0000 7fc3327fc700 -1 librbd::exclusive_lock::ImageDispatch: 0x7fc314002b60 handle_acquire_lock: failed to acquire exclusive lock: (2) No such file or directory
2021-12-14T07:25:30.669+0000 7fc3327fc700 -1 librbd::io::AioCompletion: 0x559cca568320 fail: (2) No such file or directory
(2) No such file or directory
bench failed: (2) No such file or directory


FIO output -
[root@plena007 ubuntu]# fio --name=test-1 --ioengine=rbd --pool=test1 --rbdname=image2 --numjobs=1 --rw=write --bs=4k --iodepth=1 --fsync=32 --runtime=480 --time_based --group_reporting --ramp_time=120
test-1: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=rbd, iodepth=1
fio-3.19
Starting 1 process
fio: io_u error on file test-1.0.0: No such file or directory: write offset=0, buflen=4096
fio: pid=1197333, err=2/file:io_u.c:1803, func=io_u error, error=No such file or directory

test-1: (groupid=0, jobs=1): err= 2 (file:io_u.c:1803, func=io_u error, error=No such file or directory): pid=1197333: Tue Dec 14 07:26:47 2021
  cpu          : usr=0.00%, sys=0.00%, ctx=2, majf=0, minf=5
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,1,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):

Disk stats (read/write):
  sda: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%
[root@plena007 ubuntu]#

Comment 22 Preethi 2021-12-15 07:21:25 UTC

Filed separate BZs and tracker for IO failed to start in SSD mode -https://bugzilla.redhat.com/show_bug.cgi?id=2032764
https://tracker.ceph.com/issues/53613

Comment 23 Preethi 2021-12-24 10:48:46 UTC

To verify this BZ, we need steps and workloads. Please help us in sharing the steps to verify.

Comment 27 Preethi 2022-02-02 10:06:49 UTC

Attached script used for verification for the below mentioned scenario. We have completed 10k iteration and no issue seen. Hence moving it to verified state.

create an image
for i in {0..10000}:
    start "rbd bench" or "fio" (choose between sequential and random write workload at random, choose I/O size at random) in the background
    sleep for 10-100 seconds at random
    SIGKILL "rbd bench" or "fio"
    assert that the cache is dirty ("rbd status | grep image_cache_state" should produce output)

Comment 29 errata-xmlrpc 2022-04-04 10:20:39 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat Ceph Storage 5.1 Security, Enhancement, and Bug Fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:1174

Note You need to log in before you can comment on or make changes to this bug.