Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

This project is now read‑only. Starting Monday, February 2, please use https://ibm-ceph.atlassian.net/ for all bug tracking management.

Bug 2117093

Summary:

OSD nodes are hung after initiating random writes to the pool

Product:

[Red Hat Storage] Red Hat Ceph Storage

Reporter:

skanta

Component:

RADOS

Assignee:

Nitzan mordechai <nmordech>

Status:

CLOSED NOTABUG

QA Contact:

skanta

Severity:

high

Docs Contact:

Priority:

unspecified

Version:

6.0

CC:

akupczyk, amathuri, bhubbard, ceph-eng-bugs, cephqe-warriors, choffman, ksirivad, lflores, moagrawa, nmordech, nojha, pdhange, rfriedma, rzarzyns, sseshasa, vereddy, vumrao

Target Milestone:

---

Target Release:

8.1

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2024-11-08 18:43:31 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
Top output on OSD node	none

Description skanta 2022-08-10 00:54:54 UTC

Created attachment 1904589 [details]
Top output on OSD node

Description of problem:

   OSD nodes are hung and OSD are down  after performing the random write operations on the cluster. Notice that CPU utilization is more.

[ceph: root@ceph-skanta-yi6hpl-node1-installer /]# ceph osd tree
ID   CLASS  WEIGHT   TYPE NAME                           STATUS  REWEIGHT  PRI-AFF
 -1         0.48798  root default                                                 
 -9         0.09760      host ceph-skanta-yi6hpl-node10                           
 12    hdd  0.02440          osd.12                        down   1.00000  1.00000
 14    hdd  0.02440          osd.14                        down   1.00000  1.00000
 16    hdd  0.02440          osd.16                        down   1.00000  1.00000
 18    hdd  0.02440          osd.18                        down   1.00000  1.00000
 -7         0.09760      host ceph-skanta-yi6hpl-node3                            
  2    hdd  0.02440          osd.2                           up   1.00000  1.00000
  5    hdd  0.02440          osd.5                           up   1.00000  1.00000
  8    hdd  0.02440          osd.8                           up   1.00000  1.00000
 11    hdd  0.02440          osd.11                          up   1.00000  1.00000
 -3         0.09760      host ceph-skanta-yi6hpl-node4                            
  1    hdd  0.02440          osd.1                         down   1.00000  1.00000
  3    hdd  0.02440          osd.3                         down   1.00000  1.00000
  7    hdd  0.02440          osd.7                         down   1.00000  1.00000
 10    hdd  0.02440          osd.10                        down   1.00000  1.00000
 -5         0.09760      host ceph-skanta-yi6hpl-node5                            
  0    hdd  0.02440          osd.0                         down   1.00000  1.00000
  4    hdd  0.02440          osd.4                         down   1.00000  1.00000
  6    hdd  0.02440          osd.6                         down   1.00000  1.00000
  9    hdd  0.02440          osd.9                         down   1.00000  1.00000
-11         0.09760      host ceph-skanta-yi6hpl-node9                            
 13    hdd  0.02440          osd.13                        down         0  1.00000
 15    hdd  0.02440          osd.15                        down         0  1.00000
 17    hdd  0.02440          osd.17                          up   1.00000  1.00000
 19    hdd  0.02440          osd.19                          up   1.00000  1.00000
[ceph: root@ceph-skanta-yi6hpl-node1-installer /]#

 
[ceph: root@ceph-skanta-yi6hpl-node1-installer /]# ceph df
--- RAW STORAGE ---
CLASS     SIZE    AVAIL     USED  RAW USED  %RAW USED
hdd    350 GiB  183 GiB  167 GiB   167 GiB      47.70
TOTAL  350 GiB  183 GiB  167 GiB   167 GiB      47.70
 
--- POOLS ---
POOL                    ID  PGS   STORED  OBJECTS     USED  %USED  MAX AVAIL
.mgr                     1    1  897 KiB        2  1.8 MiB      0    109 GiB
cephfs.cephfs_Qos.meta   2   16  1.2 MiB       25  2.6 MiB      0    109 GiB
cephfs.cephfs_Qos.data   3  512  102 GiB   17.39k  206 GiB  48.46    108 GiB
scrub_pool               4   32      0 B        0      0 B      0     73 GiB
recovery_pool            5   32      0 B        0      0 B      0     73 GiB
[ceph: root@ceph-skanta-yi6hpl-node1-installer /]#


 

Version-Release number of selected component (if applicable):
[ceph: root@ceph-skanta-yi6hpl-node1-installer /]# ceph -v
ceph version 17.2.2-1.el9cp (27ec6f23923e162bf6e6e48c8b789cf18fee6f31) quincy (stable)
[ceph: root@ceph-skanta-yi6hpl-node1-installer /]#

How reproducible:


Steps to Reproduce:
1. Configure cluster

2. Perform random writes by using fio.
    Command- fio --directory=/mnt/cephfs_Qos  -direct=1 -iodepth 64 -thread -rw=randwrite  --end_fsync=0 -ioengine=libaio -bs=4096 -size=16384M --norandommap -numjobs=1 -runtime=600 --time_based --invalidate=0 -group_reporting -name=ceph_fs_Qos_4M --write_iops_log=/tmp/cephfs/Fio/output.0 --write_bw_log=/tmp/cephfs/Fio/output.0 --write_lat_log=/tmp/cephfs/Fio/output.0 --log_avg_msec=100 --write_hist_log=/tmp/cephfs/Fio/output.0 --output-format=json,normal > /tmp/cephfs/Fio/output.0

Comment 42 Red Hat Bugzilla 2025-03-09 04:25:03 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days