Bug 2114615

Summary: pybind/mgr/volumes: investigate moving calls which may block on libcephfs into another thread
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Vikhyat Umrao <vumrao>
Component: Ceph-Mgr PluginsAssignee: Kotresh HR <khiremat>
Ceph-Mgr Plugins sub component: volumes QA Contact: Amarnath <amk>
Status: CLOSED ERRATA Docs Contact: Akash Raj <akraj>
Severity: high    
Priority: unspecified CC: aivaraslaimikis, akraj, bhubbard, gfarnum, gjose, kelwhite, khiremat, mcaldeir, ngangadh, pdhange, pdonnell, saraut, sostapov, tserlin, vdas, vereddy, vshankar
Version: 5.1   
Target Milestone: ---   
Target Release: 6.1z2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ceph-17.2.6-120.el9cp Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2234610 (view as bug list) Environment:
Last Closed: 2023-10-12 16:34:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2234610, 2235257    

Description Vikhyat Umrao 2022-08-02 23:54:21 UTC
Description of problem:
pybind/mgr/volumes: investigate moving calls which may block on libcephfs into another thread

https://tracker.ceph.com/issues/51177

Comment 17 Scott Ostapovicz 2023-07-17 14:38:10 UTC
Missed the window for 6.1 z1. Retargeting to 6.1 z2.

Comment 34 Amarnath 2023-09-21 17:22:07 UTC
Hi Kotresh,

Verified by following below steps

1. Created ceph fs volume
2. Triggered ceph fs fail
3. Tried created volumes after the Fs is down.Volume creation was stuck.
4. on the other window triggered ceph isostat. Ceph iostat command went with out any stuck. in older builds ceph iostat command was getting stuck when volume creation was stuck.

[root@ceph-amk-bz-xvxd2m-node7 ~]# ceph fs status
cephfs - 2 clients
======
RANK  STATE                    MDS                       ACTIVITY     DNS    INOS   DIRS   CAPS  
 0    active  cephfs.ceph-amk-bz-xvxd2m-node6.erpcbe  Reqs:    0 /s  46.0k  46.0k  8059   36.1k  
 1    active  cephfs.ceph-amk-bz-xvxd2m-node4.uuyuvw  Reqs:    0 /s  14.3k  14.3k  2376   12.0k  
       POOL           TYPE     USED  AVAIL  
cephfs.cephfs.meta  metadata  1273M  53.8G  
cephfs.cephfs.data    data    3586M  53.8G  
             STANDBY MDS                
cephfs.ceph-amk-bz-xvxd2m-node5.tetxmg  
MDS version: ceph version 17.2.6-145.el9cp (9e560e1f3b30ccca0e2b0af8fd5af0e01d8d7fbe) quincy (stable)
[root@ceph-amk-bz-xvxd2m-node7 ~]# ceph fs fail cephfs
cephfs marked not joinable; MDS cannot join the cluster. All MDS ranks marked failed.
[root@ceph-amk-bz-xvxd2m-node7 ~]# ceph fs subvolume create cephfs subvol_1;ceph fs subvolume create cephfs subvol_2;ceph fs subvolume create cephfs subvol_3;ceph fs subvolume create cephfs subvol_4
Error ETIMEDOUT: error calling ceph_mount

[root@ceph-amk-bz-xvxd2m-node7 ~]# ceph iostat
+---------------------------------+---------------------------------+---------------------------------+---------------------------------+---------------------------------+---------------------------------+
|                            Read |                           Write |                           Total |                       Read IOPS |                      Write IOPS |                      Total IOPS |
+---------------------------------+---------------------------------+---------------------------------+---------------------------------+---------------------------------+---------------------------------+
|                           0 B/s |                           0 B/s |                           0 B/s |                               0 |                               0 |                               0 |

|                           0 B/s |                           0 B/s |                           0 B/s |                               0 |                               0 |                               0 |

|                           0 B/s |                           0 B/s |                           0 B/s |                               0 |                               0 |                               0 |

|                           0 B/s |                           0 B/s |                           0 B/s |                               0 |                               0 |                               0 |

|                           0 B/s |                           0 B/s |                           0 B/s |                               0 |                               0 |                               0 |

|                           0 B/s |                           0 B/s |                           0 B/s |                               0 |                               0 |                               0 |

|                           0 B/s |                           0 B/s |                           0 B/s |                               0 |                               0 |                               0 |

|                           0 B/s |                           0 B/s |                           0 B/s |                               0 |                               0 |                               0 |

|                           0 B/s |                           0 B/s |                           0 B/s |                               0 |                               0 |                               0 |

|                           0 B/s |                           0 B/s |                           0 B/s |                               0 |                               0 |                               0 |

|                           0 B/s |                           0 B/s |                           0 B/s |                               0 |                               0 |                               0 |

|                           0 B/s |                           0 B/s |                           0 B/s |                               0 |                               0 |                               0 |

^CInterrupted
[root@ceph-amk-bz-xvxd2m-node7 ~]# ceph versions
{
    "mon": {
        "ceph version 17.2.6-145.el9cp (9e560e1f3b30ccca0e2b0af8fd5af0e01d8d7fbe) quincy (stable)": 3
    },
    "mgr": {
        "ceph version 17.2.6-145.el9cp (9e560e1f3b30ccca0e2b0af8fd5af0e01d8d7fbe) quincy (stable)": 2
    },
    "osd": {
        "ceph version 17.2.6-145.el9cp (9e560e1f3b30ccca0e2b0af8fd5af0e01d8d7fbe) quincy (stable)": 12
    },
    "mds": {
        "ceph version 17.2.6-145.el9cp (9e560e1f3b30ccca0e2b0af8fd5af0e01d8d7fbe) quincy (stable)": 3
    },
    "overall": {
        "ceph version 17.2.6-145.el9cp (9e560e1f3b30ccca0e2b0af8fd5af0e01d8d7fbe) quincy (stable)": 20
    }
}
[root@ceph-amk-bz-xvxd2m-node7 ~]#

Is this verification holds good for this BZ

Regards,
Amarnath

Comment 36 Kotresh HR 2023-09-25 06:59:16 UTC
(In reply to Amarnath from comment #34)
> Hi Kotresh,
> 
> Verified by following below steps
> 
> 1. Created ceph fs volume
> 2. Triggered ceph fs fail
> 3. Tried created volumes after the Fs is down.Volume creation was stuck.
> 4. on the other window triggered ceph isostat. Ceph iostat command went with
> out any stuck. in older builds ceph iostat command was getting stuck when
> volume creation was stuck.
> 
> [root@ceph-amk-bz-xvxd2m-node7 ~]# ceph fs status
> cephfs - 2 clients
> ======
> RANK  STATE                    MDS                       ACTIVITY     DNS   
> INOS   DIRS   CAPS  
>  0    active  cephfs.ceph-amk-bz-xvxd2m-node6.erpcbe  Reqs:    0 /s  46.0k 
> 46.0k  8059   36.1k  
>  1    active  cephfs.ceph-amk-bz-xvxd2m-node4.uuyuvw  Reqs:    0 /s  14.3k 
> 14.3k  2376   12.0k  
>        POOL           TYPE     USED  AVAIL  
> cephfs.cephfs.meta  metadata  1273M  53.8G  
> cephfs.cephfs.data    data    3586M  53.8G  
>              STANDBY MDS                
> cephfs.ceph-amk-bz-xvxd2m-node5.tetxmg  
> MDS version: ceph version 17.2.6-145.el9cp
> (9e560e1f3b30ccca0e2b0af8fd5af0e01d8d7fbe) quincy (stable)
> [root@ceph-amk-bz-xvxd2m-node7 ~]# ceph fs fail cephfs
> cephfs marked not joinable; MDS cannot join the cluster. All MDS ranks
> marked failed.
> [root@ceph-amk-bz-xvxd2m-node7 ~]# ceph fs subvolume create cephfs
> subvol_1;ceph fs subvolume create cephfs subvol_2;ceph fs subvolume create
> cephfs subvol_3;ceph fs subvolume create cephfs subvol_4
> Error ETIMEDOUT: error calling ceph_mount
> 
> [root@ceph-amk-bz-xvxd2m-node7 ~]# ceph iostat
> +---------------------------------+---------------------------------+--------
> -------------------------+---------------------------------+-----------------
> ----------------+---------------------------------+
> |                            Read |                           Write |       
> Total |                       Read IOPS |                      Write IOPS | 
> Total IOPS |
> +---------------------------------+---------------------------------+--------
> -------------------------+---------------------------------+-----------------
> ----------------+---------------------------------+
> |                           0 B/s |                           0 B/s |       
> 0 B/s |                               0 |                               0 | 
> 0 |
> 
> |                           0 B/s |                           0 B/s |       
> 0 B/s |                               0 |                               0 | 
> 0 |
> 
> |                           0 B/s |                           0 B/s |       
> 0 B/s |                               0 |                               0 | 
> 0 |
> 
> |                           0 B/s |                           0 B/s |       
> 0 B/s |                               0 |                               0 | 
> 0 |
> 
> |                           0 B/s |                           0 B/s |       
> 0 B/s |                               0 |                               0 | 
> 0 |
> 
> |                           0 B/s |                           0 B/s |       
> 0 B/s |                               0 |                               0 | 
> 0 |
> 
> |                           0 B/s |                           0 B/s |       
> 0 B/s |                               0 |                               0 | 
> 0 |
> 
> |                           0 B/s |                           0 B/s |       
> 0 B/s |                               0 |                               0 | 
> 0 |
> 
> |                           0 B/s |                           0 B/s |       
> 0 B/s |                               0 |                               0 | 
> 0 |
> 
> |                           0 B/s |                           0 B/s |       
> 0 B/s |                               0 |                               0 | 
> 0 |
> 
> |                           0 B/s |                           0 B/s |       
> 0 B/s |                               0 |                               0 | 
> 0 |
> 
> |                           0 B/s |                           0 B/s |       
> 0 B/s |                               0 |                               0 | 
> 0 |
> 
> ^CInterrupted
> [root@ceph-amk-bz-xvxd2m-node7 ~]# ceph versions
> {
>     "mon": {
>         "ceph version 17.2.6-145.el9cp
> (9e560e1f3b30ccca0e2b0af8fd5af0e01d8d7fbe) quincy (stable)": 3
>     },
>     "mgr": {
>         "ceph version 17.2.6-145.el9cp
> (9e560e1f3b30ccca0e2b0af8fd5af0e01d8d7fbe) quincy (stable)": 2
>     },
>     "osd": {
>         "ceph version 17.2.6-145.el9cp
> (9e560e1f3b30ccca0e2b0af8fd5af0e01d8d7fbe) quincy (stable)": 12
>     },
>     "mds": {
>         "ceph version 17.2.6-145.el9cp
> (9e560e1f3b30ccca0e2b0af8fd5af0e01d8d7fbe) quincy (stable)": 3
>     },
>     "overall": {
>         "ceph version 17.2.6-145.el9cp
> (9e560e1f3b30ccca0e2b0af8fd5af0e01d8d7fbe) quincy (stable)": 20
>     }
> }
> [root@ceph-amk-bz-xvxd2m-node7 ~]#
> 
> Is this verification holds good for this BZ

Looks good

> 
> Regards,
> Amarnath

Comment 38 errata-xmlrpc 2023-10-12 16:34:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat Ceph Storage 6.1 security, enhancement, and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:5693