Bug 2282533
Summary: | [CephFS - Consistency Group] - quiesce may time out or crash due to an interlock with exporting and other inter-rank operations | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Leonid Usov <lusov> |
Component: | CephFS | Assignee: | Leonid Usov <lusov> |
Status: | CLOSED ERRATA | QA Contact: | sumr |
Severity: | medium | Docs Contact: | |
Priority: | unspecified | ||
Version: | 7.1 | CC: | ceph-eng-bugs, cephqe-warriors, jcaratza, ngangadh, sumr, tserlin |
Target Milestone: | --- | Flags: | ngangadh:
needinfo+
|
Target Release: | 7.1 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | ceph-18.2.1-193 | Doc Type: | No Doc Update |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2024-06-13 14:32:56 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Leonid Usov
2024-05-22 14:04:34 UTC
Test Plan: 1. Run functional and systemic regression tests for CG quiesce 2. On repeat, Perform the below ops, > Set authrules to subvolume, pin subvolume test_dir to a mds rank, perform dir rename > Parallel quiesce calls to same set 3. Verify if debug params to quiesce cmds have been removed Verified fix on ceph build 18.2.1-194.el9cp. FUNCTIONAL REGRESSION TESTS --------------------------- http://magna002.ceph.redhat.com/cephci-jenkins/cephci-run-BN7JSV/ SYSTEMIC REGRESSION TESTS ------------------------- SCALE TEST - http://magna002.ceph.redhat.com/cephci-jenkins/cephci-run-68WUGA STRESS TEST: http://magna002.ceph.redhat.com/cephci-jenkins/cephci-run-1QDVX1/cg_snap_system_test_0.log PERFORM FS OPS in parallel to Quiesce ------------------------------------- Set authrules to subvolume, pin subvolume test_dir to a mds rank, perform dir rename : http://magna002.ceph.redhat.com/cephci-jenkins/cephci-run-3XROAN Parallel quiesce calls to same set ---------------------------------- http://magna002.ceph.redhat.com/cephci-jenkins/cephci-run-Z1HB2V/cg_snap_test_0.log Verify debug param ?q=<secs> not working [root@ceph-sumar-regression-narjcq-node8 ~]# ceph fs quiesce cephfs --set-id cg_dbg_params1 sv1?q=5 sv2?q=5 sv3?q=5 --timeout 300 --expiration 300 { "epoch": 290, "leader": 44133, "set_version": 2234, "sets": { "cg_dbg_params1": { "version": 2234, "age_ref": 0.0, "state": { "name": "QUIESCING", "age": 0.0 }, "timeout": 300.0, "expiration": 300.0, "members": { "file:/volumes/_nogroup/sv3/02b68c49-4327-4309-9417-cd85f629f8a5?q=5": { "excluded": false, "state": { "name": "QUIESCING", "age": 0.0 } }, "file:/volumes/_nogroup/sv2/a7cc3735-6a4d-4ebd-9e67-b60bf9b80e10?q=5": { "excluded": false, "state": { "name": "QUIESCING", "age": 0.0 } }, "file:/volumes/_nogroup/sv1/1936ca82-f30e-4e88-94f6-fed218be72d2?q=5": { "excluded": false, "state": { "name": "QUIESCING", "age": 0.0 } } } } } } [root@ceph-sumar-regression-narjcq-node8 ~]# ceph fs quiesce cephfs --query --set-id cg_dbg_params1 { "epoch": 290, "leader": 44133, "set_version": 2237, "sets": { "cg_dbg_params1": { "version": 2237, "age_ref": 0.0, "state": { "name": "QUIESCED", "age": 2.5 }, "timeout": 300.0, "expiration": 300.0, "members": { "file:/volumes/_nogroup/sv3/02b68c49-4327-4309-9417-cd85f629f8a5?q=5": { "excluded": false, "state": { "name": "QUIESCED", "age": 2.5 } }, "file:/volumes/_nogroup/sv2/a7cc3735-6a4d-4ebd-9e67-b60bf9b80e10?q=5": { "excluded": false, "state": { "name": "QUIESCED", "age": 2.5 } }, "file:/volumes/_nogroup/sv1/1936ca82-f30e-4e88-94f6-fed218be72d2?q=5": { "excluded": false, "state": { "name": "QUIESCED", "age": 2.5 } } } } } } [root@ceph-sumar-regression-narjcq-node8 ~]# Marking the BZ as Verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Critical: Red Hat Ceph Storage 7.1 security, enhancements, and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2024:3925 |