Bug 1854503
Summary: | [tracker-rhcs-bug 1848503] cephfs: Provide alternatives to increase the total cephfs subvolume snapshot counts to greater than the current 400 across a Cephfs volume | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat OpenShift Container Storage | Reporter: | Humble Chirammal <hchiramm> |
Component: | csi-driver | Assignee: | Humble Chirammal <hchiramm> |
Status: | CLOSED ERRATA | QA Contact: | Avi Liani <alayani> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 4.6 | CC: | alayani, ceph-eng-bugs, etamir, kramdoss, madam, muagarwa, nberry, ocs-bugs, pdonnell, ratamir, rcyriac, sostapov, srangana, sweil |
Target Milestone: | --- | Keywords: | AutomationBackLog, Reopened, Tracking |
Target Release: | OCS 4.6.0 | ||
Hardware: | All | ||
OS: | All | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | No Doc Update | |
Doc Text: | Story Points: | --- | |
Clone Of: | 1848503 | Environment: | |
Last Closed: | 2020-12-17 06:22:31 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1848503 | ||
Bug Blocks: |
Comment 4
Humble Chirammal
2020-09-02 16:40:36 UTC
running the test few times got the same results, 100 Snapshot only can be created. creation of the snapshot number 101 failed - it did not created and stay in the status of : Spec: Source: Persistent Volume Claim Name: pvc-test-8302e05ee60644b0a8691fa88e337d3c Volume Snapshot Class Name: ocs-storagecluster-cephfsplugin-snapclass Status: Bound Volume Snapshot Content Name: snapcontent-753b5e90-462f-49da-b2ea-3df685394060 Ready To Use: false Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal CreatingSnapshot 11m snapshot-controller Waiting for a snapshot namespace-test-c45840f65bb74f2593db7b9d9b36d349/pvc-snap-101-8302e05ee60644b0a8691fa88e337d3c to be created by the CSI driver. Tested on versions : Driver versions ================ OCP versions ============== clientVersion: buildDate: "2020-10-08T07:17:21Z" compiler: gc gitCommit: 074039a0a9c137967fba3e667b9849d60e5054d8 gitTreeState: clean gitVersion: openshift-clients-4.6.0-202006250705.p0-162-g074039a0a goVersion: go1.15.0 major: "" minor: "" platform: linux/amd64 openshiftVersion: 4.6.0-0.nightly-2020-10-22-034051 releaseClientVersion: 4.6.0-0.nightly-2020-10-10-041109 serverVersion: buildDate: "2020-10-08T15:58:07Z" compiler: gc gitCommit: d59ce3486ae3ca3a0c36e5498e56f51594076596 gitTreeState: clean gitVersion: v1.19.0+d59ce34 goVersion: go1.15.0 major: "1" minor: "19" platform: linux/amd64 Cluster version: NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.6.0-0.nightly-2020-10-22-034051 True False 4d1h Cluster version is 4.6.0-0.nightly-2020-10-22-034051 OCS versions ============== NAME DISPLAY VERSION REPLACES PHASE ocs-operator.v4.6.0-141.ci OpenShift Container Storage 4.6.0-141.ci Succeeded Rook versions =============== rook: 4.6-67.afaf3353.release_4.6 go: go1.15.0 Ceph versions =============== ceph version 14.2.8-111.el8cp (2e6029d57bc594eceba4751373da6505028c2650) nautilus (stable) RHCOS versions ================ NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME compute-0 Ready worker 4d1h v1.19.0+d59ce34 10.1.160.92 10.1.160.92 Red Hat Enterprise Linux CoreOS 46.82.202010091720-0 (Ootpa) 4.18.0-193.24.1.el8_2.dt1.x86_64 cri-o://1.19.0-20.rhaos4.6.git97d715e.el8 compute-1 Ready worker 4d1h v1.19.0+d59ce34 10.1.160.105 10.1.160.105 Red Hat Enterprise Linux CoreOS 46.82.202010091720-0 (Ootpa) 4.18.0-193.24.1.el8_2.dt1.x86_64 cri-o://1.19.0-20.rhaos4.6.git97d715e.el8 compute-2 Ready worker 4d1h v1.19.0+d59ce34 10.1.160.103 10.1.160.103 Red Hat Enterprise Linux CoreOS 46.82.202010091720-0 (Ootpa) 4.18.0-193.24.1.el8_2.dt1.x86_64 cri-o://1.19.0-20.rhaos4.6.git97d715e.el8 control-plane-0 Ready master 4d1h v1.19.0+d59ce34 10.1.160.99 10.1.160.99 Red Hat Enterprise Linux CoreOS 46.82.202010091720-0 (Ootpa) 4.18.0-193.24.1.el8_2.dt1.x86_64 cri-o://1.19.0-20.rhaos4.6.git97d715e.el8 control-plane-1 Ready master 4d1h v1.19.0+d59ce34 10.1.160.36 10.1.160.36 Red Hat Enterprise Linux CoreOS 46.82.202010091720-0 (Ootpa) 4.18.0-193.24.1.el8_2.dt1.x86_64 cri-o://1.19.0-20.rhaos4.6.git97d715e.el8 control-plane-2 Ready master 4d1h v1.19.0+d59ce34 10.1.160.97 10.1.160.97 Red Hat Enterprise Linux CoreOS 46.82.202010091720-0 (Ootpa) 4.18.0-193.24.1.el8_2.dt1.x86_64 cri-o://1.19.0-20.rhaos4.6.git97d715e.el8 Collecting must-gather from this test and will update when it will be ready. I am inclined towards closing it, I don't think that there is anything for QE to verify because from ocs side we don't expose any such parameter which can manage snapshot count. There is a doc bug already created for this https://bugzilla.redhat.com/show_bug.cgi?id=1891757 and we can update the same there. But again, I am not sure that we should mention a ceph internal parameter in OCS documentation or not. Closing it, please feel free to reopen (with steps to verify) if someone thinks otherwise. @Humble - Based on the above comments, it appears that the intention of this bug is to allow the creation of 512 snapshots in OCS. IOW, mds_max_snaps_per_dir value needs to be configured to 512 in OCS out of the box. if so, we still don't have this in. wdyt? p.s., The ask is to have 512 snapshots (https://issues.redhat.com/browse/KNIP-661), which by default is now 100. We @eran confirmation on this. Eran, please take a look at comment 14 and 15 and ack/nack the limit of cephfs snaps to 100 only Thanks Eran. Moving back to ON_QA, please test according to https://bugzilla.redhat.com/show_bug.cgi?id=1854503#c23 and update the same in the doc BZ opened to address the same. According to https://bugzilla.redhat.com/show_bug.cgi?id=1854503#c23 this BZ verified. i ran a test on RBD, and successful created 512 snapshots for RBD for CephFS see https://bugzilla.redhat.com/show_bug.cgi?id=1854503#c6 - 100 snapshots is the limit version used for the RBD test are the same as in https://bugzilla.redhat.com/show_bug.cgi?id=1854503#c6 Marking requires_doc_text as '-' because we already have a separate doc BZ to address the same. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat OpenShift Container Storage 4.6.0 security, bug fix, enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5605 |