Bug 1542135

Summary: PVC stat metrics from kubelet are persistent.
Product: OpenShift Container Platform Reporter: Hemant Kumar <hekumar>
Component: StorageAssignee: Hemant Kumar <hekumar>
Status: CLOSED ERRATA QA Contact: Wenqi He <wehe>
Severity: low Docs Contact:
Priority: unspecified    
Version: 3.9.0CC: aos-bugs, aos-storage-staff, bchilds, hchiramm, hekumar, jhou, jsafrane, kramdoss, smunilla, wehe
Target Milestone: ---   
Target Release: 3.9.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-05-17 06:42:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1553047    

Description Hemant Kumar 2018-02-05 16:23:37 UTC
Description of problem:

PVC stat metrics are emitted from a node, even when PVC is not being used on that node. This creates unnecessary confusion and makes it hard to track where the PVC is being used.



https://github.com/kubernetes/kubernetes/issues/57686

Comment 1 Hemant Kumar 2018-02-05 16:25:38 UTC
What should happen is - when a PVC is unmounted from a node, the metrics specific to the PVC should not be emitted from that node.

Comment 2 Jan Safranek 2018-02-16 08:57:25 UTC
Origin PR: https://github.com/openshift/origin/pull/18637

Comment 3 Humble Chirammal 2018-03-13 06:52:37 UTC
Hemanth, this is already available in OCP 3.9 build, isnt it ? if yes, we need to move the status of this bug to ON_QA for verification. We have a tracker bug in CNS for this bugzilla, so the request.

Comment 10 Wenqi He 2018-04-17 02:13:46 UTC
Based on comment #7, this bug is verified. Thanks.

Comment 12 Wenqi He 2018-04-20 10:30:02 UTC
Tested on below version:
openshift v3.9.24
kubernetes v1.9.1+a0ce1bc657

# uname -a
Linux 3.10.0-693.21.1.el7.x86_64 #1 SMP Fri Feb 23 18:54:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
# cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 7.4 (Maipo)

But I found the kubelet_volume_stats related metrics did not show up.

# oc get sc
NAME                 PROVISIONER            AGE
standard (default)   kubernetes.io/cinder   8h
# oc get pvc
NAME      STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
nfsc      Bound     pvc-c7b91603-4474-11e8-aac1-fa163e7c3848   1Gi        RWO            standard       36m
# oc get pods
NAME      READY     STATUS    RESTARTS   AGE
nfs       1/1       Running   0          36m
# oc get pods nfs -o yaml | grep node
  nodeName: 172.16.120.13
  nodeSelector:
    node-role.kubernetes.io/compute: "true"
# curl -k -H "Authorization: Bearer $(oc sa get-token prometheus -n openshift-metrics)" https://172.16.120.13:10250/metrics | grep volume_stats
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  112k  100  112k    0     0   916k      0 --:--:-- --:--:-- --:--:--  922k

Not sure whether the issue related my env, will try to build a new one and test.

Comment 14 Wenqi He 2018-04-23 09:40:59 UTC
The comment #12 happens on cinder, and we have another bug to track it.
Re-test this on below version on aws:
# oc version
openshift v3.9.24
kubernetes v1.9.1+a0ce1bc657

# uname -a
Linux ip-172-18-11-178.ec2.internal 3.10.0-862.el7.x86_64 #1 SMP Wed Mar 21 18:14:51 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux

# cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 7.5 (Maipo)

This bug is fixed. When the pvc is deleted, the metrics are gone
curl -k -H "Authorization: Bearer $(oc sa get-token prometheus -n openshift-metrics)" https://172.18.8.29:10250/metrics | grep volume_stats
kubelet_volume_stats_used_bytes{namespace="o7ahm",persistentvolumeclaim="pvc-o7ahm"} 2.625536e+06
...

After delete the pvc, volume_stats related to pvc-o7ahm disappear.

Comment 17 errata-xmlrpc 2018-05-17 06:42:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1566