Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1540039 - Promethues metrics "storage_operation_errors_total" does not work
Promethues metrics "storage_operation_errors_total" does not work
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage (Show other bugs)
3.9.0
Unspecified Unspecified
medium Severity medium
: ---
: 3.9.0
Assigned To: Hemant Kumar
chaoyang
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2018-01-30 01:53 EST by chaoyang
Modified: 2018-03-28 10:24 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2018-03-28 10:23:55 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:0489 None None None 2018-03-28 10:24 EDT

  None (edit)
Description chaoyang 2018-01-30 01:53:00 EST
Description of problem:
Promethues metrics "storage_operation_errors_total" does not work

Version-Release number of selected component (if applicable):
oc v3.9.0-0.31.0
kubernetes v1.9.1+a0ce1bc657
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://ip-172-18-10-56.ec2.internal:443
openshift v3.9.0-0.31.0
kubernetes v1.9.1+a0ce1bc657


How reproducible:
Always

Steps to Reproduce:
1.Create a dynamic pvc, record the ebs volume id
2.Attach the volume to other instance from aws web console
3.Create a pod using above pvc
4.There should be metric message 
storage_operation_errors_total { volume_plugin = "aws-ebs", operation_name = "volume_attach" }

Actual results:
No metric message displayed. This metric does not work

Expected results:
One metric message should display on the prometheus web console

Additional info:
Comment 1 Hemant Kumar 2018-02-01 10:12:28 EST
@chaoyang - Did you see any storage errors when this happened? if there were no errors - this metric will not be emitted. Once emitted, it will keep getting emitted afterwards.
Comment 2 chaoyang 2018-02-01 21:00:34 EST
@hekumar - I saw the error message related "delete volume", but no error message like "attach volume"
Comment 3 Hemant Kumar 2018-02-02 14:25:21 EST
I tried reproducing this in a AWS cluster and I got following metric after attach operation failed bunch of times:


storage_operation_errors_total{operation_name="volume_attach",volume_plugin="kubernetes.io/aws-ebs"} 2


And then I deleted a PVC which was being actively used by a pod and got:

storage_operation_errors_total{operation_name="volume_delete",volume_plugin="kubernetes.io/aws-ebs"} 2


So, I can't reproduce this problem. Can you post your logs that indicate - errors indeed happend and metrics were not recorded?
Comment 7 Hemant Kumar 2018-02-05 14:01:48 EST
https://github.com/openshift/origin/pull/18442
Comment 9 chaoyang 2018-02-22 01:15:35 EST
It is passed 
oc v3.9.0-0.47.0
kubernetes v1.9.1+a0ce1bc657
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://ip-172-18-12-213.ec2.internal:8443
openshift v3.9.0-0.47.0
kubernetes v1.9.1+a0ce1bc657

# TYPE storage_operation_errors_total counter
storage_operation_errors_total{operation_name="volume_attach",volume_plugin="kubernetes.io/aws-ebs"} 8
Comment 12 errata-xmlrpc 2018-03-28 10:23:55 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0489

Note You need to log in before you can comment on or make changes to this bug.