Bug 1631667 - [free-stg] write error "file already closed" for prometheus container
Summary: [free-stg] write error "file already closed" for prometheus container
Keywords:
Status: VERIFIED
Alias: None
Product: OpenShift Online
Classification: Red Hat
Component: Website
Version: 3.x
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Abhishek Gupta
QA Contact: Junqi Zhao
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-09-21 08:51 UTC by Junqi Zhao
Modified: 2023-05-15 19:01 UTC (History)
4 users (show)

Fixed In Version: 3.11
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
prometheus container logs (7.25 MB, text/plain)
2018-09-21 08:51 UTC, Junqi Zhao
no flags Details

Description Junqi Zhao 2018-09-21 08:51:28 UTC
Created attachment 1485428 [details]
prometheus container logs

Description of problem:
prometheus container throws out error
"write data/wal/000001: file already closed"  recently, a few days ago it did not have this issue.
$ oc -n openshift-devops-monitor logs -c prometheus prometheus-0 
level=warn ts=2018-09-21T05:06:32.477030856Z caller=scrape.go:717 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://172.31.79.85:10250/metrics/cadvisor msg="append failed" err="WAL log samples: log series: write data/wal/000001: file already closed"
level=warn ts=2018-09-21T05:06:32.478906866Z caller=scrape.go:713 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://172.31.73.38:10250/metrics/cadvisor msg="append failed" err="WAL log samples: log series: write data/wal/000001: file already closed"
level=warn ts=2018-09-21T05:06:32.479216896Z caller=scrape.go:717 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://172.31.73.38:10250/metrics/cadvisor msg="append failed" err="WAL log samples: log series: write data/wal/000001: file already closed"
level=warn ts=2018-09-21T05:06:32.593700506Z caller=scrape.go:713 component="scrape manager" scrape_pool=kubernetes-nodes target=https://172.31.72.79:10250/metrics msg="append failed" err="WAL log samples: log series: write data/wal/000001: file already closed"
level=warn ts=2018-09-21T05:06:32.594301784Z caller=scrape.go:717 component="scrape manager" scrape_pool=kubernetes-nodes target=https://172.31.72.79:10250/metrics msg="append failed" err="WAL log samples: log series: write data/wal/000001: file already closed"
level=warn ts=2018-09-21T05:06:32.905714099Z caller=manager.go:402 component="rule manager" group="Node rules" msg="rule sample appending failed" err="WAL log samples: log series: write data/wal/000001: file already closed"
level=warn ts=2018-09-21T05:06:33.215831934Z caller=scrape.go:713 component="scrape manager" scrape_pool=kubernetes-nodes target=https://172.31.77.169:10250/metrics msg="append failed" err="WAL log samples: log series: write data/wal/000001: file already closed"


Version-Release number of selected component (if applicable):

OpenShift Master:
    v3.11.7 
Kubernetes Master:
    v1.11.0+d4cacc0 
OpenShift Web Console:
    v3.11.7 

How reproducible:
recently

Steps to Reproduce:
1. oc -n openshift-devops-monitor logs -c prometheus prometheus-0 
2.
3.

Actual results:
prometheus container throws out error
"write data/wal/000001: file already closed" 

Expected results:
Should not have error

Additional info:

Comment 2 Paul Gier 2018-09-21 12:37:13 UTC
I cleared out the data and restarted prometheus.  It seems to be working now.

Comment 10 yasun 2018-09-28 08:15:18 UTC
Prometheus is running as well over 30 hours. For this bug, the issue is been fixed.


Note You need to log in before you can comment on or make changes to this bug.