Bug 1589785 - Heketi reports "Failed to get list of pods" , periodic health checks stopped and all POST requests(volume create) are hung at heketi end
Summary: Heketi reports "Failed to get list of pods" , periodic health checks stopped ...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: heketi
Version: cns-3.9
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: ---
Assignee: Michael Adam
QA Contact: Neha Berry
URL:
Whiteboard:
Depends On:
Blocks: OCS-3.11.1-devel-triage-done
TreeView+ depends on / blocked
 
Reported: 2018-06-11 11:54 UTC by Neha Berry
Modified: 2019-02-07 22:21 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-02-07 22:21:50 UTC
Embargoed:


Attachments (Terms of Use)

Comment 7 Raghavendra Talur 2018-06-28 13:00:55 UTC
It does look like issue with kubeexec.

"Failed to get list of pods" error is seen when master node does not reply to heketi's query of which are the gluster nodes. This means either master is not responding or the communication path from heketi pod to master is broken.

How to debug
1. check if oc commands are working
2. Log on to heketi pod and use heketi-cli to get details which does not require the kubeexec path. Like heketi-cli volume list. If that works, heketi is responding.

At this point it is verified that heketi is not the culprit. But we don't know if master or the communication path is the problem. I don't have any way to debug after this point.


Note You need to log in before you can comment on or make changes to this bug.