Description of problem:
Prometheus deployment in ocp3.7 on AWS platform with EBS storage, the prometheus playbook ends without any fail but the prometheus pod always being pending state, pod logs as follow "0/6 nodes are available: 3 CheckServiceAffinity, 3 MatchNodeSelector, 6 NoVolumeZoneConflict".
Version-Release number of selected component (if applicable):
prometheus deployment fails in OCP3.7 on AWS platform with EBS storage
Steps to Reproduce:
1.run the playbook
#ansible-playbook -i <inventory-host> /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/openshift-prometheus.yml [ playbook runs successfully always]
2. check pod status
Prometheus pods always being pending state
Prometheus pods should run without any error
Tested with empty dir for Prometheus storage the Prometheus pod running successfully,
This might be a configuration issue. Could you provide controller logs to see what is going on?
It also affects 3.9, same steps, same result.
Looks like issue: https://github.com/kubernetes/kubernetes/issues/39178
I am facing the same issue on deploying prometheus on OCP3.7 on Google Storage also.
The PV are created but, the were not assigned to nodes.
The pod stuck in "pending" like forever.
jmselmi, this issue is not specific to AWS, every cluster which has a storage class which can provision storage bound to one AZ in multiple AZs is prone to face this issue.
You can workaround it by manually creating all the persistent volume claims in the same AZ.
*** Bug 1579607 has been marked as a duplicate of this bug. ***
See also Bug 1565405
specify the zone in the StorageClass:
prometheus images version:v3.11.0-0.25.0
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.