+++ This bug was initially created as a clone of Bug #1632869 +++ Description of problem: There is no easy way to rerun a Kubernetes job. As discussed in bug 1632852 there are times when the job terminates with a failure and does not run again. In that sort of scenario the job has to be recreated in order for it to run again. This should be automated through the installer. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Fix is available in openshift-ansible-3.10.59-1
Tested with openshift-ansible-3.10.60-1, issue is fixed. Steps: scale down cassandra and hawkular-metrics rc, after a while, scale them up. There are error in hawkular-metrics pod, "The schema version check failed". Then run the playbooks/openshift-metrics/schema.yml playbook, all pods will be running well. # oc -n openshift-infra get pod NAME READY STATUS RESTARTS AGE hawkular-cassandra-1-l4zpg 1/1 Running 0 28m hawkular-metrics-lzz5g 1/1 Running 3 28m hawkular-metrics-schema-9qlxd 0/1 Completed 0 5m heapster-tnq2m 1/1 Running 2 1h NOTE: heapster should also be scale down and scale up avoid not showing metrics diagram in web UI.
*** Bug 1626908 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0026