Description of problem: elasticsearch-rollover and elasticsearch-delete pods are in error states with 'ValueError: No JSON object could be decoded' ~~~ $ oc get pods | grep Error elasticsearch-delete-app-1604990700-m5tzq 0/1 Error 0 6d7h elasticsearch-delete-audit-1605496500-69cc2 0/1 Error 0 11h elasticsearch-delete-infra-1605519000-lzngc 0/1 Error 0 4h45m elasticsearch-rollover-app-1605261600-hk5nq 0/1 Error 0 3d4h elasticsearch-rollover-audit-1605097800-bj4vm 0/1 Error 0 5d1h elasticsearch-rollover-infra-1605372300-877hc 0/1 Error 0 45h $ oc logs jobs/elasticsearch-delete-app-1604990700 Traceback (most recent call last): File "<string>", line 2, in <module> File "/usr/lib64/python2.7/json/__init__.py", line 290, in load **kw) File "/usr/lib64/python2.7/json/__init__.py", line 338, in loads return _default_decoder.decode(s) File "/usr/lib64/python2.7/json/decoder.py", line 366, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/usr/lib64/python2.7/json/decoder.py", line 384, in raw_decode raise ValueError("No JSON object could be decoded") ValueError: No JSON object could be decoded $ oc logs jobs/elasticsearch-delete-infra-1605519000 Traceback (most recent call last): File "<string>", line 2, in <module> File "/usr/lib64/python2.7/json/__init__.py", line 290, in load **kw) File "/usr/lib64/python2.7/json/__init__.py", line 338, in loads return _default_decoder.decode(s) File "/usr/lib64/python2.7/json/decoder.py", line 366, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/usr/lib64/python2.7/json/decoder.py", line 384, in raw_decode raise ValueError("No JSON object could be decoded") ValueError: No JSON object could be decoded ~~~ Version-Release number of selected component (if applicable): 4.5 How reproducible: NA Actual results: elasticsearch-rollover and elasticsearch-delete pods failing with 'ValueError: No JSON object could be decoded' Expected results: elasticsearch-rollover and elasticsearch-delete jobs should complete successfully. Additional info: The cluster is recently upgraded to 4.5.16. Sometimes jobs run successfully. ~~~ $ oc -n openshift-logging get jobs NAME COMPLETIONS DURATION AGE curator-1605497400 1/1 4s 10h elasticsearch-delete-app-1604990700 0/1 6d7h 6d7h elasticsearch-delete-app-1605535200 1/1 8s 13m elasticsearch-delete-audit-1605496500 0/1 10h 10h elasticsearch-delete-audit-1605535200 1/1 8s 13m elasticsearch-delete-infra-1605519000 0/1 4h43m 4h43m elasticsearch-delete-infra-1605535200 1/1 3s 13m elasticsearch-rollover-app-1605261600 0/1 3d4h 3d4h elasticsearch-rollover-app-1605535200 1/1 8s 13m elasticsearch-rollover-audit-1605097800 0/1 5d1h 5d1h elasticsearch-rollover-audit-1605535200 1/1 8s 13m elasticsearch-rollover-infra-1605372300 0/1 45h 45h elasticsearch-rollover-infra-1605535200 1/1 3s 13m ~~~
Same issue here - started with 4.5.16 and still occurs on 4.5.19.
Tested with elasticsearch-operator.4.7.0-202012080225.p0, the error message is changed to `Invalid JSON: ''`.
The fix has been cherry picked to ocp 4.6
@cam, I will cherry pick to 4.5 as well.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Errata Advisory for Openshift Logging 5.0.0), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:0652