https://openshift-gce-devel.appspot.com/build/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-serial-4.1/313 Apr 27 05:25:47.689 E ns/openshift-monitoring pod/prometheus-adapter-575b89dd7d-55qtv node/ip-10-0-154-190.ec2.internal container=prometheus-adapter container exited with code 2: Apr 27 05:26:09.512 E ns/openshift-image-registry pod/node-ca-vhq7d node/ip-10-0-154-190.ec2.internal container=node-ca container exited with code 137: Apr 27 05:37:31.362 E ns/openshift-machine-config-operator pod/machine-config-daemon-f754s node/ip-10-0-154-190.ec2.internal container=machine-config-daemon container exited with code 143: Apr 27 05:43:13.101 E ns/openshift-machine-config-operator pod/machine-config-daemon-wx22g node/ip-10-0-154-190.ec2.internal container=machine-config-daemon container exited with code 143: Apr 27 05:43:43.147 E ns/openshift-image-registry pod/node-ca-244fh node/ip-10-0-154-190.ec2.internal container=node-ca container exited with code 137: Apr 27 05:44:30.213 E ns/openshift-machine-config-operator pod/machine-config-daemon-9jww2 node/ip-10-0-154-190.ec2.internal container=machine-config-daemon container exited with code 143: Apr 27 05:45:00.266 E ns/openshift-image-registry pod/node-ca-7tnpd node/ip-10-0-154-190.ec2.internal container=node-ca container exited with code 137: Apr 27 05:50:55.850 E ns/openshift-machine-config-operator pod/machine-config-daemon-jlvw2 node/ip-10-0-154-190.ec2.internal container=machine-config-daemon container exited with code 143: Apr 27 05:51:25.909 E ns/openshift-image-registry pod/node-ca-sf2h9 node/ip-10-0-154-190.ec2.internal container=node-ca container exited with code 137: Apr 27 06:02:21.213 E ns/openshift-machine-config-operator pod/machine-config-daemon-npfpc node/ip-10-0-154-190.ec2.internal container=machine-config-daemon container exited with code 143: Apr 27 06:02:51.265 E ns/openshift-image-registry pod/node-ca-fv7pz node/ip-10-0-154-190.ec2.internal container=node-ca container exited with code 137: Apr 27 06:05:00.488 E ns/openshift-machine-config-operator pod/machine-config-daemon-w27wx node/ip-10-0-154-190.ec2.internal container=machine-config-daemon container exited with code 143: Apr 27 06:05:30.535 E ns/openshift-image-registry pod/node-ca-xtfl4 node/ip-10-0-154-190.ec2.internal container=node-ca container exited with code 137: This causes a test to fail, but needs independent investigation to understand why it is exiting and restarting every ~5 minutes.
Apr 27 05:44:30.213 E ns/openshift-machine-config-operator pod/machine-config-daemon-9jww2 node/ip-10-0-154-190.ec2.internal container=machine-config-daemon container exited with code 143: Someone (kubelet likely) is killing (SIGTERM) us.
The 143 error code is because the MCD is getting killed after someone asked to SIGTERM it. Now, in the MCD we have an handler for sigterm only during our sync, the rest of the code doesn't really care about sigterm so we don't catch it and we exit with 143 instead of 0 (if we had an handler).
PR to fix this by adding an handler for SIGTERM and exiting nicely is here https://github.com/openshift/machine-config-operator/pull/697
Alright, all daemonsets w/o a SIGTERM handler are exposing this behavior of being terminated (full conversation here https://coreos.slack.com/archives/CEKNRGF25/p1556821026430400) The MCO in that job also isn't erroring out also. As outlined in the conversation also, this may be just noise (but we do have a PR anyway). I'm moving the target to 4.2 actually.
No reports of 'container exited with code 143' in the last 14 days of test runs. Closing as verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2922