Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1395663

Summary: Scaling down pods with a running Java process results in a warning message
Product: OpenShift Container Platform Reporter: James Netherton <jnethert>
Component: RFEAssignee: Ryan Phillips <rphillips>
Status: CLOSED WONTFIX QA Contact: Xiaoli Tian <xtian>
Severity: high Docs Contact:
Priority: high    
Version: 3.3.0CC: aileenc, aos-bugs, bparees, clichybi, decarr, jnethert, jokerman, mmccomas, pkanthal, rphillips, sjenning, sreber, suchaudh, vwalek
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-09-19 15:14:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description James Netherton 2016-11-16 11:45:42 UTC
Description of problem:

Original problem described here: https://issues.jboss.org/browse/OSFUSE-427

In short, the issue is related to the exit codes returned from the JVM when it receives signals such as SIGINT, SIGTERM, etc. The JVM default is to return an exit code of 128+signal-id. E.g for SIGTERM you'd see an exit code of 143 (128+15).

This causes OpenShift to display a warning message on pod scale down stating that that the container did not stop cleanly (when in actual fact, it did).

Most of the XPAAS images seem to cause this problem. With the exception of the EAP image which launches the JVM differently to most other apps (I.e launches the process in the background). 

How reproducible:

Always for any JVM process launched in the foreground or via 'exec'.


Steps to Reproduce:

1. Clone Java project:

https://github.com/fabric8-quickstarts/spring-boot-camel

2. Deploy project to OpenShift:

mvn fabric8:deploy

3. Scale down spring-boot-camel pod in the console

Actual results:

Warning message appears: "The container spring-boot did not stop cleanly when terminated (exit code 143)."

Expected results:

The message is a potential source of confusion because the container did stop correctly.

Ideally there should not be any warning messages in these instances.

Comment 1 Jessica Forrester 2016-11-16 14:17:00 UTC
We had a similar problem with our ruby image before and we were able to resolve that one.  I don't think we want to stop warning people when their process fails to exit cleanly.  We should look into whether there is anything we can do in the Java images to provide a more appropriate exit code.

Comment 2 James Netherton 2016-11-16 14:39:22 UTC
Agreed we don't want to stop showing warnings - just suppress them in certain scenarios.

Modifying images is tricky because the OpenShift advice seems to be that processes are started with 'exec' (so that everything runs as PID 1). Therefore, you effectively replace the environment and loose the capability to trap exit codes and return something more appropriate.

The only workaround I see would be to do things the EAP way, by starting Java in the background, trap signals and then forward them onto the JVM process. But this seems to go against the recommended OpenShift advice. It also adds a bunch of additional complexity that any image wanting to use Java would need to implement - so it's less than ideal.

Comment 3 Ben Parees 2016-11-16 16:16:37 UTC
we can modify our images, but this sounds like a more general concern that anyone writing their own java image is going to run into... we can't solve that short of telling people as a best practice how to deal with it.

I wonder if there is a k8s way to indicate what the "expected/success" error code is as part of the pod/container definition?  That seems like the best way to generically solve this problem.

Comment 4 Derek Carr 2016-12-12 21:13:37 UTC
This should be an RFE.

Comment 8 Ryan Phillips 2018-05-31 15:54:58 UTC
Every application may have different needs on graceful shutdown logic. In this case, Java does not set a default error code upon exit (it needs to be explicitly set with a System.exit(0) call).

There is a project called springboot-graceful-shutdown[1] that can be injected into a Spring or SpringBoot application to help with OpenShift rolling deployments or projects that need a Spring-based graceful shutdown.


[1] https://github.com/SchweizerischeBundesbahnen/springboot-graceful-shutdown

Comment 9 Vladislav Walek 2018-06-29 11:32:25 UTC
Hello Ryan,

I have reply from customer:

The provided answer is not helping us at all and is not connected to the problem stated in the ticket in any way. Our problem is not with graceful shutdown. 
Our Problem is (in short), that OpenShift (since 3.6) displays a warning if our application ends with another exitcode than 0.  Even if exitcode 143 is okay for us.

Comment 10 Ryan Phillips 2018-06-29 15:28:29 UTC
Any error code other than 0 is considered to be an error. Other comments in this thread are about ungraceful java shutdowns, and I was addressing that issue.

The customer could capture the java exit code in a shell script and rewrite it to be 0, which seems like the most portable solution.

Comment 11 Seth Jennings 2018-07-02 15:40:16 UTC
This situation should be addressed inside the container, not by OCP.

There should be some wrapper around the JVM that sanitizes its return code since non-zero return codes have historically indicated an error.

The only thing that could be done in OCP for this is to add a field on the container spec like "successCode: 143", but I do not see that flying upstream since it is easily worked around from within the container and would involve a lot of change in the kubelet and CLI tools since they all assume that a non-zero return code in the container status indicates and error.