Bug 1549683 - [CRI-O] Liveness probe failed for EAP Quickstart app
Summary: [CRI-O] Liveness probe failed for EAP Quickstart app
Keywords:
Status: CLOSED CANTFIX
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Templates
Version: 3.9.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 3.9.z
Assignee: Ben Parees
QA Contact: XiuJuan Wang
URL:
Whiteboard:
Depends On:
Blocks: 1549259
TreeView+ depends on / blocked
 
Reported: 2018-02-27 16:16 UTC by Vikas Laad
Modified: 2018-08-16 21:25 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-08-16 21:25:03 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
describe pod (6.36 KB, text/plain)
2018-02-27 16:22 UTC, Vikas Laad
no flags Details

Description Vikas Laad 2018-02-27 16:16:39 UTC
Description of problem:
Node logs are full of following errors, app is working fine. 

Feb 27 15:45:18 ip-172-31-1-79.us-west-2.compute.internal atomic-openshift-node[15639]: E0227 15:45:18.794124   15639 remote_runtime.go:332] ExecSync 18525764a47e8f770a7ca13b58631568e1c2804ac9937f8de1fe9c972e6abd53 '/bin/bash -c /opt/eap/bin/livenessProbe.sh' from runtime service failed: rpc error: code = Unknown desc = command error: command timed out, stdout: , stderr: , exit code -1
Feb 27 15:45:19 ip-172-31-1-79.us-west-2.compute.internal atomic-openshift-node[15639]: E0227 15:45:19.794011   15639 remote_runtime.go:332] ExecSync 18525764a47e8f770a7ca13b58631568e1c2804ac9937f8de1fe9c972e6abd53 '/bin/bash -c /opt/eap/bin/livenessProbe.sh' from runtime service failed: rpc error: code = Unknown desc = command error: command timed out, stdout: , stderr: , exit code -1
Feb 27 15:45:27 ip-172-31-1-79.us-west-2.compute.internal atomic-openshift-node[15639]: E0227 15:45:27.796265   15639 remote_runtime.go:332] ExecSync 18525764a47e8f770a7ca13b58631568e1c2804ac9937f8de1fe9c972e6abd53 '/bin/bash -c /opt/eap/bin/livenessProbe.sh' from runtime service failed: rpc error: code = Unknown desc = command error: command timed out, stdout: , stderr: , exit code -1
Feb 27 15:45:28 ip-172-31-1-79.us-west-2.compute.internal atomic-openshift-node[15639]: E0227 15:45:28.794146   15639 remote_runtime.go:332] ExecSync 18525764a47e8f770a7ca13b58631568e1c2804ac9937f8de1fe9c972e6abd53 '/bin/bash -c /opt/eap/bin/livenessProbe.sh' from runtime service failed: rpc error: code = Unknown desc = command error: command timed out, stdout: , stderr: , exit code -1
Feb 27 15:45:29 ip-172-31-1-79.us-west-2.compute.internal atomic-openshift-node[15639]: E0227 15:45:29.794044   15639 remote_runtime.go:332] ExecSync 18525764a47e8f770a7ca13b58631568e1c2804ac9937f8de1fe9c972e6abd53 '/bin/bash -c /opt/eap/bin/livenessProbe.sh' from runtime service failed: rpc error: code = Unknown desc = command error: command timed out, stdout: , stderr: , exit code -1
Feb 27 15:45:37 ip-172-31-1-79.us-west-2.compute.internal atomic-openshift-node[15639]: E0227 15:45:37.793475   15639 remote_runtime.go:332] ExecSync 18525764a47e8f770a7ca13b58631568e1c2804ac9937f8de1fe9c972e6abd53 '/bin/bash -c /opt/eap/bin/livenessProbe.sh' from runtime service failed: rpc error: code = Unknown desc = command error: command timed out, stdout: , stderr: , exit code -1
Feb 27 15:45:39 ip-172-31-1-79.us-west-2.compute.internal atomic-openshift-node[15639]: E0227 15:45:39.794672   15639 remote_runtime.go:332] ExecSync 18525764a47e8f770a7ca13b58631568e1c2804ac9937f8de1fe9c972e6abd53 '/bin/bash -c /opt/eap/bin/livenessProbe.sh' from runtime service failed: rpc error: code = Unknown desc = command error: command timed out, stdout: , stderr: , exit code -1



Version-Release number of selected component (if applicable):
openshift v3.9.0-0.53.0
kubernetes v1.9.1+a0ce1bc657
etcd 3.2.8

How reproducible:
For eap quickstart app

Steps to Reproduce:
1. oc new-project test
2. oc new-app --template=eap64-mysql-s2i
3. oc describe pod <eap-app-pod>

Events are as following
  Warning  Unhealthy              17m                 kubelet, ip-172-31-11-165.us-west-2.compute.internal  Readiness probe errored: rpc error: code = Unknown desc = command error: command timed out, stdout: , stderr: , exit code -1
  Warning  Unhealthy              13m (x24 over 16m)  kubelet, ip-172-31-11-165.us-west-2.compute.internal  Liveness probe errored: rpc error: code = Unknown desc = command error: command timed out, stdout: , stderr: , exit code -1
  Warning  Unhealthy              2m (x46 over 10m)   kubelet, ip-172-31-11-165.us-west-2.compute.internal  Liveness probe errored: rpc error: code = Unknown desc = command error: exec failed: cannot allocate tty if runc will detach without setting console socket
, stdout: , stderr: , exit code -1

Actual results:
Errors in describe pod and node logs

Expected results:
No errors

Additional info:
Please find attached node logs from both the compute nodes.

Comment 1 Vikas Laad 2018-02-27 16:22:22 UTC
Created attachment 1401433 [details]
describe pod

Comment 3 Seth Jennings 2018-02-27 19:16:29 UTC
Sending to containers to investigate:

Liveness probe errored: rpc error: code = Unknown desc = command error: exec failed: cannot allocate tty if runc will detach without setting console socket

Comment 4 Mrunal Patel 2018-02-28 00:10:16 UTC
Created - https://github.com/kubernetes-incubator/cri-o/pull/1386

Comment 5 Antonio Murdaca 2018-02-28 15:53:37 UTC
Should be fixed by https://github.com/kubernetes-incubator/cri-o/pull/1386

Comment 6 DeShuai Ma 2018-03-07 06:14:19 UTC
Still have the issue on openshift v3.9.3 + crio 1.9.8
[root@ip-172-18-12-111 bin]# pwd
/var/lib/containers/atomic/cri-o/rootfs/bin
[root@ip-172-18-12-111 bin]# ./crio --version
crio version 1.9.8
[root@ip-172-18-12-111 bin]# openshift version
openshift v3.9.3
kubernetes v1.9.1+a0ce1bc657
etcd 3.2.16
[root@ip-172-18-12-111 bin]# oc get po
NAME                    READY     STATUS      RESTARTS   AGE
eap-app-1-build         0/1       Completed   0          54m
eap-app-1-deploy        0/1       Error       0          51m
eap-app-2-build         0/1       Completed   0          9m
eap-app-2-qf87d         1/1       Running     0          8m
eap-app-mysql-1-vlwkr   1/1       Running     0          54m
[root@ip-172-18-12-111 bin]# oc describe po eap-app-2-qf87d
Name:           eap-app-2-qf87d
Namespace:      test
Node:           ip-172-18-12-111.ec2.internal/172.18.12.111
Start Time:     Wed, 07 Mar 2018 01:04:55 -0500
Labels:         app=eap64-mysql-s2i
                application=eap-app
                deployment=eap-app-2
                deploymentConfig=eap-app
                deploymentconfig=eap-app
Annotations:    openshift.io/deployment-config.latest-version=2
                openshift.io/deployment-config.name=eap-app
                openshift.io/deployment.name=eap-app-2
                openshift.io/generated-by=OpenShiftNewApp
                openshift.io/scc=restricted
Status:         Running
IP:             10.129.0.8
Controlled By:  ReplicationController/eap-app-2
Containers:
  eap-app:
    Container ID:   cri-o://1ddbc6b01386af931b32313a156de076fb84e69f04f06045e29027f2427c9ddc
    Image:          docker-registry.default.svc:5000/test/eap-app@sha256:93f916b092d3227299df32a6e6655f88925d7b90b2f5c8fc7d21961ce9886828
    Image ID:       docker-registry.default.svc:5000/test/eap-app@sha256:93f916b092d3227299df32a6e6655f88925d7b90b2f5c8fc7d21961ce9886828
    Ports:          8778/TCP, 8080/TCP, 8443/TCP, 8888/TCP
    State:          Running
      Started:      Wed, 07 Mar 2018 01:07:06 -0500
    Ready:          True
    Restart Count:  0
    Limits:
      memory:  1Gi
    Requests:
      memory:   1Gi
    Liveness:   exec [/bin/bash -c /opt/eap/bin/livenessProbe.sh] delay=60s timeout=1s period=10s #success=1 #failure=3
    Readiness:  exec [/bin/bash -c /opt/eap/bin/readinessProbe.sh] delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:
      DB_SERVICE_PREFIX_MAPPING:        eap-app-mysql=DB
      DB_JNDI:                          java:jboss/datasources/TodoListDS
      DB_USERNAME:                      user3sQ
      DB_PASSWORD:                      FSfbEw6Y
      DB_DATABASE:                      root
      TX_DATABASE_PREFIX_MAPPING:       eap-app-mysql=DB
      DB_MIN_POOL_SIZE:                 
      DB_MAX_POOL_SIZE:                 
      DB_TX_ISOLATION:                  
      JGROUPS_PING_PROTOCOL:            openshift.DNS_PING
      OPENSHIFT_DNS_PING_SERVICE_NAME:  eap-app-ping
      OPENSHIFT_DNS_PING_SERVICE_PORT:  8888
      HTTPS_KEYSTORE_DIR:               /etc/eap-secret-volume
      HTTPS_KEYSTORE:                   keystore.jks
      HTTPS_KEYSTORE_TYPE:              
      HTTPS_NAME:                       
      HTTPS_PASSWORD:                   
      HORNETQ_CLUSTER_PASSWORD:         ayMclADM
      HORNETQ_QUEUES:                   
      HORNETQ_TOPICS:                   
      JGROUPS_ENCRYPT_SECRET:           eap-app-secret
      JGROUPS_ENCRYPT_KEYSTORE_DIR:     /etc/jgroups-encrypt-secret-volume
      JGROUPS_ENCRYPT_KEYSTORE:         jgroups.jceks
      JGROUPS_ENCRYPT_NAME:             
      JGROUPS_ENCRYPT_PASSWORD:         
      JGROUPS_CLUSTER_PASSWORD:         g3wXn2RP
      TIMER_SERVICE_DATA_STORE:         eap-app-mysql
      AUTO_DEPLOY_EXPLODED:             false
    Mounts:
      /etc/eap-secret-volume from eap-keystore-volume (ro)
      /etc/jgroups-encrypt-secret-volume from eap-jgroups-keystore-volume (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-r8p8g (ro)
Conditions:
  Type           Status
  Initialized    True 
  Ready          True 
  PodScheduled   True 
Volumes:
  eap-keystore-volume:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  eap-app-secret
    Optional:    false
  eap-jgroups-keystore-volume:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  eap-app-secret
    Optional:    false
  default-token-r8p8g:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-r8p8g
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/memory-pressure:NoSchedule
Events:
  Type     Reason                 Age               From                                    Message
  ----     ------                 ----              ----                                    -------
  Normal   Scheduled              8m                default-scheduler                       Successfully assigned eap-app-2-qf87d to ip-172-18-12-111.ec2.internal
  Normal   SuccessfulMountVolume  8m                kubelet, ip-172-18-12-111.ec2.internal  MountVolume.SetUp succeeded for volume "default-token-r8p8g"
  Normal   SuccessfulMountVolume  8m                kubelet, ip-172-18-12-111.ec2.internal  MountVolume.SetUp succeeded for volume "eap-keystore-volume"
  Normal   SuccessfulMountVolume  8m                kubelet, ip-172-18-12-111.ec2.internal  MountVolume.SetUp succeeded for volume "eap-jgroups-keystore-volume"
  Normal   Pulling                8m                kubelet, ip-172-18-12-111.ec2.internal  pulling image "docker-registry.default.svc:5000/test/eap-app@sha256:93f916b092d3227299df32a6e6655f88925d7b90b2f5c8fc7d21961ce9886828"
  Normal   Pulled                 6m                kubelet, ip-172-18-12-111.ec2.internal  Successfully pulled image "docker-registry.default.svc:5000/test/eap-app@sha256:93f916b092d3227299df32a6e6655f88925d7b90b2f5c8fc7d21961ce9886828"
  Normal   Created                6m                kubelet, ip-172-18-12-111.ec2.internal  Created container
  Normal   Started                6m                kubelet, ip-172-18-12-111.ec2.internal  Started container
  Warning  Unhealthy              5m (x3 over 6m)   kubelet, ip-172-18-12-111.ec2.internal  Readiness probe errored: rpc error: code = Unknown desc = command error: command timed out, stdout: , stderr: , exit code -1
  Warning  Unhealthy              2m (x16 over 5m)  kubelet, ip-172-18-12-111.ec2.internal  Liveness probe errored: rpc error: code = Unknown desc = command error: command timed out, stdout: , stderr: , exit code -1

Comment 7 Antonio Murdaca 2018-03-10 12:16:03 UTC
so, this is what I get when running this with latest cri-o, it seems either a misconfiguration or a network issue, not a crio issue for liveness probe :/

Mar 10 13:13:55 runcom.internal crio[3617]: time="2018-03-10 13:13:55.741882115+01:00" level=info msg="Received container exit code: 1, message: "
Mar 10 13:13:55 runcom.internal crio[3617]: time="2018-03-10 13:13:55.742038929+01:00" level=debug msg="execsync response stdout {
Mar 10 13:13:55 runcom.internal crio[3617]:     "probe.eap.dmr.EapProbe": {
Mar 10 13:13:55 runcom.internal crio[3617]:         "probe.eap.dmr.ServerStatusTest": "running",
Mar 10 13:13:55 runcom.internal crio[3617]:         "probe.eap.dmr.DeploymentTest": {
Mar 10 13:13:55 runcom.internal crio[3617]:             "ROOT.war": "FAILED",
Mar 10 13:13:55 runcom.internal crio[3617]:             "activemq-rar.rar": "FAILED"
Mar 10 13:13:55 runcom.internal crio[3617]:         },
Mar 10 13:13:55 runcom.internal crio[3617]:         "probe.eap.dmr.BootErrorsTest": [
Mar 10 13:13:55 runcom.internal crio[3617]:             {
Mar 10 13:13:55 runcom.internal crio[3617]:                 "failed-operation": {
Mar 10 13:13:55 runcom.internal crio[3617]:                     "operation": "add",
Mar 10 13:13:55 runcom.internal crio[3617]:                     "address": [
Mar 10 13:13:55 runcom.internal crio[3617]:                         {
Mar 10 13:13:55 runcom.internal crio[3617]:                             "subsystem": "transactions"
Mar 10 13:13:55 runcom.internal crio[3617]:                         }
Mar 10 13:13:55 runcom.internal crio[3617]:                     ]
Mar 10 13:13:55 runcom.internal crio[3617]:                 },
Mar 10 13:13:55 runcom.internal crio[3617]:                 "failure-timestamp": 1520683986890,
Mar 10 13:13:55 runcom.internal crio[3617]:                 "failure-description": "{\"JBAS014671: Failed services\" => {\"jboss.txn.ArjunaRecoveryManager\" => \"org.jboss.msc.service.StartException in service jboss.txn.ArjunaRecoveryManager: JBAS010101: Recovery manager create failed\n    Caused by: java.lang.RuntimeException: java.lang.reflect.InvocationTargetException\n    Caused by: java.lang.reflect.InvocationTargetException\n    Caused by: com.arjuna.ats.arjuna.exceptions.ObjectStoreException: java.sql.SQLException: javax.resource.ResourceException: IJ000453: Unable to get managed connection for java:jboss/datasources/TodoListDSObjectStore\n    Caused by: java.sql.SQLException: javax.resource.ResourceException: IJ000453: Unable to get managed connection for java:jboss/datasources/TodoListDSObjectStore\n    Caused by: javax.resource.ResourceException: IJ000453: Unable to get managed connection for java:jboss/datasources/TodoListDSObjectStore\n    Caused by: javax.resource.ResourceException: IJ000658: Unexpected throwable while trying to create a connection: null\n    Caused by: javax.resource.ResourceException: Could not create connection\n    Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure\n\nThe last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.\n    Caused by: java.net.ConnectException: Connection timed out (Connection timed out)\"}}",
Mar 10 13:13:55 runcom.internal crio[3617]:                 "failed-services": {
Mar 10 13:13:55 runcom.internal crio[3617]:                     "jboss.txn.ArjunaRecoveryManager": "org.jboss.msc.service.StartException in service jboss.txn.ArjunaRecoveryManager: JBAS010101: Recovery manager create failed\n    Caused by: java.lang.RuntimeException: java.lang.reflect.InvocationTargetException\n    Caused by: java.lang.reflect.InvocationTargetException\n    Caused by: com.arjuna.ats.arjuna.exceptions.ObjectStoreException: java.sql.SQLException: javax.resource.ResourceException: IJ000453: Unable to get managed connection for java:jboss/datasources/TodoListDSObjectStore\n    Caused by: java.sql.SQLException: javax.resource.ResourceException: IJ000453: Unable to get managed connection for java:jboss/datasources/TodoListDSObjectStore\n    Caused by: javax.resource.ResourceException: IJ000453: Unable to get managed connection for java:jboss/datasources/TodoListDSObjectStore\n    Caused by: javax.resource.ResourceException: IJ000658: Unexpected throwable while trying to create a connection: null\n    Caused by: javax.resource.ResourceException: Could not create connection\n    Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure\n\nThe last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.\n    Caused by: java.net.ConnectException: Connection timed out (Connection timed out)"
Mar 10 13:13:55 runcom.internal crio[3617]:                 }
Mar 10 13:13:55 runcom.internal crio[3617]:             }
Mar 10 13:13:55 runcom.internal crio[3617]:         ]
Mar 10 13:13:55 runcom.internal crio[3617]:     }
Mar 10 13:13:55 runcom.internal crio[3617]: }
Mar 10 13:13:55 runcom.internal crio[3617]: "
Mar 10 13:13:55 runcom.internal crio[3617]: time="2018-03-10 13:13:55.742055724+01:00" level=debug msg="execsync response stderr "

Comment 8 Antonio Murdaca 2018-03-10 12:17:00 UTC
Can you guys double check that it's working with docker? it seems a template issue to me

Comment 9 Vikas Laad 2018-03-12 02:27:50 UTC
I see following error in docker runtime cluster
Mar 10 07:24:25 ip-172-31-24-44.us-west-2.compute.internal atomic-openshift-node[6500]: E0310 07:24:25.218340    6500 remote_runtime.go:332] ExecSync 47d123e12c544e283f8ab698fdad175bf33d16aabf21594d79132cb2ee85a527 '/bin/bash -c /opt/eap/bin/livenessProbe.sh' from runtime service failed: rpc error: code = Unknown desc = Error: No such container: 47d123e12c544e283f8ab698fdad175bf33d16aabf21594d79132cb2ee85a527

Comment 10 Vikas Laad 2018-03-12 02:31:23 UTC
Note: above errors are showing up only in node logs, I dont see anything in describe pod.

Comment 11 Daniel Walsh 2018-05-03 19:41:16 UTC
Antonio, looks like it is failing with docker also.  So should we close or push to a different team?

Comment 12 Antonio Murdaca 2018-05-21 09:35:11 UTC
docker error is not related to the original bug, if you guys can't reproduce within a CRI-O environment, I'd close this one.

Comment 13 Vikas Laad 2018-05-21 18:38:33 UTC
I can still see the following errors in node logs

May 21 18:24:00 ip-172-31-31-156.us-west-2.compute.internal atomic-openshift-node[16206]: I0521 18:24:00.649502   16206 prober.go:106] Liveness probe for "eap-app-3-2x8nr_eap64-mysql-s2i-u-49-19-67-27(2b$
d0623-5d1f-11e8-8725-02455301af2a):eap-app" errored: rpc error: code = Unknown desc = command error: command timed out, stdout: , stderr: , exit code -1

in following version
openshift v3.10.0-0.47.0
kubernetes v1.10.0+b81c8f8
etcd 3.2.16

Comment 14 Mrunal Patel 2018-05-21 20:22:14 UTC
Vikas,
Is that with cri-o? If yes, can we access the cluster?

Comment 16 Mrunal Patel 2018-05-21 22:41:02 UTC
I manually executed the liveness/readiness probes and they succeeded but they do take longer than 1 second. We should bump up the timeout values.

I don't see any issue in cri-o with this besides the timeout that is configured too low.

https://github.com/jboss-openshift/application-templates/blob/master/eap/eap64-mysql-s2i.json#L824

https://github.com/jboss-openshift/application-templates/blob/master/eap/eap64-mysql-s2i.json#L816



root@ip-172-31-31-156: ~ # runc exec -p /tmp/my-process.json f0393992084a6af3bd00e3e3c562dc25ed920ef1877f0077da22ce730c49da32 
root@ip-172-31-31-156: ~ # echo $?
0

my-process.json
-----------------
{"user":{"uid":1000560000,"gid":0,"additionalGids":[1000560000]},"args":["/bin/bash","-c","/opt/eap/bin/livenessProbe.sh"],"env":["PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin","TERM=xterm","HOSTNAME=eap-app-2-t7v5z","DB_SERVICE_PREFIX_MAPPING=eap-app-mysql=DB","JGROUPS_PING_PROTOCOL=openshift.DNS_PING","JGROUPS_ENCRYPT_KEYSTORE=jgroups.jceks","DB_PASSWORD=sP1Wv7if","OPENSHIFT_DNS_PING_SERVICE_PORT=8888","HTTPS_PASSWORD=","HORNETQ_CLUSTER_PASSWORD=3xaOMFq8","JGROUPS_ENCRYPT_KEYSTORE_DIR=/etc/jgroups-encrypt-secret-volume","DB_USERNAME=userNh0","DB_MAX_POOL_SIZE=","JGROUPS_ENCRYPT_PASSWORD=","TIMER_SERVICE_DATA_STORE=eap-app-mysql","TX_DATABASE_PREFIX_MAPPING=eap-app-mysql=DB","OPENSHIFT_DNS_PING_SERVICE_NAME=eap-app-ping","JGROUPS_ENCRYPT_NAME=","JGROUPS_CLUSTER_PASSWORD=Y6aVGEyq","AUTO_DEPLOY_EXPLODED=false","DB_TX_ISOLATION=","JGROUPS_ENCRYPT_SECRET=eap-app-secret","DB_MIN_POOL_SIZE=","HTTPS_KEYSTORE_TYPE=","DB_JNDI=java:jboss/datasources/TodoListDS","DB_DATABASE=root","HTTPS_KEYSTORE_DIR=/etc/eap-secret-volume","HTTPS_KEYSTORE=keystore.jks","HTTPS_NAME=","HORNETQ_QUEUES=","HORNETQ_TOPICS=","EAP_APP_MYSQL_PORT_3306_TCP=tcp://172.30.17.84:3306","EAP_APP_MYSQL_PORT_3306_TCP_PROTO=tcp","EAP_APP_MYSQL_PORT_3306_TCP_ADDR=172.30.17.84","KUBERNETES_PORT=tcp://172.30.0.1:443","KUBERNETES_PORT_53_TCP_ADDR=172.30.0.1","EAP_APP_PORT_8080_TCP_PROTO=tcp","EAP_APP_PORT_8080_TCP_PORT=8080","KUBERNETES_PORT_443_TCP_PROTO=tcp","KUBERNETES_PORT_53_UDP_ADDR=172.30.0.1","EAP_APP_PORT=tcp://172.30.94.94:8080","KUBERNETES_PORT_443_TCP_ADDR=172.30.0.1","KUBERNETES_PORT_53_TCP_PROTO=tcp","SECURE_EAP_APP_PORT_8443_TCP=tcp://172.30.141.77:8443","EAP_APP_PORT_8080_TCP_ADDR=172.30.94.94","KUBERNETES_SERVICE_PORT=443","KUBERNETES_SERVICE_PORT_DNS=53","KUBERNETES_PORT_443_TCP_PORT=443","SECURE_EAP_APP_PORT_8443_TCP_ADDR=172.30.141.77","EAP_APP_MYSQL_PORT_3306_TCP_PORT=3306","KUBERNETES_SERVICE_HOST=172.30.0.1","KUBERNETES_PORT_53_UDP_PROTO=udp","SECURE_EAP_APP_SERVICE_HOST=172.30.141.77","EAP_APP_PORT_8080_TCP=tcp://172.30.94.94:8080","EAP_APP_MYSQL_SERVICE_PORT=3306","KUBERNETES_SERVICE_PORT_HTTPS=443","KUBERNETES_PORT_443_TCP=tcp://172.30.0.1:443","KUBERNETES_PORT_53_UDP=udp://172.30.0.1:53","KUBERNETES_PORT_53_TCP_PORT=53","EAP_APP_SERVICE_PORT=8080","EAP_APP_SERVICE_HOST=172.30.94.94","KUBERNETES_SERVICE_PORT_DNS_TCP=53","KUBERNETES_PORT_53_UDP_PORT=53","SECURE_EAP_APP_PORT=tcp://172.30.141.77:8443","SECURE_EAP_APP_PORT_8443_TCP_PORT=8443","EAP_APP_MYSQL_SERVICE_HOST=172.30.17.84","EAP_APP_MYSQL_PORT=tcp://172.30.17.84:3306","KUBERNETES_PORT_53_TCP=tcp://172.30.0.1:53","SECURE_EAP_APP_SERVICE_PORT=8443","SECURE_EAP_APP_PORT_8443_TCP_PROTO=tcp","OPENSHIFT_BUILD_NAME=eap-app-2","OPENSHIFT_BUILD_NAMESPACE=eap64-mysql-s2i-u-33-12-167-62","OPENSHIFT_BUILD_SOURCE=https://github.com/jboss-openshift/openshift-quickstarts","OPENSHIFT_BUILD_REFERENCE=1.2","OPENSHIFT_BUILD_COMMIT=caec20220374804b2cb3d3622a754f9091af7c57","MAVEN_MIRROR_URL=","ARTIFACT_DIR=","container=oci","JBOSS_IMAGE_NAME=jboss-eap-6/eap64-openshift","JBOSS_IMAGE_VERSION=1.8","HOME=/home/jboss","JAVA_HOME=/usr/lib/jvm/java-1.8.0","JAVA_VENDOR=openjdk","JAVA_VERSION=1.8.0","LAUNCH_JBOSS_IN_BACKGROUND=true","JBOSS_PRODUCT=eap","JBOSS_EAP_VERSION=6.4.20.GA","PRODUCT_VERSION=6.4.20.GA","JBOSS_HOME=/opt/eap","STI_BUILDER=jee","JBOSS_MODULES_SYSTEM_PKGS=org.jboss.logmanager,jdk.nashorn.api","DEFAULT_ADMIN_USERNAME=eapadmin","JOLOKIA_VERSION=1.5.0","AB_JOLOKIA_PASSWORD_RANDOM=true","AB_JOLOKIA_AUTH_OPENSHIFT=true","AB_JOLOKIA_HTTPS=true","MAVEN_VERSION=3.5"],"cwd":"/home/jboss","capabilities":{"bounding":["CAP_CHOWN","CAP_DAC_OVERRIDE","CAP_FSETID","CAP_FOWNER","CAP_AUDIT_WRITE","CAP_NET_RAW","CAP_SYS_CHROOT","CAP_NET_BIND_SERVICE","CAP_SETFCAP","CAP_SETPCAP"],"effective":["CAP_CHOWN","CAP_DAC_OVERRIDE","CAP_FSETID","CAP_FOWNER","CAP_AUDIT_WRITE","CAP_NET_RAW","CAP_SYS_CHROOT","CAP_NET_BIND_SERVICE","CAP_SETFCAP","CAP_SETPCAP"],"inheritable":["CAP_CHOWN","CAP_DAC_OVERRIDE","CAP_FSETID","CAP_FOWNER","CAP_AUDIT_WRITE","CAP_NET_RAW","CAP_SYS_CHROOT","CAP_NET_BIND_SERVICE","CAP_SETFCAP","CAP_SETPCAP"],"permitted":["CAP_CHOWN","CAP_DAC_OVERRIDE","CAP_FSETID","CAP_FOWNER","CAP_AUDIT_WRITE","CAP_NET_RAW","CAP_SYS_CHROOT","CAP_NET_BIND_SERVICE","CAP_SETFCAP","CAP_SETPCAP"]},"oomScoreAdj":936,"selinuxLabel":"system_u:system_r:svirt_lxc_net_t:s0:c24,c4"}

Comment 17 Mrunal Patel 2018-05-21 22:47:07 UTC
Actually the one for eap is here and it doesn't have a timeout set so I think it must be defaulting to 1 seconds as that is what I see getting passed down. We should add a higher value here. I will confirm the default in kubernetes code.

https://github.com/jboss-openshift/application-templates/blob/4d0ec5ad8b8d49c5f2de4167aa6cdd390463aac6/eap/eap64-basic-s2i.json#L348

Comment 18 Mrunal Patel 2018-05-21 22:50:13 UTC
Here is the comment about the 1 second default - https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/api/core/v1/types.go#L1852

Comment 20 DeShuai Ma 2018-07-26 02:11:53 UTC
(In reply to Mrunal Patel from comment #18)
> Here is the comment about the 1 second default -
> https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/api/
> core/v1/types.go#L1852

we also met the same error on crio env. After set timeout to a high value, it can be success.

For the event log, it can't to see what's wrong and hard to debug.

"Feb 27 15:45:39 ip-172-31-1-79.us-west-2.compute.internal atomic-openshift-node[15639]: E0227 15:45:39.794672   15639 remote_runtime.go:332] ExecSync 18525764a47e8f770a7ca13b58631568e1c2804ac9937f8de1fe9c972e6abd53 '/bin/bash -c /opt/eap/bin/livenessProbe.sh' from runtime service failed: rpc error: code = Unknown desc = command error: command timed out, stdout: , stderr: , exit code -1"

It's moved to ON_QA by errata, What's is the final fix for the bug?

Comment 22 Mrunal Patel 2018-08-16 21:04:49 UTC
This should really be fixed by increasing the timeout in the template.

Comment 23 Ben Parees 2018-08-16 21:25:03 UTC
we don't own the template, you'll have to take it up w/ jboss via their JIRA or open an issue against their github repo.


Note You need to log in before you can comment on or make changes to this bug.