Online/Image registry.access.redhat.com/openshift3/jenkins-1-rhel7 908b6dd3dafb Version-Release number of selected component (if applicable): kubernetes v1.2.0-alpha.7-703-gbc4550d Docker 1.8.2-el7, build a01dc02/1.8.2 kernel 3.10.0-327.10.1.el7.x86_64 How reproducible: always Description of problem: Error syncing pod when using jenkins-ephemeral-template to create jenkins Steps to Reproduce: 1. $ oc new-project test 2. $oc policy add-role-to-user admin system:serviceaccount:test:default -n test 3. $oc new-app -f https://raw.githubusercontent.com/openshift/origin/master/examples/jenkins/jenkins-ephemeral-template.json 4. $ Check the pod # oc get pods NAME READY STATUS RESTARTS AGE jenkins-1-deploy 0/1 ContainerCreating 0 1h [root@dhcp-128-91 backup]# oc describe pod jenkins-1-deploy Name: jenkins-1-deploy Namespace: test Image(s): openshift3/ose-deployer:v3.1.1.910 Node: ip-172-31-15-139.ec2.internal/172.31.15.139 Start Time: Thu, 17 Mar 2016 11:46:47 +0800 Labels: openshift.io/deployer-pod-for.name=jenkins-1 Status: Pending Reason: Message: IP: Controllers: <none> Containers: deployment: Container ID: Image: openshift3/ose-deployer:v3.1.1.910 Image ID: Port: QoS Tier: memory: BestEffort cpu: BestEffort State: Waiting Reason: ContainerCreating Ready: False Restart Count: 0 Environment Variables: KUBERNETES_MASTER: https://ip-172-31-4-121.ec2.internal OPENSHIFT_MASTER: https://ip-172-31-4-121.ec2.internal BEARER_TOKEN_FILE: /var/run/secrets/kubernetes.io/serviceaccount/token OPENSHIFT_CA_DATA: -----BEGIN CERTIFICATE----- MIIC5jCCAdCgAwIBAgIBATALBgkqhkiG9w0BAQswJjEkMCIGA1UEAwwbb3BlbnNo aWZ0LXNpZ25lckAxNDU3MzkxMjUxMB4XDTE2MDMwNzIyNTQxMVoXDTIxMDMwNjIy NTQxMlowJjEkMCIGA1UEAwwbb3BlbnNoaWZ0LXNpZ25lckAxNDU3MzkxMjUxMIIB IjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAswpnGk9V5a1/BMPhRPkkY/bz iad06kV8tzM7KXCga11S74x7D1cJ6TEpRx7PDHrzkfFc6EuGr8h6IgAD9twsxM2I sUkGfeRbWlBx+UmB7mdpfbdbyWsP0JV/h1pgHXVZbHn3P42KfkpdCOoxfbdXmwBd fEdlBX6+DVQoNE0leHnoQ1B52uXhYeUJF9yjGh47CHNSnJH1HbdOS3UPb46WYC0e dp/U7Ho0zFpmsHAJMnSMN0EJU8EZiF8LmSS/S9y27lfK8Wji8W0f6B2bAALHGvwC J9vCYsci86eOlEcFsLmevEwOxLYaNCe9xFM7ujMk/Ic5fmv+PFgHVPS9WGoFiQID AQABoyMwITAOBgNVHQ8BAf8EBAMCAKQwDwYDVR0TAQH/BAUwAwEB/zALBgkqhkiG 9w0BAQsDggEBAF6wzswIVTXRHW+26AmIbq6ZQWoJY3Nsw0fYl3wKDOFsBzxKe/Wf iI0yikZl07m2gY/oBvTzuKiuuiiD7WjMxjbJUKTnLNOzQ2HJ8893SWf0vIFeXyVs fkTjZFV9yMJyl4pso69zsRurh+7whb7tnxpyCNQ5Dx9S9wQ1tRnSl1p0rrUxh4cc JyNE6SCHW9rXDlUwqD/9DIqgE3Org8EewMVCH65YwXV2Xny0+wQGIBeThJN9TI7T HvfhrPXMa8J7yhv2MqjqFAYLbcJh/8fRRNITDuVG5PDlWEY7bGieKo8ElVyShYlH HNJ2fm3e8L9ZRRzWy4TtC1e++DLXSsz8G04= -----END CERTIFICATE----- OPENSHIFT_DEPLOYMENT_NAME: jenkins-1 OPENSHIFT_DEPLOYMENT_NAMESPACE: test Conditions: Type Status Ready False Volumes: deployer-token-a7woy: Type: Secret (a secret that should populate this volume) SecretName: deployer-token-a7woy Events: FirstSeen LastSeen Count From SubobjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 1h 1h 1 {default-scheduler } Normal Scheduled Successfully assigned jenkins-1-deploy to ip-172-31-15-139.ec2.internal 1h 1h 1 {kubelet ip-172-31-15-139.ec2.internal} Warning FailedSync Error syncing pod, skipping: No such container: 8ace102fd4462e67bdbd4eea4f32d0c5257e0f914dbe1f2aac92786acdce1753 1h 1h 1 {kubelet ip-172-31-15-139.ec2.internal} Warning FailedSync Error syncing pod, skipping: No such container: 1085c0d8ef8cf554dce5048e9e4193f7aef50fe54374f0c229fa4d3bfdec9663 1h 1h 1 {kubelet ip-172-31-15-139.ec2.internal} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "POD" with RunContainerError: "runContainer: API error (500): Error running DeviceResume dm_task_run failed\n" 1h 1h 122 {kubelet ip-172-31-15-139.ec2.internal} Warning FailedSync Error syncing pod, skipping: failed to "SetupNetwork" for "jenkins-1-deploy_test" with SetupNetworkError: "Failed to setup network for pod \"jenkins-1-deploy_test(dfeb8369-ebf2-11e5-90a5-0aadb0f8cf89)\" using network plugins \"redhat/openshift-ovs-subnet\": exit status 1; Skipping pod" 1h 1h 132 {kubelet ip-172-31-15-139.ec2.internal} Warning FailedSync Error syncing pod, skipping: API error (500): Unknown device acc7e11a7e98340f0efcfefefc267e43a07eec266b4960508d3bd005b449a3b5 1h 21s 296 {kubelet ip-172-31-15-139.ec2.internal} Warning FailedSync Error syncing pod, skipping: API error (500): Unknown device acc7e11a7e98340f0efcfefefc267e43a07eec266b4960508d3bd005b449a3b5 Actual results: pod is not running Expected results: pod should be running status
passing to networking team based on: 1h 1h 122 {kubelet ip-172-31-15-139.ec2.internal} Warning FailedSync Error syncing pod, skipping: failed to "SetupNetwork" for "jenkins-1-deploy_test" with SetupNetworkError: "Failed to setup network for pod \"jenkins-1-deploy_test(dfeb8369-ebf2-11e5-90a5-0aadb0f8cf89)\" using network plugins \"redhat/openshift-ovs-subnet\": exit status 1; Skipping pod"
Tested it again, pod is CrashLoopBackOff: # oc get pods NAME READY STATUS RESTARTS AGE jenkins-1-deploy 1/1 Running 0 38s jenkins-1-yfahj 0/1 Running 0 34s # oc get pods NAME READY STATUS RESTARTS AGE jenkins-1-deploy 1/1 Running 0 1m jenkins-1-yfahj 0/1 CrashLoopBackOff 1 1m # oc get pods NAME READY STATUS RESTARTS AGE jenkins-1-deploy 1/1 Running 0 1m jenkins-1-yfahj 0/1 Running 2 1m # oc get pods NAME READY STATUS RESTARTS AGE jenkins-1-deploy 1/1 Running 0 1m jenkins-1-yfahj 0/1 Running 2 1m # oc get pods NAME READY STATUS RESTARTS AGE jenkins-1-deploy 1/1 Running 0 1m jenkins-1-yfahj 0/1 CrashLoopBackOff 2 1m # oc describe jenkins-1-yfahj the server doesn't have a resource type "jenkins-1-yfahj" #oc describe pod jenkins-1-yfahj Name: jenkins-1-yfahj Namespace: wewang7 Image(s): registry.access.redhat.com/openshift3/jenkins-1-rhel7:latest Node: ip-172-31-15-139.ec2.internal/172.31.15.139 Start Time: Fri, 18 Mar 2016 10:43:26 +0800 Labels: deployment=jenkins-1,deploymentconfig=jenkins,name=jenkins Status: Running Reason: Message: IP: 10.1.0.237 Controllers: ReplicationController/jenkins-1 Containers: jenkins: Container ID: docker://68050e3ce2b12550e08d3fe62e433c1bf110679500698de950fc60bec201fd83 Image: registry.access.redhat.com/openshift3/jenkins-1-rhel7:latest Image ID: docker://908b6dd3dafbabbb1cf38b60bb8a281988c04f4953854df19e0ed804fe9d4dfa Port: QoS Tier: cpu: Burstable memory: Guaranteed Limits: cpu: 1 memory: 512Mi Requests: cpu: 60m memory: 512Mi State: Waiting Reason: CrashLoopBackOff Last State: Terminated Reason: OOMKilled Exit Code: 137 Started: Fri, 18 Mar 2016 10:44:39 +0800 Finished: Fri, 18 Mar 2016 10:44:54 +0800 Ready: False Restart Count: 2 Liveness: http-get http://:8080/login delay=30s timeout=3s period=10s #success=1 #failure=3 Readiness: http-get http://:8080/login delay=3s timeout=3s period=10s #success=1 #failure=3 Environment Variables: JENKINS_PASSWORD: password Conditions: Type Status Ready False Volumes: jenkins-data: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: default-token-m8i52: Type: Secret (a secret that should populate this volume) SecretName: default-token-m8i52 Events: FirstSeen LastSeen Count From SubobjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 2m 2m 1 {default-scheduler } Normal Scheduled Successfully assigned jenkins-1-yfahj to ip-172-31-15-139.ec2.internal 2m 2m 1 {kubelet ip-172-31-15-139.ec2.internal} spec.containers{jenkins} Normal Created Created container with docker id 5a465869c4db 2m 2m 1 {kubelet ip-172-31-15-139.ec2.internal} spec.containers{jenkins} Normal Started Started container with docker id 5a465869c4db 1m 1m 2 {kubelet ip-172-31-15-139.ec2.internal} spec.containers{jenkins} Warning Unhealthy Readiness probe failed: Get http://10.1.0.237:8080/login: read tcp 10.1.0.237:8080: use of closed network connection 1m 1m 1 {kubelet ip-172-31-15-139.ec2.internal} spec.containers{jenkins} Normal Created Created container with docker id 86e52801394e 1m 1m 1 {kubelet ip-172-31-15-139.ec2.internal} spec.containers{jenkins} Normal Started Started container with docker id 86e52801394e 1m 1m 1 {kubelet ip-172-31-15-139.ec2.internal} spec.containers{jenkins} Warning Unhealthy Readiness probe failed: HTTP probe failed with statuscode: 503 1m 1m 2 {kubelet ip-172-31-15-139.ec2.internal} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "jenkins" with CrashLoopBackOff: "Back-off 10s restarting failed container=jenkins pod=jenkins-1-yfahj_wewang7(306dc2f6-ecb3-11e5-9d8d-0aadb0f8cf89)" 2m 51s 3 {kubelet ip-172-31-15-139.ec2.internal} spec.containers{jenkins} Normal Pulled Container image "registry.access.redhat.com/openshift3/jenkins-1-rhel7:latest" already present on machine 50s 50s 1 {kubelet ip-172-31-15-139.ec2.internal} spec.containers{jenkins} Normal Created Created container with docker id 68050e3ce2b1 50s 50s 1 {kubelet ip-172-31-15-139.ec2.internal} spec.containers{jenkins} Normal Started Started container with docker id 68050e3ce2b1 1m 43s 2 {kubelet ip-172-31-15-139.ec2.internal} spec.containers{jenkins} Warning Unhealthy Readiness probe failed: Get http://10.1.0.237:8080/login: dial tcp 10.1.0.237:8080: connection refused 1m 19s 5 {kubelet ip-172-31-15-139.ec2.internal} spec.containers{jenkins} Warning BackOff Back-off restarting failed docker container 34s 19s 3 {kubelet ip-172-31-15-139.ec2.internal} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "jenkins" with CrashLoopBackOff: "Back-off 20s restarting failed container=jenkins pod=jenkins-1-yfahj_wewang7(306dc2f6-ecb3-11e5-9d8d-0aadb0f8cf89)"
Can you please get us the output from the openshift node log. And run the troubleshooting script at https://raw.githubusercontent.com/openshift/openshift-sdn/master/hack/debug.sh
node logs as below, and I have no permission to run the script, can you track the problem with the logs? [root@dev-preview-int-node-compute-d5623 ~]# docker ps -a | grep jenkins-1-deploy ab8870ea68fb openshift3/ose-deployer:v3.1.1.910 "/usr/bin/openshift-d" 7 minutes ago Exited (255) 5 minutes ago k8s_deployment.86d070f9_jenkins-1-deploy_test_28d451a5-ef29-11e5-9d8d-0aadb0f8cf89_1eecc842 0229e3387213 openshift3/ose-pod:v3.1.1.910 "/pod" 7 minutes ago Exited (0) 5 minutes ago k8s_POD.e5c1dc5a_jenkins-1-deploy_test_28d451a5-ef29-11e5-9d8d-0aadb0f8cf89_64850473 [root@dev-preview-int-node-compute-d5623 ~]# docker logs ab8870ea68fbdec7bab4b6046ef9e49a7aafb78c428d2f03590896efe4367f2d I0321 01:52:59.062514 1 deployer.go:199] Deploying test/jenkins-1 for the first time (replicas: 1) I0321 01:52:59.066739 1 recreate.go:126] Scaling test/jenkins-1 to 1 before performing acceptance check F0321 01:55:00.099291 1 deployer.go:69] couldn't scale test/jenkins-1 to 1: timed out waiting for the condition
Met this issue in ose env(3.2/2016-03-18.4) 0s 0s 1 {kubelet openshift-133.lab.sjc.redhat.com} spec.containers{jenkins} Normal Started Started container with docker id 1e806e9afc6b <invalid> <invalid> 1 {kubelet openshift-133.lab.sjc.redhat.com} spec.containers{jenkins} Normal Killing Killing container with docker id 1e806e9afc6b: pod "jenkins-1-a1vio_jenkins(17142722-ef4a-11e5-bcc1-fa163efe3ad5)" container "jenkins" is unhealthy, it will be killed and re-created. 5m <invalid> 6 {kubelet openshift-133.lab.sjc.redhat.com} spec.containers{jenkins} Normal Pulled Container image "brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/jenkins-1-rhel7:latest" already present on machine <invalid> <invalid> 1 {kubelet openshift-133.lab.sjc.redhat.com} spec.containers{jenkins} Normal Created Created container with docker id 24a815714336 <invalid> <invalid> 1 {kubelet openshift-133.lab.sjc.redhat.com} spec.containers{jenkins} Normal Started Started container with docker id 24a815714336 5m <invalid> 23 {kubelet openshift-133.lab.sjc.redhat.com} spec.containers{jenkins} Warning Unhealthy Readiness probe failed: Get http://10.2.1.6:8080/login: dial tcp 10.2.1.6:8080: connection refused 4m <invalid> 8 {kubelet openshift-133.lab.sjc.redhat.com} spec.containers{jenkins} Warning Unhealthy Liveness probe failed: Get http://10.2.1.6:8080/login: dial tcp 10.2.1.6:8080: connection refused
I need the output from: journalctl -flu atomic-openshift-node (Or perhaps openshift-node... do systemctl status | grep openshift to get the unit name if those fail)
Ben, Here are my debug logs that your requested while in the IRC... Mar 21 15:32:38 node1_vm origin-node[17123]: I0321 15:32:38.315960 17177 plugin.go:138] SetUpPod network plugin output: + lock_file=/var/lock/openshift-sdn.lock Mar 21 15:32:38 node1_vm origin-node[17123]: + action=setup Mar 21 15:32:38 node1_vm origin-node[17123]: + net_container=41c971017f78f95334cd8ded26239c1edd6f6f5eb219f9554dd4b7eb2058452f Mar 21 15:32:38 node1_vm origin-node[17123]: + tenant_id=0 Mar 21 15:32:38 node1_vm origin-node[17123]: + lockwrap run Mar 21 15:32:38 node1_vm origin-node[17123]: + flock 200 Mar 21 15:32:38 node1_vm origin-node[17123]: + run Mar 21 15:32:38 node1_vm origin-node[17123]: + get_ipaddr_pid_veth Mar 21 15:32:38 node1_vm origin-node[17123]: ++ docker inspect --format '{{.HostConfig.NetworkMode}}' 41c971017f78f95334cd8ded26239c1edd6f6f5eb219f9554dd4b7eb2058452f Mar 21 15:32:38 node1_vm origin-node[17123]: + network_mode=default Mar 21 15:32:38 node1_vm origin-node[17123]: + '[' default == host ']' Mar 21 15:32:38 node1_vm origin-node[17123]: + [[ default =~ container:.* ]] Mar 21 15:32:38 node1_vm origin-node[17123]: ++ docker inspect --format '{{.NetworkSettings.IPAddress}}' 41c971017f78f95334cd8ded26239c1edd6f6f5eb219f9554dd4b7eb2058452f Mar 21 15:32:38 node1_vm origin-node[17123]: + ipaddr=172.17.0.2 Mar 21 15:32:38 node1_vm origin-node[17123]: ++ docker inspect --format '{{.State.Pid}}' 41c971017f78f95334cd8ded26239c1edd6f6f5eb219f9554dd4b7eb2058452f Mar 21 15:32:38 node1_vm origin-node[17123]: + pid=28810 Mar 21 15:32:38 node1_vm origin-node[17123]: ++ get_veth_host 28810 Mar 21 15:32:38 node1_vm origin-node[17123]: ++ local pid=28810 Mar 21 15:32:38 node1_vm origin-node[17123]: +++ nsenter -n -t 28810 -- ethtool -S eth0 Mar 21 15:32:38 node1_vm origin-node[17123]: +++ sed -n -e 's/.*peer_ifindex: //p' Mar 21 15:32:38 node1_vm origin-node[17123]: ++ local veth_ifindex=1358 Mar 21 15:32:38 node1_vm origin-node[17123]: ++ ip link show Mar 21 15:32:38 node1_vm origin-node[17123]: ++ sed -ne 's/^1358: \([^:@]*\).*/\1/p' Mar 21 15:32:38 node1_vm origin-node[17123]: + veth_host=veth14e6fda Mar 21 15:32:38 node1_vm origin-node[17123]: ++ get_container_mac 28810 Mar 21 15:32:38 node1_vm origin-node[17123]: ++ local pid=28810 Mar 21 15:32:38 node1_vm origin-node[17123]: ++ nsenter -n -t 28810 -- ip link show dev eth0 Mar 21 15:32:38 node1_vm origin-node[17123]: ++ sed -n -e 's/.*link.ether \([^ ]*\).*/\1/p' Mar 21 15:32:38 node1_vm origin-node[17123]: + macaddr=02:42:ac:11:00:02 Mar 21 15:32:38 node1_vm origin-node[17123]: + source /run/openshift-sdn/config.env Mar 21 15:32:38 node1_vm origin-node[17123]: ++ export OPENSHIFT_CLUSTER_SUBNET=10.1.0.0/16 Mar 21 15:32:38 node1_vm origin-node[17123]: ++ OPENSHIFT_CLUSTER_SUBNET=10.1.0.0/16 Mar 21 15:32:38 node1_vm origin-node[17123]: + case "$action" in Mar 21 15:32:38 node1_vm origin-node[17123]: + add_ovs_port Mar 21 15:32:38 node1_vm origin-node[17123]: + brctl delif lbr0 veth14e6fda Mar 21 15:32:38 node1_vm origin-node[17123]: device veth14e6fda is not a slave of lbr0 Mar 21 15:32:38 node1_vm origin-node[17123]: , exit status 1 Mar 21 15:32:38 node1_vm origin-node[17123]: E0321 15:32:38.316051 17177 manager.go:1791] Failed to setup network for pod "docker-registry-1-deploy_default(4006a8ee-ef9b-11e5-bbd7-0050568848d8)" using network plugins "redhat/openshift-ovs-subnet" : exit status 1; Skipping pod Thanks, digi691
What RPM version of atomic-openshift is installed on this cluster? "rpm -q atomic-openshift" will tell you... Sorry if I missed it above.
It's looking like docker's network setup isn't correct. Can you also grab: 1) the contents of /run/openshift-sdn/docker-network 2) 'ps ax | grep docker'
@Ben ,here is info you need [root@dev-preview-int-master-167b1 ~]# cat /run/openshift-sdn/docker-network # This file has been modified by openshift-sdn. DOCKER_NETWORK_OPTIONS='-b=lbr0 --mtu=8951' [root@dev-preview-int-master-167b1 ~]# ps ax | grep docker 29489 ? Ss 0:00 /bin/sh -c /usr/bin/docker daemon $OPTIONS $DOCKER_STORAGE_OPTIONS $DOCKER_NETWORK_OPTIONS $ADD_REGISTRY $BLOCK_REGISTRY $INSECURE_REGISTRY 2>&1 | /usr/bin/forward-journald -tag docker 29490 ? Sl 64:01 /usr/bin/docker daemon --selinux-enabled --storage-driver devicemapper --storage-opt dm.fs=xfs --storage-opt dm.thinpooldev=/dev/mapper/docker_vg-docker--pool --storage-opt dm.use_deferred_removal=true -b=lbr0 --mtu=8951 --add-registry registry.qe.openshift.com --add-registry registry.access.redhat.com 29491 ? Sl 0:21 /usr/bin/forward-journald -tag docker 46201 pts/1 S+ 0:00 grep --color=auto docker
using jenkins-ephemeral-template in ose 3.2.0.16 env. Image is brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/jenkins-1-rhel7(a414317f519d). error: [root@dhcp-128-91 backup]# oc get pods NAME READY STATUS RESTARTS AGE jenkins-1-deploy 1/1 Running 0 8m jenkins-1-o0gh5 0/1 CrashLoopBackOff 6 8m [root@dhcp-128-91 backup]# oc describe pod jenkins-1-o0gh5 Name: jenkins-1-o0gh5 Namespace: wewang Image(s): brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/jenkins-1-rhel7:latest Node: openshift-210.lab.sjc.redhat.com/10.14.6.210 Start Time: Mon, 18 Apr 2016 17:08:46 +0800 Labels: deployment=jenkins-1,deploymentconfig=jenkins,name=jenkins Status: Running Reason: Message: IP: 10.2.2.3 Controllers: ReplicationController/jenkins-1 Containers: jenkins: Container ID: docker://1dd4a1aa921eff1a0c7200839a0f05302d98226c4f9ffa5a05fdb32f70df265e Image: brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/jenkins-1-rhel7:latest Image ID: docker://a414317f519d1c75b400d848bb658931eb71a4ba1cd7e6075783cd99c9ca8759 Port: QoS Tier: memory: Guaranteed cpu: BestEffort Limits: memory: 512Mi Requests: memory: 512Mi State: Waiting Reason: CrashLoopBackOff Last State: Terminated Reason: Error Exit Code: 143 Started: Mon, 18 Apr 2016 17:16:04 +0800 Finished: Mon, 18 Apr 2016 17:16:36 +0800 Ready: False Restart Count: 6 Liveness: http-get http://:8080/login delay=30s timeout=3s period=10s #success=1 #failure=3 Readiness: http-get http://:8080/login delay=3s timeout=3s period=10s #success=1 #failure=3 Environment Variables: JENKINS_PASSWORD: password Conditions: Type Status Ready False Volumes: jenkins-data: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: default-token-pimf1: Type: Secret (a secret that should populate this volume) SecretName: default-token-pimf1 Events: FirstSeen LastSeen Count From SubobjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 8m 8m 1 {default-scheduler } Normal Scheduled Successfully assigned jenkins-1-o0gh5 to openshift-210.lab.sjc.redhat.com 8m 8m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Pulling pulling image "brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/jenkins-1-rhel7:latest" 7m 7m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Pulled Successfully pulled image "brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/jenkins-1-rhel7:latest" 7m 7m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Created Created container with docker id ce3582dcba64 7m 7m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Started Started container with docker id ce3582dcba64 6m 6m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Killing Killing container with docker id ce3582dcba64: pod "jenkins-1-o0gh5_wewang(27bc1236-0545-11e6-8ce7-fa163e1af809)" container "jenkins" is unhealthy, it will be killed and re-created. 6m 6m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Created Created container with docker id ff36f822940d 6m 6m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Started Started container with docker id ff36f822940d 5m 5m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Killing Killing container with docker id ff36f822940d: pod "jenkins-1-o0gh5_wewang(27bc1236-0545-11e6-8ce7-fa163e1af809)" container "jenkins" is unhealthy, it will be killed and re-created. 5m 5m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Created Created container with docker id 00f46abdf54d 5m 5m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Started Started container with docker id 00f46abdf54d 5m 5m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Killing Killing container with docker id 00f46abdf54d: pod "jenkins-1-o0gh5_wewang(27bc1236-0545-11e6-8ce7-fa163e1af809)" container "jenkins" is unhealthy, it will be killed and re-created. 5m 5m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Created Created container with docker id 1965693838bf 5m 5m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Started Started container with docker id 1965693838bf 4m 4m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Killing Killing container with docker id 1965693838bf: pod "jenkins-1-o0gh5_wewang(27bc1236-0545-11e6-8ce7-fa163e1af809)" container "jenkins" is unhealthy, it will be killed and re-created. 4m 4m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Created Created container with docker id 51e2f2b0ffae 4m 4m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Started Started container with docker id 51e2f2b0ffae 3m 3m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Killing Killing container with docker id 51e2f2b0ffae: pod "jenkins-1-o0gh5_wewang(27bc1236-0545-11e6-8ce7-fa163e1af809)" container "jenkins" is unhealthy, it will be killed and re-created. 3m 3m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Started Started container with docker id e9aeb42ec2b6 3m 3m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Created Created container with docker id e9aeb42ec2b6 6m 3m 2 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Warning Unhealthy Liveness probe failed: HTTP probe failed with statuscode: 503 6m 3m 2 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Warning Unhealthy Readiness probe failed: HTTP probe failed with statuscode: 503 3m 3m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Killing Killing container with docker id e9aeb42ec2b6: pod "jenkins-1-o0gh5_wewang(27bc1236-0545-11e6-8ce7-fa163e1af809)" container "jenkins" is unhealthy, it will be killed and re-created. 3m 1m 8 {kubelet openshift-210.lab.sjc.redhat.com} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "jenkins" with CrashLoopBackOff: "Back-off 1m20s restarting failed container=jenkins pod=jenkins-1-o0gh5_wewang(27bc1236-0545-11e6-8ce7-fa163e1af809)" 6m 1m 6 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Pulled Container image "brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/jenkins-1-rhel7:latest" already present on machine 1m 1m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Created Created container with docker id 1dd4a1aa921e 1m 1m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Started Started container with docker id 1dd4a1aa921e 7m 1m 23 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Warning Unhealthy Readiness probe failed: Get http://10.2.2.3:8080/login: dial tcp 10.2.2.3:8080: connection refused 6m 1m 7 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Warning Unhealthy Liveness probe failed: Get http://10.2.2.3:8080/login: dial tcp 10.2.2.3:8080: connection refused 1m 1m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Killing Killing container with docker id 1dd4a1aa921e: pod "jenkins-1-o0gh5_wewang(27bc1236-0545-11e6-8ce7-fa163e1af809)" container "jenkins" is unhealthy, it will be killed and re-created. 3m 14s 14 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Warning BackOff Back-off restarting failed docker container 1m 14s 6 {kubelet openshift-210.lab.sjc.redhat.com} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "jenkins" with CrashLoopBackOff: "Back-off 2m40s restarting failed container=jenkins pod=jenkins-1-o0gh5_wewang(27bc1236-0545-11e6-8ce7-fa163e1af809)"
openshift3/jenkins-1-rhel7(a414317f519d) As my comment #14 said, my jenkins pods could be running,could login from webconsole and do jenkins job. But sometimes, met the jenkins pod restart issue,finally the pod will be running(Maybe the networks between Beijing and US is not stable).Paste logs later.
Created attachment 1148362 [details] jenkins-restart.log
seems pod is running, just a warn when describe pod ,so change severity to low
wewang, your later log messages don't show the SetupNetwork errors that the original report and Chris DiGiovanni had. Do you still see those SetupNetwork errors? I think the 'unhealthy container' messages are a different issue.