| Summary: | pod is ContainerCreating when using jenkins-ephemeral-template to create jenkins | ||||||
|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | wewang <wewang> | ||||
| Component: | Networking | Assignee: | Dan Williams <dcbw> | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Meng Bo <bmeng> | ||||
| Severity: | low | Docs Contact: | |||||
| Priority: | medium | ||||||
| Version: | 3.2.0 | CC: | aos-bugs, cdigiovanni, dcbw, jokerman, mmccomas, tdawson, wewang, wzheng, xiuwang | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | |||||||
| : | 1328727 (view as bug list) | Environment: | |||||
| Last Closed: | 2016-11-22 23:21:20 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Bug Depends On: | |||||||
| Bug Blocks: | 1328727 | ||||||
| Attachments: |
|
||||||
|
Description
wewang
2016-03-17 06:16:15 UTC
passing to networking team based on:
1h 1h 122 {kubelet ip-172-31-15-139.ec2.internal} Warning FailedSync Error syncing pod, skipping: failed to "SetupNetwork" for "jenkins-1-deploy_test" with SetupNetworkError: "Failed to setup network for pod \"jenkins-1-deploy_test(dfeb8369-ebf2-11e5-90a5-0aadb0f8cf89)\" using network plugins \"redhat/openshift-ovs-subnet\": exit status 1; Skipping pod"
Tested it again, pod is CrashLoopBackOff:
# oc get pods
NAME READY STATUS RESTARTS AGE
jenkins-1-deploy 1/1 Running 0 38s
jenkins-1-yfahj 0/1 Running 0 34s
# oc get pods
NAME READY STATUS RESTARTS AGE
jenkins-1-deploy 1/1 Running 0 1m
jenkins-1-yfahj 0/1 CrashLoopBackOff 1 1m
# oc get pods
NAME READY STATUS RESTARTS AGE
jenkins-1-deploy 1/1 Running 0 1m
jenkins-1-yfahj 0/1 Running 2 1m
# oc get pods
NAME READY STATUS RESTARTS AGE
jenkins-1-deploy 1/1 Running 0 1m
jenkins-1-yfahj 0/1 Running 2 1m
# oc get pods
NAME READY STATUS RESTARTS AGE
jenkins-1-deploy 1/1 Running 0 1m
jenkins-1-yfahj 0/1 CrashLoopBackOff 2 1m
# oc describe jenkins-1-yfahj
the server doesn't have a resource type "jenkins-1-yfahj"
#oc describe pod jenkins-1-yfahj
Name: jenkins-1-yfahj
Namespace: wewang7
Image(s): registry.access.redhat.com/openshift3/jenkins-1-rhel7:latest
Node: ip-172-31-15-139.ec2.internal/172.31.15.139
Start Time: Fri, 18 Mar 2016 10:43:26 +0800
Labels: deployment=jenkins-1,deploymentconfig=jenkins,name=jenkins
Status: Running
Reason:
Message:
IP: 10.1.0.237
Controllers: ReplicationController/jenkins-1
Containers:
jenkins:
Container ID: docker://68050e3ce2b12550e08d3fe62e433c1bf110679500698de950fc60bec201fd83
Image: registry.access.redhat.com/openshift3/jenkins-1-rhel7:latest
Image ID: docker://908b6dd3dafbabbb1cf38b60bb8a281988c04f4953854df19e0ed804fe9d4dfa
Port:
QoS Tier:
cpu: Burstable
memory: Guaranteed
Limits:
cpu: 1
memory: 512Mi
Requests:
cpu: 60m
memory: 512Mi
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: OOMKilled
Exit Code: 137
Started: Fri, 18 Mar 2016 10:44:39 +0800
Finished: Fri, 18 Mar 2016 10:44:54 +0800
Ready: False
Restart Count: 2
Liveness: http-get http://:8080/login delay=30s timeout=3s period=10s #success=1 #failure=3
Readiness: http-get http://:8080/login delay=3s timeout=3s period=10s #success=1 #failure=3
Environment Variables:
JENKINS_PASSWORD: password
Conditions:
Type Status
Ready False
Volumes:
jenkins-data:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
default-token-m8i52:
Type: Secret (a secret that should populate this volume)
SecretName: default-token-m8i52
Events:
FirstSeen LastSeen Count From SubobjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
2m 2m 1 {default-scheduler } Normal Scheduled Successfully assigned jenkins-1-yfahj to ip-172-31-15-139.ec2.internal
2m 2m 1 {kubelet ip-172-31-15-139.ec2.internal} spec.containers{jenkins} Normal Created Created container with docker id 5a465869c4db
2m 2m 1 {kubelet ip-172-31-15-139.ec2.internal} spec.containers{jenkins} Normal Started Started container with docker id 5a465869c4db
1m 1m 2 {kubelet ip-172-31-15-139.ec2.internal} spec.containers{jenkins} Warning Unhealthy Readiness probe failed: Get http://10.1.0.237:8080/login: read tcp 10.1.0.237:8080: use of closed network connection
1m 1m 1 {kubelet ip-172-31-15-139.ec2.internal} spec.containers{jenkins} Normal Created Created container with docker id 86e52801394e
1m 1m 1 {kubelet ip-172-31-15-139.ec2.internal} spec.containers{jenkins} Normal Started Started container with docker id 86e52801394e
1m 1m 1 {kubelet ip-172-31-15-139.ec2.internal} spec.containers{jenkins} Warning Unhealthy Readiness probe failed: HTTP probe failed with statuscode: 503
1m 1m 2 {kubelet ip-172-31-15-139.ec2.internal} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "jenkins" with CrashLoopBackOff: "Back-off 10s restarting failed container=jenkins pod=jenkins-1-yfahj_wewang7(306dc2f6-ecb3-11e5-9d8d-0aadb0f8cf89)"
2m 51s 3 {kubelet ip-172-31-15-139.ec2.internal} spec.containers{jenkins} Normal Pulled Container image "registry.access.redhat.com/openshift3/jenkins-1-rhel7:latest" already present on machine
50s 50s 1 {kubelet ip-172-31-15-139.ec2.internal} spec.containers{jenkins} Normal Created Created container with docker id 68050e3ce2b1
50s 50s 1 {kubelet ip-172-31-15-139.ec2.internal} spec.containers{jenkins} Normal Started Started container with docker id 68050e3ce2b1
1m 43s 2 {kubelet ip-172-31-15-139.ec2.internal} spec.containers{jenkins} Warning Unhealthy Readiness probe failed: Get http://10.1.0.237:8080/login: dial tcp 10.1.0.237:8080: connection refused
1m 19s 5 {kubelet ip-172-31-15-139.ec2.internal} spec.containers{jenkins} Warning BackOff Back-off restarting failed docker container
34s 19s 3 {kubelet ip-172-31-15-139.ec2.internal} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "jenkins" with CrashLoopBackOff: "Back-off 20s restarting failed container=jenkins pod=jenkins-1-yfahj_wewang7(306dc2f6-ecb3-11e5-9d8d-0aadb0f8cf89)"
Can you please get us the output from the openshift node log. And run the troubleshooting script at https://raw.githubusercontent.com/openshift/openshift-sdn/master/hack/debug.sh node logs as below, and I have no permission to run the script, can you track the problem with the logs? [root@dev-preview-int-node-compute-d5623 ~]# docker ps -a | grep jenkins-1-deploy ab8870ea68fb openshift3/ose-deployer:v3.1.1.910 "/usr/bin/openshift-d" 7 minutes ago Exited (255) 5 minutes ago k8s_deployment.86d070f9_jenkins-1-deploy_test_28d451a5-ef29-11e5-9d8d-0aadb0f8cf89_1eecc842 0229e3387213 openshift3/ose-pod:v3.1.1.910 "/pod" 7 minutes ago Exited (0) 5 minutes ago k8s_POD.e5c1dc5a_jenkins-1-deploy_test_28d451a5-ef29-11e5-9d8d-0aadb0f8cf89_64850473 [root@dev-preview-int-node-compute-d5623 ~]# docker logs ab8870ea68fbdec7bab4b6046ef9e49a7aafb78c428d2f03590896efe4367f2d I0321 01:52:59.062514 1 deployer.go:199] Deploying test/jenkins-1 for the first time (replicas: 1) I0321 01:52:59.066739 1 recreate.go:126] Scaling test/jenkins-1 to 1 before performing acceptance check F0321 01:55:00.099291 1 deployer.go:69] couldn't scale test/jenkins-1 to 1: timed out waiting for the condition Met this issue in ose env(3.2/2016-03-18.4)
0s 0s 1 {kubelet openshift-133.lab.sjc.redhat.com} spec.containers{jenkins} Normal Started Started container with docker id 1e806e9afc6b
<invalid> <invalid> 1 {kubelet openshift-133.lab.sjc.redhat.com} spec.containers{jenkins} Normal Killing Killing container with docker id 1e806e9afc6b: pod "jenkins-1-a1vio_jenkins(17142722-ef4a-11e5-bcc1-fa163efe3ad5)" container "jenkins" is unhealthy, it will be killed and re-created.
5m <invalid> 6 {kubelet openshift-133.lab.sjc.redhat.com} spec.containers{jenkins} Normal Pulled Container image "brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/jenkins-1-rhel7:latest" already present on machine
<invalid> <invalid> 1 {kubelet openshift-133.lab.sjc.redhat.com} spec.containers{jenkins} Normal Created Created container with docker id 24a815714336
<invalid> <invalid> 1 {kubelet openshift-133.lab.sjc.redhat.com} spec.containers{jenkins} Normal Started Started container with docker id 24a815714336
5m <invalid> 23 {kubelet openshift-133.lab.sjc.redhat.com} spec.containers{jenkins} Warning Unhealthy Readiness probe failed: Get http://10.2.1.6:8080/login: dial tcp 10.2.1.6:8080: connection refused
4m <invalid> 8 {kubelet openshift-133.lab.sjc.redhat.com} spec.containers{jenkins} Warning Unhealthy Liveness probe failed: Get http://10.2.1.6:8080/login: dial tcp 10.2.1.6:8080: connection refused
I need the output from: journalctl -flu atomic-openshift-node (Or perhaps openshift-node... do systemctl status | grep openshift to get the unit name if those fail) Ben,
Here are my debug logs that your requested while in the IRC...
Mar 21 15:32:38 node1_vm origin-node[17123]: I0321 15:32:38.315960 17177 plugin.go:138] SetUpPod network plugin output: + lock_file=/var/lock/openshift-sdn.lock
Mar 21 15:32:38 node1_vm origin-node[17123]: + action=setup
Mar 21 15:32:38 node1_vm origin-node[17123]: + net_container=41c971017f78f95334cd8ded26239c1edd6f6f5eb219f9554dd4b7eb2058452f
Mar 21 15:32:38 node1_vm origin-node[17123]: + tenant_id=0
Mar 21 15:32:38 node1_vm origin-node[17123]: + lockwrap run
Mar 21 15:32:38 node1_vm origin-node[17123]: + flock 200
Mar 21 15:32:38 node1_vm origin-node[17123]: + run
Mar 21 15:32:38 node1_vm origin-node[17123]: + get_ipaddr_pid_veth
Mar 21 15:32:38 node1_vm origin-node[17123]: ++ docker inspect --format '{{.HostConfig.NetworkMode}}' 41c971017f78f95334cd8ded26239c1edd6f6f5eb219f9554dd4b7eb2058452f
Mar 21 15:32:38 node1_vm origin-node[17123]: + network_mode=default
Mar 21 15:32:38 node1_vm origin-node[17123]: + '[' default == host ']'
Mar 21 15:32:38 node1_vm origin-node[17123]: + [[ default =~ container:.* ]]
Mar 21 15:32:38 node1_vm origin-node[17123]: ++ docker inspect --format '{{.NetworkSettings.IPAddress}}' 41c971017f78f95334cd8ded26239c1edd6f6f5eb219f9554dd4b7eb2058452f
Mar 21 15:32:38 node1_vm origin-node[17123]: + ipaddr=172.17.0.2
Mar 21 15:32:38 node1_vm origin-node[17123]: ++ docker inspect --format '{{.State.Pid}}' 41c971017f78f95334cd8ded26239c1edd6f6f5eb219f9554dd4b7eb2058452f
Mar 21 15:32:38 node1_vm origin-node[17123]: + pid=28810
Mar 21 15:32:38 node1_vm origin-node[17123]: ++ get_veth_host 28810
Mar 21 15:32:38 node1_vm origin-node[17123]: ++ local pid=28810
Mar 21 15:32:38 node1_vm origin-node[17123]: +++ nsenter -n -t 28810 -- ethtool -S eth0
Mar 21 15:32:38 node1_vm origin-node[17123]: +++ sed -n -e 's/.*peer_ifindex: //p'
Mar 21 15:32:38 node1_vm origin-node[17123]: ++ local veth_ifindex=1358
Mar 21 15:32:38 node1_vm origin-node[17123]: ++ ip link show
Mar 21 15:32:38 node1_vm origin-node[17123]: ++ sed -ne 's/^1358: \([^:@]*\).*/\1/p'
Mar 21 15:32:38 node1_vm origin-node[17123]: + veth_host=veth14e6fda
Mar 21 15:32:38 node1_vm origin-node[17123]: ++ get_container_mac 28810
Mar 21 15:32:38 node1_vm origin-node[17123]: ++ local pid=28810
Mar 21 15:32:38 node1_vm origin-node[17123]: ++ nsenter -n -t 28810 -- ip link show dev eth0
Mar 21 15:32:38 node1_vm origin-node[17123]: ++ sed -n -e 's/.*link.ether \([^ ]*\).*/\1/p'
Mar 21 15:32:38 node1_vm origin-node[17123]: + macaddr=02:42:ac:11:00:02
Mar 21 15:32:38 node1_vm origin-node[17123]: + source /run/openshift-sdn/config.env
Mar 21 15:32:38 node1_vm origin-node[17123]: ++ export OPENSHIFT_CLUSTER_SUBNET=10.1.0.0/16
Mar 21 15:32:38 node1_vm origin-node[17123]: ++ OPENSHIFT_CLUSTER_SUBNET=10.1.0.0/16
Mar 21 15:32:38 node1_vm origin-node[17123]: + case "$action" in
Mar 21 15:32:38 node1_vm origin-node[17123]: + add_ovs_port
Mar 21 15:32:38 node1_vm origin-node[17123]: + brctl delif lbr0 veth14e6fda
Mar 21 15:32:38 node1_vm origin-node[17123]: device veth14e6fda is not a slave of lbr0
Mar 21 15:32:38 node1_vm origin-node[17123]: , exit status 1
Mar 21 15:32:38 node1_vm origin-node[17123]: E0321 15:32:38.316051 17177 manager.go:1791] Failed to setup network for pod "docker-registry-1-deploy_default(4006a8ee-ef9b-11e5-bbd7-0050568848d8)" using network plugins "redhat/openshift-ovs-subnet"
: exit status 1; Skipping pod
Thanks,
digi691
What RPM version of atomic-openshift is installed on this cluster? "rpm -q atomic-openshift" will tell you... Sorry if I missed it above. It's looking like docker's network setup isn't correct. Can you also grab: 1) the contents of /run/openshift-sdn/docker-network 2) 'ps ax | grep docker' @Ben ,here is info you need [root@dev-preview-int-master-167b1 ~]# cat /run/openshift-sdn/docker-network # This file has been modified by openshift-sdn. DOCKER_NETWORK_OPTIONS='-b=lbr0 --mtu=8951' [root@dev-preview-int-master-167b1 ~]# ps ax | grep docker 29489 ? Ss 0:00 /bin/sh -c /usr/bin/docker daemon $OPTIONS $DOCKER_STORAGE_OPTIONS $DOCKER_NETWORK_OPTIONS $ADD_REGISTRY $BLOCK_REGISTRY $INSECURE_REGISTRY 2>&1 | /usr/bin/forward-journald -tag docker 29490 ? Sl 64:01 /usr/bin/docker daemon --selinux-enabled --storage-driver devicemapper --storage-opt dm.fs=xfs --storage-opt dm.thinpooldev=/dev/mapper/docker_vg-docker--pool --storage-opt dm.use_deferred_removal=true -b=lbr0 --mtu=8951 --add-registry registry.qe.openshift.com --add-registry registry.access.redhat.com 29491 ? Sl 0:21 /usr/bin/forward-journald -tag docker 46201 pts/1 S+ 0:00 grep --color=auto docker using jenkins-ephemeral-template in ose 3.2.0.16 env.
Image is brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/jenkins-1-rhel7(a414317f519d).
error:
[root@dhcp-128-91 backup]# oc get pods
NAME READY STATUS RESTARTS AGE
jenkins-1-deploy 1/1 Running 0 8m
jenkins-1-o0gh5 0/1 CrashLoopBackOff 6 8m
[root@dhcp-128-91 backup]# oc describe pod jenkins-1-o0gh5
Name: jenkins-1-o0gh5
Namespace: wewang
Image(s): brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/jenkins-1-rhel7:latest
Node: openshift-210.lab.sjc.redhat.com/10.14.6.210
Start Time: Mon, 18 Apr 2016 17:08:46 +0800
Labels: deployment=jenkins-1,deploymentconfig=jenkins,name=jenkins
Status: Running
Reason:
Message:
IP: 10.2.2.3
Controllers: ReplicationController/jenkins-1
Containers:
jenkins:
Container ID: docker://1dd4a1aa921eff1a0c7200839a0f05302d98226c4f9ffa5a05fdb32f70df265e
Image: brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/jenkins-1-rhel7:latest
Image ID: docker://a414317f519d1c75b400d848bb658931eb71a4ba1cd7e6075783cd99c9ca8759
Port:
QoS Tier:
memory: Guaranteed
cpu: BestEffort
Limits:
memory: 512Mi
Requests:
memory: 512Mi
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 143
Started: Mon, 18 Apr 2016 17:16:04 +0800
Finished: Mon, 18 Apr 2016 17:16:36 +0800
Ready: False
Restart Count: 6
Liveness: http-get http://:8080/login delay=30s timeout=3s period=10s #success=1 #failure=3
Readiness: http-get http://:8080/login delay=3s timeout=3s period=10s #success=1 #failure=3
Environment Variables:
JENKINS_PASSWORD: password
Conditions:
Type Status
Ready False
Volumes:
jenkins-data:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
default-token-pimf1:
Type: Secret (a secret that should populate this volume)
SecretName: default-token-pimf1
Events:
FirstSeen LastSeen Count From SubobjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
8m 8m 1 {default-scheduler } Normal Scheduled Successfully assigned jenkins-1-o0gh5 to openshift-210.lab.sjc.redhat.com
8m 8m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Pulling pulling image "brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/jenkins-1-rhel7:latest"
7m 7m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Pulled Successfully pulled image "brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/jenkins-1-rhel7:latest"
7m 7m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Created Created container with docker id ce3582dcba64
7m 7m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Started Started container with docker id ce3582dcba64
6m 6m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Killing Killing container with docker id ce3582dcba64: pod "jenkins-1-o0gh5_wewang(27bc1236-0545-11e6-8ce7-fa163e1af809)" container "jenkins" is unhealthy, it will be killed and re-created.
6m 6m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Created Created container with docker id ff36f822940d
6m 6m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Started Started container with docker id ff36f822940d
5m 5m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Killing Killing container with docker id ff36f822940d: pod "jenkins-1-o0gh5_wewang(27bc1236-0545-11e6-8ce7-fa163e1af809)" container "jenkins" is unhealthy, it will be killed and re-created.
5m 5m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Created Created container with docker id 00f46abdf54d
5m 5m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Started Started container with docker id 00f46abdf54d
5m 5m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Killing Killing container with docker id 00f46abdf54d: pod "jenkins-1-o0gh5_wewang(27bc1236-0545-11e6-8ce7-fa163e1af809)" container "jenkins" is unhealthy, it will be killed and re-created.
5m 5m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Created Created container with docker id 1965693838bf
5m 5m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Started Started container with docker id 1965693838bf
4m 4m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Killing Killing container with docker id 1965693838bf: pod "jenkins-1-o0gh5_wewang(27bc1236-0545-11e6-8ce7-fa163e1af809)" container "jenkins" is unhealthy, it will be killed and re-created.
4m 4m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Created Created container with docker id 51e2f2b0ffae
4m 4m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Started Started container with docker id 51e2f2b0ffae
3m 3m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Killing Killing container with docker id 51e2f2b0ffae: pod "jenkins-1-o0gh5_wewang(27bc1236-0545-11e6-8ce7-fa163e1af809)" container "jenkins" is unhealthy, it will be killed and re-created.
3m 3m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Started Started container with docker id e9aeb42ec2b6
3m 3m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Created Created container with docker id e9aeb42ec2b6
6m 3m 2 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Warning Unhealthy Liveness probe failed: HTTP probe failed with statuscode: 503
6m 3m 2 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Warning Unhealthy Readiness probe failed: HTTP probe failed with statuscode: 503
3m 3m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Killing Killing container with docker id e9aeb42ec2b6: pod "jenkins-1-o0gh5_wewang(27bc1236-0545-11e6-8ce7-fa163e1af809)" container "jenkins" is unhealthy, it will be killed and re-created.
3m 1m 8 {kubelet openshift-210.lab.sjc.redhat.com} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "jenkins" with CrashLoopBackOff: "Back-off 1m20s restarting failed container=jenkins pod=jenkins-1-o0gh5_wewang(27bc1236-0545-11e6-8ce7-fa163e1af809)"
6m 1m 6 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Pulled Container image "brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/jenkins-1-rhel7:latest" already present on machine
1m 1m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Created Created container with docker id 1dd4a1aa921e
1m 1m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Started Started container with docker id 1dd4a1aa921e
7m 1m 23 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Warning Unhealthy Readiness probe failed: Get http://10.2.2.3:8080/login: dial tcp 10.2.2.3:8080: connection refused
6m 1m 7 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Warning Unhealthy Liveness probe failed: Get http://10.2.2.3:8080/login: dial tcp 10.2.2.3:8080: connection refused
1m 1m 1 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Normal Killing Killing container with docker id 1dd4a1aa921e: pod "jenkins-1-o0gh5_wewang(27bc1236-0545-11e6-8ce7-fa163e1af809)" container "jenkins" is unhealthy, it will be killed and re-created.
3m 14s 14 {kubelet openshift-210.lab.sjc.redhat.com} spec.containers{jenkins} Warning BackOff Back-off restarting failed docker container
1m 14s 6 {kubelet openshift-210.lab.sjc.redhat.com} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "jenkins" with CrashLoopBackOff: "Back-off 2m40s restarting failed container=jenkins pod=jenkins-1-o0gh5_wewang(27bc1236-0545-11e6-8ce7-fa163e1af809)"
openshift3/jenkins-1-rhel7(a414317f519d) As my comment #14 said, my jenkins pods could be running,could login from webconsole and do jenkins job. But sometimes, met the jenkins pod restart issue,finally the pod will be running(Maybe the networks between Beijing and US is not stable).Paste logs later. Created attachment 1148362 [details]
jenkins-restart.log
seems pod is running, just a warn when describe pod ,so change severity to low wewang, your later log messages don't show the SetupNetwork errors that the original report and Chris DiGiovanni had. Do you still see those SetupNetwork errors? I think the 'unhealthy container' messages are a different issue. |