Description of problem: Build pod was deleted automaticly after create roughly 1 hour later, so can't get build-log anymore. $ oc build-logs ruby-sample-build-4 -n gits API error (404): no such id: 7ef69dfb99d623cad1f0351b16b2bfa6c77f36336c4fec3063083e2750ba1155 Version-Release number of selected component (if applicable): # openshift version openshift v3.0.2.902 kubernetes v1.2.0-alpha.1-1107-g4c8e6f4 etcd 2.1.2 How reproducible: always Steps to Reproduce: 1.Trigger a build 2.Check build log after 1 hour 3. Actual results: $ oc get builds -n gits NAME TYPE FROM STATUS STARTED ruby-sample-build-1 Source Git Failed 2 hours ago ruby-sample-build-2 Source Git Failed About an hour ago ruby-sample-build-3 Source Git Failed About an hour ago ruby-sample-build-4 Source Git Complete About an hour ago $ oc build-logs ruby-sample-build-4 -n gits API error (404): no such id: 7ef69dfb99d623cad1f0351b16b2bfa6c77f36336c4fec3063083e2750ba1155 Expected results: Build pod should exist,and can check build-logs. Additional info:
Could reproduce this bug in ose 3.1 env oc v3.1.0.4-9-g72d3991 kubernetes v1.1.0-origin-1107-g4c8e6f4
Can you enable loglevel 5 in the openshift master and recreate providing those logs? We need to determine if the pod is being deleted by our sync logic that tries to delete pods when the build is deleted. Specifically we'll be looking for trace indicating: "Handling deletion of build <buildname>"
This looks to me like the container has been deleted, but I assume the pod is still there; otherwise, you would get a different error message.
Good point Andy, sounds like this is probably working as designed, though the logging api should report a better error message when the container doesn't exist. That's probably an upstream issue. XiuJuan, can you check on your container garbage collection settings? https://docs.openshift.org/latest/admin_guide/garbage_collection.html#container-garbage-collection see also this clarification to the docs: https://github.com/openshift/openshift-docs/pull/1219/files
Ben, Check the /etc/origin/node/node-config.yaml file in ose 3.1 env, no the three arguments setting. kubeletArguments: minimum-container-ttl-duration: - 10s maximum-dead-containers-per-container: - 2 maximum-dead-containers: And in today's ose env, could streamback build-logs if builds have been created by 3 hours. So will move this bug as verified. oc v3.1.0.4-5-gebe80f5 kubernetes v1.1.0-origin-1107-g4c8e6f4 $oc get builds NAME TYPE FROM STATUS STARTED DURATION ruby-hello-world-1 Docker Git Complete 3 hours ago 2m17s ruby-hello-world-2 Docker Git Complete 3 hours ago 2m2s ruby-hello-world-3 Docker Git Complete 3 hours ago 2m6s The openshift log after setting loglevel=5 111949:Nov 19 16:52:38 openshift-146 atomic-openshift-master: I1119 16:52:38.076031 17677 controller.go:81] Handling build xiuwang/ruby-hello-world-3 113431:Nov 19 16:54:39 openshift-146 atomic-openshift-master: I1119 16:54:39.075729 17677 controller.go:81] Handling build xiuwang/ruby-hello-world-3 113823:Nov 19 16:55:13 openshift-146 atomic-openshift-master: I1119 16:55:13.953638 17677 factory.go:448] Found build pod xiuwang/ruby-hello-world-3-build 113824:Nov 19 16:55:13 openshift-146 atomic-openshift-master: I1119 16:55:13.959613 17677 factory.go:472] Found build xiuwang/ruby-hello-world-3 for pod ruby-hello-world-3-build 113832:Nov 19 16:55:13 openshift-146 atomic-openshift-master: I1119 16:55:13.997732 17677 factory.go:528] Found build xiuwang/ruby-hello-world-3 113833:Nov 19 16:55:13 openshift-146 atomic-openshift-master: I1119 16:55:13.997738 17677 factory.go:530] Ignoring build xiuwang/ruby-hello-world-3 because it is complete 114950:Nov 19 16:56:40 openshift-146 atomic-openshift-master: I1119 16:56:40.074691 17677 controller.go:81] Handling build xiuwang/ruby-hello-world-3
i've opened an upstream issue (https://github.com/kubernetes/kubernetes/issues/17501) for the bad error message that occurs when this happens.