Bug 1304266
| Summary: | Pod status keeps in pending status on dedicated env | ||||||
|---|---|---|---|---|---|---|---|
| Product: | OpenShift Online | Reporter: | Wang Haoran <haowang> | ||||
| Component: | Containers | Assignee: | Jhon Honce <jhonce> | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | DeShuai Ma <dma> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | 3.x | CC: | agrimm, akostadi, aos-bugs, haowang, jhonce, jokerman, mmccomas, pruan, whearn, wzheng | ||||
| Target Milestone: | --- | Keywords: | TestBlocker | ||||
| Target Release: | --- | Flags: | jhonce:
needinfo-
|
||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2016-05-23 15:08:33 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 1303130 | ||||||
| Attachments: |
|
||||||
|
Description
Wang Haoran
2016-02-03 08:37:46 UTC
the env cannot build and deploy now
[vagrant@ose ~]$ oc get event
FIRSTSEEN LASTSEEN COUNT NAME KIND SUBOBJECT REASON SOURCE MESSAGE
17m 10m 11 database-1 ReplicationController FailedCreate {deployer } Error creating deployer pod for haowang2/database-1: Pod "database-1-deploy" is forbidden: service account haowang2/deployer was not found, retry after the service account is created
17m 12m 2 database-1 ReplicationController FailedCreate {deployer } Error creating deployer pod for haowang2/database-1: Internal error occurred: Get http://api.stage.openshift.com/api/v1/namespaces/haowang2: dial tcp 52.5.122.7:80: connection refused
17m 14m 2 database-1 ReplicationController FailedCreate {deployer } Error creating deployer pod for haowang2/database-1: Internal error occurred: Get http://api.stage.openshift.com/api/v1/namespaces/haowang2: dial tcp 52.72.220.72:80: connection refused
8m 17s 10 database-2 ReplicationController FailedCreate {deployer } Error creating deployer pod for haowang2/database-2: Pod "database-2-deploy" is forbidden: service account haowang2/deployer was not found, retry after the service account is created
17m 17m 1 database DeploymentConfig DeploymentCreated {deploymentconfig-controller } Created new deployment "database-1" for version 1
8m 8m 1 database DeploymentConfig DeploymentCreated {deploymentconfig-controller } Created new deployment "database-2" for version 2
Initial report that project does not exist and there are no other pods showing this. Comment 1 is related to https://bugzilla.redhat.com/show_bug.cgi?id=1304586 Can you recreate the OutOfDisk error so I can actually look into it? No such errors like comment #1 and no OutOfDisk error now; However, pods keep in pending status in nodes: [wzheng@localhost ~]$ oc get builds NAME TYPE FROM STATUS STARTED DURATION php-sample-build-1 Source Git Pending [wzheng@localhost ~]$ oc get pods -o wide NAME READY STATUS RESTARTS AGE NODE database-1-deploy 1/1 Running 0 21m ip-172-31-5-179.ec2.internal database-1-kbnn8 1/1 Running 0 21m ip-172-31-5-179.ec2.internal database-1-posthook 0/1 Pending 0 21m ip-172-31-5-180.ec2.internal database-1-prehook 0/1 Completed 0 21m ip-172-31-5-179.ec2.internal php-sample-build-1-build 0/1 Pending 0 19m ip-172-31-5-180.ec2.internal [wzheng@localhost ~]$ oc get pods -n wzheng3 -o wide NAME READY STATUS RESTARTS AGE NODE php-sample-build-2-build 0/1 Pending 0 19m ip-172-31-5-179.ec2.internal [wzheng@localhost ~]$ oc get pods -o wide -n wzheng123 NAME READY STATUS RESTARTS AGE NODE php-sample-build-3-build 0/1 Pending 0 21m ip-172-31-5-177.ec2.internal This is related to docker getting hung. It seems to be happening at a higher rate for us in 3.1.1.6. During the next hang, please attach strace to the docker process (-f -v -y -yy -s 4096) and log for approximately 5 minutes. Please attach log or forward via email. Thanks. Created attachment 1122835 [details] python build failed It might be related working in web console, I've got "the image cannot be retrieved" several times while trying to open: https://console.stage.openshift.com/console/project/ctrnl/create/fromimage?imageName=python&imageTag=3.4&namespace=ctrnl At some point it succeeded. But then build failed with failed to push image. See attached console log. I've tested it with my limited runs and have not seen the problem. Will need to run more tests to see and will put it as VERIFIED if still can't reproduce the problem then. I've run more tests today on top off yesterday and have not seen the issue again. Putting it as verified. |