Hide Forgot
Description of problem: Logged in to console to see MongoDB pod has a failed deploy, however I haven't made any changes to the pod that would trigger a re-deploy - it has been stable for 20 days. Console overview says the deploy was from: 10 hours ago from image change But, as stated, no image change was made. The Python app deployment has also failed, however I did a git push before stopping work last night and don't remember checking if the rebuild/deploy was successful - so that may have caused the Python app deployment to fail. I have re-deployed the Python app, but just getting the error events: Unable to mount volumes for pod "python-app-92-xw0d5_my-app(4d43470a-6fe5-11e6-8f30-12d79454368d)": Could not attach EBS Disk "aws://us-east-1c/vol-abfa5979": Error attaching EBS volume: VolumeInUse: vol-abfa5979 is already attached to an instance status code: 400, request id: Version-Release number of selected component (if applicable): Dev preview. How reproducible: I don't know. Steps to Reproduce: 1. Login to console. Actual results: Failed deploy for MongoDB and Python pods. Expected results: Normally functioning pods. Additional info:
After a few tries at re-deploying the app, it finally worked, but when I scaled the pod up it just got stuck on the "pulling" state (grey circle indicator).
Last comment is in regards to the python app.
I deleted the failed mongo deployment (as the previous one was up and running), and again a re-deployment has been triggered without me doing anything. :/
Both pods failed and unable to get back up.
This seems so strange, I scale down Python and MongoDB pods, and then just look at the console for about 5 minutes, and MongoDB pod deployment #11 just scales down and MongoDB pod deployment #12 (a new deployment) gets deployed without me doing anything. (This has happened a couple of times, and each time I delete MongoDB pod deployment #12). I can't scale up either the MongoDB or Python pod from 0 to 1.
(In reply to bugreport398 from comment #5) > This seems so strange, I scale down Python and MongoDB pods, and then just > look at the console for about 5 minutes, and MongoDB pod deployment #11 just > scales down and MongoDB pod deployment #12 (a new deployment) gets deployed > without me doing anything. (This has happened a couple of times, and each > time I delete MongoDB pod deployment #12). I can't scale up either the > MongoDB or Python pod from 0 to 1. OK, there are couple things grouped in this bug: 1) You see "Could not attach EBS Disk" because your python app deployment config is using the "rolling" strategy but have ReadWriteOnce (RWO) persistent volume attached. What is happening is that rolling strategy will start a new Pod when a new deployment occurs but it does not scale down the old Pod... So at this point, there are two Pods running which have the same volume defined. To fix this, you can switch to "recreate" strategy. 2) "gets deployed without me doing anything" if you removed the replication controller that deployment config created, deployment config will see that as it does not have the latest version running where it should and will re-create it for you. That is expected. 3) "however I haven't made any changes to the pod that would trigger a re-deploy" this is probably caused by pushing a new version of the MongoDB image into Docker Registry. Because you have ImageChangeTrigger defined, it means it will be automatically re-deployed when a new image is available. You can remove this behavior by removing the "automatic: true" from the trigger (edit YAML in web console or `oc edit dc/foo`).
The Mongo deployment started probably because an updated image was pushed to the Mongo ImageStream. If you don't want to have automatic deployments happen on image updates, edit your Mongo DeploymentConfig and remove "automatic: true" from its ImageChangeTrigger. https://docs.openshift.org/latest/dev_guide/deployments.html#image-change-trigger In this way, you will need to manually enable the trigger back when you wish to update to the latest Mongo image. From the cli, it should be `oc set triggers dc/mongodb --auto` - from the console, you will need to add "automatic: true" back to the DeploymentConfig. In 1.3.1 we will enable manual deployments without the need to enable/disable triggers. > I deleted the failed mongo deployment (as the previous one was up and running), and again a re-deployment has been triggered without me doing anything. :/ You are not supposed to manipulate ReplicationControllers owned by DeploymentConfigs. We lag in our docs, but the notice is added in https://github.com/openshift/openshift-docs/pull/2694. Long story short, if you delete your latest RC, the DC controller will notice it and recreate a new one. Can you post the output of the following commands from your project? oc get events oc get all -o yaml
@Michal Fojtik, both MongoDB and Python were already set to recreate strategies (not rolling): Browse > Deployments > mongodb > Edit YAML: spec: strategy: type: Recreate recreateParams: timeoutSeconds: 600 resources: { } Browse > Deployments > python-app > Edit YAML: spec: strategy: type: Recreate recreateParams: timeoutSeconds: 600 resources: { } @ Michail Kargakis This seems likely: "The Mongo deployment started probably because an updated image was pushed to the Mongo ImageStream". Based on these two bits of feedback (thanks), I am thinking that the errors are because resources weren't available to build, deploy or scale both pods up at once. (I still cannot scale either pod up from 0 - python behaviour is reminiscent of that described in this bug: https://bugzilla.redhat.com/show_bug.cgi?id=1369644). I am scratching my head though, because I think it should be able to handle the current configuration, ie: Browse > Deployments > mongodb > Set Resource Limits: Memory: 1GB Browse > Deployments > python-app > Set Resource Limits: Memory: 1GB oc get pvc NAME STATUS VOLUME CAPACITY ACCESSMODES AGE mongodb Bound pv-aws-1dj3b 4Gi RWO 17d pvc-nf0kl Bound pv-aws-e1agr 4Gi RWO 16d oc volume dc --all deploymentconfigs/mongodb pvc/mongodb (allocated 4GiB) as mongodb-data mounted at /var/lib/mongodb/data deploymentconfigs/python-app pvc/pvc-nf0kl (allocated 4GiB) as mypythonvolume mounted at /opt/app-root/src/static/media_files oc describe quota compute-resources -n my-app Name: compute-resources Namespace: my-app Scopes: NotTerminating * Matches all pods that do not have an active deadline. Resource Used Hard -------- ---- ---- limits.cpu 0 8 limits.memory 0 4Gi Below is the requested output for `oc get events`. Is there a way to minimise output of `oc get all -o yaml` so that only required information is displayed - it currently is very lengthy and includes `tokens` and `secrets` fields - not sure if ok to post here? oc get events FIRSTSEEN LASTSEEN COUNT NAME KIND SUBOBJECT TYPE REASON SOURCE MESSAGE 18m 18m 1 python-app-57-build Pod Normal Scheduled {default-scheduler } Successfully assigned python-app-57-build to ip-172-31-54-158.ec2.internal 18m 18m 1 python-app-57-build Pod spec.containers{sti-build} Normal Pulling {kubelet ip-172-31-54-158.ec2.internal} pulling image "openshift3/ose-sti-builder:v3.2.1.7" 18m 18m 1 python-app-57-build Pod spec.containers{sti-build} Normal Pulled {kubelet ip-172-31-54-158.ec2.internal} Successfully pulled image "openshift3/ose-sti-builder:v3.2.1.7" 18m 18m 1 python-app-57-build Pod spec.containers{sti-build} Normal Created {kubelet ip-172-31-54-158.ec2.internal} Created container with docker id 23c106a2a94e 18m 18m 1 python-app-57-build Pod spec.containers{sti-build} Normal Started {kubelet ip-172-31-54-158.ec2.internal} Started container with docker id 23c106a2a94e 2h 48m 82 python-app-93-4gnk7 Pod Warning FailedMount {kubelet ip-172-31-54-168.ec2.internal} Unable to mount volumes for pod "python-app-93-4gnk7_my-app(77f4b738-7027-11e6-8f30-12d79454368d)": Could not attach EBS Disk "aws://us-east-1c/vol-abfa5979": Error attaching EBS volume: VolumeInUse: vol-abfa5979 is already attached to an instance status code: 400, request id: 2h 48m 82 python-app-93-4gnk7 Pod Warning FailedSync {kubelet ip-172-31-54-168.ec2.internal} Error syncing pod, skipping: Could not attach EBS Disk "aws://us-east-1c/vol-abfa5979": Error attaching EBS volume: VolumeInUse: vol-abfa5979 is already attached to an instance status code: 400, request id: 49m 49m 1 python-app-93 ReplicationController Normal SuccessfulDelete {replication-controller } Deleted pod: python-app-93-4gnk7 15m 15m 1 python-app-94-deploy Pod Normal Scheduled {default-scheduler } Successfully assigned python-app-94-deploy to ip-172-31-54-168.ec2.internal 15m 15m 1 python-app-94-deploy Pod spec.containers{deployment} Normal Pulling {kubelet ip-172-31-54-168.ec2.internal} pulling image "openshift3/ose-deployer:v3.2.1.7" 15m 15m 1 python-app-94-deploy Pod spec.containers{deployment} Normal Pulled {kubelet ip-172-31-54-168.ec2.internal} Successfully pulled image "openshift3/ose-deployer:v3.2.1.7" 15m 15m 1 python-app-94-deploy Pod spec.containers{deployment} Normal Created {kubelet ip-172-31-54-168.ec2.internal} Created container with docker id 0f1645b84a7e 15m 15m 1 python-app-94-deploy Pod spec.containers{deployment} Normal Started {kubelet ip-172-31-54-168.ec2.internal} Started container with docker id 0f1645b84a7e 15m 15m 1 python-app DeploymentConfig Normal DeploymentCreated {deploymentconfig-controller } Created new deployment "python-app-94" for version 94 15m 15m 1 python-app DeploymentConfig Warning FailedUpdate {deployment-controller } Cannot update deployment my-app/python-app-94 status to Pending: replicationcontrollers "python-app-94" cannot be updated: the object has been modified; please apply your changes to the latest version and try again
I just changed: Browse > Deployments > python-app > Set Resource Limits: Memory: 1GB to: Browse > Deployments > python-app > Set Resource Limits: Memory: 525MB And the python pod deployed and scaled up quickly. I then tried to scale up MongoDB pod and it got to the light blue "not ready" stage - events tab shows: 10:16:36 PM Warning Unhealthy Readiness probe failed: sh: cannot set terminal process group (-1): Inappropriate ioctl for device sh: no job control in this shell sh: mongostat: command not found 9 times in the last minute 10:15:05 PM Normal Created Created container with docker id 7d1071cb67ce 10:15:05 PM Normal Started Started container with docker id 7d1071cb67ce 10:15:04 PM Normal Pulled Successfully pulled image "registry.access.redhat.com/rhscl/mongodb-32-rhel7@sha256:888c0b99e71bf21382e7471f5f6a48d4e52cf7b43b10ce57df05e7b03843c964" 10:15:02 PM Normal Pulling pulling image "registry.access.redhat.com/rhscl/mongodb-32-rhel7@sha256:888c0b99e71bf21382e7471f5f6a48d4e52cf7b43b10ce57df05e7b03843c964" 10:14:56 PM Normal Scheduled Successfully assigned mongodb-12-chn9u to ip-172-31-54-168.ec2.internal
If I scale down the python pod to 0 (hoping to give the mongodb pod more available juice) and try and scale up the mongodb pod, the same events tab message above displays.
I rolled back to mongodb deployment #11 (where #12 was the automatically triggered deployment that I didn't trigger) and the mongodb pod scaled up. The python pod could then be scaled up. So two issues: - The new automatic deployment 'doesn't work'. - The system seems to prefer when MongoDB memory is 1GB and Python memory is 525MB - even though I think I have 4GB memory available :/
> Is there a way to minimise output of `oc get all -o yaml` so that only required information is displayed - it currently is very lengthy and includes `tokens` and `secrets` fields - not sure if ok to post here? oc get dc,rc -o yaml then and put it in a pastebin
Output of oc get dc,rc -o yaml: http://pastebin.com/U069VnWR Can anyone give a definitive answer as to why the 4GB memory allowance for the project is not allowing two pods (set to recreate strategy) not to comfortably build deploy and scale when both mongodb and python pods are allocated 1GB memory. I recently had to take the python pod down to 525MB in order to be able to scale it up without errors as explained above. Thank you.
PS - It just occurred to me that the reason all pods went down and they were so difficult to get back up again was that python and mongodb were on 1GB memory each, and then the automatic mongodb deploy was triggered (when both pods were in a scaled up state), and it was this action that caused a lack of resources which resulted in unable to mount and readiness probe errors etc. Just a theory but if a similar thing happens to someone else, perhaps it can be tested. Would still be good to know why 4GB memory resource is not adequate for supporting 2 x 1GB deployments and an automatically triggered deploy (all with recreate strategy).
To your question about why 4GB of "non-terminating" quota is not enough... Your "terminating" quota is perhaps coming into play. This quota is used to run the builder/deployer pods as well as the hook pods. The resources that these pods specify is the same as what is specified in the corresponding DeploymentConfig and BuildConfig. Can you check if this is the issue (your deployment is failing on account of a lack of resources in your "terminating" quota?