Bug 1317783 - Should give proper reason in event that memory and cpu are exceeded in dc
Summary: Should give proper reason in event that memory and cpu are exceeded in dc
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OKD
Classification: Red Hat
Component: Deployments
Version: 3.x
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: ---
Assignee: Michail Kargakis
QA Contact: zhou ying
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-03-15 08:29 UTC by XiaochuanWang
Modified: 2016-05-12 17:10 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-05-12 17:10:30 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description XiaochuanWang 2016-03-15 08:29:58 UTC
Description of problem:
Set mem limit and cpu limit. Edit dc to be exceeded and trigger new deploy. `oc get event` should give proper reason of failed to create deployer pod instead of "Error creating deployer pod for project1/database-2: <nil>"

Version-Release number of selected component (if applicable):
openshift v1.1.4-4-gd41c3de
kubernetes v1.2.0-origin-41-g91d3e75
etcd 2.2.5

How reproducible:
Always

Steps to Reproduce:
1. Process and create dc by template "https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/deployment/deployments_nobc_cpulimit.json"
2. Create limit for mem and cpu:
https://raw.githubusercontent.com/openshift/origin/master/examples/project-quota/limits.yaml
https://raw.githubusercontent.com/openshift/origin/master/examples/project-quota/quota.yaml
3. Edit dc to be exceeded and trigger new dc (or manually trigger when deploy #1 is done)
4. `oc get event`

Actual results:
1)The failed message is not sufficient as below:
48s         43s        6         database            DeploymentConfig                                                    Warning   FailedCreate        {deployer }                             Error creating deployer pod for xiaocwan3-o/database-2: <nil>
2) as "additional info", rc doesn't have the reason in "event", just shows "No events"

Expected results:
Failed message should give the reason like below:
Error creating deployer pod for quota-demo/database-2: Pod "deploy-database-2" is forbidden: Limited to 750Mi memory

Additional info:
`oc describe rc` and `oc describe dc` can not show the failed reason either.
# oc describe rc database-2
Name:        database-2
Namespace:    xiaocwan3-o
Image(s):    openshift/mysql-55-centos7
Selector:    deployment=database-2,deploymentconfig=database,name=database
Labels:        openshift.io/deployment-config.name=database,template=application-template-isdc
Replicas:    0 current / 0 desired
Pods Status:    0 Running / 0 Waiting / 0 Succeeded / 0 Failed
No volumes.
No events.
# oc describe dc database
...
  1m        25s        6    {deployer }                    Warning        FailedCreate        Error creating deployer pod for xiaocwan-t/database-2: <nil>

Comment 1 Michail Kargakis 2016-03-15 13:38:11 UTC
This is a regression. We moved emitting events in the deployment controller from the rc to the dc (generally speaking just by looking at the dc should be enough in order to debug your app) and shadowed some errors that shouldn't be shadowed:) https://github.com/openshift/origin/pull/8015 fixes the shadowing, once it merges you will be able to see the actual error instead of <nil>.

Comment 2 openshift-github-bot 2016-03-16 01:12:33 UTC
Commit pushed to master at https://github.com/openshift/origin

https://github.com/openshift/origin/commit/85f2ebed5e043721e87c6edb851cbfe71c23486d
Bug 1317783: avoid shadowing errors in the deployment controller

Comment 3 XiaochuanWang 2016-03-16 09:49:13 UTC
Just FYI, I tested on latst origin(on devenv-rhel7_3734) again and get the error message " Error updating deployment cqrhj/database-1 status to Pending"
oadm v1.1.4-16-gb5da002
kubernetes v1.2.0-origin-41-g91d3e75

$ oc get events
FIRSTSEEN   LASTSEEN   COUNT     NAME                KIND                    SUBOBJECT                                   TYPE      REASON              SOURCE                                  MESSAGE
4m          4m         1         database-1-deploy   Pod                                                                 Normal    Scheduled           {default-scheduler }                    Successfully assigned database-1-deploy to ip-172-18-5-28.ec2.internal
4m          4m         1         database-1-deploy   Pod                     spec.containers{deployment}                 Normal    Pulled              {kubelet ip-172-18-5-28.ec2.internal}   Container image "openshift/origin-deployer:v1.1.4" already present on machine
4m          4m         1         database-1-deploy   Pod                     spec.containers{deployment}                 Normal    Created             {kubelet ip-172-18-5-28.ec2.internal}   Created container with docker id 5d2bb9fb6f11
4m          4m         1         database-1-deploy   Pod                     spec.containers{deployment}                 Normal    Started             {kubelet ip-172-18-5-28.ec2.internal}   Started container with docker id 5d2bb9fb6f11
3m          3m         1         database-1-xjkp4    Pod                                                                 Normal    Scheduled           {default-scheduler }                    Successfully assigned database-1-xjkp4 to ip-172-18-5-28.ec2.internal
3m          3m         1         database-1-xjkp4    Pod                     spec.containers{ruby-helloworld-database}   Normal    Pulled              {kubelet ip-172-18-5-28.ec2.internal}   Container image "openshift/mysql-55-centos7" already present on machine
3m          3m         1         database-1-xjkp4    Pod                     spec.containers{ruby-helloworld-database}   Normal    Created             {kubelet ip-172-18-5-28.ec2.internal}   Created container with docker id 2b782e5f002a
3m          3m         1         database-1-xjkp4    Pod                     spec.containers{ruby-helloworld-database}   Normal    Started             {kubelet ip-172-18-5-28.ec2.internal}   Started container with docker id 2b782e5f002a
3m          3m         1         database-1          ReplicationController                                               Normal    SuccessfulCreate    {replication-controller }               Created pod: database-1-xjkp4
4m          4m         1         database            DeploymentConfig                                                    Normal    DeploymentCreated   {deploymentconfig-controller }          Created new deployment "database-1" for version 1
4m          4m         1         database            DeploymentConfig                                                    Warning   FailedUpdate        {deployer }                             Error updating deployment cqrhj/database-1 status to Pending

Comment 4 Michail Kargakis 2016-03-16 10:10:24 UTC
I don't think you are using my code changes because as a matter of fact I have changed that error message 1) by rewording it and 2) to include the error (see https://github.com/openshift/origin/pull/8015/files#diff-fa925ff7d2acc462649ccd349e76d981R230).

Comment 5 Michail Kargakis 2016-03-16 11:07:03 UTC
Also instead of the wordier `oc get events` try `oc describe dc/database`. It should have all the deploymentconfig events.

Comment 6 XiaochuanWang 2016-03-18 07:53:44 UTC
This is verified on latest origin
openshift v3.2.0.4
kubernetes v1.2.0-origin-41-g91d3e75
etcd 2.2.5

Got the expected message as below:
Error creating deployer pod for xiaocwan-q/database-2: pods "database-2-deploy" is forbidden: [Maximum cpu usage per Pod is 500m, but limit is 1100m., Maximum memory usage per Pod is 750Mi, but limit is 796917760., Maximum cpu usage per Container is 500m, but limit is 1100m., Maximum memory usage per Container is 750Mi, but limit is 760Mi.]


Note You need to log in before you can comment on or make changes to this bug.