Bug 996629 - Node rescues on partially deleted application
Summary: Node rescues on partially deleted application
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: OpenShift Online
Classification: Red Hat
Component: Containers
Version: 2.x
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ---
: ---
Assignee: Hiro Asari
QA Contact: libra bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-08-13 14:44 UTC by Sten Turpin
Modified: 2015-05-14 23:26 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-09-13 13:43:59 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
mcollective log for deconfigure operation (8.68 KB, text/plain)
2013-08-13 14:44 UTC, Sten Turpin
no flags Details

Description Sten Turpin 2013-08-13 14:44:26 UTC
Created attachment 786189 [details]
mcollective log for deconfigure operation

Description of problem: Applications are sometimes halfway created - when this occurs, the deconfigure operation throws rescues: 

E, [2013-08-12T23:01:51.094286 #3760] ERROR -- : openshift.rb:302:in `rescue in with_container_from_args' User does not exist in cgroups: 5207709d5973cad178000078
User does not exist in cgroups: 5207709d5973cad178000078
E, [2013-08-12T23:01:51.236228 #3760] ERROR -- : openshift.rb:963:in `rescue in has_app_cartridge_action' can't find user for 5207709d5973cad178000078
  {"--with-app-uuid"=>"5207709d5973cad178000078",
   "--with-container-uuid"=>"5207709d5973cad178000078",


Version-Release number of selected component (if applicable): 
rubygem-openshift-origin-node-1.12.10-1.el6oso.noarch

How reproducible: sometimes


Steps to Reproduce:
1. Wait for an application to be halfway created
2. Review mocllecitve logs

Actual results: Application is left on the node


Expected results: Application should be removed


Additional info: see attached logfile

Comment 1 Hiro Asari 2013-08-13 20:58:51 UTC
Could you elaborate on what you mean by 'Applications are sometimes halfway created'? Is there a way to reproduce this error? Even if it is not reliably reproducible, it is better than guessing what you might have done when you saw this error.

The first sign of problem is not the quoted part. It is here:

E, [2013-08-12T23:01:27.193240 #3760] ERROR -- : openshift.rb:302:in `rescue in with_container_from_args' CLIENT_ERROR: Unexpected error: User does not exist in cgroups: 5207709d5973cad178000078
CLIENT_ERROR: Unexpected error: User does not exist in cgroups: 5207709d5973cad178000078
  {"--with-app-uuid"=>"5207709d5973cad178000078",
   "--with-container-uuid"=>"5207709d5973cad178000078",

The subsequent operations involving this user, including "deconfigure", would thus fail. We need to figure out why the user doesn't exist.

After this is observed, what sort of state is the application in? Does it exist? If so, can it be removed?

Comment 2 Hiro Asari 2013-09-12 20:20:27 UTC
Sten,

Have you had a chance to look at this?

Comment 3 Sten Turpin 2013-09-13 14:00:14 UTC
This bz can be closed, we haven't been able to reproduce at all. If it re-occurs, we'll open a new bz.


Note You need to log in before you can comment on or make changes to this bug.