| Summary: | Failed app creates sometimes leave around fragments | ||
|---|---|---|---|
| Product: | OKD | Reporter: | Thomas Wiest <twiest> |
| Component: | Containers | Assignee: | Rajat Chopra <rchopra> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | libra bugs <libra-bugs> |
| Severity: | high | Docs Contact: | |
| Priority: | medium | ||
| Version: | 2.x | CC: | jialiu, mpatel |
| Target Milestone: | --- | Keywords: | Triaged |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2012-04-13 18:35:00 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Bug Depends On: | |||
| Bug Blocks: | 811192 | ||
Rajat, can you take a look at this and ensure we don't have a case deconfigure and destroy aren't being called. Some more fixes with rev#04dff1e6fdb228ebea0aba2678658e8d94c5688c The app's gears were not being cleaned up properly in case mongo save failed. Keeping this open still for other issues lurking. Found a code path where if max_gears limit is reached and the application is denied creation, it leaves gears behind. Fixed that with rev#1be5f271b378b8425702997b80e02f9ab346c464. For patch 04dff1e6fdb228ebea0aba2678658e8d94c5688c, it has been verified in BUG 807045. For patch 1be5f271b378b8425702997b80e02f9ab346c464, verify it on devenv_stage_169, and PASS. Reproduce steps: 1). Launch instance with sprint 8 release version - devenv_stage_157 2). Create 2 apps, that mean only 1 gear is allowed to be created. 3). Try to create a scalable app that will consume 2 gears at least, that mean creation failure will happen due to max_gears is reached. 4). In instance, check /var/lib/stickshift, some gear leftover is seen. |
Description of problem: Failed app creates sometimes leave around fragments. This morning we got some e-mail from STG that showed that rhc-last-access is failing with this error: EXCEPTION: No such file or directory - /var/lib/stickshift/2300b6a3f8ef4aa0834847ec326730c6/.env/OPENSHIFT_GEAR_DNS In total, there are 9 apps that are failing with this error in STG. When we go to look at these apps, here's what we see: * The app _does not_ have an entry in Mongo * The app _does not_ have a ProxyPass file * The app _has_ a user on the machine * The app _has_ a directory under /var/lib/stickshift Under the /var/lib/stickshift/uuid directory, it looks like this: [root@ex-std-node1 2300b6a3f8ef4aa0834847ec326730c6]# ll -a total 24 drwxr-x---. 4 root 2300b6a3f8ef4aa0834847ec326730c6 4096 Mar 31 05:08 . drwxr-x--x. 171 root root 12288 Apr 2 16:51 .. drwxr-x---. 2 root 2300b6a3f8ef4aa0834847ec326730c6 4096 Mar 31 05:08 .env d---------. 3 root root 4096 Mar 31 12:46 .tmp [root@ex-std-node1 2300b6a3f8ef4aa0834847ec326730c6]# Again, the other apps are in a similar situation. This is the only thing in the mcollective log referring to this uuid: D, [2012-03-31T05:08:39.403655 #10995] DEBUG -- : libra.rb:60:in `cartridge_do_action' cartridge_do_acti on call / request = #<MCollective::RPC::Request:0x7f03903a54b8 @action="cartridge_do", @agent="libra", @caller="cert=mcollective-public", @data= {:cartridge=>"stickshift-node", :args=> "--with-app-uuid '419a0a962d0243a980bc99892341b65f' --with-container-uuid '2300b6a3f8ef4aa0834847ec3 26730c6' -i '5185' --named 'scalephp' --with-namespace 'bmeng7s'", :action=>"app-create", :process_results=>true}, @sender="mcollect.cloud.redhat.com", @time=1333184919, @uniqid="c88b6e359f3e824eb5ea5f7dbf86c491"> D, [2012-03-31T05:08:39.403945 #10995] DEBUG -- : libra.rb:61:in `cartridge_do_action' cartridge_do_acti on validation = stickshift-node app-create --with-app-uuid '419a0a962d0243a980bc99892341b65f' --with-con tainer-uuid '2300b6a3f8ef4aa0834847ec326730c6' -i '5185' --named 'scalephp' --with-namespace 'bmeng7s' D, [2012-03-31T05:08:40.368987 #10995] DEBUG -- : libra.rb:102:in `cartridge_do_action' cartridge_do_act ion (0) ------ CART_DATA: PROXY_HOST=b5053c3948-bmeng8s.stg.rhcloud.com CART_DATA: PROXY_PORT=58906 CART_DATA: HOST=127.10.27.129 CART_DATA: PORT=8080 ------) Version-Release number of selected component (if applicable): rhc-node-0.89.2-1.el6_2.x86_64 How reproducible: Very sporadic, but since it's happening in STG with the latest release, I think the bug still exists. Steps to Reproduce: 1. unknown Actual results: failed app create leaves fragments. Expected results: nothing should be left around