Bug 986380 - /etc/cgconfig.conf~ and /etc/cgrules.conf~ are leftover on file system after app-destroy
/etc/cgconfig.conf~ and /etc/cgrules.conf~ are leftover on file system after ...
Status: CLOSED CURRENTRELEASE
Product: OpenShift Online
Classification: Red Hat
Component: Containers (Show other bugs)
2.x
Unspecified Linux
medium Severity high
: ---
: ---
Assigned To: Rob Millner
libra bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-07-19 11:58 EDT by Matt Woodson
Modified: 2015-05-14 19:24 EDT (History)
4 users (show)

See Also:
Fixed In Version: fork_ami_origin_runtime_183_and_191_720
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-08-07 18:55:41 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Matt Woodson 2013-07-19 11:58:04 EDT
Description of problem:

This problem is blowing up our selinux monitoring system.

After an app-destroy is issued through mcollective, the two files:

/etc/cgconfig.conf~
/etc/cgrules.conf~ 

are on the file sytem.  These files have bad selinux context, so we are being alerted.

A little bit of debugging info:

Notice the ctime on these files:

[root@ex-std-node106.prod etc]# ls -lc /etc/cg*~
-rw-r--r--. 1 root root 131219 Jul 19 10:01 /etc/cgconfig.conf~
-rw-r--r--. 1 root root  43865 Jul 19 10:01 /etc/cgrules.conf~


It is at 10:01.  From the mcollective.log @ 10:01. This app is a test app used for monitoring purposes that is created then immediately removed. :

--------------------------------------------------------------------------------
I, [2013-07-19T10:01:50.937734 #14752]  INFO -- : openshift.rb:53:in `cartridge_do_action' cartridge_do_action validation = openshift-origin-node app-destroy {"--with-app-uuid"=>"51e8727f5004465aae000013", "--with-app-name"=>"chkexsrv2", "--with-container-uuid"=>"51e8727f5004465aae000013", "--with-container-name"=>"chkexsrv2", "--with-namespace"=>"openshiftnagios", "--with-uid"=>3967, "--with-request-id"=>"fbfeeab7c4ea8cbd704da0c37f0f0631", "--cart-name"=>"openshift-origin-node"}
I, [2013-07-19T10:01:50.938423 #14752]  INFO -- : openshift.rb:151:in `execute_action' Executing action [app-destroy] using method oo_app_destroy with args [{"--with-app-uuid"=>"51e8727f5004465aae000013", "--with-app-name"=>"chkexsrv2", "--with-container-uuid"=>"51e8727f5004465aae000013", "--with-container-name"=>"chkexsrv2", "--with-namespace"=>"openshiftnagios", "--with-uid"=>3967, "--with-request-id"=>"fbfeeab7c4ea8cbd704da0c37f0f0631", "--cart-name"=>"openshift-origin-node"}]
I, [2013-07-19T10:01:55.218808 #14752]  INFO -- : openshift.rb:166:in `execute_action' Finished executing action [app-destroy] (0)
I, [2013-07-19T10:01:55.219158 #14752]  INFO -- : openshift.rb:130:in `cartridge_do_action' cartridge_do_action reply (0):
--------------------------------------------------------------------------------

The differences of these files and their non ~ counterparts:

--------------------------------------------------------------------------------
[root@ex-std-node106.prod etc]# diff /etc/cgrules.conf /etc/cgrules.conf~
489a490
> 51e8727f5004465aae000013	cpu,cpuacct,memory,net_cls,freezer	/openshift/51e8727f5004465aae000013
[root@ex-std-node106.prod etc]# diff /etc/cgconfig.conf /etc/cgconfig.conf~
461a462
> group /openshift/51e8727f5004465aae000013  { perm { task { uid = 3967; gid = 3967; } admin { uid = root; gid = root; } } net_cls { net_cls.classid = 69503; } cpu { cpu.cfs_quota_us = 30000; cpu.shares = 128; } memory { memory.limit_in_bytes = 536870912; memory.memsw.limit_in_bytes = 641728512; } }
--------------------------------------------------------------------------------

The (~) files have the app that was destroy in them listed while the regular files do not.



Version-Release number of selected component (if applicable):

This is a result of the 2_0_30 release that went to production July 18

How reproducible:

Very

Steps to Reproduce:
1. remove /etc/cgrules.conf~ and /etc/cgconfig.conf~
2. destroy app on node.
3. verify that these files return on destroy

Actual results:

Files exists

Expected results:

These are probably temp files.  These files should be removed after destroying the app.  

If these files are not destroyed, we need to change their selinux context to the proper context.

Additional info:

Info provided above.
Comment 1 Rob Millner 2013-07-19 12:26:17 EDT
They are backup files which can be used in the event that a corrupt edit made its way into cgrules.conf.  I'll get rid of them in the code.
Comment 2 Rob Millner 2013-07-19 17:05:30 EDT
Stage pull request:
https://github.com/openshift/origin-server/pull/3124

The fix for master will be delivered as part of:
https://trello.com/c/e3bx08kC/183-5-cgroups-templates
Comment 3 Meng Bo 2013-07-23 22:42:17 EDT
The issue fixed on devenv-stage,

The two temp files will not be generated when deleting app.
Comment 4 Rob Millner 2013-07-24 21:12:46 EDT
Fixed in:
fork_ami_origin_runtime_183_and_191_720
Comment 5 Thomas Wiest 2013-07-25 18:38:08 EDT
This is still happening in INT with the latest 2.0.31 code.

rhc-node-1.12.2-1.el6oso.x86_64
Comment 6 Xiaoli Tian 2013-07-26 06:35:13 EDT
(In reply to Thomas Wiest from comment #5)
> This is still happening in INT with the latest 2.0.31 code.
> 
> rhc-node-1.12.2-1.el6oso.x86_64

The fix should have not been merged in INT.

It's fixed in  fork_ami_origin_runtime_183_and_191_724

After app is deleted, relate cgroup entry is deleted from cgrules.conf and cgroups.conf

#diff beforedestorycgrules.conf /etc/cgrules.conf 
53d52
< 51f23f1b529abf0a0d000001	cpu,cpuacct,memory,net_cls,freezer	/openshift/51f23f1b529abf0a0d000001

#diff beforedestroycgconfig.conf /etc/cgconfig.conf 
30d29
< group /openshift/51f23f1b529abf0a0d000001 { perm { task { uid = 501;  gid = 501; } admin { uid = root;  gid = root; }} cpu { cpu.cfs_quota_us = 100000;  cpu.shares = 128; } cpuacct {} memory { memory.limit_in_bytes = 536870912;  memory.memsw.limit_in_bytes = 641728512; } net_cls { net_cls.classid = 66037; } freezer {}}
Comment 7 Thomas Wiest 2013-07-26 09:55:47 EDT
I'd prefer to leave this open until we see the fix deployed to INT.

Otherwise I don't know how we'll track it.

Switching back to ON_QA so that we know we need to verify the bug fix in INT (or an equivalent devenv to INT).
Comment 8 Meng Bo 2013-07-30 07:31:28 EDT
Checked on devenv_3582,

the temp file cgrules.conf~ and cgconfig.conf~ will not remain on the node after app creation and deletion.


@Thomas Wiest 
Can you check this on the INT, if fixed we will move the bug to verified.
Comment 9 Thomas Wiest 2013-07-30 10:42:06 EDT
I checked in INT and both /etc/cgconfig.conf~ and /etc/cgrules.conf~ are still there. 

They are, however, in the proper selinux context, so our monitoring is no longer alerting for them.

Do I need to remove these manually (as in, are they no longer used)?
Comment 10 Rob Millner 2013-07-30 12:39:11 EDT
The new code doesn't create them but it does not delete them either (also, the ~ extension is used by some editors as the backup file).

Go ahead and delete them.
Comment 11 Thomas Wiest 2013-07-30 13:01:09 EDT
Ok, I've deleted them in INT.

I believe this bug is now fixed as we haven't gotten alerts for these files since the deploy yesterday.
Comment 12 Meng Bo 2013-07-30 21:54:10 EDT
According to comment#11, move the bug to verified.

Note You need to log in before you can comment on or make changes to this bug.