Bug 1019142

Summary: oo-accept-node show error messages when restarting node server or creating apps in parallel
Product: OpenShift Container Platform Reporter: Ma xiaoqiang <xiama>
Component: ContainersAssignee: Brenton Leanhardt <bleanhar>
Status: CLOSED DEFERRED QA Contact: libra bugs <libra-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 2.0.0CC: libra-onpremise-devel, xiama, xtian
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-01-21 13:13:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ma xiaoqiang 2013-10-15 08:27:45 UTC
Description of problem:
oo-accept-node show error messages when restarting node server or creating apps in parallel

Version-Release number of selected component (if applicable):
puddle:[2.0/2013-10-14.3] 
http://buildvm-devops.usersys.redhat.com/puddle/build/OpenShiftEnterpriseErrata/2.0/2013-10-14.3/

How reproducible:
always

Steps to Reproduce:
first method:
1.create several apps
2.reboot the node server
3.run "oo-accept-node"
#oo-accept-node
second method:
1.create several app in parallel
for i in `seq 1 9`;do rhc app create testapp$i php & done
2.run "oo-accept-node"
#oo-accept-node


Actual results:
The similar errors are given out at random

Output:
FAIL: 525ce8455f101a5da90002f4 has a process missing from cgroups: 9385 cgroups controller: freezer
FAIL: 525ce8455f101a5da90002f4 has a process missing from cgroups: 9393 cgroups controller: freezer
FAIL: 525ce84c5f101a395f0019a1 has a process missing from cgroups: 10102 cgroups controller: net_cls
FAIL: 525ce84c5f101a395f0019a1 has a process missing from cgroups: 10106 cgroups controller: net_cls
FAIL: 525ce84c5f101a395f0019a1 has a process missing from cgroups: 10102 cgroups controller: freezer
FAIL: 525ce84c5f101a395f0019a1 has a process missing from cgroups: 10106 cgroups controller: freezer

That means some processes are not limited by cgroup.

Expected results:
Should show no error!! 

Additional info:
If the above error message is shown, restart cgred service, re-run oo-accept-node, it works fine.

Comment 3 Ma xiaoqiang 2014-01-21 04:27:25 UTC
I check this problem on devenv__4247 and puddle [2.0.2/2014-01-16.1], Now the problem can not be reproduced.
# ll /etc/rc.d/rc3.d/S[0-9][0-9]c*
lrwxrwxrwx. 1 root root 18 Jan 14 23:14 /etc/rc.d/rc3.d/S05cgconfig -> ../init.d/cgconfig
lrwxrwxrwx. 1 root root 18 Jan 14 23:03 /etc/rc.d/rc3.d/S13cpuspeed -> ../init.d/cpuspeed
lrwxrwxrwx. 1 root root 14 Jan 14 23:01 /etc/rc.d/rc3.d/S25cups -> ../init.d/cups
lrwxrwxrwx. 1 root root 15 Jan 14 23:14 /etc/rc.d/rc3.d/S30cgred -> ../init.d/cgred
lrwxrwxrwx. 1 root root 15 Jan 14 23:01 /etc/rc.d/rc3.d/S90crond -> ../init.d/crond
lrwxrwxrwx. 1 root root 20 Mar 21  2013 /etc/rc.d/rc3.d/S99certmonger -> ../init.d/certmonger

The cgred service has been started after the cgconfig service. So the problem can not be reproduced.