Bug 1021583

Summary: [PERF] oo-scheduled-jobs calls chmod on every run of every cron script in a gear
Product: OpenShift Online Reporter: Andy Grimm <agrimm>
Component: ContainersAssignee: Jhon Honce <jhonce>
Status: CLOSED CURRENTRELEASE QA Contact: libra bugs <libra-bugs>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 2.xCC: jgoulding, xtian
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-01-30 00:49:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Andy Grimm 2013-10-21 14:59:45 UTC
oo-scheduled jobs makes an excessive number of calls to "chmod".  It would be better to test files with "-x" and only chmod them if necessary.  This would likely avoid over 99% of the chmod invocations from this code.

Comment 1 Andy Grimm 2013-10-21 20:36:45 UTC
I realize that the current code uses "chmod -R" to minimize the number of calls, but here's some data:

[agrimm@dhcp129-30 ~]$ for x in $( seq 1 10000 ); do touch file$x; done
[agrimm@dhcp129-30 ~]$ time for x in $( seq 1 10000 ); do chmod +x file$x; done

real	0m4.576s
user	0m0.268s
sys	0m0.896s

Real time is about the same for the shell if you stat every file:

[agrimm@dhcp129-30 ~]$ time for x in $( seq 1 10000 ); do if [ -x file$x ]; then chmod +x file$x; fi; done

real	0m4.776s
user	0m0.357s
sys	0m0.962s

It gets much faster if you don't have to chmod:

[agrimm@dhcp129-30 ~]$ time for x in $( seq 1 10000 ); do if [ \! -x file$x ]; then chmod +x file$x; fi; done

real	0m0.085s
user	0m0.072s
sys	0m0.011s


And just for good measure, this measures a double loop with some math, so that we can attempt to count the cost of the extra shell loop that we'd be introducing:

[agrimm@dhcp129-30 ~]$  time for x in $( seq 0 99 ); do for y in $( seq 1 100 );  do z=$(( 100*x + $y )); if [ \! -x file$z ]; then chmod +x file$z; fi; done; done

real	0m0.256s
user	0m0.146s
sys	0m0.041s

This isn't quite as scientific as we can get, but knowing the overhead of executing a process on a busy node, I'm pretty convinced that the "-x" test is worth it.

Comment 2 Andy Grimm 2013-10-21 21:01:48 UTC
One more test:

first script: one thousand lines of:
chmod +x file1

second script: one thousand copies of:
for x in file{1,2,3,4,5}; do
	if [ \! -x $x ]; then
		chmod +x $x
	fi
done

The first take about 0.44 seconds on my machine, while the second takes about 0.065.

Comment 3 openshift-github-bot 2013-10-22 01:47:10 UTC
Commit pushed to master at https://github.com/openshift/origin-server

https://github.com/openshift/origin-server/commit/2bc19d0fc0f47263108d7f09b4c95b4b19cddacf
Bug 1021583 - Stop chmod'ing all cron scripts

* stat script files before chmod'ing

Comment 4 Qiushui Zhang 2013-10-22 06:31:39 UTC
Tested on devenv_3927.
The chmod code is changed.
It saves much time to check the mode than directly +x.
[walter@dhcp-10-238 abc]$ time for x in `seq 1 10000`; do chmod +x file$x ; done

real	0m9.293s
user	0m0.626s
sys	0m1.853s

[walter@dhcp-10-238 abc]$ time for x in `seq 1 10000`; do [ -x file$x ] ; done

real	0m0.099s
user	0m0.081s
sys	0m0.017s

It will obviously reduce the time when lots of the files already have an "x" mode.

Mark the bug as verified.