The code that drives cron jobs in gears uses flock to ensure that a cron job isn't run multiple times. The way it is used, though, an open write-mode file descriptor is passed to child processes. In one case, I've seen a cron job that starts a service (which is ugly, but that's orthogonal), which then keeps the file locked indefinitely, so the cron job from which the service was launched can never run as long as the service is running and holding the lock.
Potential fix using &- to close flocked descriptor. E.g., ( flock -n 9 || exit 1 ; sleep 100 9>&- ) 9>.flock
This class of issue potentially exists in the following scripts/modules: oo-httpd-singular frontend_httpd.rb cron_runjobs.sh haproxy_ctld.rb set-proxy oo-autoidler oo-last-access openshift-origin-stale-lockfiles OO_setup_helper.rb fix_local.sh set-gear-endpoints
Added extra protections to the lock file descriptor in various scripts. https://github.com/openshift/origin-server/pull/2957
Commit pushed to master at https://github.com/openshift/origin-server https://github.com/openshift/origin-server/commit/1389b18352d66cadb54d96f764a86d969447569b Bug 977493 - Avoid leaking the lock file descriptor to child processes.
Here's the steps to Q/E: 1. Create an app with cron embedded. rhc app create rm1 php-5.3 cron-1.4 2. Create a cron script that will attempt to write to the lock file descriptor. cd rm1 cat > .openshift/cron/minutely/broken <<_EOF_ echo "foo" >&9 _EOF_ chmod +x .openshift/cron/minutely/broken git add .openshift/cron/minutely/broken git commit -m 'Add broken crontab' git push 3. Wait a few minutes and inspect the logs from cron in the gear. 4. You should see the following in the logs: line 1: 9: Bad file descriptor
Checked on devenv_3430, with step in comment#5. tail the cron log, can get the following lines: [php1-bmengdev.dev.rhcloud.com log]\> tailf cron.minutely.log __________________________________________________________________________ Mon Jul 1 02:01:06 EDT 2013: START minutely cron run __________________________________________________________________________ /var/lib/openshift/35ee7d48e21311e285a322000aa40b62/app-root/runtime/repo//.openshift/cron/minutely/broken: /var/lib/openshift/35ee7d48e21311e285a322000aa40b62/app-root/runtime/repo//.openshift/cron/minutely/broken: line 1: 9: Bad file descriptor __________________________________________________________________________ Mon Jul 1 02:01:07 EDT 2013: END minutely cron run - status=0 __________________________________________________________________________ Move bug to verified.