Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1148192 - Race condition in `oo-httpd-singular graceful` when using apache-vhost
Race condition in `oo-httpd-singular graceful` when using apache-vhost
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Containers (Show other bugs)
2.1.0
All Linux
medium Severity high
: ---
: ---
Assigned To: Luke Meyer
libra bugs
: Upstream
: 1154645 (view as bug list)
Depends On: 1147054 1151744
Blocks:
  Show dependency treegraph
 
Reported: 2014-09-30 18:13 EDT by Timothy Williams
Modified: 2014-11-03 14:55 EST (History)
7 users (show)

See Also:
Fixed In Version: openshift-origin-node-util-1.30.3.1-1.el6op
Doc Type: Bug Fix
Doc Text:
Cause: There was a race condition when using the apache-vhost frontend. If "oo-httpd-singular graceful" is run to incorporate one gear vhost update while another gear is creating its vhost configuration, the configuration is left in a bad state and httpd will not (re)start. Consequence: When this condition is hit, vhost configuration will cease being updated and newly-added gears will be unreachable via the vhost frontend. If httpd is stopped, it will fail to start until the config is fixed. Fix: A lock was extended around the call to oo-httpd-singular preventing the race condition. Result: This should no longer occur.
Story Points: ---
Clone Of:
: 1148418 1155794 (view as bug list)
Environment:
Last Closed: 2014-11-03 14:55:18 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2014:1796 normal SHIPPED_LIVE Moderate: Red Hat OpenShift Enterprise 2.2 Release Advisory 2014-11-03 19:52:02 EST

  None (edit)
Description Timothy Williams 2014-09-30 18:13:24 EDT
Description of problem:
There appears to be a race condition when using the apache-vhost frontend. If oo-httpd-singluar graceful is run while an application is creating its frontend configuration, the configuration is in a bad state and httpd will not start.

Version-Release number of selected component (if applicable):
2.1.6

How reproducible:
Very Rarely

Steps to Reproduce:
1. Create many applications while removing others
2.
3.

Actual results:
We see the following in the node's platform.log. 
-=~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~=-
September 22 13:39:32 INFO Shell command '/usr/sbin/oo-httpd-singular  graceful' ran. rc=0 out=
September 22 13:39:32 INFO Connecting frontend mapping for 54206ba04970fa5329000003/haproxy: [/haproxy-status] => [127.2.20.3:8080/] with options: {"protocols"=>["http"]}
September 22 13:39:32 WARN V2CartModel#connect_frontend: No such file or directory - /etc/httpd/conf.d/openshift/54206ba04970fa5329000003_e2e_54206ba04970fa5329000003/599984_element-_haproxy-status.conf
/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-common-1.22.5.11/lib/openshift-origin-common/utils/file_needs_sync.rb:36:in `initialize'
/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-common-1.22.5.11/lib/openshift-origin-common/utils/file_needs_sync.rb:36:in `open'
/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-common-1.22.5.11/lib/openshift-origin-common/utils/file_needs_sync.rb:36:in `open'
/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-frontend-apache-vhost-0.5.2.4/lib/openshift/runtime/frontend/http/plugins/apache-vhost.rb:152:in `block (2 levels) in connect'
  [... backtrace cut for clarity ...]
September 22 13:39:32 ERROR Unexpected error during configure: No such file or directory - /etc/httpd/conf.d/openshift/54206ba04970fa5329000003_e2e_54206ba04970fa5329000003/599984_element-_haproxy-status.conf (Errno::ENOENT)
September 22 13:39:32 INFO openshift-agent: request end: action=cartridge_do, requestid=5373aabdbd865534bec108f8bf32d199, senderid=lae-alln-brk02, statuscode=1, data={:time=>nil, :output=>"CLIENT_ERROR: Unexpected error: No such file or directory - /etc/httpd/conf.d/openshift/54206ba04970fa5329000003_e2e_54206ba04970fa5329000003/599984_element-_haproxy-status.conf\n", :exitcode=>1, :addtl_params=>nil}
September 22 13:39:32 INFO Shell command '/usr/sbin/oo-httpd-singular  graceful' ran. rc=1 out=
September 22 13:39:32 ERROR ERROR: failure from oo-httpd-singular(1): : stdout:  stderr:httpd.worker: Syntax error on line 221 of /etc/httpd/conf/httpd.conf: Syntax error on line 45 of /etc/httpd/conf.d/000001_openshift_origin_frontend_vhost.conf: Syntax error on line 29 of /etc/httpd/conf.d/openshift/54206ba04970fa5329000003_e2e_0_54206ba04970fa5329000003.conf: Include directory '/etc/httpd/conf.d/openshift/54206ba04970fa5329000003_e2e_54206ba04970fa5329000003' not found
-=~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~=-


Expected results:
All applications created/removed successfully

Additional info:
This looks very similar to Bugzilla 1146194 but the same error messages are not observed.
Comment 3 Luke Meyer 2014-10-01 15:23:28 EDT
Per bug 1147054 the fix should be coming to the next 2.2 rebase.
Comment 6 Luke Meyer 2014-10-22 16:06:22 EDT
In addition to rebase,  https://github.com/openshift/origin-server/pull/5885 is required.
Comment 7 Luke Meyer 2014-10-22 16:13:48 EDT
origin-server cherrypick:
commit dc07a6d177263a128f9b1506db9a7a20e64df451
Author: Rajat Chopra <rchopra@redhat.com>
Date:   Fri Oct 17 13:43:53 2014 -0700
    bz1151744 - wrap the wait for reload to finish inside of the lockfile
Comment 11 Anping Li 2014-10-24 08:22:21 EDT
Verified and pass
1) enable httpd.worker
[root@node1 ~]# ps -ef|grep httpd
root     27730     1  0 05:01 ?        00:00:00 /usr/sbin/httpd.worker
apache   27732 27730  0 05:01 ?        00:00:00 /usr/sbin/httpd.worker
apache   27733 27730  0 05:01 ?        00:00:00 /usr/sbin/httpd.worker
apache   27735 27730  0 05:01 ?        00:00:00 /usr/sbin/httpd.worker
root     29466 26319  0 05:04 pts/0    00:00:00 grep httpd

2)  Create many applications while removing others  and run test regression testing.

3) check platform.log. oo-httpd-singular was executed and there isn't singular  error  was reported.
[root@node1 node]# cat platform.log|grep oo-httpd
October 24 05:12:57 INFO Shell command '/usr/sbin/oo-httpd-singular  graceful' ran. rc=0 out=
October 24 05:12:58 INFO Shell command '/usr/sbin/oo-httpd-singular  graceful' ran. rc=0 out=
October 24 05:13:00 INFO Shell command '/usr/sbin/oo-httpd-singular  graceful' ran. rc=0 out

[root@node1 node]# grep error platform.log
[root@node1 node]# grep warn platform.log
git archive --format=tar master | (cd /var/lib/openshift/544a4251e5fed5c217000186/app-root/runtime/repo && tar --warning=no-timestamp -xf -);
Comment 12 Luke Meyer 2014-10-31 11:08:19 EDT
*** Bug 1154645 has been marked as a duplicate of this bug. ***
Comment 14 errata-xmlrpc 2014-11-03 14:55:18 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2014-1796.html

Note You need to log in before you can comment on or make changes to this bug.