| Summary: | User getting intermittent "417 Expectation Failed" errors when accessing his app(s) | ||
|---|---|---|---|
| Product: | OKD | Reporter: | Nam Duong <nduong> |
| Component: | Pod | Assignee: | Rob Millner <rmillner> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | libra bugs <libra-bugs> |
| Severity: | low | Docs Contact: | |
| Priority: | medium | ||
| Version: | 2.x | CC: | dmcphers, jofernan, mfisher, qgong, rmillner, tkramer, xtian |
| Target Milestone: | --- | Keywords: | Triaged |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2012-07-13 23:43:50 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
|
Description
Nam Duong
2012-02-29 19:29:06 UTC
Sect 8 of RFC 2616 discusses "417 Expectation Failed" in this context. http://www.w3.org/Protocols/rfc2616/rfc2616-sec8.html From section 8.2.3, it seems like there's a few places for the 417 message to be created. First place to check would be whether the Apache proxy is somehow deciding that its back-end server doesn't support HTTP 1.1. Related discussions and bugs: http://stackoverflow.com/questions/3889574/apache-and-mod-proxy-not-handling-http-100-continue-from-client-http-417 https://issues.apache.org/bugzilla/show_bug.cgi?id=47087 https://issues.apache.org/bugzilla/show_bug.cgi?id=46709 I can't get this to repeat. Tried setting the server up with an extremely high load to force too-slow of a response and expose the Apache bugs. Tried extremely large (200MB) uploads. No luck in either case. Apache bugzilla 47087 was fixed in trunk and there's a patch which was back-ported to 2.2.18. RHEL 6 ships with 2.2.15. We could see if the patch back-ports cleanly to the RHEL 6 Apache. http://svn.apache.org/viewvc?view=revision&revision=1098104 The patch is already applied to the RHEL 6 httpd package (httpd-2.2.15-expectnoka.patch). Found this thread that seems to be related to Opera but not Apache. http://my.opera.com/community/forums/topic.dml?id=1065252&t=1330730053&page=1#comment10574472 Have the end-user disable "Non-Compliant Server 100 Continue" in Opera settings and see if the problem persists. http://www.opera.com/support/usingopera/operaini/ I still can't get the problem to reproduce in dev; but I can get it to 100% reproduce in production. Something is really different about how Apache or PHP work in prod. curl -v --header 'Expect: 100-continue' http://wordpress-rmillnertest1.rhcloud.com/ * About to connect() to wordpress-rmillnertest1.rhcloud.com port 80 (#0) * Trying 23.20.65.163... connected * Connected to wordpress-rmillnertest1.rhcloud.com (23.20.65.163) port 80 (#0) > GET / HTTP/1.1 > User-Agent: curl/7.21.7 (x86_64-redhat-linux-gnu) libcurl/7.21.7 NSS/3.13.1.0 zlib/1.2.5 libidn/1.22 libssh2/1.2.7 > Host: wordpress-rmillnertest1.rhcloud.com > Accept: */* > Expect: 100-continue > < HTTP/1.1 417 Expectation Failed < Date: Mon, 05 Mar 2012 23:51:28 GMT < Content-Length: 466 < Connection: close < Content-Type: text/html; charset=iso-8859-1 < <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <html><head> <title>417 Expectation Failed</title> </head><body> <h1>Expectation Failed</h1> <p>The expectation given in the Expect request-header field could not be met by this server. The client sent<pre> Expect: 100-continue </pre> </p><p>Only the 100-continue expectation is supported.</p> <hr> <address>Apache/2.2.15 (Red Hat) Server at wordpress-rmillnertest1.rhcloud.com Port 80</address> </body></html> * Closing connection #0 I can verify on stg that the back-end Apache accepts 100-continue; just not the front-end Apache. It seems likely that the front-end Apache is downgrading the connection to HTTP 1.0 for some reason. Requested configuration and log data from ops. STG/PROD are downgrading the proxy connection to HTTP 1.0; which according to RFC 2616 must send back an expectation failed error. ==> rmtest/logs/access_log-20120626-000000-EST <== 209.132.181.86 - - [26/Jun/2012:18:01:43 -0400] "GET / HTTP/1.0" 200 5213 "-" "curl/7.21.7 (x86_64-redhat-linux-gnu) libcurl/7.21.7 NSS/3.13.3.0 zlib/1.2.5 libidn/1.22 libssh2/1.2.7" Whereas on devenv; the proxy connection uses http 1.1. ==> rmtest/logs/access_log-20120626-000000-EST <== 209.132.181.86 - - [26/Jun/2012:18:08:52 -0400] "GET / HTTP/1.1" 200 5213 "-" "curl/7.21.7 (x86_64-redhat-linux-gnu) libcurl/7.21.7 NSS/3.13.3.0 zlib/1.2.5 libidn/1.22 libssh2/1.2.7" Is it necessary and/or intentional to downgrade to http 1.0 on the proxy connection. At a little under 2000 apps; we hit a problem where the mcollective configure call times out. Its likely the deconfigure call runs in parallel.
D, [2012-06-26T20:01:00.563982 #1091] DEBUG -- : libra.rb:306:in `cartridge_do_action' cartridge_do_action call / request = #<MCollective::RPC::Request:0x7f03474322f8
@action="cartridge_do",
@agent="libra",
@caller="cert=mcollective-public",
@data=
{:process_results=>true,
:action=>"configure",
:args=>"'c9r4m1981' 'c9r4m1981' '34b9e13bd52c4dc180df94d716f2ff8b'",
:cartridge=>"diy-0.1"},
@sender="ip-10-110-27-216",
@time=1340755260,
@uniqid="1a8163f9e2014ab20e59a8f378192d70">
D, [2012-06-26T20:01:00.564166 #1091] DEBUG -- : libra.rb:307:in `cartridge_do_action' cartridge_do_action validation = diy-0.1 configure 'c9r4m1981' 'c9r4m1981' '34b9e13bd52c4dc180df94
d716f2ff8b'
***** note, there's no output or exit code here.
D, [2012-06-26T20:02:00.875772 #1091] DEBUG -- : amqp.rb:91:in `receive' Received message
Broker has an rpc tmeout of 60s. That's likely where the problem is. Should this timeout be higher to accommodate heavy IO load on the system (300?).
config.rpc_opts = {
:disctimeout => 2,
:timeout => 60,
:verbose => false,
:progress_bar=> false,
:filter => {"identity"=>[], "fact"=>[], "agent"=>[], "cf_class"=>[]},
:config => "/etc/mcollective/client.cfg"
}
Ignore last two comments; they were intended for a different ticket. INT/STG/PROD all configure mod_proxy to downgrade to http 1.0. Sent out a query for whether this is necessary. Release ticket for 2.0.14 updated with instructions to allow KeepAlive and http 1.1 between mod_proxy and the back-end. Also made arrangements with ops to correct the settings on INT so we can try it now. Deployed and tested in INT. To test, create a PHP application (it can be scalable or not) and do a curl command similar to the following: $ curl -k -v --header 'Expect: 100-continue' -X POST -d foo="bar" https://rmtest-rlmint.int.rhcloud.com/health_check.php * About to connect() to rmtest-rlmint.int.rhcloud.com port 443 (#0) * Trying 23.22.154.9... connected * Connected to rmtest-rlmint.int.rhcloud.com (23.22.154.9) port 443 (#0) * Initializing NSS with certpath: sql:/etc/pki/nssdb * warning: ignoring value of ssl.verifyhost * skipping SSL peer certificate verification * SSL connection using TLS_RSA_WITH_AES_256_CBC_SHA * Server certificate: * subject: CN=*.rhcloud.com,OU=RHC Cloud Opoerations,O=Red Hat Inc,L=Raleigh,ST=North Carolina,C=US,serialNumber=wm-moQb/Ntz0RGKXwD13T3vGVlHChq5H * start date: Mar 22 16:16:59 2011 GMT * expire date: Mar 24 16:38:45 2013 GMT * common name: *.rhcloud.com * issuer: CN=GeoTrust SSL CA,O="GeoTrust, Inc.",C=US > POST /health_check.php HTTP/1.1 > User-Agent: curl/7.21.7 (x86_64-redhat-linux-gnu) libcurl/7.21.7 NSS/3.13.3.0 zlib/1.2.5 libidn/1.22 libssh2/1.2.7 > Host: rmtest-rlmint.int.rhcloud.com > Accept: */* > Expect: 100-continue > Content-Length: 7 > Content-Type: application/x-www-form-urlencoded > < HTTP/1.1 100 Continue < HTTP/1.1 100 Continue < HTTP/1.1 200 OK < Date: Fri, 29 Jun 2012 21:04:02 GMT < Server: Apache/2.2.15 (Red Hat) < X-Powered-By: PHP/5.3.3 < Content-Length: 1 < Content-Type: text/html < Set-Cookie: GEAR=3cde56694d-rlmint; path=/ < Strict-Transport-Security: max-age=15768000, includeSubDomains < * Connection #0 to host rmtest-rlmint.int.rhcloud.com left intact * Closing connection #0 retest on int 1.create an nonscalable php application 2.[root@localhost ~]# curl -k -v --header 'Expect: 100-continue' -X POST -d foo="bar" https://qphp-qgong.int.rhcloud.com/health_check.php * About to connect() to qphp-qgong.int.rhcloud.com port 443 (#0) * Trying 174.129.125.96... connected * Connected to qphp-qgong.int.rhcloud.com (174.129.125.96) port 443 (#0) * Initializing NSS with certpath: /etc/pki/nssdb * warning: ignoring value of ssl.verifyhost * skipping SSL peer certificate verification * SSL connection using TLS_RSA_WITH_AES_256_CBC_SHA * Server certificate: * subject: CN=*.rhcloud.com,OU=RHC Cloud Opoerations,O=Red Hat Inc,L=Raleigh,ST=North Carolina,C=US,serialNumber=wm-moQb/Ntz0RGKXwD13T3vGVlHChq5H * start date: Mar 22 16:16:59 2011 GMT * expire date: Mar 24 16:38:45 2013 GMT * common name: *.rhcloud.com * issuer: CN=GeoTrust SSL CA,O="GeoTrust, Inc.",C=US > POST /health_check.php HTTP/1.1 > User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.12.9.0 zlib/1.2.3 libidn/1.18 libssh2/1.2.2 > Host: qphp-qgong.int.rhcloud.com > Accept: */* > Expect: 100-continue > Content-Length: 7 > Content-Type: application/x-www-form-urlencoded > < HTTP/1.1 100 Continue < HTTP/1.1 100 Continue < HTTP/1.1 200 OK < Date: Mon, 02 Jul 2012 02:24:40 GMT < Server: Apache/2.2.15 (Red Hat) < X-Powered-By: PHP/5.3.3 < Content-Length: 1 < Content-Type: text/html < Strict-Transport-Security: max-age=15768000, includeSubDomains < * Connection #0 to host qphp-qgong.int.rhcloud.com left intact * Closing connection #0 check scalable php application: [root@localhost ~]# curl -k -v --header 'Expect: 100-continue' -X POST -d foo="bar" https://qsphp-qgong.int.rhcloud.com/health_check.php * About to connect() to qsphp-qgong.int.rhcloud.com port 443 (#0) * Trying 174.129.125.96... connected * Connected to qsphp-qgong.int.rhcloud.com (174.129.125.96) port 443 (#0) * Initializing NSS with certpath: /etc/pki/nssdb * warning: ignoring value of ssl.verifyhost * skipping SSL peer certificate verification * SSL connection using TLS_RSA_WITH_AES_256_CBC_SHA * Server certificate: * subject: CN=*.rhcloud.com,OU=RHC Cloud Opoerations,O=Red Hat Inc,L=Raleigh,ST=North Carolina,C=US,serialNumber=wm-moQb/Ntz0RGKXwD13T3vGVlHChq5H * start date: Mar 22 16:16:59 2011 GMT * expire date: Mar 24 16:38:45 2013 GMT * common name: *.rhcloud.com * issuer: CN=GeoTrust SSL CA,O="GeoTrust, Inc.",C=US > POST /health_check.php HTTP/1.1 > User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.12.9.0 zlib/1.2.3 libidn/1.18 libssh2/1.2.2 > Host: qsphp-qgong.int.rhcloud.com > Accept: */* > Expect: 100-continue > Content-Length: 7 > Content-Type: application/x-www-form-urlencoded > < HTTP/1.1 100 Continue < HTTP/1.1 100 Continue < HTTP/1.1 200 OK < Date: Mon, 02 Jul 2012 02:32:45 GMT < Server: Apache/2.2.15 (Red Hat) < X-Powered-By: PHP/5.3.3 < Content-Length: 1 < Content-Type: text/html < Set-Cookie: GEAR=9fe473b427-qgong; path=/ < Strict-Transport-Security: max-age=15768000, includeSubDomains < * Connection #0 to host qsphp-qgong.int.rhcloud.com left intact * Closing connection #0 Then check wordpress, uploading works well. |