Bug 1623632
| Summary: | Incorrect http request method to Router stats port results in connection reset. | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Ryan Howe <rhowe> |
| Component: | Networking | Assignee: | Ram Ranganathan <ramr> |
| Networking sub component: | router | QA Contact: | zhaozhanqi <zzhao> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | medium | ||
| Priority: | unspecified | CC: | aos-bugs, bbennett, dmace, hongli, piqin, xtian |
| Version: | 3.10.0 | ||
| Target Milestone: | --- | ||
| Target Release: | 3.11.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2018-10-11 07:25:55 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Ram, want to fix up the handler to return a 501 for unsupported methods? @Dan will do, looks like this is happening because the method matching inside the cockroachdb cmux code is looking for exact matches on the method name when called from: https://github.com/openshift/origin/blob/master/pkg/router/metrics/metrics.go#L135 And because of that it doesn't match HTTP1Fast and as a result defaults to assuming it is a TLS request at: https://github.com/openshift/origin/blob/master/pkg/router/metrics/metrics.go#L148 and since it not a TLS request, the router logs an error and looks like the code just closes the connection. I will see if we can the slow http route and if that works. Or alternatively return an error - don't know what hooks we have available via the cockroach db code, so the actual http status we can return will be dependent on that. Ok, so it looks like if we just use the HTTP1 protocol matcher (instead of HTTP1Fast) in cockroach db, it does an ignore case check. Associated PR: https://github.com/openshift/origin/pull/20873 Some rudimentary tests at: https://gist.github.com/ramr/ced20f285c07b26942f90ae5a961b249 I checked this https://github.com/openshift/origin/pull/20873 only was merged in master branch, but not in 3.11 branch. So update the status to 'MODIFIED" Just check again. I saw this https://github.com/openshift/origin/pull/20873 merged to branch 'enterprise-3.11-backup-eparis' but not 'enterprise-3.11', and the latest release 3.11.8-1 is using branch 'enterprise-3.11'. So this PR need to rebase to 'enterprise-3.11'. please correct me if I'm wrong. Verified this bug on v3.11.12 the connection did not be reset even if using lowercase 'Get', see: #telnet 127.0.0.1 1936 Trying 127.0.0.1... Connected to 127.0.0.1. Escape character is '^]'. Get /healthz HTTP/1.1 host:apps.ocp.example.com Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:2652 |
Description of problem: Incorrect http request method to Router stats port results in connection reset. Version-Release number of selected component (if applicable): 3.7+ How reproducible: 100% Steps to Reproduce: # telnet 10.10.94.150 1936 Trying 10.10.94.150... Connected to 10.10.94.150. Escape character is '^]'. Get /healthz HTTP/1.1 host:apps.ocp.example.com connection:close Connection closed by foreign host. Actual results: Connection reset. 1 0.000000 10.10.94.149 → 10.10.94.150 TCP 74 54824 → 1936 [SYN] Seq=0 Win=29200 Len=0 MSS=1460 SACK_PERM=1 TSval=819120536 TSecr=0 WS=128 2 0.000085 10.10.94.150 → 10.10.94.149 TCP 74 1936 → 54824 [SYN, ACK] Seq=0 Ack=1 Win=28960 Len=0 MSS=1460 SACK_PERM=1 TSval=819121880 TSecr=819120536 WS=128 3 0.000259 10.10.94.149 → 10.10.94.150 TCP 66 54824 → 1936 [ACK] Seq=1 Ack=1 Win=29312 Len=0 TSval=819120537 TSecr=819121880 4 0.448752 10.10.94.149 → 10.10.94.150 TCP 89 54824 → 1936 [PSH, ACK] Seq=1 Ack=1 Win=29312 Len=23 TSval=819120985 TSecr=819121880 5 0.448811 10.10.94.150 → 10.10.94.149 TCP 66 1936 → 54824 [ACK] Seq=1 Ack=24 Win=29056 Len=0 TSval=819122329 TSecr=819120985 6 0.448925 10.10.94.150 → 10.10.94.149 TCP 66 1936 → 54824 [RST, ACK] Seq=1 Ack=24 Win=29056 Len=0 TSval=0 TSecr=819120985 7 0.449097 10.10.94.149 → 10.10.94.150 TCP 119 54824 → 1936 [PSH, ACK] Seq=24 Ack=1 Win=29312 Len=53 TSval=819120985 TSecr=819122329 8 0.449122 10.10.94.150 → 10.10.94.149 TCP 54 1936 → 54824 [RST] Seq=1 Win=0 Len=0 Expected results: Either respond with a 501 or accept the Get even though the method is case sensitive. https://www.w3.org/Protocols/rfc2616/rfc2616-sec5.html#sec5.1.1 "... An origin server SHOULD return the status code 405 (Method Not Allowed) if the method is known by the origin server but not allowed for the requested resource, and 501 (Not Implemented) if the method is unrecognized or not implemented by the origin server. . . ." Additional info: https://github.com/openshift/origin/blob/release-3.7/pkg/router/metrics/metrics.go A "Get" worked in 3.6 but due to this commit we now get reset when this is passed. https://github.com/openshift/origin/commit/feefce602f5c2abb54c9b493fc4831d6af6867f8