| Summary: | Response to HEAD requests does not contain a "Connection: close" header which leads to "IOError: [Errno 24] Too many open files" | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [oVirt] ovirt-engine | Reporter: | Kobi Hakimi <khakimi> | ||||
| Component: | RestAPI | Assignee: | Juan Hernández <juan.hernandez> | ||||
| Status: | CLOSED NOTABUG | QA Contact: | Kobi Hakimi <khakimi> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 4.0.0 | CC: | bugs, gklein, juan.hernandez, mgoldboi, mperina, oourfali | ||||
| Target Milestone: | ovirt-4.0.0-beta | Keywords: | Regression | ||||
| Target Release: | 4.0.0 | Flags: | rule-engine:
ovirt-4.0.0+
rule-engine: blocker+ mgoldboi: planning_ack+ juan.hernandez: devel_ack+ pstehlik: testing_ack+ |
||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Release Note | |||||
| Doc Text: |
The default configuration of the web server in EL6 disables persistent connections, adding the following parameter to the /etc/httpd/conf/httpd.conf file:
KeepAlive Off
This means that programs using the API will always receive the following response header, for all requests:
Connection: close
In EL7 persistent connections are enabled by default, which is good for performance in general, but may cause issues for programs that expect that "Connection: close" header. We users to update those programs so that they don't require the header, but if that isn't possible then the previous behavior of the server can be restored adding the "KeepAlive Off" parameter to the web server configuration.
|
Story Points: | --- | ||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2016-06-16 12:23:10 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | Infra | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Attachments: |
|
||||||
|
Description
Kobi Hakimi
2016-04-06 15:20:07 UTC
Seems to be a Jenkins issues After investigation with juan we found that the problem is: that in 4.0 the server doesn't send the "Connection: close" header for HEAD requests we tried to run API calls: - on 3.6 engine - not reproduce - on rhel6 + python 2.6 machine run to remote 4.0 engine - reproduced - on rhel7 + python 2.7 machine run locally or remote to 4.0 engine - reproduced It is true that the server doesn't send the "Connection: close" header like it used to do in version 3.6. We should probably change that, to avoid other similar issues. But after studying the issue I believe that it can be solved in the client, making sure that it consumes the (empty) body of the HEAD response. As the client is using the Python "httplib" module I'd suggest to make sure to always do the following for HEAD requests:
connection.request('HEAD', ...)
response = connection.getresponse()
response.read()
That call to "read" should make sure that the body is consumed, and the connection released.
Note that my analysis in comment 3 wasn't correct. The problem wasn't related to the consumption of the response body. It was a connection leak in in the testing framework. This leak wasn't problematic with version 3.6 of the engine, as the connections were leaked, but closed, so they didn't consume any resource other than memory. But with version 4 of the engine the connections are leaked, but they stay open, because the engine doesn't send the "Connection: close" response header for failed requests. This means that the leaked connections consume file descriptors and sockets, thus generating a real problem. That leak in the testing framework has been fixed. We want also to modify the engine so that it sends the "Connection: close" response header for failed connections, that is why we are keeping this bug open. However, that may be difficult, or even impossible, because that header is managed by the application server, not by the application. We are investigating it, but we may eventually close the bug as CANTFIX. (In reply to Juan Hernández from comment #7) > Note that my analysis in comment 3 wasn't correct. The problem wasn't > related to the consumption of the response body. It was a connection leak in > in the testing framework. > > This leak wasn't problematic with version 3.6 of the engine, as the > connections were leaked, but closed, so they didn't consume any resource > other than memory. But with version 4 of the engine the connections are > leaked, but they stay open, because the engine doesn't send the "Connection: > close" response header for failed requests. This means that the leaked > connections consume file descriptors and sockets, thus generating a real > problem. AFAIU, A 3.x engine does send a "Connection:close" response header for failed requests, which makes this issues a regression in the behaviour we had before. I'm marking this issue as a Regression, based on this info. This bug report has Keywords: Regression or TestBlocker. Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP. Moving from 4.0 alpha to 4.0 beta since 4.0 alpha has been already released and bug is not ON_QA. Looking at this deeper I see that it the "Connection: close" response header is added by the Apache web server, not by the application server. And this when running in EL6. The difference between EL6 and the other distributions is that the version of Apache used there is 2.2 instead of 2.4. The EL6 packaging of that version of Apache includes the following configuration: KeepAlive Off This disables completely the use of persistent connections, so that the "Connection: close" request is sent for all responses, not only failed ones. Newer versions of Apache (2.4 and newer) don't include this directive, so persistent connections are enabled. We could explicitly disable persistent connections adding "KeepAlive Off" as part of the changes that engine-setup makes to the system, but this would affect all the applications deployed to the web server. We can also disable it for specific locations, for example only for the API, with something like this inside /etc/httpd/conf.d/z-ovirt-engine-proxy.conf: SetEnvIf Request_URI "^/(ovirt-engine/)?api(/.*)?$" nokeepalive But doing this would actually mean a change in behavior for users that are already using EL7. As persistent connections improve performance and are a good thing, I'm in favor of not changing this configuration, and making a release note explaining that this has been changed, and how to restore the previous behavior for those users that may find an issue. As there will be no change to the source I'm moving to ON_QA. should update in the release note |