Bug 1732834
Summary: | Amphora RST instead of FIN connection with server side | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Priscila <pveiga> |
Component: | openstack-octavia | Assignee: | Michael Johnson <michjohn> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Bruna Bonguardo <bbonguar> |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 13.0 (Queens) | CC: | amuller, astafeye, broose, cgoncalves, ihrachys, lpeer, majopela, marjones, michjohn, njohnston, scohen |
Target Milestone: | --- | Keywords: | Triaged, ZStream |
Target Release: | --- | ||
Hardware: | All | ||
OS: | All | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-03-25 15:36:04 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1709925, 1759254 | ||
Bug Blocks: |
Description
Priscila
2019-07-24 13:22:04 UTC
Thank you for providing the pcap of the traffic you are concerned about. I can see that this was a “bench marking” activity given the content of the flows and that there were 563 transfers in this 76 second capture. I analyzed the HTTP GET /1024k.html flow in this pcap. It starts with packet 140 and ends with a RST/ACK in packet 1784. I see that the flow has jumbo frames enabled and the amphora was communicating with the web server using 8960 byte segments (40 for the protocol overhead). The 1,048,576 byte HTTP payload on a network with 8960 byte segments took 118 TCP segments to transfer the payload. The total transfer time for this payload was 0.204659 seconds. I was unable to find any packets that had IP fragmentation in the pcap. The TCP window size stayed pretty consistent through the beginning of the transfer (approximately .12 seconds), but did shift down towards the end of the transfer. I also see a delayed ACK at approximately that time frame. This flow did not experience a window full event, though I see others in the capture did, especially the later the flows were in the capture. In analyzing this flow, I do not see anything wrong with how the Amphora handled the request. The RST you see at the end of the flows is expected behavior and does not impact the HTTP payload transfer time. The initial HTTP transfer finished at packet 1233 with the final ACK for the transfer. The Amphora then held the connection to the back end server for a short period to see if another request could be serviced over the same connection. This is a form of back end keep-alive. It reduces the latency between flows and the load on the back end servers. The bench marking tool being used does not send follow on requests, so the back end connection is eventually reset, with the RST flag, to close the TCP session. The tool is likely not using HTTP keepalive or reusing the client to Amphora TCP connections. The delayed ACKs and TCP window full events are likely being caused by the client connecting to the Amphora not being able to handle the the data it is receiving in a timely manner. The Amphora will have to slow the rate of data from the server if the client is unable to handle it. This is common with clients that do not have tuned kernel settings and are using bench marking tools such as ApacheBench. To confirm this you can look at a pcap from the client to Amphora side that aligns to the pcap from the Amphora to the back end server. You should see some indication that the client was not responding in a timely manner to the data packets from the Amphora. The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days |