Bug 1915080
Summary: | Large number of tcp connections with shiftstack ocp cluster in about 24 hours | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Alex Krzos <akrzos> | ||||||||
Component: | Networking | Assignee: | Yossi Boaron <yboaron> | ||||||||
Networking sub component: | runtime-cfg | QA Contact: | Oleg Sher <osher> | ||||||||
Status: | CLOSED ERRATA | Docs Contact: | |||||||||
Severity: | urgent | ||||||||||
Priority: | urgent | CC: | adduarte, florin-alexandru.peter, llopezmo, m.andre, pprinett | ||||||||
Version: | 4.6 | Keywords: | UpcomingSprint | ||||||||
Target Milestone: | --- | ||||||||||
Target Release: | 4.7.0 | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Unspecified | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | |||||||||||
: | 1926730 1926732 (view as bug list) | Environment: | |||||||||
Last Closed: | 2021-02-24 15:51:54 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 1926732 | ||||||||||
Attachments: |
|
Description
Alex Krzos
2021-01-11 22:00:18 UTC
@Alex Kroz Any chance we can get the port number the connections are established to? and the full output of netstat -s may be useful as well. and also netstat -nau to get an idea of the tcp ports. or netstat -nau | grep ESTABLISHED > somefilebecauseislarge.txt Accepted as valid; to be investigated. Tentatively setting the severity to "low"; to be revisited if some impact is detected on clusters. (In reply to Adolfo Duarte from comment #1) > @Alex Kroz > > Any chance we can get the port number the connections are established to? > and the full output of netstat -s may be useful as well. > and also > netstat -nau to get an idea of the tcp ports. > > or > netstat -nau | grep ESTABLISHED > somefilebecauseislarge.txt Off master-0 # netstat -s Ip: Forwarding: 1 677095483 total packets received 24115304 forwarded 0 incoming packets discarded 652962145 incoming packets delivered 579072763 requests sent out 8232 outgoing packets dropped 423 dropped because of missing route Icmp: 42594 ICMP messages received 49 input ICMP message failed ICMP input histogram: destination unreachable: 42593 timeout in transit: 1 52816 ICMP messages sent 0 ICMP messages failed ICMP output histogram: destination unreachable: 44891 redirect: 7925 IcmpMsg: InType3: 42593 InType11: 1 OutType3: 44891 OutType5: 7925 Tcp: 5458764 active connection openings 3803949 passive connection openings 855534 failed connection attempts 1449090 connection resets received 48205 connections established 1075856680 segments received 1233135213 segments sent out 138885 segments retransmitted 0 bad segments received 4837830 resets sent Udp: 125620098 packets received 44564 packets to unknown port received 0 packet receive errors 1236699 packets sent 0 receive buffer errors 0 send buffer errors UdpLite: TcpExt: 81 invalid SYN cookies received 8 packets pruned from receive queue because of socket buffer overrun 15 ICMP packets dropped because they were out-of-window 1142247 TCP sockets finished time wait in fast timer 19069 time wait sockets recycled by time stamp 5852 packetes rejected in established connections because of timestamp 14601950 delayed acks sent 1841 delayed acks further delayed because of locked socket Quick ack mode was activated 90996 times 74 times the listen queue of a socket overflowed 74 SYNs to LISTEN sockets dropped 208293095 packet headers predicted 319649263 acknowledgments not containing data payload received 194189954 predicted acknowledgments TCPSackRecovery: 7569 Detected reordering 155401 times using SACK Detected reordering 715 times using time stamp 3714 congestion windows fully recovered without slow start 487 congestion windows partially recovered using Hoe heuristic TCPDSACKUndo: 2348 23 congestion windows recovered without slow start after partial ack TCPLostRetransmit: 4266 TCPSackFailures: 16 12778 fast retransmits TCPTimeouts: 6883 TCPLossProbes: 126375 TCPLossProbeRecovery: 87 TCPSackRecoveryFail: 523 TCPBacklogCoalesce: 1337411 TCPDSACKOldSent: 91196 TCPDSACKOfoSent: 71 TCPDSACKRecv: 104750 TCPDSACKOfoRecv: 34 1661653 connections reset due to unexpected data 495095 connections reset due to early user close 76 connections aborted due to timeout 2 times unable to send RST due to no memory TCPDSACKIgnoredOld: 132 TCPDSACKIgnoredNoUndo: 88417 TCPSackShifted: 14290 TCPSackMerged: 37705 TCPSackShiftFallback: 200618 TCPRcvCoalesce: 22390816 TCPOFOQueue: 104674 TCPOFOMerge: 71 TCPChallengeACK: 4185 TCPSpuriousRtxHostQueues: 1786 TCPAutoCorking: 20637 TCPFromZeroWindowAdv: 7 TCPToZeroWindowAdv: 7 TCPWantZeroWindowAdv: 159 TCPSynRetrans: 1120 TCPOrigDataSent: 482111603 TCPHystartTrainDetect: 6580 TCPHystartTrainCwnd: 132741 TCPHystartDelayDetect: 2 TCPHystartDelayCwnd: 184 TCPACKSkippedPAWS: 1190 TCPACKSkippedSeq: 1685 TCPACKSkippedChallenge: 18 TCPKeepAlive: 280264131 TCPDelivered: 482402249 TCPAckCompressed: 3626 IpExt: InNoRoutes: 1 InMcastPkts: 7098323 OutMcastPkts: 1216946 InOctets: 352544381659 OutOctets: 355411710695 InMcastOctets: 1136073832 OutMcastOctets: 185810790 InNoECTPkts: 680308975 Created attachment 1747109 [details]
Established connections
Note, since yesterday there was a DNS outage in the lab which "reset" all of the established connections.
Created attachment 1747112 [details]
Updated screenshot showing tcp established connections growth.
Note how rebooting one master reduced established connections on the other two master nodes proportionally and (as expected) reset that nodes established connection count on the rebooted node. Also note that a lab dns outage reset the established connection count as well.
By the way, I was thinking about your dns outage and I have a theory. We host the lb and dns on the master nodes. Each lb allows for a maximum of 20,000 active tcp connections. That means that once you hit 60,000 no new connections will be accepted by the load balancer. This would mean that the healthcheck, that occurs every 2 seconds or so, will not be able to query haproxy's healthz api, resulting in the vip being moved around the master nodes until one of the health checks go through. This absolutely has implications at scale, because as the number of services that run on your cluster grows, it will rapidly approach that upper limit to the number of connections allowed per node. That may mean that you need more master nodes in order to maintain larger clusters for all platforms using this networking architecture... (In reply to egarcia from comment #6) > By the way, I was thinking about your dns outage and I have a theory. We > host the lb and dns on the master nodes. Each lb allows for a maximum of > 20,000 active tcp connections. That means that once you hit 60,000 no new > connections will be accepted by the load balancer. This would mean that the > healthcheck, that occurs every 2 seconds or so, will not be able to query > haproxy's healthz api, resulting in the vip being moved around the master > nodes until one of the health checks go through. This absolutely has > implications at scale, because as the number of services that run on your > cluster grows, it will rapidly approach that upper limit to the number of > connections allowed per node. That may mean that you need more master nodes > in order to maintain larger clusters for all platforms using this networking > architecture... Is there a hard limit on 20000 connections for haproxy for this application? yes, per insance of haproxy. There is an instance on each master node. verified in (.venv) 17:05:15 ocp-edge-auto-sher(master) > oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.7.0-0.nightly-2021-02-02-052812 True False 144m Cluster version is 4.7.0-0.nightly-2021-02-02-052812 Managed, 3 masters, 2 workers baremetal ipv6 [core@master-0-0 ~]$ echo `uptime` > net.log [core@master-0-0 ~]$ for i in {1..10}; do echo `date` >> net.log; netstat -s | grep "connections established" >> net.log; sleep 60; done [core@master-0-0 ~]$ cat net.log 09:11:43 up 21:13, 1 user, load average: 1.27, 0.92, 0.90 Wed Feb 3 09:11:46 UTC 2021 1038 connections established Wed Feb 3 09:12:46 UTC 2021 1040 connections established Wed Feb 3 09:13:46 UTC 2021 1029 connections established Wed Feb 3 09:14:46 UTC 2021 1034 connections established Wed Feb 3 09:15:46 UTC 2021 1021 connections established Wed Feb 3 09:16:46 UTC 2021 1037 connections established Wed Feb 3 09:17:46 UTC 2021 1037 connections established Wed Feb 3 09:18:46 UTC 2021 1032 connections established Wed Feb 3 09:19:47 UTC 2021 1036 connections established Wed Feb 3 09:20:47 UTC 2021 1035 connections established *** Bug 1906194 has been marked as a duplicate of this bug. *** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633 KCS article related to this problem has been written: https://access.redhat.com/solutions/5865521 |