Bug 201166
Summary: | osad leaves dangling network connections | ||
---|---|---|---|
Product: | [Retired] Red Hat Network | Reporter: | Jose Plans <jplans> |
Component: | RHN/Other | Assignee: | Pradeep Kilambi <pkilambi> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | wes hayutin <whayutin> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | rhn370 | CC: | cperry, hgarcia, pstyles, rhn-bugs |
Target Milestone: | --- | Keywords: | Reopened |
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | sat510 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2008-04-03 00:18:26 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 248627 |
Description
Jose Plans
2006-08-03 10:49:29 UTC
*** Bug 203731 has been marked as a duplicate of this bug. *** was able to get the osad client to drop the zombie socket when we run out of servers to connect, but it kills the service as well, introduces a dangling pid file and can't reconnect since its dead. need some more time on this one. moving to sat510 triage due to time constraints This is what is going on: When osad is getting into this state, it calls jabber.Client.disconnected(self) which in turn is calling xmlstream.Client.disconnect The disconnect method tries to close the connection and then the socket but only if the process is not alive. But in our case the process is always alive as we get into the sleep state. This is putting the older ports into a CLOSED_WAIT state. client># while true; do echo; date; netstat -npt | grep -i 5222; sleep 150; done Wed Oct 3 14:19:42 EDT 2007 tcp 0 0 10.10.76.162:33133 10.10.76.168:5222 ESTABLISHED 30378/python tcp 0 0 10.10.76.162:33111 10.10.76.168:5222 TIME_WAIT - Wed Oct 3 14:22:12 EDT 2007 tcp 0 0 10.10.76.162:33133 10.10.76.168:5222 ESTABLISHED 30378/python Wed Oct 3 14:24:42 EDT 2007 tcp 0 0 10.10.76.162:33133 10.10.76.168:5222 ESTABLISHED 30378/python Wed Oct 3 14:27:12 EDT 2007 tcp 0 0 10.10.76.162:33133 10.10.76.168:5222 ESTABLISHED 30378/python forgot to mention the above run is after adding the fix . As we can see there are no closed-wait state connections. verified build 47 [root@rlx-3-18 ~]# netstat -an | grep 5222 tcp 0 0 0.0.0.0:5222 0.0.0.0:* LISTEN tcp 0 0 10.10.76.189:5222 10.10.76.189:32789 ESTABLISHED tcp 0 0 10.10.76.189:32789 10.10.76.189:5222 ESTABLISHED tcp 0 0 10.10.76.189:5222 10.10.76.182:42740 ESTABLISHED [root@rlx-3-18 ~]# /etc/init.d/rhn-satellite stop Shutting down rhn-satellite... Stopping rhn-search... Stopped rhn-search. Stopping satellite-httpd: audit(1200931575.079:14): avc: denied { unlink } for pid=2720 comm="httpd" name="jk-runtime-status.2720.lock" dev=dm-0 ino=6357181 scontext=user_u:system_r:httpd_t tcontext=user_u:object_r:httpd_log_t tclass=file [ OK ] waiting for processes to exit waiting for processes to exit Stopping RHN Taskomatic... Stopped RHN Taskomatic. Shutting down osa-dispatcher: [ OK ] Shutting down rhn-database: [ OK ] Shutting down Jabber router: [ OK ] Done. [root@rlx-3-18 ~]# netstat -an | grep 5222 tcp 0 0 10.10.76.189:32789 10.10.76.189:5222 TIME_WAIT tcp 0 0 10.10.76.189:5222 10.10.76.182:42740 FIN_WAIT2 [root@rlx-3-18 ~]# netstat -an | grep 5222 tcp 0 0 10.10.76.189:32789 10.10.76.189:5222 TIME_WAIT tcp 0 0 10.10.76.189:5222 10.10.76.182:42740 FIN_WAIT2 [root@rlx-3-18 ~]# netstat -an | grep 5222 tcp 0 0 10.10.76.189:32789 10.10.76.189:5222 TIME_WAIT tcp 0 0 10.10.76.189:5222 10.10.76.182:42740 FIN_WAIT2 [root@rlx-3-18 ~]# netstat -an | grep 5222 tcp 0 0 10.10.76.189:32789 10.10.76.189:5222 TIME_WAIT tcp 0 0 10.10.76.189:5222 10.10.76.182:42740 FIN_WAIT2 [root@rlx-3-18 ~]# [root@rlx-3-18 ~]# netstat -an | grep 5222 tcp 0 0 10.10.76.189:32789 10.10.76.189:5222 TIME_WAIT tcp 0 0 10.10.76.189:5222 10.10.76.182:42740 FIN_WAIT2 [root@rlx-3-18 ~]# netstat -an | grep 5222 tcp 0 0 10.10.76.189:32789 10.10.76.189:5222 TIME_WAIT tcp 0 0 10.10.76.189:5222 10.10.76.182:42740 FIN_WAIT2 [root@rlx-3-18 ~]# netstat -an | grep 5222 tcp 0 0 10.10.76.189:32789 10.10.76.189:5222 TIME_WAIT tcp 0 0 10.10.76.189:5222 10.10.76.182:42740 FIN_WAIT2 [root@rlx-3-18 ~]# netstat -an | grep 5222 tcp 0 0 10.10.76.189:32789 10.10.76.189:5222 TIME_WAIT tcp 0 0 10.10.76.189:5222 10.10.76.182:42740 FIN_WAIT2 [root@rlx-3-18 ~]# netstat -an | grep 5222 tcp 0 0 10.10.76.189:32789 10.10.76.189:5222 TIME_WAIT tcp 0 0 10.10.76.189:5222 10.10.76.182:42740 FIN_WAIT2 [root@rlx-3-18 ~]# netstat -an | grep 5222 tcp 0 0 10.10.76.189:32789 10.10.76.189:5222 TIME_WAIT tcp 0 0 10.10.76.189:5222 10.10.76.182:42740 FIN_WAIT2 [root@rlx-3-18 ~]# netstat -an | grep 5222 tcp 0 0 10.10.76.189:32789 10.10.76.189:5222 TIME_WAIT tcp 0 0 10.10.76.189:5222 10.10.76.182:42740 FIN_WAIT2 [root@rlx-3-18 ~]# netstat -an | grep 5222 [root@rlx-3-18 ~]# netstat -an | grep 5222 [root@rlx-3-18 ~]# Looks good, tested by using the netstat commands above and bringing down rhn-satellite service, no closed_wait states appear. 5.1 Sat GA so Closed for Current Release. |