Bug 76146
Summary: | xinetd 2.3.9 causes hanging CLOSE_WAIT connections | ||||||
---|---|---|---|---|---|---|---|
Product: | [Retired] Red Hat Linux | Reporter: | Corey Shields <cshields> | ||||
Component: | xinetd | Assignee: | Phil Knirsch <pknirsch> | ||||
Status: | CLOSED ERRATA | QA Contact: | Mike McLean <mikem> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 7.1 | CC: | ae, a.gormanly, chris.ricker, daniel, dball, djh, drfickle, franz.sirl-kernel, gbailey, herrold, hjl, icon, jhcaiced, jos, jwright, k.georgiou, mattdm, menthos, me, mgb, michael.redinger, milan.kerslager, mmartinez, pb, rk, rvokal, samuel, shishz, wtogami | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | i686 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2002-12-02 20:37:28 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Corey Shields
2002-10-17 15:43:11 UTC
I saw the same thing with xinetd-2.3.9-0.73. My FTP mirror server would fail to operate after a period of time, needing restarting of xinetd to make it work again. Downgraded to the last xinetd version and this is working properly again. Major problem! not a fix, but if you're using vsftpd on your mirror, you can use vsftpd 1.1.2 in standalone mode. Works swimmingly :) Yes, it happened to me with RH7.2 and 7.3, I upgraded xinetd, and after a while users coult not connect to the POP server. I telneted at port 110, and found that instead of the normal login message, it showed also the PID number of xinetd and something else... the information looked pretty much like the one you see when issuing a 'ps ax|grep xinet'. I had to issue a 'service xinetd stop' TWICE to stop the service, and then start it. And it would work for a couple of hours, but after that it broke again. What I did in both cases was install xinetd RPM from the 8.0 distribution. It installed without any complain and is working fine until now (2 days). We've seen DoS situation worsen somewhat under 2.3.9 (RH7.2 and RH7.3). Also new kinds of confusion, such as wu-ftpd banner being prefixed by <86> and a snippet of what looks like xinetd syslog text. We're currently backing down to xinetd-2.3.8 and hopefully this will fix this new problem. If anyone has time to check out xinetd-2.3.8 i'd greatly appreciate it. Otherwise we'll back down to 2.3.7 and be done with it (as that obviously seems to work). As soon as we're sure the new packages work we'll reissue the errata. This came as unexpected to us as to anyone else. Sorry for the incovenience. Read ya, Phil are y'all planning on epoch bumping to rollback? Yep, all the packages now contain an Epoch of 2, so rollback should be automatic. Read ya, Phil How do I duplicate it? I have no problem with it. But my kernel does have a TCP patch. Created attachment 81228 [details]
A kernel TCP patch
I was told this patch would be/was in the current 2.4.20 pre kernel. It happens on my FTP mirror only after a few hours, so try creating and breaking connections repeatedly. At some point it should stop responding. How many times do I have to try? I tried 20 ftp connections from RedHat 7.3 to xinetd 2.3.9. It is fine. Maybe it is related to the "instances" or "per_source" directive. Here is my /etc/xinetd.d/vsftpd file: service ftp { disable = no socket_type = stream wait = no user = root server = /usr/sbin/vsftpd nice = 10 instances = 18 per_source = 2 } Otherwise, I suspect that only 20 connections isn't enough to trigger this. My mirror regularly gets thousands of connections in several hours. *** Bug 76610 has been marked as a duplicate of this bug. *** I really don't know where to find xinetd-2.3.8. Raw Hide has xinetd-2.3.7- 2.i386.rpm but it seems too old. As this problem touch whole set of my servers (even with low load but using time service inside xinetd) this should be solved for anyone ASAP. *** Bug 76127 has been marked as a duplicate of this bug. *** I set highest priority as this is simple way to make DoS. There should be an official URL to allow at least testing IMHO. *** Bug 76808 has been marked as a duplicate of this bug. *** You may try to use my RPM package (based on xinetd from RH-8.0). I don't have problems sice I downgraded, but use it at own risk :-) ftp://ftp.linux.cz/pub/linux/people/milan_kerslager/RedHat-7.3/other/xinetd- 2.3.7-2.i386.rpm any new words on this. I'd love to know when an eta on a new xinetd could be expected. I've got some people here kinda chomping at the bit In the mean time, try the 2.3.7-2 package from Rawhide. It works fine for me on my Red Hat 7.3 server. my reason for asking is I'd like to not be in a version bump war with red hat if/when they release updates. if the ones in rawhide, rebuilt won't upgrade over the current errata then I have to epoch-bump them. Then Red Hat releases their pkgs and I may have bumped too far. I don't want to do that if I can avoid it. will the epoch DEFINITELY be 2? Can't do a rpm -Uvh --oldpackage? No, Not on 100s of systems. We auto patch/upgrade - so things need to happen in an order, ideally. The following technique seems to definitively replicate this bug: (1) Create a new xinetd service for http-alt. This service will use netcat to access the regular http server on port 80 (start httpd up if it isn't already). My config file looks something like: [root@test191 ]# cat /etc/xinetd.d/httpd service http-alt { socket_type = stream wait = no user = nobody server = /usr/bin/nc server_args = -w 5 -n 127.0.0.1 80 log_on_success += DURATION USERID log_on_failure += USERID nice = 10 disable = no } (2) Use apachebench to wail on the http-alt port. (Actually it doesn't take too much wailing to completely fry xinetd). [mike@test114 mike]$ ab -n 15 -c 5 http://test191:8008/foo This is ApacheBench, Version 2.0.40-dev <$Revision: 1.116 $> apache-2.0 Copyright (c) 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Copyright (c) 1998-2002 The Apache Software Foundation, http://www.apache.org/ Benchmarking test191 (be patient)...apr_poll: The timeout specified has expired (20507) Total of 9 requests completed In case you are not familiar with apachebench syntax, this was *fifteen* requests, with a concurrency of 5. Only *nine* came back before xinetd died, leaving several tcp sockets in CLOSE_WAIT (not to mention several zombie netcat processes). If I downgrade to 2.3.7 (or 2.3.3 for that matter), then this test passes cleanly. I may have spoken too soon. I think the problems I encountered may be a different bug (which does not seem to manifest with the 7.1 errata package in question). OTOH, this certainly seems like a reasonable way to put lots of load on xinetd. I have seen the same behaviour - you have to enable "daytime" port (TCP/13). When you do "telnet my.machine.com 13" it hangs forever although it should disconnect very fast after returning correct date and time. Why has this not been addressed for weeks! The entire Red Hat 7.x series is broken on many servers if people install this updated xinetd package. This situation is inexcusably bad! I haven't got this entirely figured out, since I absolutely can not reproduce the problem on a test box running 2.4.18-10 on Valhalla. However, I did see the problem on a production box on a friends network. I got limited debugging because I had to fix it, but what I saw was the same as mgb and jpabuyer reported: syslog text. Specifically, I believe the text was the START message normally recorded by svc_log_success. There hasn't been any relevant changes to the code in libs/src/xlog/, so I suspect that something is causing the syslog code to write to stdout or stderr. This is also the reason that no one sees any log entries after the service stops working. AFAIK, the application doesn't know what fd syslog() writes to, so maybe this is the result of a buffer overflow? Would anything else cause the syslog() function to start writing to the wrong fd? http://videl.ics.hawaii.edu/~warren/xinetd-fix-test/ I made a test package of the latest devel snapshot of xinetd and so far this bug seems to be fixed. Please help to test this package and report problems here and the xinetd mailing list. An errata has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2002-196.html *** Bug 77074 has been marked as a duplicate of this bug. *** *** Bug 77773 has been marked as a duplicate of this bug. *** *** Bug 78762 has been marked as a duplicate of this bug. *** *** Bug 76506 has been marked as a duplicate of this bug. *** |