Bug 76146

Summary:

xinetd 2.3.9 causes hanging CLOSE_WAIT connections

Product:

[Retired] Red Hat Linux

Reporter:

Corey Shields <cshields>

Component:

xinetd

Assignee:

Phil Knirsch <pknirsch>

Status:

CLOSED ERRATA

QA Contact:

Mike McLean <mikem>

Severity:

high

Docs Contact:

Priority:

high

Version:

7.1

CC:

ae, a.gormanly, chris.ricker, daniel, dball, djh, drfickle, franz.sirl-kernel, gbailey, herrold, hjl, icon, jhcaiced, jos, jwright, k.georgiou, mattdm, menthos, me, mgb, michael.redinger, milan.kerslager, mmartinez, pb, rk, rvokal, samuel, shishz, wtogami

Target Milestone:

---

Target Release:

---

Hardware:

i686

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2002-12-02 20:37:28 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
A kernel TCP patch	none

Description Corey Shields 2002-10-17 15:43:11 UTC

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20020918

Description of problem:
We upgraded to xinetd 2.3.9 after the DoS security notice, and shortly after we
start to notice tens, then hundreds, of connections that remained CLOSE_WAIT and
wouldn't let go.  Pretty soon out files open limit was reached and the system
started complaining.  According to lsof, it seemed that any network service
accessed through xinetd, including xinetd itself, would leave stale CLOSE_WAIT
connections as it ran.  We downgraded to what we were running previously (2.3.7)
and the problems went away.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. install xinetd-2.3.9-0.71.i386.rpm
2. start xinetd.
3.
	

Actual Results:  connections remaining in the CLOSE_WAIT state started piling up

Additional info:

Comment 1 Warren Togami 2002-10-18 04:27:39 UTC

I saw the same thing with xinetd-2.3.9-0.73.  My FTP mirror server would fail to
operate after a period of time, needing restarting of xinetd to make it work
again.  Downgraded to the last xinetd version and this is working properly again.

Major problem!

Comment 2 Seth Vidal 2002-10-18 13:31:23 UTC

not a fix, but if you're using vsftpd on your mirror, you can use vsftpd 1.1.2
in standalone mode. Works swimmingly :)

Comment 3 Need Real Name 2002-10-18 13:41:40 UTC

Yes, it happened to me with RH7.2 and 7.3, I upgraded xinetd, and after a while
users coult not connect to the POP server.

I telneted at port 110, and found that instead of the normal login message, it
showed also the PID number of xinetd and something else... the information
looked pretty much like the one you see when issuing a 'ps ax|grep xinet'. I had
to issue a 'service xinetd stop' TWICE to stop the service, and then start it.
And it would work for a couple of hours, but after that it broke again.

What I did in both cases was install xinetd RPM from the 8.0 distribution. It
installed without any complain and is working fine until now (2 days).

Comment 4 Mike Bird 2002-10-18 16:33:43 UTC

We've seen DoS situation worsen somewhat under 2.3.9 (RH7.2 and RH7.3).  Also
new kinds of confusion, such as wu-ftpd banner being prefixed by <86> and a
snippet of what looks like xinetd syslog text.

Comment 5 Phil Knirsch 2002-10-18 21:31:54 UTC

We're currently backing down to xinetd-2.3.8 and hopefully this will fix this
new problem.

If anyone has time to check out xinetd-2.3.8 i'd greatly appreciate it.
Otherwise we'll back down to 2.3.7 and be done with it (as that obviously seems
to work).

As soon as we're sure the new packages work we'll reissue the errata. This came
as unexpected to us as to anyone else. Sorry for the incovenience.

Read ya, Phil

Comment 6 Seth Vidal 2002-10-19 01:00:19 UTC

are y'all planning on epoch bumping to rollback?

Comment 7 Phil Knirsch 2002-10-19 06:49:40 UTC

Yep, all the packages now contain an Epoch of 2, so rollback should be automatic.

Read ya, Phil

Comment 8 hjl 2002-10-21 01:40:09 UTC

How do I duplicate it? I have no problem with it. But
my kernel does have a TCP patch.

Comment 9 hjl 2002-10-21 01:41:52 UTC

Created attachment 81228 [details]
A kernel TCP patch

Comment 10 hjl 2002-10-21 01:42:52 UTC

I was told this patch would be/was in the current 2.4.20 pre
kernel.

Comment 11 Warren Togami 2002-10-21 02:32:07 UTC

It happens on my FTP mirror only after a few hours, so try creating and breaking
connections repeatedly.  At some point it should stop responding.

Comment 12 hjl 2002-10-21 02:48:41 UTC

How many times do I have to try? I tried 20 ftp
connections from RedHat 7.3 to xinetd 2.3.9. It
is fine.

Comment 13 Warren Togami 2002-10-21 03:54:30 UTC

Maybe it is related to the "instances" or "per_source" directive.  Here is my
/etc/xinetd.d/vsftpd file:

service ftp
{
        disable = no
        socket_type             = stream
        wait                    = no
        user                    = root
        server                  = /usr/sbin/vsftpd
        nice                    = 10
        instances               = 18
        per_source              = 2
}

Otherwise, I suspect that only 20 connections isn't enough to trigger this.  My
mirror regularly gets thousands of connections in several hours.

Comment 14 Milan Kerslager 2002-10-25 07:47:36 UTC

*** Bug 76610 has been marked as a duplicate of this bug. ***

Comment 15 Milan Kerslager 2002-10-25 08:27:47 UTC

I really don't know where to find xinetd-2.3.8. Raw Hide has xinetd-2.3.7-
2.i386.rpm but it seems too old. As this problem touch whole set of my servers 
(even with low load but using time service inside xinetd) this should be solved 
for anyone ASAP.

Comment 16 Milan Kerslager 2002-10-25 08:33:24 UTC

*** Bug 76127 has been marked as a duplicate of this bug. ***

Comment 17 Milan Kerslager 2002-10-25 08:36:04 UTC

I set highest priority as this is simple way to make DoS.
There should be an official URL to allow at least testing IMHO.

Comment 18 Milan Kerslager 2002-10-28 16:48:07 UTC

*** Bug 76808 has been marked as a duplicate of this bug. ***

Comment 19 Milan Kerslager 2002-10-28 17:15:58 UTC

You may try to use my RPM package (based on xinetd from RH-8.0). I don't have 
problems sice I downgraded, but use it at own risk :-)

ftp://ftp.linux.cz/pub/linux/people/milan_kerslager/RedHat-7.3/other/xinetd-
2.3.7-2.i386.rpm

Comment 20 Seth Vidal 2002-10-28 18:55:28 UTC

any new words on this. I'd love to know when an eta on a new xinetd could be
expected.

I've got some people here kinda chomping at the bit

Comment 21 Warren Togami 2002-10-28 19:21:40 UTC

In the mean time, try the 2.3.7-2 package from Rawhide.  It works fine for me on
my Red Hat 7.3 server.

Comment 22 Seth Vidal 2002-10-28 19:31:59 UTC

my reason for asking is I'd like to not be in a version bump war with red hat
if/when they release updates.

if the ones in rawhide, rebuilt won't upgrade over the current errata then I
have to epoch-bump them. Then Red Hat releases their pkgs and I may have bumped
too far.

I don't want to do that if I can avoid it.
will the epoch DEFINITELY be 2?

Comment 23 Warren Togami 2002-10-28 19:51:20 UTC

Can't do a rpm -Uvh --oldpackage?

Comment 24 Seth Vidal 2002-10-28 20:16:48 UTC

No, Not on 100s of systems. We auto patch/upgrade - so things need to happen in
an order, ideally.

Comment 25 Mike McLean 2002-10-29 16:46:34 UTC

The following technique seems to definitively replicate this bug:

(1) Create a new xinetd service for http-alt.  This service will use netcat to
access the regular http server on port 80 (start httpd up if it isn't already).
 My config file looks something like:

[root@test191 ]# cat /etc/xinetd.d/httpd
service http-alt
{
        socket_type             = stream
        wait                    = no
        user                    = nobody
        server                  = /usr/bin/nc
        server_args             = -w 5 -n 127.0.0.1 80
        log_on_success          += DURATION USERID
        log_on_failure          += USERID
        nice                    = 10
        disable                 = no
}

(2) Use apachebench to wail on the http-alt port.  (Actually it doesn't take too
much wailing to completely fry xinetd).

[mike@test114 mike]$ ab -n 15 -c 5 http://test191:8008/foo
This is ApacheBench, Version 2.0.40-dev <$Revision: 1.116 $> apache-2.0
Copyright (c) 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Copyright (c) 1998-2002 The Apache Software Foundation, http://www.apache.org/

Benchmarking test191 (be patient)...apr_poll: The timeout specified has expired
(20507)
Total of 9 requests completed

In case you are not familiar with apachebench syntax, this was *fifteen*
requests, with a concurrency of 5.  Only *nine* came back before xinetd died,
leaving several tcp sockets in CLOSE_WAIT (not to mention several zombie netcat
processes). 

If I downgrade to 2.3.7 (or 2.3.3 for that matter), then this test passes cleanly.

Comment 26 Mike McLean 2002-10-30 19:05:37 UTC

I may have spoken too soon.  I think the problems I encountered may be a
different bug (which does not seem to manifest with the 7.1 errata package in
question).

OTOH, this certainly seems like a reasonable way to put lots of load on xinetd.

Comment 27 Need Real Name 2002-11-09 16:55:40 UTC

I have seen the same behaviour - you have to enable "daytime" port (TCP/13). When you do "telnet my.machine.com 13" it hangs forever although
it should disconnect very fast after returning correct date and time.

Comment 28 Warren Togami 2002-11-19 03:45:45 UTC

Why has this not been addressed for weeks!  The entire Red Hat 7.x series is
broken on many servers if people install this updated xinetd package.  This
situation is inexcusably bad!

Comment 29 Gordon Messmer 2002-11-19 07:35:45 UTC

I haven't got this entirely figured out, since I absolutely can not reproduce
the problem on a test box running 2.4.18-10 on Valhalla.

However, I did see the problem on a production box on a friends network.  I got
limited debugging because I had to fix it, but what I saw was the same as 
mgb and jpabuyer reported: syslog text.

Specifically, I believe the text was the START message normally recorded by
svc_log_success.  There hasn't been any relevant changes to the code in
libs/src/xlog/, so I suspect that something is causing the syslog code to write
to stdout or stderr.  This is also the reason that no one sees any log entries
after the service stops working.

AFAIK, the application doesn't know what fd syslog() writes to, so maybe this is
the result of a buffer overflow?  Would anything else cause the syslog()
function to start writing to the wrong fd?

Comment 30 Warren Togami 2002-11-24 00:59:12 UTC

http://videl.ics.hawaii.edu/~warren/xinetd-fix-test/
I made a test package of the latest devel snapshot of xinetd and so far this bug
seems to be fixed.  Please help to test this package and report problems here
and the xinetd mailing list.

Comment 31 Bill Nottingham 2002-12-02 20:37:29 UTC

An errata has been issued which should help the problem described in this bug report. 
This report is therefore being closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files, please follow the link below. You may reopen 
this bug report if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2002-196.html

Comment 32 Bill Nottingham 2002-12-03 20:23:56 UTC

*** Bug 77074 has been marked as a duplicate of this bug. ***

Comment 33 Bill Nottingham 2002-12-03 20:35:19 UTC

*** Bug 77773 has been marked as a duplicate of this bug. ***

Comment 34 Bill Nottingham 2002-12-03 20:51:45 UTC

*** Bug 78762 has been marked as a duplicate of this bug. ***

Comment 35 Bill Nottingham 2002-12-03 20:55:03 UTC

*** Bug 76506 has been marked as a duplicate of this bug. ***