IBM Netfinity 5000 and 5500 servers. SMP machines. the 5000s are dual P3,
the 5500 is a quad Xeon
periodically TCP sockets get stuck, be they imap (:143) or LDAP (:389), in
the CLOSE_WAIT or TIME_WAIT states. This happens on one of the 3 Web
servers (the 5000s)at a time.. they pile up exausting the resources on the
file/mail server (the 5500). making the site unreacable.. we have roughly
250,000 email accounts and are expecting to reach over 1M email accounts in
the very near future..
Kernel Ver: 2.2.12-20smp. We've also tried .14, 15pre10, even going so far
as to modify the header file to allow a greater deal of TCP connections..
The email server we're using is Cyrus imapd-v1.6.22-2 and sasl-v1.5.15-2
We've also upped /proc/sys/fs/file-max to 65535.. but this is more to
accomidate the amount of traffic the servers experience..
Please help, there are 10's of millions of $$ at stake, and there are
people here considering scrapping everything linux!!
Firstly TIME_WAIT is not an error. Its a protocol requirement. TCP requires
this to avoid the risk of data corruption with other sessions. Im sure you'd
prefer intact email.
TIME_WAIT lasts 120 seconds. Thus a heavy polling rate of thousands of mail
clients would cause and you would expect to see a lot of TIME_WAIT sockets,
especially if you have people using silly (eg 5 second) poll rates.
How many connections/minute is your IMAP running at ?
Ok the system info you sent shows nothing at all out of the
ordinary either in configuration or setup. I see no obvious
reasons for problems.
I see two possible issues here:
1. Someone is despite your claims otherwise polling very
fast running you out of resource
2. You have a lot of large mailboxes and Cyrus is not using
maildir format. That can cause a huge amount of I/O and
memory usage reformatting mailboxes after changes.
Both are speculation. I'd need to look at netstat output to
judge further. Can you do
echo "1" >/proc/sys/net/ipv4/tcp_syncookies
and when the box gets loaded tell me if 'dmesg' and the logs show
any messages abotu sending syn cookies.
You appear to be running cyrus imapd as a standalone daemon - this
is a correct assumption o my part ?
No response since March: closing