IBM Netfinity 5000 and 5500 servers. SMP machines. the 5000s are dual P3, the 5500 is a quad Xeon periodically TCP sockets get stuck, be they imap (:143) or LDAP (:389), in the CLOSE_WAIT or TIME_WAIT states. This happens on one of the 3 Web servers (the 5000s)at a time.. they pile up exausting the resources on the file/mail server (the 5500). making the site unreacable.. we have roughly 250,000 email accounts and are expecting to reach over 1M email accounts in the very near future.. Kernel Ver: 2.2.12-20smp. We've also tried .14, 15pre10, even going so far as to modify the header file to allow a greater deal of TCP connections.. The email server we're using is Cyrus imapd-v1.6.22-2 and sasl-v1.5.15-2 We've also upped /proc/sys/fs/file-max to 65535.. but this is more to accomidate the amount of traffic the servers experience.. Please help, there are 10's of millions of $$ at stake, and there are people here considering scrapping everything linux!!
Firstly TIME_WAIT is not an error. Its a protocol requirement. TCP requires this to avoid the risk of data corruption with other sessions. Im sure you'd prefer intact email. TIME_WAIT lasts 120 seconds. Thus a heavy polling rate of thousands of mail clients would cause and you would expect to see a lot of TIME_WAIT sockets, especially if you have people using silly (eg 5 second) poll rates. How many connections/minute is your IMAP running at ?
Ok the system info you sent shows nothing at all out of the ordinary either in configuration or setup. I see no obvious reasons for problems. I see two possible issues here: 1. Someone is despite your claims otherwise polling very fast running you out of resource 2. You have a lot of large mailboxes and Cyrus is not using maildir format. That can cause a huge amount of I/O and memory usage reformatting mailboxes after changes. Both are speculation. I'd need to look at netstat output to judge further. Can you do echo "1" >/proc/sys/net/ipv4/tcp_syncookies and when the box gets loaded tell me if 'dmesg' and the logs show any messages abotu sending syn cookies. You appear to be running cyrus imapd as a standalone daemon - this is a correct assumption o my part ?
No response since March: closing