Red Hat Bugzilla – Bug 465152
thunderbird-2 IMAP timeout+repeat does DoS on server
Last modified: 2010-10-23 00:52:57 EDT
Escalated to Bugzilla from IssueTracker
Description of problem:
Our mail service has seen a major issues with thunderbird-2, related to the fact that it implements a non-RFC IMAP-level timeout - and resubmits pending commands. In the case we have seen, the commands usually are large (and slow) "move lots of messages to some other folder" (ie. IMAP "copy" and later "delete"), which are not idempotent - repeating "copy" just means duplicates on the target folder..
This is an upstream bug, tracked (with some logs from us) at https://bugzilla.mozilla.org/show_bug.cgi?id=409259
For now, we have reasonably few SLC5/RHEL5 users - so this crops up mostly for the occasional user who has installed TBird-2 by himself. However, we are considering a mass migration over the next few months, and this would certainly kill our mail service for good.
The proposed upstream workaround (set a longer timeout via per-user configurations) is not viable - we at least would have to do this at system/RPM level - please discuss whether this is something you'd be willing to integrate into the next thunderbird-2 update (i.e. IMAP timeout of 24h instead of 1min).
But we'd also like to know whether other RH customers have reported this, and whether other workarounds exists (client or server-side). We'd also like you to add some weight to the upstream bug, if possible.
Steps to Reproduce:
This event sent from IssueTracker by jbastian [Support Engineering Group]
I was able to reproduce this fairly easily.
1. Built an IMAP server (RHEL 4.7 and dovecot on a 2-CPU system with 5GB RAM)
2. Created a test account on the server
3. Created a large mailbox for the test account with approximately 50,000
emails or about 400MB. Most emails were one-liners generated by a script,
but there were 20 emails with 10MB attachments.
4. Put an artificial load on the IMAP server to keep the disk busy
while true; do
nice -19 dd if=/dev/zero of=/tmp/bigfile bs=1M count=512
(This took some experimentation to keep it busy enough, but not too busy.)
5. Run Wireshark on both IMAP server and Thunderbird client (another system
on the same LAN)
6. Using Thunderbird on another system (same LAN), told it to move all 50,000
emails from INBOX to 'archive' folder and back again.
Moving the email back to the INBOX took a few minutes, and in the middle of it I noticed the Thunderbird status bar changed from
Moving messages to INBOX...
Sending login information...
Moving messages to INBOX...
Shortly after that I killed the while loop from step 4.
When it was done, the 50,000 messages had turned into 100,000 in the INBOX (duplicates of every email).
I looked at the imap.request packets in Wireshark on the server and saw:
No. Time Protocol Info
4 0.031186 IMAP Request: 16 uid copy 137735:190322 "INBOX"
18 62.118568 IMAP Request: 2 authenticate plain
1505 84.008062 IMAP Request: 6 uid copy 137735:190322 "INBOX"
(In reply to comment #3)
> 6. Using Thunderbird on another system (same LAN), told it to move all 50,000
> emails from INBOX to 'archive' folder and back again.
I forgot to mention, this thunderbird-220.127.116.11-1.el5.i386 on RHEL 5.2.
Created attachment 319260 [details]
logfile created by the below-described method
Hmm, I have run
export NSPR_LOG_MODULES NSPR_LOG_FILE
and tried to connect to RH Zimbra IMAP server. It haven't got the connection (yum was occupying to much of the network bandwidth downloading 854MB of upgrades ;-)) and timeouted. So, I have got this log, but I am not sure what I see. Is this a reproduction of the bug?
Comment 6: probably not as per your log. You need the server to be busy with something (not just a slow network), thunderbird will then drop the connection and resubmit the same command again. If this takes equally long, it will drop again, resubmit... etc. This would be quite clear from the logs (see. e.g. log attached to upstream, https://bugzilla.mozilla.org/attachment.cgi?id=340339)
We've added a small patch there to just bump up the timeout.
Note: In order to reproduce this more easily, you could just set the client side timeout to something smaller than 60seconds..
Let's call it X bug for a moment -- wonder what it gives us.
(In reply to comment #12)
> Let's call it X bug for a moment -- wonder what it gives us.
OK, not accepted, coming back to the original component.
Development Management has reviewed and declined this request. You may appeal
this decision by reopening this request.