Hi, I've got my own IMAP package which I think is based on yours... I'm using your imap-4.4-vfs.patch, imap-4.5-linux.patch, and imap-4.5-redhat.patch patches with imap-4.6 and a few patches of my own. I've not messed with the locking code, and I'm using the same few basic patches that your package used.. so I expect that our packages have this same locking problem. Well, I end up with this problem where imap attempts to create a dotfile in /var/spool/mail and when it fails it sleeps for a second. Burning the second is a real pain when using a webmail client that may make two IMAP connections per mailbox listing. Here is my system trace showing what was going on stat("/var/spool/mail/bacon", {st_mode=033252, st_size=0, ...}) = 0 lstat("/tmp/.305.29838", 0x7fffe3e4) = -1 ENOENT (No such file or directory) open("/tmp/.305.29838", O_RDWR|O_CREAT|O_EXCL, 0666) = 4 flock(4, LOCK_EX|LOCK_NB) = 0 lstat("/tmp/.305.29838", {st_mode=033416, st_size=0, ...}) = 0 fstat(4, {st_mode=033416, st_size=0, ...}) = 0 chmod("/tmp/.305.29838", 0666) = 0 alarm(0) = 0 chmod("/tmp/.305.29838", 0666) = 0 getpid() = 25287 write(4, "25287\0", 6) = 6 ftruncate(4, 5) = 0 fsync(4) = 0 access("/var/spool/mail/bacon", W_OK) = 0 open("/var/spool/mail/bacon", O_RDWR) = 5 flock(5, LOCK_SH) = 0 lstat("/var/spool/mail/bacon.lock", 0x7fffd71c) = -1 ENOENT (No such file or directory) time(NULL) = 943143797 open("/var/spool/mail/bacon.lock", O_WRONLY|O_CREAT|O_EXCL, 0666) = -1 EACCES (Permission denied) <0.0 00069> stat("/var/spool/mail/bacon.lock", 0x7fffd740) = -1 ENOENT (No such file or directory) stat("/etc/mlock", 0x7fffd740) = -1 ENOENT (No such file or directory) SYS_175(0, 0x7fffd708, 0x7fffd688, 0x8, 0) = 0 SYS_174(0x11, 0, 0x7fffd48c, 0x8, 0x11) = 0 SYS_175(0x2, 0x7fffd688, 0, 0x8, 0x2) = 0 nanosleep(0x7fffd5f4, 0x7fffd5f4, 0x2abf11b4, 0x7fffd5f4, 0x7fffd708) = 0 This sleep is coming from the src/osdep/unix/env_unix.c:dotlock_lock function. Other than this sleep being a pain, I don't think this agrees with what was said in bug id 3914 about your locking policy for mail delivery programs. > --- Additional Comments From pbrown 07/06/99 11:35 --- > Jeff, this shouldn't matter because none of our mailers use lockfiles, > but instead the system-call locking functions, correct? > > --- Additional Comments From gafton 07/29/99 03:27 --- > that is correct. all the mail delivery programs on a RH system are > synced up to fcntl. this leaves out the mail delivery over nfs, but if > you are delivering mail over nfs you are looking for trouble anyway > (since even lockfiles will not protect you from that - they do not > guarantee the atomicity of a lock in any way) So, I don't think Red Hat has properly addressed the issue of how locks are dealt with in the imap package. I don't know enough to tell you how you should do it, but I think I can see a problem at least. BTW, you might want to read docs/locking.txt in the IMAP-4.5 tarball. The dotlocking stuff is really weird. Anyway, now that I've said I don't know the solution. Let me point you towards what I think the solution is. :-) I ran into interesting patch in the calderasystems IMAP RPM version 4.5-2 maintained by edo (Ed Orcutt)... he just makes it so that the dotlock_lock function is never called, but rather the mail file is directly flocked. Before I ran into this patch, I actually built my own patch to setup IMAP so that it ran with the mail group as a saved gid, then when it wanted to dotlock a file in /var/spool/mail it changed group to mail. My patch has the drawback that it will not work when a user directly runs /usr/sbin/imap for a pre-authenticated connection (imap over ssh, for example). It would be improved to work when it is not run as root, but I don't feel like doing that. I'll send you both patches. So, you guys should know what kind of patch you want depending on what your locking standard is. I'm not sure if your procmail packages does dotlocking, but I have this vague memory that it does.. because I think one time I messed up the /var/spool/mail permissions and had all kinds of locking errors from procmail in my qmail log file. You might want to check that.
Created attachment 11 [details] the caldera flock insteadl of dotlock patch
Created attachment 12 [details] my switch-to-mail-group-when-needed patch
My strace on procmail shows that it is clearly using dot locking. (procmail- 3.13.1-2 package from RedHat 6.0.) So, I think that I'm going to use my imap- 4.6-mailgroup.patch patch for now.
I just realized that in the system trace the line: nanosleep(0x7fffd5f4, 0x7fffd5f4, 0x2abf11b4, 0x7fffd5f4, 0x7fffd708) = 0 done not justify that the sleep was actually a second. I know that the sleep was a second because: (a) I ran systrace with the -T option which shows the time spent in the system call, but I edited this information out to try to make the strace fit in the width of the bug report. (lots of word wrapping stinks). (b) when I looked at the source code, I found a "sleep(1)" right where the sleep was coming from.
I think it is correct that imap try to do *both* flock locking and dotfile locking; there's no harm in doing both, and it's an extra layer of protection. What is *not* correct is the "sleep(1)" which causes significant delays when retrieving a small mailbox. I fixed that with this patch: --- imap-4.7/src/osdep/unix/env_unix.c.orig Tue Jan 25 10:55:32 2000 +++ imap-4.7/src/osdep/unix//env_unix.c Tue Jan 25 10:55:49 2000 @@ -858,7 +858,7 @@ break; } /* if failed to make lock file and retry OK */ - if ((ld < 0) && base->lock) { + if ((ld < 0) && base->lock && base->lock[0]) { if (!(i%15)) { /* time to notify? */ sprintf (tmp,"Mailbox %.80s is locked, will override in %d seconds...", file,i);
While I agree that for high-load IMAP servers waiting one sec for a dotlock is less than ideal, in practice eliminating the delay will make the dotfile locking useless on such a server. You have a choice of either allowing the imap to create the dotlock entries or disable dotlocking entirely. None of these options are sane enough for the majority of the users, so we'll have to go with whatever suits most of them.
Please look at the patch I submitted. I am not proposing that you remove dotlocking. I'm proposing that you fix a *bug in the code* which is causing the 1-second delay when the server already knows that the dotlocking has failed and it is going to proceed without it. There is no reason for the delay at that point.
I think that jik.ma.us is right. The behavior i was seeing was a sleep(1) _after_ the dotlocking attempt had failed for lack of permissions to create a file in the /var/spool/mail directory. This sleep(1) was not part of the internal dotlocking mechanics, as the dotlock attempt had already failed. I don't think Red Hat really has it straight what their standard for locking the /var/spool/mail directory is. Here you are saying "lets not screw up the dotlocking" but then in bug id 3914 you (grafton) and pbrown implied that flock/fcntl was the appropriate locking standard for a Red Hat /var/spool/mail. Right now it looks like you have _no_ locking happening between procmail delivery and imapd pickup. Imadp can't dotlock and procmail is dotlocking. No locking is a real problem. Please look further into this. From the 02/17/00 19:41 reply, I don't think you fully understood the bug report. You need to be clear on these issues: (a) what your locking standard is, and (b) are you sure that every /var/spool/mail program uses this same standard?
+Right now it looks like you have _no_ locking happening between procmail +delivery and imapd pickup. Imadp can't dotlock and procmail is dotlocking. No +locking is a real problem. That's not correct. Procmail is doing both fcntl and dot-locking, so there should be no problem. Here's output from "strace procmail -d jik": execve("/usr/bin/procmail", ["procmail", "-d", "jik"], [/* 39 vars */]) = 0 ... link("/var/spool/mail/_rCC.yOYr4.jik.kamens.b", "/var/spool/mail/jik.lock") = 0 unlink("/var/spool/mail/_rCC.yOYr4.jik.kamens.b") = 0 ... open("/var/spool/mail/jik", O_WRONLY|O_APPEND|O_CREAT, 0667) = 4 ... fcntl(4, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=565, len=0}) = 0 ... fcntl(4, F_SETLK, {type=F_UNLCK, whence=SEEK_SET, start=565, len=0}) = 0 close(4) = 0 ... unlink("/var/spool/mail/jik.lock") = 0 ... This is with procmail-3.14-2. The unnecessary sleep(1) is still a bug which should be fixed, but I don't think the locking itself is broken.
Oh, okay.. my mistake on that. But I agree with you that the sleep still needs to be removed from imap, because there is really no use for it. If you run IMP which makes three different (sequential) IMAP connections when you login, it is a real drag.