Description of problem: Installing packages appears to hang, however strace reveals that it is not actually hanging, but going extremely slowly. After a _long_ time it does actually succeed. The slow calls are all to fsync with a variety of different file descriptors. Version-Release number of selected component (if applicable): 4.2-1 How reproducible: always Steps to Reproduce: 1. install an rpm package 2. 3. Actual results: Command takes several minutes Expected results: Command takes a couple of seconds. Additional info: Note that this is NOT a lock issue. The effect is the same whether or not I remove the /var/lib/rpm/__db.00? files. I suspect that many of the people who are reporting rpm hangs didn't have the patience to wait long enough for rpm to complete normally and kill -9'd it, when actually it was still running, very slowly. Executing "while true ; do sync ; done" in another window in parallel with rpm makes rpm run at normal speed. The effect of calling "sync" is immediately noticeable when stracing rpm. There are many calls to fsync on different file descriptors and each one seems to hang for a second. When calling sync from another process, the calls to fsync do not hang. Installing one package without calling sync took 4 minutes, and with sync 2 seconds, so the effect is quite dramatic. These times are reproducible and are not a result of caching. It would seem that fsync is being used to ensure database integrity, but it is taking some time before the synchronization is actually happening, unless explicitly requested using the sync command. For information, I am using kernel 2.4.20-20.9.
Slow is in the eye of the beholder, and not a bug unless you supply comparitive data with a different machine. Yes, fsync is called repeatedly. If slow, then I suggest looking at your disk perormance and/or kernel buffer cache.
My point was that by running an _extra_ command, rpm became faster which implies that something is definitely wrong. I am not talking about slow in absolute terms, but slow relative to running rpm on its own and running it with sync. In fact, the problem seems to be with the NPTL patches in the kernel, so it is not an rpm bug. I was already beginning to suspect a kernel issue when I first posted (which is why I said which kernel I was using). Then I found that kernel-2.4.20-19.8 did not suffer from this problem. Now I have tried taking the 2.4.20-28.9 kernel and modifying the spec file to remove the NPTL patches and that works too. With the NPTL patches, I get the same problem as with 2.4.20-20.9. One thing I didn't notice when I first posted was that the problem only occurs if something is running niced (one of my users runs SETI). So, the problem is definitely not an rpm problem and rpm-4.2-1 is working fine with the kernel without NPTL. So from that point of view it is closed. However, I would like to add that anyone who is running the NPTL kernel with something niced in the background, might be fooled into thinking rpm was running slowly or worse think it had locked up altogether and kill it with kill -9 leaving all the locking problems. I suspect that a large proportion of the bugs reported as rpm bugs are actually due to the inclusion of NPTL in the kernel. So although this bug is not your responsability, it may be causing most of the bug reports you are receiving.