Whilst upgrading my desktop to FC5, I noticed that Anaconda, mdX_raid1 and the kjournald daemons spending a _lot_ of time in the D-state. Attaching strace to the installer, I noticed a lot of fsync() or fdatasync() calls being made, which would seem to account for this. Admittedly, the files being installed need to be written, sync'd and then moved into place, but couldn't the sync'ing be batched? Rather than calling fsync() or fdatasync() on each file, couldn't anaconda just write all the files and then sync() the lot of them, then move them across? Or maybe it could do it in smaller batches. Also, when anaconda is in the D-state, it can't also be updating its display, though that's a minor matter as most people probably won't switch away from it.
Do you have actual measurements or just anecdotes from watching anaconda installs? Without actual measurements, it's very hard to say what problem is being solved, and entirely impossible to guess whthere sync's can be batched or not. Anaconda not being able to update its display is a very different problem that needs to be fixed by putting the display on its own thread, with its own cpu resources, rather than using cpu cycles from the callback to update the display. You are correct that there is no reason why rpmlib's D state should prevent anaconda screen updates. That is an architectural, not implementation, problem.
> Do you have actual measurements or just anecdotes from watching anaconda > installs? Unfortunately, I wasn't set up to measure anything as I didn't expect this problem (it was a one-off upgrade of my desktop box; not something I do regularly). I started investigating why it was being so slow when it proved to be taking hours to complete, despite the NFS server being a short distance away. I'll see what I can do to reproduce the problem, though I'll have to sort out another computer to do it on. I can probably get strace logs with timestamps from installer processes, but I'm not sure what else I can measure. Is there anything you'd care to suggest?
Anecdotal and non-reproducible is impossible to fix. FWIW, rpm has --stats to identify time taken on various operations, all modes.
User pnasrat's account has been closed
Reassigning to owner after bugzilla made a mess, sorry about the noise...
The information we've requested above is required in order to review this problem report further and diagnose/fix the issue if it is still present. Since there have not been any updates to the report since thirty (30) days or more since we requested additional information, we're assuming the problem is either no longer present in the current Fedora release, or that there is no longer any interest in tracking the problem. Setting status to "INSUFFICIENT_DATA". If you still experience this problem after updating to our latest Fedora release and can provide the information previously requested, please feel free to reopen the bug report. Thank you in advance.