Description of problem: RPM seem to be very very slow with database reading: here are couple examples from my quite long time/fragmented disk usage: Look at various timed sequences: # echo 3 >/proc/sys/vm/drop_caches # time rpm -qa >/dev/null real 0m47.770s user 0m1.870s sys 0m2.803s # echo 3 >/proc/sys/vm/drop_caches # time ( cat /var/lib/rpm/Packages >/dev/null ; rpm -qa >/dev/null) real 0m8.470s same with already cached data: # time rpm -qa >/dev/null real 0m2.840s After discussion with Jindra (jnovy) I've rebuilded databaze: rpm --rebuilddb so now again in same order # echo 3 >/proc/sys/vm/drop_caches # time rpm --nosignature --nodigest -qa >/dev/null real 0m11.517s user 0m0.480s sys 0m0.560s Actually quite an improvement - though I was unware I should rebuild db from time to time to get better performance. # echo 3 >/proc/sys/vm/drop_caches # time ( cat /var/lib/rpm/Packages >/dev/null ; rpm --nosignature --nodigest -qa ) >/dev/null real 0m5.087s user 0m0.253s sys 0m0.413s Hmm still more than 2x faster - some the db reading must have quite some overhead. # time rpm --nosignature --nodigest -qa >/dev/null real 0m0.353s user 0m0.257s sys 0m0.090s This looks quite fast and decent when data are already cached and database is freshly rebuild. I assume someone should track why the disc reading is so slow. (On nonfragmented drives differences are much bigger) Version-Release number of selected component (if applicable): rpm-4.7.1-6.fc12.x86_64 How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
I think part of problem could be explained by bug 534146 - as my database really needed rebuild, that significantly reduced read time.
The benchmark's of --rebuilddb and -qa are hardly indicative of rpmdb database performance since they are both sequential accesses of all data. You have also failed to supply any metric other than wall clock time. And you also have failed to control for the memory pool caching (and tuning) used by Berkeley DB, only the kernel buffer cache. Claims of "overhead" should be backed up by using callgrind which just isn't that hard to do.
Let's go into details: 1.) I've mentioned --rebuilddb purely for the reason that rpm over time get much much slower - I've not measured time of --rebuilddb. Obviously either db4 or rpm is seriously fragmenting data over months of usage and gets slower. Is it documented I should run 'rebuildb' regularly to get decent performance ?? 2.) My metric is perfectly valid and ideally you should try that yourself. I think if there is more the 100% speed up just by using cat - then either it should be configurable option for rpm (so it would do this itself - i.e. simple wrapper script could handle that) or the rpm read capabilities should be enhanced - 1st option is probably more simpler to implement, 2nd choise is preferable longterm solution. Also I believe you are probably slightly missing what I want to point out in this bugzilla. I've never read anything about tunning libdb for improving rpm performance and I'm not quite sure why should I - why isn't the default for rpm to go as fast as possible - why the user should tune libdb to get decent rpm speed? 3.) My "overhead" term here does not mean rpm 'burns' CPU - but actually nearly sleeps and waits all the time - because of some were low performing file access pattern - it actually remainds me usage of mmap without readahead - but that's just wild guess - I do not know rpm sources (nor libdb4) 4.) I'll attach callgrind - but I'm not really sure it helps here - I may provide probably output of oprofile, perf, or other tools you would like - I could even time track disk access though this will be probably quite long trace if you want. But IMHO why don't you try yourself - I've observed same behavior on many RHEL and Fedora machine around me - so it's definitely not the problem of my personal laptop. 5.) Just a side note - when using rpm without --nodigest options - it reveals very high load from SHA515 invocation - so again my simple 'dumb' time test reveals: sha512sum /var/lib/rpm/Packages: - 0.46s (when cached) cat >/dev/null 0.04s (when cached) but rpm with digesting takes 1.1s without digest (--nodigest --nosignature) 0.3s This gives 0.8s difference - it looks like sha512sum would checksum nearly twice my 80MB Package file within this period of time - so again I may ask a simple question - what is calculated so heavily inside rpm nss-softokn library ? 6.) --nodigest reveals a lot of time being spent rpm-4.7.1/lib/header.c:dataLength - just by a plain look there it appears to me, that rpm actually spends major time for scanning strings inside binary file - why not store string size within the string - or use some indexes for this ? I'm not sure if that's the reason why the yum is so slow - but I think it's part of the puzzle.... If you need more details let me know.
Created attachment 369328 [details] callgrind of rpm -qa --nosignature --nodigest
Now you're talking ... 1) No one (certainly not me) said your reported results were invalid. I don't think you've analyzed the results correctly (see your callgrind measurements, dataLength is not I/O related per se). And its premature to advertise a performance "fix" like running --rebuilddb periodically when the performance problem is poorly characterized and understood. 2) I did not say your metric was "invalid", read what I wrote. I have tried the results myself. I run callgrind on rpm at least weekly and already know (and have fixed) many performance problems in RPM. I did point out that there are other issues than I/O, and suggested callgrind, where I/O overhead is _NOT_ the issue that you have measured. I did point out that you hav another level of caching that needs to be controlled for useful I/O metrics. And certainly having rpm run as fast is possible is my goal, I have no idea where you got the idea that any other goal is preferred. 3) If "sleeps and waits" is the issue (its not afaik), that was not at all clear from your wall clock benchmarks. And I most definitely know both rpm and db4 sources, in fact I have achieved a measured (w callgrind) 14.6x performance increase @rpm5.org by running careful (better than wallclock) benchmarks. But that's not relevant here. 4) Stare at the numero uno piggy in the callgrind spewage. When you start to realize that serialization and marshalling is the issue, then you will begin to understand the performance issue. 5) I'm not sure how SHA512 is related other than through signatures, where --nosignature is the disabler. In all cases, verifying digests on header blob's is overhead unrelated to I/O performance and must be controlled for. 6) yum performance depends on many factors unrelated to rpm. But run benchmarks on yum if you wish to understand yum performance problems. Without measurements, feel free to claim anything you wish about the cause of yum's pathetic performance, your opinion is as good or bad as anyone else's. And certainly I have no argument with you, nor anyone willing to run callgrind to verify issues ;-)
(In reply to comment #5) > Now you're talking ... > > 1) No one (certainly not me) said your reported results were invalid. I don't > think > you've analyzed the results correctly (see your callgrind measurements, > dataLength > is not I/O related per se). And its premature to advertise a performance "fix" > like running > --rebuilddb periodically when the performance problem is poorly characterized > and understood. I've not done any --rebuilddb analysis - I've just wrote: rpm -qa time before rebuild was 47 seconds - after rebuild 12 seconds. I've not saved older dataset for this analysis as I've not expected any problems. So it's just the fact that speed of my machine with --rebuildb improved ~4 times. > > 2) I did not say your metric was "invalid", read what I wrote. I have tried the > results myself. > I run callgrind on rpm at least weekly and already know (and have fixed) many > performance > problems in RPM. I did point out that there are other issues than I/O, and > suggested callgrind, > where I/O overhead is _NOT_ the issue that you have measured. I did point out > that you hav > another level of caching that needs to be controlled for useful I/O metrics. Do you flush disk buffers within your tests ? Time when all data are buffered in memory is 'almost' acceptable (though there are still some reserves - but there might be limits from DB format, which is probably nontrivial to change) My report is mainly about the moment when there are no data in memory - thus trivial query for installed package takes 12 seconds. > 3) If "sleeps and waits" is the issue (its not afaik), that was not at all > clear from your > wall clock benchmarks. And I most definitely know both rpm and db4 sources, in > fact I have > achieved a measured (w callgrind) 14.6x performance increase @rpm5.org by > running careful > (better than wallclock) benchmarks. But that's not relevant here. Is this rpm 4.7 going to be replaced by rpm 5 - or is it unrelated project to Fedora's rpm package? > 4) Stare at the numero uno piggy in the callgrind spewage. When you start to > realize that > serialization and marshalling is the issue, then you will begin to understand > the > performance issue. As I've said - callgrind will not show I/O stalls. > 5) I'm not sure how SHA512 is related other than through signatures, where > --nosignature > is the disabler. In all cases, verifying digests on header blob's is overhead > unrelated > to I/O performance and must be controlled for. Sure it's not related to slow disk reading - it's just what callgrind shows - and I've been just curious how much memory chunks needs to be checksummed for every simple rpm command - maybe it might be effective to use a short term daemon, to speed up repeated invocation (if daemon keeps lock on database) > 6) yum performance depends on many factors unrelated to rpm. But run benchmarks > on yum if you wish to understand yum performance problems. Without > measurements, > feel free to claim anything you wish about the cause of yum's pathetic > performance, > your opinion is as good or bad as anyone else's. Yeah - sure python is much bigger CPU eater in this case - but rpm is not negligible either...
I did not mean to imply you have done --rebuilddb benchchmarks. My statement is true no matter what: neither -qa _NOR_ --rebuilddb are proper measurements of rpmdb "performance" because the access is sequential. So any claim of "inefficient" rpmdb I/O will only apply narrowly and incompletely. > Is this rpm 4.7 going to be replaced by rpm 5 - or is it unrelated project to > Fedora's rpm package? I quote myself "irrelevant". The problems were the same as what is in your callgrind spewage because the code was largely the same. But feel free to not look for fixes in projects that are unrelated to Fedora. The measured callgrind speed up after fixing dataLength and other issues was 14.6x @rpm5.org. > As I've said - callgrind will not show I/O stalls. Yes. Which is why I use callgrind to measure I/O "overhead" claims as here. > Do you flush disk buffers within your tests ? Callgrind measurements are largely immune to buffer state as well. Yes I flush buffers where needed or appropriate. Are you claiming I/O stalls or not? You have not disclosed any measurement that directly shows stalls, only pointed out the possibility afaict. strace tstamps would be convincing. I have not seen stalls, or behavior indicative of I/O waits, while benchmarking RPM. > Yeah - sure python is much bigger CPU eater in this case - but rpm is not > negligible either... LVM is not exactly svelte either. Reasoning from yum->python->lvm peformance to claimed "inefficient database disk reading" for RPM based on the largeness of the code base is no measurement I understand.
(In reply to comment #7) > I did not mean to imply you have done --rebuilddb benchchmarks. My statement is > true no matter what: neither -qa _NOR_ --rebuilddb are proper measurements > of rpmdb "performance" because the access is sequential. So any claim of > "inefficient" > rpmdb I/O will only apply narrowly and incompletely. Well I still don't get what you mean by this - I've reported problem I'm experiencing as a regular daily user. I see a very slow behavior and I report this as a problem. It could be easily checked by anyone just by repeating steps in my first post. If you think this bugzilla title is not correct - propose a better name. > Are you claiming I/O stalls or not? You have not disclosed any measurement > that directly shows stalls, only pointed out the possibility afaict. strace > tstamps > would be convincing. I have not seen stalls, or behavior indicative of I/O > waits, > while benchmarking RPM. Statistics cached rpm perf stat -- rpm -qa >/dev/null Performance counter stats for 'rpm -qa': 1119.839601 task-clock-msecs # 0.994 CPUs 137 context-switches # 0.000 M/sec 0 CPU-migrations # 0.000 M/sec 3667 page-faults # 0.003 M/sec 2453079208 cycles # 2190.563 M/sec 4382415288 instructions # 1.786 IPC 27935906 cache-references # 24.946 M/sec 408832 cache-misses # 0.365 M/sec 1.126486505 seconds time elapsed Statistics uncached Performance counter stats for 'rpm -qa': 2073.234535 task-clock-msecs # 0.179 CPUs 2443 context-switches # 0.001 M/sec 1 CPU-migrations # 0.000 M/sec 3666 page-faults # 0.002 M/sec 2830241244 cycles # 1365.133 M/sec 4833644897 instructions # 1.708 IPC 37085859 cache-references # 17.888 M/sec 543550 cache-misses # 0.262 M/sec 11.552013713 seconds time elapsed Statistics cached without digest Performance counter stats for 'rpm -qa --nodigest --nosignature': 355.230563 task-clock-msecs # 0.990 CPUs 41 context-switches # 0.000 M/sec 1 CPU-migrations # 0.000 M/sec 3547 page-faults # 0.010 M/sec 778136260 cycles # 2190.510 M/sec 845519636 instructions # 1.087 IPC 20456887 cache-references # 57.588 M/sec 353068 cache-misses # 0.994 M/sec 0.358798949 seconds time elapsed Statistics uncached without digest Performance counter stats for 'rpm -qa --nodigest --nosignature': 1160.226932 task-clock-msecs # 0.099 CPUs 2322 context-switches # 0.002 M/sec 3 CPU-migrations # 0.000 M/sec 3546 page-faults # 0.003 M/sec 1111501957 cycles # 958.004 M/sec 1267144821 instructions # 1.140 IPC 28076070 cache-references # 24.199 M/sec 478383 cache-misses # 0.412 M/sec 11.668546646 seconds time elapsed --- Short sample from strace -tt cached - difference .003302s 15:12:49.660754 write(1, "dejavu-lgc-sans-mono-fonts-2.30-"..., 46) = 46 15:12:49.660860 pread(3, "\0\0\0\0\1\0\0\0\3142\0\0\0\0\0\0\3152\0\0\1\0\346\17\0\7\0\0\0J\0\1"..., 4096, 53264384) = 4096 ... 15:12:49.663573 pread(3, "\0\0\0\0\1\0\0\0\3452\0\0\3442\0\0\0\0\0\0\1\0\252\f\0\7\0\0\0\0\0\0"..., 4096, 53366784) = 4096 15:12:49.663865 rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], [], 8) = 0 15:12:49.663953 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 15:12:49.664056 write(1, "gedit-2.28.0-1.fc12.x86_64\n", 27) = 27 ---- Short sample from strace -tt uncached - difference .013265s 15:13:27.990154 write(1, "dejavu-lgc-sans-mono-fonts-2.30-"..., 46) = 46 .. 15:13:28.003419 write(1, "gedit-2.28.0-1.fc12.x86_64\n", 27) = 27 > > > > Yeah - sure python is much bigger CPU eater in this case - but rpm is not > > negligible either... > LVM is not exactly svelte either. Reasoning from yum->python->lvm peformance > to claimed "inefficient database disk reading" for RPM based on the largeness > of the code base is no measurement I understand. Not exactly sure what do you mean here by LVM....
So you are seeing additional "overhead" (including context switches and cache misses) in what you are identifying as "uncached". (aside) Oddly, the --nodigest case is slower than if digests are calculated for the "uncached" case. Likely from statistical noise ... But I do not see any immediately apparent I/O stalls and waits anomaly in the measured behavior. Sure there's waits in the measurement, all "uncached" I/O will wait. But if I've misinterpreted the perf data, please point that out. What I lack is some comparison point to calibrate my expectations so that I can notice when some measurement is not within a typical range. What _IS_ different about rpmdb I/O (from other I/O cases) is that RPM (and Berkeley DB) goes to some lengths to ensure that data is written to disk. While "rpm -qa" is largely a read operation, populating the memory pool cache operation is a write operation. So you may wish to also control for the memory pool cache by doing rm -f /var/lib/rpm/__db* if you really want to measure the components of "rpm -qa" I/O more carefully. You are making claims that rpm --rebuilddb is needed as part of normal maintenance based on what you see as a normal user with a stop watch. The behavior needs to be more carefully analyzed (imho) before that claim is justified (imho). Whether it is true (or not) that --rebuilddb is needed I cannot yet say. But I know from a decade of living without --rebuilddb for routine rpmdb performance maintenance that there's some other piece of this puzzle that needs to be understood first. Yes, the title "inefficient database disk reading" is misleading. *shrug* The behavior needs to better characterized and understood first.
Yes - the 0.1s difference in 11s measurement is the system noise - the time of sha512 summing is hidden in I/O stalls (waits for cache page in) I've forget to add these missing perf stat: uncached cat Performance counter stats for 'cat /var/lib/rpm/Packages': 123.237846 task-clock-msecs # 0.035 CPUs 1131 context-switches # 0.009 M/sec 1 CPU-migrations # 0.000 M/sec 160 page-faults # 0.001 M/sec 211787797 cycles # 1718.529 M/sec 191481355 instructions # 0.904 IPC 3519474 cache-references # 28.558 M/sec 61659 cache-misses # 0.500 M/sec 3.478775266 seconds time elapsed cached cat Performance counter stats for 'cat /var/lib/rpm/Packages': 49.098802 task-clock-msecs # 0.848 CPUs 5 context-switches # 0.000 M/sec 0 CPU-migrations # 0.000 M/sec 159 page-faults # 0.003 M/sec 107537451 cycles # 2190.226 M/sec 54117805 instructions # 0.503 IPC 1520101 cache-references # 30.960 M/sec 62342 cache-misses # 1.270 M/sec 0.057930420 seconds time elapsed Where you could see much higher throughput. Also please note - I'm not doing code analysis - I'm bug reporter ;)
Cat(1) measurements help, but as I tried to point out, this is a database, and databases have different needs and different I/O patterns than cat(1) does. And any solution (once the problem is characterized) will likely be rather different too. You are hardly a typical user, most of whom have no idea what callgrind is or does, or what an I/O stall issue is. Please stop pretending that you are "just a user" with a stop watch. ;-) (aside) Note that there has been a historical issue with Berkeley DB db-4.1.x (iirc, been years) with I/O stalls and waits. The issue was tied to select vs poll behavior. There's a bugzilla report and a perl script reproducer generating 5000 records in Berkeley DB on (iirc) approx a RHEL4 time frame. Which is why I ask for details like Can you document I/O stalls & waits as part of a performance problem? Not also that I have never heard or seen an issue with rpmdb performance degradation over time until now. Which means that something (other than RPM and Berkeley DB which haven't changed much at all) is likely relevant to what you are seeing.
(In reply to comment #11) > Cat(1) measurements help, but as I tried to point out, this is a database, > and databases have different needs and different I/O patterns than > cat(1) does. And any solution (once the problem is characterized) will likely > be rather different too. > > You are hardly a typical user, most of whom have no idea > what callgrind is or does, or what an I/O stall issue is. Please > stop pretending that you are "just a user" with a stop watch. ;-) It's not about pretending - its about having many other tasks in my hands... > (aside) > Note that there has been a historical issue with Berkeley DB db-4.1.x > (iirc, been years) with I/O stalls and waits. The issue was tied to > select vs poll behavior. There's a bugzilla report and a perl script > reproducer generating 5000 records in Berkeley DB on (iirc) approx a RHEL4 time > frame. > > Which is why I ask for details like > Can you document I/O stalls & waits as part of a performance problem? Which I could do only with local recompilation with a special debug hinting. I've simply thought that package developers have better idea where the problem can be burred. Is this behavior (from comment 1) similar with rpm5 or is this onlt rpm 4.X thing ? > Not also that I have never heard or seen an issue with rpmdb performance > degradation over time until now. Which means that something (other than > RPM and Berkeley DB which haven't changed much at all) is likely > relevant to what you are seeing. Maybe users do not bother to report :)? Maybe they consider those 12 seconds query time still acceptable? But as I said - I've tested it on several machines and experienced approximately same results. I should probably note that my machine is F8 installation continuously upgraded and usually its quite fresh Rawhide. In the past I've been forced to run --rebuilddb just once after some serious problem. So maybe during the live cycle of Fedora release the database stays 'fast' enough?
No time perfectly understood. Let's not argue ;-) I have not ever seen performance degradation over time doing daily development (and I rarely do --rebuilddb). But when I do measurements, I typically control for behavior within the measurement, which may have missed I/O degradation over time. We both agree that after --rebuilddb (and controlling for --nodigest and --nosignature) that rpm -qa performance is reasonable. The issue for me is solely whether --rebuilddb is routinely needed for maintenance (its not afaik yesterday). If there is performance degradation, it can (will in code that I develop) likely be avoided by changing the implementation. The 12 (or 18 or 48 or ...) sec behavior for rpm -qa can be lived with no matter what. Its not like "rpm -qa" hangs or takes an hour. There are definitely rpm -qa speed-ups available by avoiding loading header's entirely. The entire "rpm -qa" spewage could be cached with a tstamp and just blasted at the user whenever needed. But that solution just bandaid's deeper problems that may need to be analyzed and solved more carefully. (aside) Note that rpmdb performance @rpm5.org is now transactionally protected to have ACID behavior and /var/lib/rpm/Packages is going to be eliminated and replaced with an entirely different store and schema than what is used in rpm4. So far the I/O performance is approx (within a factor of 2x) status quo ante. But -qa will be reworked to be faster so that I don't have to sort out --nodigest --nosignature and other factors when analyzing problem reports.
After looking around a bit for time degradation issues with Berkeley DB (there are no visible reports), this thought occurs to me: There's an optimization for DB_HASH access added to db-4.6.x. As always, Berkeley DB provides backward compatibility but perhaps the older DB_HASH layout is no longer optimal. So the (hypothesized) result of doing --rebuilddb changes the behavior to improve performance, but is not necessarily time degraded, or fragmentation related. I have no easy way to confirm the hypothesis. But RPM use of Berkeley DB is quite straightforward and hasn't changed since forever. And I expect -- because of wide usage -- that a performance time degradation with Berkeley DB I/O performance would have been reported _SOMEWHERE_. Does that sound consistent with what you are seeing?
Well, just from plain simple look into the strace and pread's nearly 'random' seek positions I think it's most probably pretty clear the only reason for this very slow access. In the attachment you will find a C program - it contains the array of offsets used directly from strace pread. I've spent some time writing it - thus I hope it will be usable. It demonstrates all things. Enjoy
Created attachment 369493 [details] Demonstration C code
Hmm for your comment 14 - I assume there should be some testing code available that simulates heavy rpm usage over the long period of time - i.e. installing, removing packages with lots of files ? If it's not - it would be probably wise idea to create such test script ?
Sure a simulated "heavy load" is the only credible test for whether there is a degradation in rpmdb I/O behavior over time. There are plain and simply too many other factors, like file system type, kernel version, Berkeley DB version, RPM implementation, etc etc to compare results meaningfully. I've already asked privately for opinions re "time degradation" from users who I know have long running rpm implementations and whose opinion I trust. (I trust your report and methodology but the analysis is a bit premature so far. Nothing personal, jmho ;-) And a test script isn't that hard, I have several. Still I'm going to be surprised if there is performance degradation over time with RPM+BDB. Degradation is a very different issue than whether rpmdb I/O is optimal with "rpm -qa", I already know zillions of inefficiencies in rpmdb I/O handling, and have been actively addressing those issues since db-4.8.24 was released in September.
Well - the issue doesn't need to be actually degradation - it eventually might have been some 'buggy' version of rpm or libdb4 in past. I've not been solving the issue - until I got curious why it takes those ~47sec for a simple query. I'm mostly using yum - which is slow by design - thus I've not been checking in depth rpm.
We both agree that 47sec for "rpm -qa" is too long. What isn't clear yet is the root cause. No matter what, we both agree the --rebuilddb leads to better performance. But degradation is my primary concern. If Berkeley DB degrades over time, then a "fix" should be attempted. BTW, tnx for the block trace. I can almost see some patterns that I can map into Berkeley DB and RPM code. FYI, you need to control for whether "rpm -qa" is run as root (or not). If run as root, then a dbenv is opened, and there is a memory pool cache that is interposed. Your programs lists blocks solely for Packages afaict. There should be additional I/O occurring to the memory pool as well. With a sequential access like "rpm -qa", cache blow out is inevitable. There's also a page size tunable, rpm traditionally has used 512b pages because that optimizes locking granularity, locks are per-page, so large pages are likelier to lead to lock contention because large pages are likelier to overlap. There was no significant affect of changing the mempool page size on I/O performance when I looked (not recently, but I believe ffesti saw similar "No effect." within the last year). There is an issue of rpmdb fragmentation on ext4 reported by sandeen. In empty chroot's, an rpmdb is the most fragmented file. When one considers that an rpmdb is just about the only database in use in chroot installs, and is certainly the most active file(s) during a chroot install, the presence of fragmentation should surprise no one. Whether fallocate (or equivalent) could/should be used to address rpmdb fragmentation is not yet clear. Certainly fallocate would reduce rpmdb fragmentation, but there are other issues, such as (on linux) fallocate is not available on older systems and introduces a run time dependency on both kernel and glibc implementations; and (on non-linux) that fallocate may not be reliably available even if in POSIX. I have an implementation half done in RPM, but there's no reason to rush to fallocate until there are clear performance reasons to do so. So far its just the presence of fragmentation, not any reliable measurements of improved/degraded performance, that are being reported.
(In reply to comment #20) > What isn't clear yet is the root cause. No matter what, > we both agree the --rebuilddb leads to better performance. > > But degradation is my primary concern. If Berkeley DB > degrades over time, then a "fix" should be attempted. Well I've no idea how DB works here with my ext3 filesystem. My system partitions is around ~8GB and usually has around 0.7G free space. But as you like plain numbers and I like my simple 'wall' clock experiments, lets dig a bit deeper here :). I've 1GB 'play' partition for experiments (usually for lvm :)). Thus completely fresh format of ext2/ext3/ext4 fs was used for the following test. `hdparm -t` gives 30MB/s for this test partition. using nongfragmented 80MB file 1. pread() ~7.5s 0.06s 2. mmap() ~5.8s 0.10s 3. mmap() ADV ~2.8s 0.07s 4 `cat` ~2.8s 0.06s Timing was nearly same for all extX. (Note - `cat` is slighly faster as it is not doing memcpy - thus mmap with just reading the page has the same or better speed - I may provide updated source file to attachment if needed) So obviously pread() is by far the worst way for this jobs. Let's continue with experiments. Weird fragmentated DB's file /var/lib/rpm/Packages on my system drive results in actually pretty slow read speed - as plain `cp` of this Packages to Packages.copy file reveals this: (`hdparm -t` of this system drive gives 43MB/s) original uncached `cat` of Packages 3.7s copied uncached `cat` of Packages.copy - 2.0s - wow 55% faster. So all this could show some possible updates in code. 1. pread() could be replaced with mmap() - probably pretty easy change - I think it might be optional - i.e. `rpm --use-mmap` - if there would be no problem and user would be happy - it might be switched as default. MMap should probably also result in significantly smaller memory foot print of the application. 2. from time to time probably full copy of Packages file should be done - to defragment strange file layout in filesystem - note that this Package files is just 1day old from fresh rebuilddb and only few packages were modified. 3. hmm just wondering if using plain ASCII small files could get any worse then this Berkeley DB. (i.e. /var/lib/dpkg/info way) 4. Wait till all users switch to SSD - and apply only 1.) to safe memory ;) > > BTW, tnx for the block trace. I can almost see some patterns > that I can map into Berkeley DB and RPM code. > > FYI, you need to control for whether "rpm -qa" is run as root (or not). > If run as root, then a dbenv is opened, and there is a memory pool > cache that is interposed. Your programs lists blocks solely > for Packages afaict. There should be additional I/O occurring to the > memory pool as well. Yep - 4 appearances of pread() seems to be from different file descriptor in my strace. But they would have probably very small impact on the total time. Just another simple strace check of pread() appearance for simple small rpm file installation - it looks like there are some ~10000 pread() calls as well - which is slightly less than for -qa - but still pretty high number. > > With a sequential access like "rpm -qa", cache blow out is inevitable. rpm -i seems to be doing not so much different job after all... > There is an issue of rpmdb fragmentation on ext4 reported by sandeen. Yep - revealed by my plain simple experiment as well and I'm running ext3 > In empty chroot's, an rpmdb is the most fragmented > Whether fallocate (or equivalent) could/should be used to address > rpmdb fragmentation is not yet clear. Certainly fallocate would reduce I think large DB files should be probably split to smaller pieces according to their usage. Thus for the most common task only small number of data would need to be loaded. For some hardly every used commands like rpm -q --changelog (which I'm still wondering why they are part of DB and not stored somewhere in /usr/share/doc/pkg/changelog file) could be loaded from a separate DB. Anyway take this only just as my ideas - nothing you should probably worry about. Maybe there is a way how to improve Berkeley DB to handle all of this still in one file...
> But as you like plain numbers and I like my simple 'wall' clock experiments, > lets dig a bit deeper here :). I like KISS. The only issue with wall clock is that it is sometime difficult to analyze and compare Otherwise I like wallclock a lot ... > 1. pread() could be replaced with mmap() Sure. But that's a deep change to uisng Berkeley DB, which has numerous consequences. Likely better is just to use the existing mpool handling in BDB, with a cache size appropriate for the data being cached, with double buffering removed using O_DIRECT. But there's a balance that will be needed between transactionally protected data and I/O performance, the balance can only come with experience. > 2. from time to time probably full copy of Packages file should be done Easy to say but very hard to automate a ~100Mb copy with bullet-proofing. Lusers need to design their own backups. FYI: Packages (and the header blob within) are gonna be eliminated @rpm5.org by the end of the year and replaced with mmap(2) onto a secondary store of /some/path/*.rpm. The issue for trnsactional logging is reducing the size of the logs. But that also means that all metadata must be stored in indices so that headerLoad() is avoided. > 3. hmm just wondering if using plain ASCII small files Please do the math. Hash/btree access is _ALWAYS_ superior to flat files for anything but toy in-memory cases. Or Oracle would not exist. > 4. Wait till all users switch to SSD Well NAND has its own I/O performance (and failure) issues, rather different than DASD. See bz #529948 if you want to see s-l-o-o-w rpmdb perfomance. > I think large DB files should be probably split to smaller pieces ... Yup. Possible with db-4.8.24, not older versions. > Anyway take this only just as my ideas - nothing you should probably worry What me worry? ;-) All very sane suggestions, actively being implemented @rpm5.org. E.g. signature/digest verification of header blob's (which are already PROT_READ protected) was removed this morning.
(In reply to comment #22) > > 1. pread() could be replaced with mmap() > Sure. But that's a deep change to uisng Berkeley DB, which has numerous > consequences. Going from pread() -> mmap() should be fairly simplier than switching to anything like memory pools. Easy way could be to mmap DB at the beging, and instead of pread() just call memcpy(). Second step would to throw away those calls and access memory directly. > Likely better is just to use the existing mpool handling in BDB, with a > cache size appropriate for the data being cached, with double buffering > removed using O_DIRECT. But there's a balance that will be needed between O_DIRECT is performance killer unless you think you know much better access pattern than Linux kernel could guess. > > 3. hmm just wondering if using plain ASCII small files > Please do the math. Hash/btree access is _ALWAYS_ superior > to flat files for anything but toy in-memory cases. Or Oracle would not exist. You have those btrees already in filesystem - so the math still works. And with btrfs you have there even Oracle ;)
(In reply to comment #22) > > 2. from time to time probably full copy of Packages file should be done > Easy to say but very hard to automate a ~100Mb copy with bullet-proofing. > Lusers need to design their own backups. Forget to add one information: I've just checked that --rebuilddb already creates such 'weird' slow file. Are there used files with holes ? I think it's actually pretty nontrivial to make it so much fragment. As a quick hack - maybe even just plain copy right after --rebuilddb could make things better for a lot of people....
We are tuning different aspectsof I/O perfomance. E.g. Create a /var/lib/rpm/DB_CONFIG file with these 2 lines: set_cachesize 0 67108864 4 set_mp_mmapsize 268435456 1st line permits 4 cache files of 64Mb 2nd line pemits up to 256Mb to be memory mapped. Prime the cache by running as root rpm -qa Show cache hits by doing cd /var/lib/rpm /usr/lib/rpm/db_stat -m Adding -Z will rezero the counters. The above is a rather different I/O trace than you have reported with strace. Yes, O_DIRECT with Berkeley DB likely knows better than the kernel ;-) And "quick hacks" tend to get forgotten. You have no idea how many "quick hacks" there are in RPM that noone has a clue about. I have no problem whatsoever using available I/O performance in a linux kernel. But tuning a database != a tuna fish.
Created attachment 369559 [details] rpm -qa strace Note the complete absence of pread(2) or any I/O from Packages with the DB_CONFIG lines as described.
(In reply to comment #26) > Created an attachment (id=369559) [details] > rpm -qa strace > > Note the complete absence of pread(2) or any I/O from Packages > with the DB_CONFIG lines as described. Good now I can see your trace works through mmap while I could not get mmap working even with 1GB mp_mmapsize. It looks like Jindra is going to recheck the code - he's found some comments in sources about AC_FUNC_MMAP from 2007... I'll wait for a package rebuild. I like that we are finally moving somewhere :)
Sadly, rpm has been deliberately crippled to avoid mmap(2) when the mapped region is larger than a limit (that is too small imho). (aside) The issue way back when was to avoid the appearance of large numbers in top(1) displays that bothered lusers who reported "bugs". And then there's sparse /var/log/lastlog which has all sorts of hilarious hysteria with RPM. There's hardly any need to package /var/log/lastlog. (another aside) Prelinking (through a pipe to prelink --undo) has never been implemented correctly in RPM either. The need to verify a digest on unprelinked libraries forces prelinking detection and a prelink --undo helper rather than using mmap(2) directly to calculate library file digests. While prelink could do the digest check for --md5/--sha1, the recent change to SHA256 for file digests causes RPM to use I/O rather than mmpa(2). But the important rpmdb I/O questions for me are: 1) Does Berkeley DB performance degrade over time? 2) Is there a demonstrable/measurable need for fallocate? Sure unfragmented files have less overhead than fragmented files. But the performance gain needs to be balanced against the implementation cost. I have yet to hear of any credible performance gain measurements for "fragmented" rpmdb's, no one has bothered.
This bug appears to have been reported against 'rawhide' during the Fedora 12 development cycle. Changing version to '12'. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
(In reply to comment #28) > Sadly, rpm has been deliberately crippled to avoid mmap(2) > when the mapped region is larger than a limit (that is too small imho). Well not really sure if there some 'deliberately' crippling - but there is surely a lack of testing and performance checking. > (aside) > The issue way back when was to avoid the appearance of > large numbers in top(1) displays that bothered lusers who > reported "bugs". And then there's sparse /var/log/lastlog > which has all sorts of hilarious hysteria with RPM. There's > hardly any need to package /var/log/lastlog. How the lastlog gets connected with rpm is probably beyond my imagination. But the top should be actually showing much better numbers as RS memory should be actually smaller with mmap (when written in a right way) and only VIRT size gets bigger, but that should not bother any user ;) > > (another aside) > Prelinking (through a pipe to prelink --undo) has never been implemented > correctly I've prelinking disabled, because it certainly must eat more CPU/Watts to compute dependencies, that it could ever safe during the life time of those valid results between updates... ;) > But the important rpmdb I/O questions for me are: > 1) Does Berkeley DB performance degrade over time? > 2) Is there a demonstrable/measurable need for fallocate? Sure > unfragmented files have less overhead than fragmented files. But > the performance gain needs to be balanced against the implementation cost. > I have yet to hear of any credible performance gain measurements for > "fragmented" rpmdb's, no one has bothered. Again I assume this is something for internal testing of rpm tool itself to simulate longterm heavy usage and check whether performance goes down. As mentioned in commnent #2 I was probably not alone with this problem, but I've not made copy of rpm dir before rebuild :( so I could hardly provide anything better than just observation, that even after --rebuilddb file reading has about 50% speed of 'defragmented' file. Also when you said BDB could know better then Linux how to access data - that could be only true in case you would use separate partition for such DB file. But when such data are stored on a filesystem like ext3 or any other advance fs which has its own fragmentation I doubt BDB could have any decent algorithm to handle this case. And I think I should also probably mention that few upgrade were not properly finished because of 'various' rawhide faults - but usually it was handled properly via yum-complete-transaction and package-clean --dupes. But eventually this might have lead over the time to the increased size because of keeping such invalid transactions still stored in DB - again just a wild guess...?
> Well not really sure if there some 'deliberately' crippling - but there is > surely a lack of testing and performance checking. Surely you exclude present company. Have fun! > I've prelinking disabled, because it certainly must eat more CPU/Watts to > compute dependencies, that it could ever safe during the life time of those > valid results between updates... ;) If RPM had any choice, prelink would never have been implemented. Note that RPM per se is hardly to blame for how it is used. Dependencies are input, if present, they will be checked. Otherwise go tune your file system with rm -rf / I guarantee "rm -rf /" tuning will be higher performing, for all kernel versions and file systems, than any other possible "tuning". I'm going to assume 1) No BDB degradation over time. 2) fallocate is not needed until I hear otherwise. Both 1) and 2) can be fixed any time anyone chooses outside of RPM.
(In reply to comment #31) > > Well not really sure if there some 'deliberately' crippling - but there is > > surely a lack of testing and performance checking. > > Surely you exclude present company. Have fun! Well as for me - I'm only trying to help to resolve my bugzilla... > If RPM had any choice, prelink would never have been implemented. Note that > RPM per se is hardly to blame for how it is used. Dependencies are input, if > present, they will be checked. Otherwise go tune your file system with > rm -rf / > I guarantee "rm -rf /" tuning will be higher performing, for all kernel Not really sure how the rm -rf / relates to my plain easy to see wall-clock experiments. > versions and file systems, > than any other possible "tuning". > > I'm going to assume > 1) No BDB degradation over time. > 2) fallocate is not needed > until I hear otherwise. Both 1) and 2) can be fixed any time anyone chooses > outside of RPM. Sure - I cannot provide you exact code lines number where the problem is - but just to show some numbers: Before today's upgrade my 'Packages' file after --rebuilddb was something like 71MB. After upgrade of aprox. 900 packages fc12->fc13 - this file is now over 97MB. Well don't ask me what is in those extra 26MB, I've no idea what's even in those 71MB as that's aprox. 40KB per package I'm only providing you numbers for you statement 1.)
> Before today's upgrade my 'Packages' file after --rebuilddb was something like > 71MB. After upgrade of aprox. 900 packages fc12->fc13 - this file is now over > 97MB. Well don't ask me what is in those extra 26MB, I've no idea what's even > in those 71MB as that's aprox. 40KB per package SIZE != PERFORMANCE Performance for hashes (in fact) requires pre-allocated free space in buckets in order to retain performance. SImilarly for btree's, the costly operation for btree access is recursing upwards splitting pages. Running "db_stat -m" would show statistics if anyone had a clue. But sure, run "ls -l" and use wall clock as measurements all you want and claim degradation "objectively".
Well let's do more tests & numbers after todays upgrade - db_stat -m will be attached as well. Please note - this bugzilla is about fixing rpm-4.7 - if it works for rpm5 its good for your tool - but it will not solve the problem with my Fedora Rawhide installation. I should also mention that my measurements are done with unloaded machine and repeated several times - so it's not one-time experiment but easily repeatable thing - and as such they ARE time consuming. I'll provide mainly wall clock time - as that's what counts for user - he must wait that long - and experiences noticeable response delay. As has been mentioned several times in this thread - CPU time is not all that high - it's not the best it could - but it's in tolerable range for now (<1s). 1.) uncached 'rpm -qa' Original time before yum upgrade was around 11.5 second under same conditions 2 days ago. Now over 16.2 seconds - amount of package increased due to various fc12->fc13 deps from 1886 to 1911 - possible few of them could be removed again. Anyway I consider this number to be nearly the same. around 15.5s with removed __db.* (by default 4 of them are created) 2.) uncached 'rpm -qa --nodigest --nosignature' around 14.5s without __db* around 15.6s with __db* 3.) uncached 'cat Packages' 3.3s 4.) cached 'rpm -qa' 0.84s - nearly same with or without __db* 5.) cached 'rpm -qa --nodigest --nosignature' 0.37s - nearly same with or without __db* From these experiments above there could be taken interesting conclusion that actually those cached files makes the rpm tool running slower. Ok and now something completely different Using DB_CONFIG from comment 25 6.) uncached 'rpm -qa' ~16.2s without __db* - quite similar to case 1.) BUT ~12s second if the __db* files are there (and there is 7 __db files now) 7.) uncached rpm -qa --nodigest --nosignature ~15.5s without __db* ~12s with __db* 8.) cached 'rpm -qa' 1.1s without __db* 0.92s with __db* 9.) cached 'rpm -qa --nodigest --nosignature' 0.60s without __db* 0.44s with __db* Ok - this leads to another conclusion - that using more __db* files seems to be winning strategy only for the first run case when no files are in memory. In other cases there is noticeable performance penalty. And now let me do final experiment: 10.) uncached rpm -qa --nodigest --nosignature after --rebuilddb ~12 seconds - so we are back at original number from comment 1. There are few more interesting things from strace to be mentioned here - a.) mmap is used to process __db* files - but NOT Packages files. b.) with DB_CONFIG file rm -f __db* rpm -qa rpm -qa <--- not reading Package file with pread() anymore and uses index __db* files - this is different from the default DB_CONFIG less configuration. However - the speed is actually lower except for the uncached case. For this test I've made --rebuildb for separate dir copy - thus I'll continue to use original 100MB Packages file to see if there will be some more degradation over the time of usage.
Created attachment 372467 [details] db_stat -m This is db_stat for rpm -qa with __db* files and DB_CONFIG Let me tell which other db_stat files under which circumstances you'd like to see.
Not interested in seeing anything. Its your bug with rpm-4.7 and Fedora. My sole interest was your claim that Berkeley DB I/O performance and space degrades over time. I cannot tell that from your wallclock measurements.
One last note ... db_stat -m claims 80M for cache, you claim 100M for Packages. And your cache hits are ~60%, not 100%. Increasing cache to accomodate workset would seem to be useful.
(In reply to comment #37) > One last note ... > > db_stat -m claims 80M for cache, you claim 100M for Packages. > > And your cache hits are ~60%, not 100%. > > Increasing cache to accomodate workset would seem to be useful. Well - sure there is no problem to increase the case size - the question is - how to make it efficiently (and again - why the RPM user should actually care about DB cache size - at least man page for RPM hides DB_CONFIG from the users like me ;) Let's get back to increasing the the case size doesn't improve initial problem - and IMHO makes later cached processing actually slower - though in range of couple milliseconds - but it is measurable and consistent across multiple measurements. I admit the time for --nodigest --nosignature goes down to 10 seconds for uncached __db* case - but let me repeat that wallclock experiment from comment 1 clearly shows that, that using even cachesize 0 and doing cat Packages; followed by rpm command outperforms this solution by large factor. IMHO keypoint is to enable full mmap usage preferable with MAP_POPULATE if available and I'm getting the feeling that I'll need to look on this myself... :( As for degradation BDB - if my simple case is not the sign of degradation (i.e. DB size growth by 30% and its getting slower by approximately same factor) then of course there is no problem to care about ;)
Created attachment 372649 [details] db_stat -m 1st. run with bigger cache --nodigest --nosignature
Created attachment 372650 [details] db_stat -m 2nd. run with bigger cache --nodigest --nosignature
Created attachment 372652 [details] db_stat -m 3rd. run with bigger cache --nodigest --nosignature
There are various heuristics that can be implemented to choose a cache size based on available memory and (a priori known) working set size. The important point is that Berkeley DB has a resource cap that is honored. How the resource cap is chosen involves factors other than performance. DB_CONFIG is most definitely documented (and has nothing whatsoever to do with RPM): http://www.oracle.com/technology/documentation/berkeley-db/db/programmer_reference/env_db_config.html Alas, RPM started using Berkeley DB before DB_CACHE was implemented and has never bothered to change. Why is MAP_POPULATE preferred? I've missed 30% performance decrease. What is needed is some hint about the underlying cause in order to attempt a fix. I don't argue with wall clock. OTOH, I can't fix anything based solely on wall clock, the underlying cause needs to be identified. No matter what: rpm --rebuilddb "fixes" degradation. And the degradation isn't orders of magnitude for either size of performance.
Apologies. DB_CONFIG was intended, not DB_CACHE.
Ah, fault ahead is why MAP_POPULATE is important. Note that RPM achieved a 10% increase in upgrade performance (measured by wallclock ;-) by changing mmap(2) flags in 3 places as well as using a larger (16Mb but 256 Kb had most of the gain) buffer for zlib way back when. Linux kernels have an astonishing amount of read ahead bandwidth available and often unused.
(In reply to comment #42) > There are various heuristics that can be implemented > to choose a cache size based on available memory > and (a priori known) working set size. The important point is that > Berkeley DB has a resource cap that is honored. How the resource > cap is chosen involves factors other than performance. > > DB_CONFIG is most definitely documented (and has > nothing whatsoever to do with RPM): > > http://www.oracle.com/technology/documentation/berkeley-db/db/programmer_reference/env_db_config.html > Surely I've googled this easily - it's just that until you provided hint in comment 25 I had no idea I should actually care about that - that why I'd like to either get some hint from rpm man page, that user should play with this variable or rpm should provide reasonable good defaults. > Alas, RPM started using Berkeley DB before DB_CACHE was implemented > and has never bothered to change. > > Why is MAP_POPULATE preferred? mmap without this flag is actually quite slow - as there is no read-ahead - and unless rpm would use a separate smart thread for this - it would lose a lot of performance (as has been shown in comment 16. > I've missed 30% performance decrease. What is needed is some hint > about the underlying cause in order to attempt a fix. I don't argue > with wall clock. OTOH, I can't fix anything based solely on wall clock, > the underlying cause needs to be identified. > > No matter what: rpm --rebuilddb "fixes" degradation. And the degradation > isn't orders of magnitude for either size of performance. Well - yes - degradation isn't all that big - but it all sums up - and makes the upgrade time quite horrible - yes - yum is king here - but I'd like to start somewhere and see some progress ;)
(In reply to comment #44) > Ah, fault ahead is why MAP_POPULATE is important. > > Note that RPM achieved a 10% increase in upgrade performance > (measured by wallclock ;-) by changing mmap(2) flags in 3 > places as well as using a larger (16Mb but 256 Kb had most of the gain) > buffer for zlib way back when. > > Linux kernels have an astonishing amount of read ahead > bandwidth available and often unused. Great, you have noticed yourself before I've finished my comment ;) The flag is relatively new - and not available on older systems.
Sadly, in almost all cases for RPM, MAP_POPULATE for read-ahead doesn't help much. Decompression and digest checking tend to dominate CPU usage, and its write performance (where madvise(MADV_DONTNEED) was the win) that tends to dominate I/O performance. My guess is that MAP_POPULATE benefits would be mostly seen with quicker startup for simple queries. But there's other factors, such as redundant lookups, and the marshalling issues you see in the callgrind, that likely make MAP_POPULATE unimportant. But, by all means, show me the wall clock move faster ;-)
Well that's the problem - I'd enjoy to see rpm dominating my CPU :) but so far it takes 0.5s and the rest is iowait for uncached case. Once it will be CPU bound (for rpm-4.7) it would be the right time to focus on other factors :)
Created attachment 372657 [details] rpmdigest --alldigests /bin/bash hehe. Hint: a single #pragma and -fopenmp build option sped rpmdigest up by 7x That's where MAP_POPULATE will be a win. PIGZ/PBZIP2 parallel I/O are already staged for deployment, mutexes on all RPM objects needed to be stabilized first. Stability was achieved mid-summer.
FYI: http://rpm5.org/community/rpm-devel/4014.html Queries with patterns but still loading headers. When I lose the headerLoad(), I expect approx an order of magnitude faster, and perhaps 3 orders of magnitude less data to read, for rpm -qa. Course that won't help "rpm dominating your CPU" ;-)
This message is a reminder that Fedora 12 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 12. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '12'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 12's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 12 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Still applies to rawhide - unsure - why nobody cares....
This package has changed ownership in the Fedora Package Database. Reassigning to the new owner of this component.
This bug appears to have been reported against 'rawhide' during the Fedora 19 development cycle. Changing version to '19'. (As we did not run this process for some time, it could affect also pre-Fedora 19 development cycle bugs. We are very sorry. It will help us with cleanup during Fedora 19 End Of Life. Thank you.) More information and reason for this action is here: https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Fedora19
Rpm >= 4.10 (in Fedora >= 18) has significant optimizations to header loading (in particular from the rpmdb) which is one of the bigger bottlenecks in speed, the rest tends to be in Berkely DB internals over which rpm has little say. Since there is no single actual *bug* here, I'm considering the case closed.
Well I'm not convinced this has ever been fixed - since event today - still being a user of T61 - but with SSD and throughput of ~250MB/s I get this: # echo 3 >/proc/sys/vm/drop_caches # time rpm --nosignature --nodigest -qa | wc -l 3194 real 0m8.205s user 0m0.697s sys 0m2.817s # time rpm --nosignature --nodigest -qa | wc -l 3194 real 0m0.680s user 0m0.333s sys 0m0.337s As can be seen - the second read is significantly faster. Combine cat & rpm: # echo 3 >/proc/sys/vm/drop_caches # time ( cat /var/lib/rpm/Packages >/dev/null; rpm --nosignature --nodigest -qa | wc -l ) 3194 real 0m2.080s user 0m0.357s sys 0m0.860s # rpm -qa rpm rpm-4.11.0.1-4.fc20.x86_64 But since being a user of SSD - I'm not really depressed with the performance of rpm/yum nowadays - I'm rather stressed with pointless rewrites of multiple GB of data with the same with often rawhide mass rebuild upgrades....