Bug 504579 - [RFE]: re-use hashes for files with the same device and inode numbers
[RFE]: re-use hashes for files with the same device and inode numbers
Status: CLOSED WONTFIX
Product: Fedora
Classification: Fedora
Component: coreutils (Show other bugs)
rawhide
All Linux
low Severity low
: ---
: ---
Assigned To: Ondrej Vasik
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-06-08 06:50 EDT by Daniel Mach
Modified: 2012-12-11 07:24 EST (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-12-11 07:24:39 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Daniel Mach 2009-06-08 06:50:40 EDT
When md5sum is run on bunch of files, some of them might be hardlinked and I don't see any reason to compute hash on the same content again.

It should be possible to implement hash caching without significant code change:
- create a new function which will handle (dev, ino) -> hash cache
- call it instead of original function
- on cache hit, return cached hash
- otherwise call original function, compute hash, store it to cache and return
Comment 1 Ondrej Vasik 2009-06-08 10:31:21 EDT
It's quite easy to handle such thing with short wrapper shell script(just get info about file dev/inode (ls/find/stat/whatever) , sort it by device and inode, and call md5/shaxxxsum just for the case that device and inode differs from previous file (they are sorted, so you have hardlinks with same sums in the row, you just need to remember only last one file)). At the moment md5/shaxxxsum utitilities are not calling stat(2) - just fopen(3) - and in file desriptor structure there is AFAIK no information about dev/inode of the file - so IMHO adding cache/stat/dynamicmemoryallocation will make compact code of md5sum.c much more difficult to read for quite a small benefit for common users. I'll check the upstream opinion before working on it...

Note You need to log in before you can comment on or make changes to this bug.