Bug 589821

Summary: Disable caching of program locations in the hash table?
Product: [Fedora] Fedora Reporter: Matt McCutchen <matt>
Component: bashAssignee: Roman Rakus <rrakus>
Status: CLOSED NOTABUG QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: low Docs Contact:
Priority: low    
Version: 12CC: rrakus, tsmetana
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-05-19 19:40:35 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Matt McCutchen 2010-05-07 01:49:44 UTC
Bash caches the locations of programs in a hash table to reduce the amount of path searching.  The hash table can be manipulated via the BASH_CMDS variable and the "hash" built-in command.

This cache can improve performance by reducing the amount of path searching, but it can also cause problems if it gets out of date.  If a program's location is cached and a new copy is installed earlier in the PATH, bash will continue to use the old copy until "hash -r" is run.  If the new copy is then uninstalled, attempts to execute that program name will fail rather than fall back to the old version, again until "hash -r" is run.  (The first issue is an unavoidable property of the caching mechanism, while the second could be fixed.)

On modern Fedora systems, actually searching the path is so fast that I wonder whether the performance benefits of leaving the cache enabled still outweigh the problems.  I ran a time trial on my Dell Latitude D620 running Fedora 12, using "true", which is in /bin, the 10th directory on my PATH:

$ time (enable -n true; for i in {1..10000}; do hash -r; /bin/true; done)

real	0m15.155s
user	0m3.671s
sys	0m12.713s
$ time (enable -n true; for i in {1..10000}; do hash -r; true; done)

real	0m16.156s
user	0m3.916s
sys	0m13.857s

("hash -r" is run in the second case to force the path search, and in the first case to ensure a fair comparison.)  The extra time to do 10000 path searches is about one second, so one path search is about 0.1 ms.  What do you think?

Comment 1 Roman Rakus 2010-05-07 11:10:26 UTC
Try to enable shopt checkhash. From the man page:
If this is set, Bash checks that a command found in the hash table exists before trying to execute it. If a hashed command no longer exists, a normal path search is performed.

Comment 2 Matt McCutchen 2010-05-07 15:35:20 UTC
Please consider the merits of my original proposal.

"checkhash" only addresses the failure to run a removed program, not the failure to find a newly installed program.

When I looked in the man page, I noticed another option "hashall" that can be turned off to disable the caching completely.  I did a new timing test using "hashall", with results similar to the previous test:

$ time (enable -n true; for i in {1..10000}; do true; done)

real	0m14.399s
user	0m3.148s
sys	0m12.817s
$ time (enable -n true; set +o hashall; for i in {1..10000}; do true; done)

real	0m16.709s
user	0m3.706s
sys	0m14.814s
$ time (enable -n true; for i in {1..10000}; do true; done)

real	0m15.623s
user	0m3.436s
sys	0m13.965s
$ time (enable -n true; set +o hashall; for i in {1..10000}; do true; done)

real	0m16.891s
user	0m4.119s
sys	0m14.719s

Unfortunately, there's no way to disable "hashall" system-wide without patching bash.  The SHELLOPTS environment variable can only turn options on, not off (pretty poor design).

Comment 3 Roman Rakus 2010-05-19 19:40:35 UTC
There can be some non-so-fast hw. For example CD, NFS mount... I'm not willing to change current behaviour (and upstream will not be either).