Red Hat Bugzilla – Bug 467508
ls slower, due to capabilities
Last modified: 2008-10-27 12:31:08 EDT
Description of problem:
I'm not even 100% sure I should log this, but I recently got a coreutils update, and I've seen ls be "slow" a couple of times on large-ish directories. The configuration for ls specifically is:
/bin/ls --color=auto --sort=version -F -T 0 -ABFbhs
...now doing an strace shows lots of these calls between the write()s:
capget(0x20071026, 0, NULL) = -1 EFAULT (Bad address)
getxattr("20020513h.gif", "security.capability"..., 0x7fff4e8701e0, 20) = -1 ENODATA (No data available)
...generally being 4 sets of calls (matching the 4 rows of output for each line), and this can be so slow that you can watch the terminal scroll each line upwards ... of course that's for the "first" attempt, if you then try it immediately afterwards it's fast (presumably the capabilities for each inode is cached). Also even from a cold cache POV, it's not always slow.
As I said I'm not 100% sure this is an ls bug, and not a kernel bug or an ext3 "capabilities are slow" feature ... it's just I've only just starting seeing it since the coreutils update, so I thought I'd mention it.
Version-Release number of selected component (if applicable):
coreutils.x86_64 6.10-33.fc9 installed
(In reply to comment #0)
Thank you for the report.
> The configuration for ls specifically is:
> /bin/ls --color=auto --sort=version -F -T 0 -ABFbhs
Why so many parameters? (-F is even twice) An minimal example would be nice. Can you give me the output of this (1st run, 2nd run, old ls, new ls - 4 cases together)?
time ls -U1 --color > /dev/null # without sort, without output to terminal
Though I could not reproduce this bug, this is a really good point. There should be possible to turn off capability checking by unset the $LS_COLORS ca attribute (in the same way as symlink validity checking). I will try to propose this one-line patch to upstream.
It's actually a couple of aliases, so I just type "l" (ell) ... so I followed the aliases to pasted together what ls will see.
I hadn't realized -F was in my alises twice :)
Doing the timing is somewhat hard because I have to wait until the directory is out of cache again, to get the "slow" version. However first thing in the morning is a good time for at least one cold run, so (both new ls):
/bin/ls --color=auto --sort=version -F -T 0 -ABFbhs 0.01s user 0.15s system 3% cpu 4.345 total
/bin/ls --color=auto --sort=version -F -T 0 -ABFbhs 0.01s user 0.07s system 45% cpu 0.181 total
...doing the same with old-ls will be particularly hard, as I don't have that installed anymore.
(In reply to comment #2)
You can try the patch proposed to coreutils upstream:
It should behave as "old ls" if capabilities checking is disabled by $LS_COLORS.
By disabled you mean that:
dircolors -b ~/.dir_colors | perl -nle 'print for split ":"'
...does having anything starting with "ca=" yeh?
I can try that, although a test rpm would be easier?:)
(In reply to comment #4)
> By disabled you mean that:
> dircolors -b ~/.dir_colors | perl -nle 'print for split ":"'
I am not familiar with perl, this works for me:
$ eval `dircolors -b | sed s/ca=[^:]*:/ca=:/`
> ...does having anything starting with "ca=" yeh?
It is not set in /etc/DIR_COLORS on F-9 because of tcsh Bug 457342, but you can get this attribute from dircolors utility.
> I can try that, although a test rpm would be easier?:)
Test rpm will be available soon, we are still waiting for upstream reaction...
If I run just dircolors on Fedora 9 I get a "ca=" line. I have my own ~/.dir_colors file though, which doesn't have that.
Feel free to post the test rpm, when it's ready and I'll try that. Thanks.
fwiw, maybe more useful for some kernel guys, stracing some cold dirs. I can see that it's the getxattr("blah", "security.capability", ...) calls that are taking all the time.
Roughly one call in every 5 takes about 0.015 of a second, with the rest taking around 0.00018, so if you have 2,000 files in a directory you quickly get a lot of latency ((2000 / 5) * 0.015).
This implies, to me, that the kernel isn't doing the correct amount of readahead on the getxattr data (ie. not growing the number of xattrs it tries to read at once).
Predictably after the first cold run, all the getxattr calls are in the 0.00005 = 0.00015 range.
The patch was accepted by upstream:
You can test the rpm from this scratch build:
Feel free to open/clone the bug against libcap/kernel if your think there is a performance problem. This patch is only way to stop capabilities checking, not to make it run faster.
Since it is accepted by upstream, built in Rawhide with possibility of disabling capability ls performance impact as coreutils-6.12-16.fc10 to have it for F-10. As Kamil said, this is only workaround for coreutils - feel free to open bug against libcap/kernel about this strange unpredictible performance issue.
coreutils.x86_64 6.10-33.1.fc9 installed
...doesn't work for me, strace still shows it doing getxattr calls.
% echo $COLORS
% dircolors -b ~/.dir_colors | perl -nle 'print for split ":"' | fgrep ca
Could you please attach the full content of $LS_COLORS?
Thanks. Could you try to add "ca=:" to the list and check with strace again?
yeh, adding that manually fixed it.
Fixed coreutils package is available for F-10, closing NEXTRELEASE.
So I just upgraded coreutils and util-linux-ng from rawhide, and it still has the same problem ... if you don't configure ca as off it still does the capabilities check?
Is the default for ca= not being there that it does/means something?
Default behaviour is to have colored capability - and this causes performance impact. Workaround was only about possibility to disable colorized capability... I guess you still have ~/.dir_colors or changed /etc/DIR_COLORS (%config(noreplace) file) without defined capability color - and therefore default from dircolors is used. Add line "CAPABILITY 00" there and it should be ok...
Maybe we want to change the default then?
Yeh, I have a ~/.dir_colors ... my worry is that other people will have these too, also not see any colour differences (almost no files have capabilities) and see the perf. impact.
I've added the CAPABILITY 00 in there now, so it wfm at least :)
Also random idea ... capabilities only make sense on exec, yeh? ... so how about we only check them for files with (mode * 0111) ?
As adding capability displaying to ls was requested internally by Red Hat security tool guys, I would like to keep it default. You are right that checking only on executables does make sense and I'm sure it will reduce performance impact even when not cached. What do you think, Kamil?
I think it would increase the performance, but at the cost of usability. Non-executable files with capabilities are unusual case which should be visible at first glance.
I did some investigation about the "performance impact", here are my results (coreutils-6.12-16.fc10):
# umount /home && mount /home
# mount | fgrep home
/dev/mapper/VolGroup00-LogVol02 on /home type ext3 (rw,usrquota,grpquota)
$ time ~/cvs/coreutils/devel/coreutils-6.12/src/ls -U1 --color huge_dir >/dev/null
| 1st run | 2nd run | 3rd run |
default colors | 24.4s | 0.8s | 0.8s |
ca=00 | 23.0s | 0.4s | 0.4s |
James, can you give me such results from your system?
> Non-executable files with capabilities are unusual case which should be visible at first glance.
I'm not disagreeing that's unusual ... but it is painful given the amount of work required, and that the unusual case is a noop. And there are other unusual cases which aren't checked/flagged.
As for timings, see the time results in comment #2 and the further explanation in comment #7 ... this is very consistent for me, doing capability checking adds roughly 0.015 of a second for every 5 files in a directory. A couple of directors I ls in have > 1,000 files in them ... at which point you can "see" that the ls output is slow (instead of being "instant" it looks like a fast printer).
So you have completely different results. If we consider only the 1st run (not cached) it is only about 5% slower with checking of capabilities on my box. I forgot to note that huge_dir contains 100,000 files.
Maybe you can ping me on #fedora-devel, the communication would be faster ;-)