Bug 467508 - ls slower, due to capabilities
Summary: ls slower, due to capabilities
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: coreutils
Version: 9
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Kamil Dudka
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-10-17 20:29 UTC by James Antill
Modified: 2008-10-27 16:31 UTC (History)
3 users (show)

Fixed In Version: coreutils-6.12-16.fc10
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-10-24 10:22:30 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description James Antill 2008-10-17 20:29:51 UTC
Description of problem:
 I'm not even 100% sure I should log this, but I recently got a coreutils update, and I've seen ls be "slow" a couple of times on large-ish directories. The configuration for ls specifically is:

/bin/ls --color=auto --sort=version -F -T 0 -ABFbhs

...now doing an strace shows lots of these calls between the write()s:

capget(0x20071026, 0, NULL)             = -1 EFAULT (Bad address)
getxattr("20020513h.gif", "security.capability"..., 0x7fff4e8701e0, 20) = -1 ENODATA (No data available)

...generally being 4 sets of calls (matching the 4 rows of output for each line), and this can be so slow that you can watch the terminal scroll each line upwards ... of course that's for the "first" attempt, if you then try it immediately afterwards it's fast (presumably the capabilities for each inode is cached). Also even from a cold cache POV, it's not always slow.

 As I said I'm not 100% sure this is an ls bug, and not a kernel bug or an ext3 "capabilities are slow" feature ... it's just I've only just starting seeing it since the coreutils update, so I thought I'd mention it.

Version-Release number of selected component (if applicable):
coreutils.x86_64                      6.10-33.fc9                      installed

Comment 1 Kamil Dudka 2008-10-20 13:20:57 UTC
(In reply to comment #0)
Thank you for the report.

> The configuration for ls specifically is:
> 
> /bin/ls --color=auto --sort=version -F -T 0 -ABFbhs
Why so many parameters? (-F is even twice) An minimal example would be nice. Can you give me the output of this (1st run, 2nd run, old ls, new ls - 4 cases together)?
time ls -U1 --color > /dev/null # without sort, without output to terminal

Though I could not reproduce this bug, this is a really good point. There should be possible to turn off capability checking by unset the $LS_COLORS ca attribute (in the same way as symlink validity checking). I will try to propose this one-line patch to upstream.

Comment 2 James Antill 2008-10-20 13:44:01 UTC
It's actually a couple of aliases, so I just type "l" (ell) ... so I followed the aliases to pasted together what ls will see.
 I hadn't realized -F was in my alises twice :)

Doing the timing is somewhat hard because I have to wait until the directory is out of cache again, to get the "slow" version. However first thing in the morning is a good time for at least one cold run, so (both new ls):

/bin/ls --color=auto --sort=version -F -T 0 -ABFbhs  0.01s user 0.15s system 3% cpu 4.345 total

/bin/ls --color=auto --sort=version -F -T 0 -ABFbhs  0.01s user 0.07s system 45% cpu 0.181 total

...doing the same with old-ls will be particularly hard, as I don't have that installed anymore.

Comment 3 Kamil Dudka 2008-10-20 15:39:11 UTC
(In reply to comment #2)
You can try the patch proposed to coreutils upstream:
http://lists.gnu.org/archive/html/bug-coreutils/2008-10/msg00242.html

It should behave as "old ls" if capabilities checking is disabled by $LS_COLORS.

Comment 4 James Antill 2008-10-20 16:24:14 UTC
 By disabled you mean that:

dircolors -b ~/.dir_colors | perl -nle 'print for split ":"'

...does having anything starting with "ca=" yeh?

 I can try that, although a test rpm would be easier?:)

Comment 5 Kamil Dudka 2008-10-20 16:42:27 UTC
(In reply to comment #4)
>  By disabled you mean that:
> 
> dircolors -b ~/.dir_colors | perl -nle 'print for split ":"'
I am not familiar with perl, this works for me:
$ eval `dircolors -b | sed s/ca=[^:]*:/ca=:/`

> ...does having anything starting with "ca=" yeh?
It is not set in /etc/DIR_COLORS on F-9 because of tcsh Bug 457342, but you can get this attribute from dircolors utility.

>  I can try that, although a test rpm would be easier?:)
Test rpm will be available soon, we are still waiting for upstream reaction...

Comment 6 James Antill 2008-10-20 17:13:17 UTC
If I run just dircolors on Fedora 9 I get a "ca=" line. I have my own ~/.dir_colors file though, which doesn't have that.

Feel free to post the test rpm, when it's ready and I'll try that. Thanks.

Comment 7 James Antill 2008-10-21 21:32:57 UTC
fwiw, maybe more useful for some kernel guys, stracing some cold dirs. I can see that it's the getxattr("blah", "security.capability", ...) calls that are taking all the time.
 Roughly one call in every 5 takes about 0.015 of a second, with the rest taking around 0.00018, so if you have 2,000 files in a directory you quickly get a lot of latency ((2000 / 5) * 0.015).
 This implies, to me, that the kernel isn't doing the correct amount of readahead on the getxattr data (ie. not growing the number of xattrs it tries to read at once).

Predictably after the first cold run, all the getxattr calls are in the 0.00005 = 0.00015 range.

Comment 8 Kamil Dudka 2008-10-22 10:45:30 UTC
The patch was accepted by upstream:
http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=f3f1ccfd871ee395e7fafc051c1b7dedb39fdfc9

You can test the rpm from this scratch build:
http://koji.fedoraproject.org/koji/taskinfo?taskID=895188

Feel free to open/clone the bug against libcap/kernel if your think there is a performance problem. This patch is only way to stop capabilities checking, not to make it run faster.

Comment 9 Ondrej Vasik 2008-10-22 11:53:29 UTC
Since it is accepted by upstream, built in Rawhide with possibility of disabling capability ls performance impact as coreutils-6.12-16.fc10 to have it for F-10. As Kamil said, this is only workaround for coreutils - feel free to open bug against libcap/kernel about this strange unpredictible performance issue.

Comment 10 James Antill 2008-10-22 13:44:39 UTC
Installed Packages
coreutils.x86_64                     6.10-33.1.fc9                     installed

...doesn't work for me, strace still shows it doing getxattr calls.

% echo $COLORS                            
/home/james/.dir_colors
% dircolors -b ~/.dir_colors | perl -nle 'print for split ":"' | fgrep ca
%

Comment 11 Kamil Dudka 2008-10-23 09:49:16 UTC
Could you please attach the full content of $LS_COLORS?

Comment 12 James Antill 2008-10-23 13:33:39 UTC
no=00:fi=00:di=01;34:ln=01;36:pi=40;33:so=01;35:bd=40;33;01:cd=40;33;01:or=01;05;37;41:mi=01;05;37;41:ex=01;32:*.cmd=01;32:*.exe=01;32:*.com=01;32:*.btm=01;32:*.bat=01;32:*.sh=01;32:*.pl=01;32:*.py=01;32:*.csh=01;32:*.conf=32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.gz=01;31:*.bz2=01;31:*.bz=01;31:*.tz=01;31:*.rpm=01;31:*.cpio=01;31:*.jpg=01;35:*.gif=01;35:*.bmp=01;35:*.xbm=01;35:*.xpm=01;35:*.png=01;35:*.tif=01;35:

Comment 13 Kamil Dudka 2008-10-23 14:57:05 UTC
Thanks. Could you try to add "ca=:" to the list and check with strace again?

Comment 14 James Antill 2008-10-23 17:42:39 UTC
yeh, adding that manually fixed it.

Comment 15 Kamil Dudka 2008-10-24 10:22:30 UTC
Fixed coreutils package is available for F-10, closing NEXTRELEASE.

Comment 16 James Antill 2008-10-24 13:21:06 UTC
So I just upgraded coreutils and util-linux-ng from rawhide, and it still has the same problem ... if you don't configure ca as off it still does the capabilities check?
 Is the default for ca= not being there that it does/means something?

Comment 17 Ondrej Vasik 2008-10-24 14:55:58 UTC
Default behaviour is to have colored capability - and this causes performance impact. Workaround was only about possibility to disable colorized capability... I guess you still have ~/.dir_colors or changed /etc/DIR_COLORS (%config(noreplace) file) without defined capability color - and therefore default from dircolors is used. Add line "CAPABILITY 00" there and it should be ok...

Comment 18 James Antill 2008-10-24 15:08:30 UTC
Maybe we want to change the default then?

Yeh, I have a ~/.dir_colors ... my worry is that other people will have these too, also not see any colour differences (almost no files have capabilities) and see the perf. impact.
I've added the CAPABILITY 00 in there now, so it wfm at least :)

Also random idea ... capabilities only make sense on exec, yeh? ... so how about we only check them for files with (mode * 0111) ?

Comment 19 Ondrej Vasik 2008-10-24 16:10:04 UTC
As adding capability displaying to ls was requested internally by Red Hat security tool guys, I would like to keep it default. You are right that checking only on executables does make sense and I'm sure it will reduce performance impact even when not cached. What do you think, Kamil?

Comment 20 Kamil Dudka 2008-10-27 15:46:06 UTC
I think it would increase the performance, but at the cost of usability. Non-executable files with capabilities are unusual case which should be visible at first glance.

I did some investigation about the "performance impact", here are my results (coreutils-6.12-16.fc10):

# umount /home && mount /home
# mount | fgrep home
/dev/mapper/VolGroup00-LogVol02 on /home type ext3 (rw,usrquota,grpquota)

$ time ~/cvs/coreutils/devel/coreutils-6.12/src/ls -U1 --color huge_dir >/dev/null

               | 1st run | 2nd run | 3rd run |
---------------|---------|---------|---------|
default colors |  24.4s  |   0.8s  |   0.8s  |
         ca=00 |  23.0s  |   0.4s  |   0.4s  |

James, can you give me such results from your system?

Comment 21 James Antill 2008-10-27 16:15:44 UTC
> Non-executable files with capabilities are unusual case which should be visible at first glance.

I'm not disagreeing that's unusual ... but it is painful given the amount of work required, and that the unusual case is a noop. And there are other unusual cases which aren't checked/flagged.

As for timings, see the time results in comment #2 and the further explanation in comment #7 ... this is very consistent for me, doing capability checking adds roughly 0.015 of a second for every 5 files in a directory. A couple of directors I ls in have > 1,000 files in them ... at which point you can "see" that the ls output is slow (instead of being "instant" it looks like a fast printer).

Comment 22 Kamil Dudka 2008-10-27 16:31:08 UTC
So you have completely different results. If we consider only the 1st run (not cached) it is only about 5% slower with checking of capabilities on my box. I forgot to note that huge_dir contains 100,000 files.

Maybe you can ping me on #fedora-devel, the communication would be faster ;-)


Note You need to log in before you can comment on or make changes to this bug.