Bug 1704353 - Parallelize /usr/lib/rpm/check-buildroot
Summary: Parallelize /usr/lib/rpm/check-buildroot
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: rpm
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Packaging Maintenance Team
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-04-29 15:42 UTC by Denys Vlasenko
Modified: 2019-06-13 12:44 UTC (History)
7 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2019-06-13 12:44:21 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Denys Vlasenko 2019-04-29 15:42:06 UTC
Kernel's rpm build process executes this command at a certain stage:

/usr/lib/rpm/check-buildroot

which is a shell script. At this point, kernel build tree has ~60 thousand files. The script greps almost all of these files:

find "$RPM_BUILD_ROOT" \! \( \
    -name '*.pyo' -o -name '*.pyc' -o -name '*.elc' -o -name '.packlist' \
    \) -type f -print0 | \
    LANG=C xargs -0r grep -F "$RPM_BUILD_ROOT" >$tmp

This runs grep on only one CPU, which is rather suboptimal. Our build machines usually have >10 CPUs.

Proposed improvement is to use xargs -P<num_cpus>, a-la:

find "$RPM_BUILD_ROOT" \! \( \
    -name '*.pyo' -o -name '*.pyc' -o -name '*.elc' -o -name '.packlist' \
    \) -type f -print0 | \
    LANG=C xargs -0r -P`nproc` grep -F "$RPM_BUILD_ROOT" >$tmp


Testing on a 8-thread Skylake CPU:

Ran both scripts on my /usr tree (17 gbytes, does not fit into RAM cache):
                Before:
17:23:50 UD.............. io:934m    0 forks:  11
17:23:55 UD.............. io:911m 300k forks:  18
17:24:00 S............... io:368m    0 forks:  45
17:24:05 S............... io:483m    0 forks:  32
17:24:10 U............... io:380m 184k forks:  37
17:24:15 UD.............. io:506m  64k forks:  69
17:24:20 SU.............. io:442m    0 forks:  78
17:24:25 UD.............. io:1.0g    0 forks:   3
17:24:30 UD.............. io:1.1g    0 forks:   3
17:24:35 UD.............. io:978m    0 forks:  23
17:24:40 UD.............. io:1.2g  36k forks:   2
17:24:45 UD.............. io:1.2g  72k forks:   4
17:24:50 UD.............. io:1.1g    0 forks:  11
17:24:55 UD.............. io:1.1g    0 forks:   6
17:25:00 UD.............. io:952m    0 forks:  14
17:25:05 ................ io:281m    0 forks:   2
real    1m15.273s
user    0m35.866s
sys     0m9.906s
                After:
17:25:25 SUUDDD.......... io:1.6g    0 forks:  29
17:25:30 SSUUDDD......... io:1.7g    0 forks: 177
17:25:35 SUUDDDD......... io:2.1g    0 forks:  85
17:25:40 SUUUDDDD........ io:2.0g    0 forks:  26
17:25:45 SUUUDDDD........ io:2.1g 4096 forks:  17
17:25:50 SUUDDD.......... io:1.9g  56k forks:  13
real    0m28.508s                                                                           
user    0m36.859s                                                                           
sys     0m10.688s

And on /usr/lib (9.2 gbytes, completely cached in RAM before the test):
                Before:                      
17:28:31 U............... io:   0    0 forks:   4
17:28:32 UU.............. io:   0    0 forks:   1
17:28:33 UU.............. io:   0  16k forks:   0
17:28:34 UU.............. io:   0    0 forks:   2
17:28:35 UU.............. io: 56k  80k forks:   7
17:28:36 UU.............. io:   0 132k forks:   1                           
17:28:37 UU.............. io:   0    0 forks:  19                           
17:28:38 UU.............. io:   0    0 forks:   2                           
17:28:39 UU.............. io:   0    0 forks:   1                           
17:28:40 UU.............. io:   0    0 forks:   2                           
17:28:41 UU.............. io:   0    0 forks:   1                                           
17:28:42 UU.............. io:   0    0 forks:  10
17:28:43 UU.............. io:   0    0 forks:   1
17:28:44 UU.............. io:   0    0 forks:   1
17:28:45 UU.............. io:   0    0 forks:   2
17:28:46 UU.............. io:   0    0 forks:   1
17:28:47 UU.............. io:   0    0 forks:   1
17:28:48 UU.............. io:   0    0 forks:  14
real    0m17.147s
user    0m15.670s
sys     0m1.555s
                After:
17:28:54 SUUUUUUUUUUUUUU. io:   0    0 forks:  33
17:28:55 SUUUUUUUUUUUUUUU io:   0    0 forks:  17
17:28:56 SUUUUUUUUUU..... io:   0    0 forks:  16
real    0m3.205s
user    0m19.408s
sys     0m1.869s

Comment 1 Florian Festi 2019-04-30 15:32:15 UTC
Upstream PR: https://github.com/rpm-software-management/rpm/pull/687

Comment 2 Panu Matilainen 2019-06-13 12:44:21 UTC
Fixed in rawhide as of rpm >= 4.14.90, thanks for the suggestion and sample code!


Note You need to log in before you can comment on or make changes to this bug.