Bug 1773216
| Summary: | speed up /usr/lib/rpm/redhat/brp-mangle-shebangs | ||
|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Denys Vlasenko <dvlasenk> |
| Component: | redhat-rpm-config | Assignee: | Florian Festi <ffesti> |
| Status: | CLOSED RAWHIDE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | rawhide | CC: | ajax, ffesti, fweimer, igor.raits, john.j5live, jonathan, j, mjw, packaging-team-maint, pmatilai, pmoravco, praiskup, vmukhame |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2019-12-03 14:35:38 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
This script also has another issue: unsafe/incorrect command expansion:
trim() {
printf '%s' "$*"
}
...
read shebang_line < "$f" || :
orig_shebang=$(trim $(echo "$shebang_line" | grep -Po "#!\K.*" || echo))
The "trimming", i.e. replacement of multiple spaces and removal of leading and trailing spaces, is achieved because "trim $(cmd)" construct has an unquoted $(), which is subject to word splitting.
BUT!
It is also subject to glob expansion - any ?s and *s will be attempted to be expanded as well - definitely NOT what we want!
I propose to replace this code
read shebang_line < "$f" || :
orig_shebang=$(trim $(echo "$shebang_line" | grep -Po "#!\K.*" || echo))
shebang="$orig_shebang"
if [ -n "$exclude_shebangs" ]; then
echo "$shebang" | grep -q -E "$exclude_shebangs" && continue
fi
if [ -n "$exclude_shebangs_from" ]; then
echo "$shebang" | grep -q -E -f "$exclude_shebangs_from" && continue
fi
if [ -z "$shebang" ]; then
echo >&2 "*** WARNING: $f is executable but has empty or no shebang, removing executable bit"
chmod -x "$f"
touch -d "$ts" "$f"
continue
elif [ -n "${shebang##/*}" ]; then
echo >&2 "*** ERROR: $f has shebang which doesn't start with '/' ($shebang)"
fail=1
continue
fi
with a code which avoids the expansion issue, and which does not spawn any subprocesses for string manipulations (at the cost of being more elaborate):
read shebang_line < "$f"
orig_shebang="${shebang_line#\#!}"
if [ "$orig_shebang" = "$shebang_line" ]; then
echo >&2 "*** WARNING: $f is executable but has no shebang, removing executable bit"
chmod -x "$f"
touch -d "$ts" "$f"
continue
fi
# Trim spaces
while shebang="${orig_shebang// / }"; [ "$shebang" != "$orig_shebang" ]; do
orig_shebang="$shebang"
done
# Treat "#! /path/to " as "#!/path/to"
orig_shebang="${orig_shebang# }"
shebang="$orig_shebang"
if [ -z "$shebang" ]; then
echo >&2 "*** WARNING: $f is executable but has empty shebang, removing executable bit"
chmod -x "$f"
touch -d "$ts" "$f"
continue
fi
while we are at it, running "stat" tool for every file: ts=$(stat -c %y "$f") is not necessary, can improve this by running it only when we know we will be using $ts value (it is used to "fix up" mtime of the files if we modify them). Moving to redhat-rpm-config where brp-mangle-shebangs comes from. Nice speed-ups. The rpm-team has other priorities at this time though, the best way to drive this forward is to submit a PR on redhat-rpm-config. Yes, please send PR to redhat-rpm-config. (In reply to Igor Gnatenko from comment #4) > Yes, please send PR to redhat-rpm-config. I don't understand the above. Do you want me to open another BZ/ticket/issue/whatever somewhere else? Where? redhat-rpm-config is special as the Fedora distgit is the actual upstream repository. So changes to redhat-rpm-config can be made my directly open pull requests against the Fedora package. PR merged now, closing. Thanks for the patch! |
For kernel builds, /usr/lib/rpm/redhat/brp-mangle-shebangs runs for 25 seconds. It's mainly caused by this loop executing "file" and "grep" for every file in the tree: fail=0 while IFS= read -r -d $'\0' f; do file -N --mime-type "$f" | grep -q -P ".+(?=: text/)" || continue ... done < <(find -executable -type f -print0) exit $fail The following replacement runs one "file" and one "grep" process to examine all files: find -executable -type f ! -path '*:*' ! -path $'*\n*' \ | file -N --mime-type -f - \ | grep -P ".+(?=: text/)" \ | { fail=0 while IFS= read -r line; do f=${line%%:*} ... done exit $fail } The complicated "find" invocation is skipping any files with colons or newlines in their names. Tested by me to work, and completes in 3 seconds.