Bug 2038818
| Summary: | systemctl restart ipset extremely slow restoring large saved set | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | sobolev | |
| Component: | ipset | Assignee: | Phil Sutter <psutter> | |
| Status: | CLOSED WONTFIX | QA Contact: | Jiri Peska <jpeska> | |
| Severity: | unspecified | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | 8.1 | CC: | jpeska, sbrivio, todoleza | |
| Target Milestone: | rc | Keywords: | Triaged | |
| Target Release: | 8.8 | Flags: | pm-rhel:
mirror+
|
|
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 2043008 2047275 2052926 (view as bug list) | Environment: | ||
| Last Closed: | 2023-07-10 07:28:09 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 2043008 | |||
|
Description
sobolev
2022-01-10 08:17:47 UTC
Additional info:
Shell script /usr/libexec/ipset/ipset.start-stop from ipset-service-7.1-1.el8.noarch contains line
awk '/^(add|create) ('"${conflicts}"')/ { printf "%s ",$1; system("echo '${salt}'" $2 " | md5sum | head -c31"); $1=""; $2=""; print; next} {print}' "${merged}" > "${mangled}"
3 execve (bash, md5sum and head) per set entry is extremely inefficient.
Proposed patch:
--- ipset.start-stop.orig 2019-11-08 20:09:27.000000000 +0300
+++ ipset.start-stop 2022-01-10 01:30:05.908317776 +0300
@@ -257,7 +257,13 @@
CLEAN_FILES="${CLEAN_FILES} ${mangled}"
chmod 600 "${mangled}"
- awk '/^(add|create) ('"${conflicts}"')/ { printf "%s ",$1; system("echo '${salt}'" $2 " | md5sum | head -c31"); $1=""; $2=""; print; next} {print}' "${merged}" > "${mangled}"
+ cat "${merged}" > "${mangled}"
+ IFS='|'
+ for set in ${conflicts}; do
+ new_name=$(echo "${salt}${set}" | md5sum | head -c31)
+ sed -i -r -e "s/^(add|create) $set /\1 $new_name /" "${mangled}"
+ done
+ unset IFS
if ! ipset_restore "${mangled}"; then
err "Failed to restore configured sets"
exit 1
Hi! Thanks for your report. I tried to reproduce the problem, but don't quite see it. I'm testing with a huge set with over 20M entries: | # wc -l /etc/sysconfig/ipset.d/testset.set | 20137833 /etc/sysconfig/ipset.d/testset.set Restoring it takes over a minute: | # time ipset -f /etc/sysconfig/ipset.d/testset.set -\! restore | | real 1m7.409s | user 0m39.397s | sys 0m27.218s But the service script is not much slower: | # time systemctl restart ipset | | real 1m15.218s | user 0m0.006s | sys 0m0.007s Do you see a bigger difference between plain ipset restore and service restart with your data? (In reply to sobolev from comment #1) > Additional info: > Shell script /usr/libexec/ipset/ipset.start-stop from > ipset-service-7.1-1.el8.noarch contains line > > awk '/^(add|create) ('"${conflicts}"')/ { printf "%s ",$1; system("echo > '${salt}'" $2 " | md5sum | head -c31"); $1=""; $2=""; print; next} {print}' > "${merged}" > "${mangled}" > > 3 execve (bash, md5sum and head) per set entry is extremely inefficient. That's not the case: awk grabs lines beginning with 'add' or 'create' and manipulates only those. I assume you have few sets and many elements, not vice versa, right? > Proposed patch: > > --- ipset.start-stop.orig 2019-11-08 20:09:27.000000000 +0300 > +++ ipset.start-stop 2022-01-10 01:30:05.908317776 +0300 > @@ -257,7 +257,13 @@ > CLEAN_FILES="${CLEAN_FILES} ${mangled}" > chmod 600 "${mangled}" > > - awk '/^(add|create) ('"${conflicts}"')/ { printf "%s ",$1; > system("echo '${salt}'" $2 " | md5sum | head -c31"); $1=""; $2=""; print; > next} {print}' "${merged}" > "${mangled}" > + cat "${merged}" > "${mangled}" > + IFS='|' > + for set in ${conflicts}; do > + new_name=$(echo "${salt}${set}" | md5sum | head -c31) > + sed -i -r -e "s/^(add|create) $set /\1 $new_name /" > "${mangled}" > + done > + unset IFS > if ! ipset_restore "${mangled}"; then > err "Failed to restore configured sets" > exit 1 Did you test this patch? Does it have an effect on service restart time? Cheers, Phil Hi. The problem occurs only on service restart and only if the set is used by someone (for example, iptables). Restart can be divided into "stop" and "start". In this case, when the script executes a "stop", it cannot delete the set, since it is in use. When starting, seeing that a set with the same name already exists, the script creates a temporary set with a unique name containing the md5 sum and adds elements to it. And then it performs ipset swap. Adding elements to a temporary set is very slow due to >> 3 execve (bash, md5sum and head) per set entry is extremely inefficient. > That's not the case: awk grabs lines beginning with 'add' or 'create' and > manipulates only those. I assume you have few sets and many elements, not vice > versa, right? The problem occurs with one hash:ip set with ~200000 entries. Set is used by iptables rule during service restart. > Did you test this patch? Does it have an effect on service restart time? Yes, I applied this patch on our linux router and it significantly speed up the script (from ~15 minutes to ~10 seconds). The patch is not very efficient because it does inplace sed separately for each set in ${conflicts} (it would be better to use the sed script in a temporary file in one pass), but it works. OK, so I missed the fact that the problematic code does not run for
unreferenced sets and also that elements are prefixed "add" in save files - I
can only attribute this to coffee deficit. :(
Looking at the code, I found the whole checksum approach a bit too much, so I
got rid of it. Here's the diff:
------------------------------------8<-----------------------------------------
--- a/ipset.start-stop
+++ b/ipset.start-stop
@@ -235,41 +235,23 @@ load() {
return
fi
- # Find a salt for md5sum that makes names of saved sets unique
- salt=0
- while true; do
- unique=1
- IFS="
-"
- for set in $(${IPSET_BIN} list -n -t); do
- if grep -q "^create $(echo "${salt}${set}" | md5sum | head -c31) " "${
merged}"; then
- unique=0
- break
- fi
- done
- unset IFS
- [ ${unique} -eq 1 ] && break
- salt=$((salt + 1))
+ # Find a free prefix for temporary set names
+ pfx="__tmp$((idx = 0))_"
+ while ${IPSET_BIN} list -n -t | grep -q "^${pfx}"; do
+ pfx="__tmp$((idx += 1))_"
done
# Add sets, mangling names for conflicting sets
- mangled="$(mktemp -q /tmp/ipset.XXXXXX)"
- CLEAN_FILES="${CLEAN_FILES} ${mangled}"
- chmod 600 "${mangled}"
-
- awk '/^(add|create) ('"${conflicts}"')/ { printf "%s ",$1; system("echo '${salt}'" $2 " | md5sum | head -c31"); $1=""; $2=""; print; next} {print}' "${merged}" > "${mangled}"
- if ! ipset_restore "${mangled}"; then
+ sed -i -E -e "s/^(add|create) (${conflicts}) /\1 ${pfx}\2 /g" "${merged}"
+ if ! ipset_restore "${merged}"; then
err "Failed to restore configured sets"
exit 1
fi
- rm "${mangled}"
- CLEAN_FILES="${CLEAN_FILES%* ${mangled}}"
-
# Swap and delete old sets
IFS='|'
for set in ${conflicts}; do
- mangled="$(echo "${salt}${set}" | md5sum | head -c31)"
+ mangled="${pfx}${set}"
------------------------------------8<-----------------------------------------
Please give it a try if you find time to.
Stefano, what's your take on this? Was there a specific reason requiring the
checksumming in the first place?
Hi. I tested the patch in a VM. [root@ol8-test ~]# ipset list -t Name: test1 Type: hash:ip Revision: 4 Header: family inet hashsize 131072 maxelem 300000 Size in memory: 5235048 References: 1 Number of entries: 214372 Name: test2 Type: hash:ip Revision: 4 Header: family inet hashsize 1024 maxelem 65536 Size in memory: 168 References: 1 Number of entries: 1 Name: test3 Type: hash:ip Revision: 4 Header: family inet hashsize 1024 maxelem 65536 Size in memory: 216 References: 0 Number of entries: 2 [root@ol8-test ~]# iptables -S FORWARD -P FORWARD ACCEPT -A FORWARD -m set --match-set test1 dst -A FORWARD -m set --match-set test2 dst [root@ol8-test ~]# /usr/libexec/ipset/ipset.start-stop save [root@ol8-test ~]# time /usr/libexec/ipset/ipset.start-stop start real 18m15,710s user 7m17,382s sys 13m0,186s [root@ol8-test ipset]# cd /usr/libexec/ipset [root@ol8-test ipset]# patch -p1 < /root/bug2038818.patch patching file ipset.start-stop [root@ol8-test ipset]# time /usr/libexec/ipset/ipset.start-stop start real 0m2,250s user 0m1,153s sys 0m0,929s [root@ol8-test ipset]# ipset list -t Name: test1 Type: hash:ip Revision: 4 Header: family inet hashsize 131072 maxelem 300000 Size in memory: 5217768 References: 1 Number of entries: 214372 Name: test2 Type: hash:ip Revision: 4 Header: family inet hashsize 1024 maxelem 65536 Size in memory: 168 References: 1 Number of entries: 1 Name: test3 Type: hash:ip Revision: 4 Header: family inet hashsize 1024 maxelem 65536 Size in memory: 216 References: 0 Number of entries: 2 [root@ol8-test ipset]# time systemctl restart ipset real 0m2,213s user 0m0,011s sys 0m0,013s It seems you can change the status to fixed/verified. ) Thanks a lot! Thanks for verifying! Sorry for the delay, (In reply to Phil Sutter from comment #4) > Stefano, what's your take on this? It looks good to me, except that I'm not sure what happens if the original set names (without the suffix) are close to 31 characters long. > Was there a specific reason requiring the > checksumming in the first place? Yes, the reason was that we need to find unique names, but they can't exceed 31 characters -- that's why I didn't go with a prefix. I'm not sure there's a problem with the prefix, but I think it should be checked. (In reply to Stefano Brivio from comment #7) > (In reply to Phil Sutter from comment #4) > > Stefano, what's your take on this? > > It looks good to me, except that I'm not sure what happens if the original > set names (without the suffix) are close to 31 characters long. > > > Was there a specific reason requiring the > > checksumming in the first place? > > Yes, the reason was that we need to find unique names, but they can't exceed > 31 characters -- that's why I didn't go with a prefix. I'm not sure there's > a problem with the prefix, but I think it should be checked. Thanks for clarifying. I agree, my change breaks that corner-case for no good reason. I reviewed the options again and went with a patch based on the reporter's one in comment 1: --- a/ipset.start-stop +++ b/ipset.start-stop @@ -257,7 +257,13 @@ load() { CLEAN_FILES="${CLEAN_FILES} ${mangled}" chmod 600 "${mangled}" - awk '/^(add|create) ('"${conflicts}"')/ { printf "%s ",$1; system("echo '${salt}'" $2 " | md5sum | head -c31"); $1=""; $2=""; print; next} {print}' "${merged}" > "${mangled}" + cat "${merged}" > "${mangled}" + IFS='|' + for set in ${conflicts}; do + new_name=$(echo "${salt}${set}" | md5sum | head -c31) + echo "s/^(add|create) $set /\1 $new_name /" + done | sed -i -r -f - "${mangled}" + unset IFS if ! ipset_restore "${mangled}"; then err "Failed to restore configured sets" exit 1 The only change is to call sed just once, passing the substitution commands on stdin. WDYT, is this fine or did I miss something? (In reply to Phil Sutter from comment #8) > Thanks for clarifying. I agree, my change breaks that corner-case for no good > reason. I reviewed the options again and went with a patch based on the > reporter's one in comment 1: > > --- a/ipset.start-stop > +++ b/ipset.start-stop > @@ -257,7 +257,13 @@ load() { > CLEAN_FILES="${CLEAN_FILES} ${mangled}" > chmod 600 "${mangled}" > > - awk '/^(add|create) ('"${conflicts}"')/ { printf "%s ",$1; > system("echo '${salt}'" $2 " | md5sum | head -c31"); $1=""; $2=""; print; > next} {print}' "${merged}" > "${mangled}" > + cat "${merged}" > "${mangled}" > + IFS='|' > + for set in ${conflicts}; do > + new_name=$(echo "${salt}${set}" | md5sum | head -c31) > + echo "s/^(add|create) $set /\1 $new_name /" > + done | sed -i -r -f - "${mangled}" > + unset IFS > if ! ipset_restore "${mangled}"; then > err "Failed to restore configured sets" > exit 1 > > The only change is to call sed just once, passing the substitution commands > on > stdin. WDYT, is this fine or did I miss something? It looks good to me! Note that I haven't tested it though. (In reply to Stefano Brivio from comment #9) > It looks good to me! Note that I haven't tested it though. Thanks for the review! I tested it with two sets in use, sed seems fine with how I call it - good enough for CI I'd say. :) Due to missing QE capacity, this won't make it into RHEL8.6 - sorry for any inconveniences this causes. Missed RHEL8.7, proposing for RHEL8.8. After evaluating this issue, there are no plans to address it further or fix it in an upcoming release. Therefore, it is being closed. If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened. |