RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2038818 - systemctl restart ipset extremely slow restoring large saved set
Summary: systemctl restart ipset extremely slow restoring large saved set
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: ipset
Version: 8.1
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: 8.8
Assignee: Phil Sutter
QA Contact: Jiri Peska
URL:
Whiteboard:
Depends On:
Blocks: 2043008
TreeView+ depends on / blocked
 
Reported: 2022-01-10 08:17 UTC by sobolev
Modified: 2023-07-10 07:28 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2043008 2047275 2052926 (view as bug list)
Environment:
Last Closed: 2023-07-10 07:28:09 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-107255 0 None None None 2022-01-10 08:22:20 UTC

Description sobolev 2022-01-10 08:17:47 UTC
Description of problem:
I have large saved hash:ip set (>200000 entries).
systemctl restart ipset works more than 10 minutes and timed out.

Version-Release number of selected component (if applicable): ipset-service-7.1-1.el8.noarch


How reproducible:


Steps to Reproduce:
1. Create set with 200000+ entries
2. Save it - /usr/libexec/ipset/ipset.start-stop save
3. systemctl restart ipset

Actual results:
systemctl timed out on service start.


Expected results:
Set restored. Quickly.

Comment 1 sobolev 2022-01-10 08:18:24 UTC
Additional info:
Shell script /usr/libexec/ipset/ipset.start-stop from ipset-service-7.1-1.el8.noarch contains line

awk '/^(add|create) ('"${conflicts}"')/ { printf "%s ",$1; system("echo '${salt}'" $2 " | md5sum | head -c31"); $1=""; $2=""; print; next} {print}' "${merged}" > "${mangled}"

3 execve (bash, md5sum and head) per set entry is extremely inefficient.

Proposed patch:

--- ipset.start-stop.orig       2019-11-08 20:09:27.000000000 +0300
+++ ipset.start-stop    2022-01-10 01:30:05.908317776 +0300
@@ -257,7 +257,13 @@
        CLEAN_FILES="${CLEAN_FILES} ${mangled}"
        chmod 600 "${mangled}"
 
-       awk '/^(add|create) ('"${conflicts}"')/ { printf "%s ",$1; system("echo '${salt}'" $2 " | md5sum | head -c31"); $1=""; $2=""; print; next} {print}' "${merged}" > "${mangled}"
+       cat "${merged}" > "${mangled}"
+       IFS='|'
+       for set in ${conflicts}; do
+               new_name=$(echo "${salt}${set}" | md5sum | head -c31)
+               sed -i -r -e "s/^(add|create) $set /\1 $new_name /" "${mangled}"
+       done
+       unset IFS
        if ! ipset_restore "${mangled}"; then
                err "Failed to restore configured sets"
                exit 1

Comment 2 Phil Sutter 2022-01-19 16:41:32 UTC
Hi!

Thanks for your report. I tried to reproduce the problem, but don't quite see it. I'm testing with a huge set with over 20M entries:

| # wc -l /etc/sysconfig/ipset.d/testset.set
| 20137833 /etc/sysconfig/ipset.d/testset.set

Restoring it takes over a minute:

| # time ipset -f /etc/sysconfig/ipset.d/testset.set -\! restore
| 
| real	1m7.409s
| user	0m39.397s
| sys	0m27.218s

But the service script is not much slower:

| # time systemctl restart ipset
| 
| real	1m15.218s
| user	0m0.006s
| sys	0m0.007s

Do you see a bigger difference between plain ipset restore and service restart
with your data?

(In reply to sobolev from comment #1)
> Additional info:
> Shell script /usr/libexec/ipset/ipset.start-stop from
> ipset-service-7.1-1.el8.noarch contains line
> 
> awk '/^(add|create) ('"${conflicts}"')/ { printf "%s ",$1; system("echo
> '${salt}'" $2 " | md5sum | head -c31"); $1=""; $2=""; print; next} {print}'
> "${merged}" > "${mangled}"
> 
> 3 execve (bash, md5sum and head) per set entry is extremely inefficient.

That's not the case: awk grabs lines beginning with 'add' or 'create' and
manipulates only those. I assume you have few sets and many elements, not vice
versa, right?

> Proposed patch:
> 
> --- ipset.start-stop.orig       2019-11-08 20:09:27.000000000 +0300
> +++ ipset.start-stop    2022-01-10 01:30:05.908317776 +0300
> @@ -257,7 +257,13 @@
>         CLEAN_FILES="${CLEAN_FILES} ${mangled}"
>         chmod 600 "${mangled}"
>  
> -       awk '/^(add|create) ('"${conflicts}"')/ { printf "%s ",$1;
> system("echo '${salt}'" $2 " | md5sum | head -c31"); $1=""; $2=""; print;
> next} {print}' "${merged}" > "${mangled}"
> +       cat "${merged}" > "${mangled}"
> +       IFS='|'
> +       for set in ${conflicts}; do
> +               new_name=$(echo "${salt}${set}" | md5sum | head -c31)
> +               sed -i -r -e "s/^(add|create) $set /\1 $new_name /"
> "${mangled}"
> +       done
> +       unset IFS
>         if ! ipset_restore "${mangled}"; then
>                 err "Failed to restore configured sets"
>                 exit 1

Did you test this patch? Does it have an effect on service restart time?

Cheers, Phil

Comment 3 sobolev 2022-01-19 17:36:43 UTC
Hi.

The problem occurs only on service restart and only if the set is used by someone (for example, iptables).
Restart can be divided into "stop" and "start". In this case, when the script executes a "stop", it cannot delete the set, since it is in use.
When starting, seeing that a set with the same name already exists, the script creates a temporary set with a unique name containing the md5 sum and adds elements to it.
And then it performs ipset swap.
Adding elements to a temporary set is very slow due to

>> 3 execve (bash, md5sum and head) per set entry is extremely inefficient.

> That's not the case: awk grabs lines beginning with 'add' or 'create' and
> manipulates only those. I assume you have few sets and many elements, not vice
> versa, right?

The problem occurs with one hash:ip set with ~200000 entries. Set is used by iptables rule during service restart.

> Did you test this patch? Does it have an effect on service restart time?

Yes, I applied this patch on our linux router and it significantly speed up the script (from ~15 minutes to ~10 seconds).

The patch is not very efficient because it does inplace sed separately for each set in ${conflicts} (it would be better to use the sed script in a temporary file in one pass), but it works.

Comment 4 Phil Sutter 2022-01-20 13:37:12 UTC
OK, so I missed the fact that the problematic code does not run for
unreferenced sets and also that elements are prefixed "add" in save files - I
can only attribute this to coffee deficit. :(

Looking at the code, I found the whole checksum approach a bit too much, so I
got rid of it. Here's the diff:

------------------------------------8<-----------------------------------------
--- a/ipset.start-stop
+++ b/ipset.start-stop
@@ -235,41 +235,23 @@ load() {
                return
        fi
 
-       # Find a salt for md5sum that makes names of saved sets unique
-       salt=0
-       while true; do
-               unique=1
-               IFS="
-"
-               for set in $(${IPSET_BIN} list -n -t); do
-                       if grep -q "^create $(echo "${salt}${set}" | md5sum | head -c31) " "${
merged}"; then
-                               unique=0
-                               break
-                       fi
-               done
-               unset IFS
-               [ ${unique} -eq 1 ] && break
-               salt=$((salt + 1))
+       # Find a free prefix for temporary set names
+       pfx="__tmp$((idx = 0))_"
+       while ${IPSET_BIN} list -n -t | grep -q "^${pfx}"; do
+               pfx="__tmp$((idx += 1))_"
        done
 
        # Add sets, mangling names for conflicting sets
-       mangled="$(mktemp -q /tmp/ipset.XXXXXX)"
-       CLEAN_FILES="${CLEAN_FILES} ${mangled}"
-       chmod 600 "${mangled}"
-
-       awk '/^(add|create) ('"${conflicts}"')/ { printf "%s ",$1; system("echo '${salt}'" $2 " | md5sum | head -c31"); $1=""; $2=""; print; next} {print}' "${merged}" > "${mangled}"
-       if ! ipset_restore "${mangled}"; then
+       sed -i -E -e "s/^(add|create) (${conflicts}) /\1 ${pfx}\2 /g" "${merged}"
+       if ! ipset_restore "${merged}"; then
                err "Failed to restore configured sets"
                exit 1
        fi
 
-       rm "${mangled}"
-       CLEAN_FILES="${CLEAN_FILES%* ${mangled}}"
-
        # Swap and delete old sets
        IFS='|'
        for set in ${conflicts}; do
-               mangled="$(echo "${salt}${set}" | md5sum | head -c31)"
+               mangled="${pfx}${set}"
------------------------------------8<-----------------------------------------

Please give it a try if you find time to.

Stefano, what's your take on this? Was there a specific reason requiring the
checksumming in the first place?

Comment 5 sobolev 2022-01-21 14:46:15 UTC
Hi.

I tested the patch in a VM.

[root@ol8-test ~]# ipset list -t
Name: test1
Type: hash:ip
Revision: 4
Header: family inet hashsize 131072 maxelem 300000
Size in memory: 5235048
References: 1
Number of entries: 214372

Name: test2
Type: hash:ip
Revision: 4
Header: family inet hashsize 1024 maxelem 65536
Size in memory: 168
References: 1
Number of entries: 1

Name: test3
Type: hash:ip
Revision: 4
Header: family inet hashsize 1024 maxelem 65536
Size in memory: 216
References: 0
Number of entries: 2

[root@ol8-test ~]# iptables -S FORWARD
-P FORWARD ACCEPT
-A FORWARD -m set --match-set test1 dst 
-A FORWARD -m set --match-set test2 dst

[root@ol8-test ~]# /usr/libexec/ipset/ipset.start-stop save

[root@ol8-test ~]# time /usr/libexec/ipset/ipset.start-stop start

real    18m15,710s
user    7m17,382s
sys     13m0,186s


[root@ol8-test ipset]# cd /usr/libexec/ipset

[root@ol8-test ipset]# patch -p1 < /root/bug2038818.patch 
patching file ipset.start-stop

[root@ol8-test ipset]# time /usr/libexec/ipset/ipset.start-stop start

real    0m2,250s
user    0m1,153s
sys     0m0,929s

[root@ol8-test ipset]# ipset list -t
Name: test1
Type: hash:ip
Revision: 4
Header: family inet hashsize 131072 maxelem 300000
Size in memory: 5217768
References: 1
Number of entries: 214372

Name: test2
Type: hash:ip
Revision: 4
Header: family inet hashsize 1024 maxelem 65536
Size in memory: 168
References: 1
Number of entries: 1

Name: test3
Type: hash:ip
Revision: 4
Header: family inet hashsize 1024 maxelem 65536
Size in memory: 216
References: 0
Number of entries: 2


[root@ol8-test ipset]# time systemctl restart ipset

real    0m2,213s
user    0m0,011s
sys     0m0,013s

It seems you can change the status to fixed/verified. )

Thanks a lot!

Comment 6 Phil Sutter 2022-01-21 15:57:50 UTC
Thanks for verifying!

Comment 7 Stefano Brivio 2022-01-28 15:07:29 UTC
Sorry for the delay,

(In reply to Phil Sutter from comment #4)
> Stefano, what's your take on this?

It looks good to me, except that I'm not sure what happens if the original set names (without the suffix) are close to 31 characters long.

> Was there a specific reason requiring the
> checksumming in the first place?

Yes, the reason was that we need to find unique names, but they can't exceed 31 characters -- that's why I didn't go with a prefix. I'm not sure there's a problem with the prefix, but I think it should be checked.

Comment 8 Phil Sutter 2022-01-31 14:39:32 UTC
(In reply to Stefano Brivio from comment #7)
> (In reply to Phil Sutter from comment #4)
> > Stefano, what's your take on this?
> 
> It looks good to me, except that I'm not sure what happens if the original
> set names (without the suffix) are close to 31 characters long.
> 
> > Was there a specific reason requiring the
> > checksumming in the first place?
> 
> Yes, the reason was that we need to find unique names, but they can't exceed
> 31 characters -- that's why I didn't go with a prefix. I'm not sure there's
> a problem with the prefix, but I think it should be checked.

Thanks for clarifying. I agree, my change breaks that corner-case for no good
reason. I reviewed the options again and went with a patch based on the
reporter's one in comment 1:

--- a/ipset.start-stop
+++ b/ipset.start-stop
@@ -257,7 +257,13 @@ load() {
        CLEAN_FILES="${CLEAN_FILES} ${mangled}"
        chmod 600 "${mangled}"
 
-       awk '/^(add|create) ('"${conflicts}"')/ { printf "%s ",$1; system("echo '${salt}'" $2 " | md5sum | head -c31"); $1=""; $2=""; print; next} {print}' "${merged}" > "${mangled}"
+       cat "${merged}" > "${mangled}"
+       IFS='|'
+       for set in ${conflicts}; do
+               new_name=$(echo "${salt}${set}" | md5sum | head -c31)
+               echo "s/^(add|create) $set /\1 $new_name /"
+       done | sed -i -r -f - "${mangled}"
+       unset IFS
        if ! ipset_restore "${mangled}"; then
                err "Failed to restore configured sets"
                exit 1

The only change is to call sed just once, passing the substitution commands on
stdin. WDYT, is this fine or did I miss something?

Comment 9 Stefano Brivio 2022-01-31 15:24:09 UTC
(In reply to Phil Sutter from comment #8)
> Thanks for clarifying. I agree, my change breaks that corner-case for no good
> reason. I reviewed the options again and went with a patch based on the
> reporter's one in comment 1:
> 
> --- a/ipset.start-stop
> +++ b/ipset.start-stop
> @@ -257,7 +257,13 @@ load() {
>         CLEAN_FILES="${CLEAN_FILES} ${mangled}"
>         chmod 600 "${mangled}"
>  
> -       awk '/^(add|create) ('"${conflicts}"')/ { printf "%s ",$1;
> system("echo '${salt}'" $2 " | md5sum | head -c31"); $1=""; $2=""; print;
> next} {print}' "${merged}" > "${mangled}"
> +       cat "${merged}" > "${mangled}"
> +       IFS='|'
> +       for set in ${conflicts}; do
> +               new_name=$(echo "${salt}${set}" | md5sum | head -c31)
> +               echo "s/^(add|create) $set /\1 $new_name /"
> +       done | sed -i -r -f - "${mangled}"
> +       unset IFS
>         if ! ipset_restore "${mangled}"; then
>                 err "Failed to restore configured sets"
>                 exit 1
> 
> The only change is to call sed just once, passing the substitution commands
> on
> stdin. WDYT, is this fine or did I miss something?

It looks good to me! Note that I haven't tested it though.

Comment 10 Phil Sutter 2022-01-31 15:46:37 UTC
(In reply to Stefano Brivio from comment #9)
> It looks good to me! Note that I haven't tested it though.

Thanks for the review! I tested it with two sets in use, sed seems fine with
how I call it - good enough for CI I'd say. :)

Comment 11 Phil Sutter 2022-01-31 16:39:11 UTC
Due to missing QE capacity, this won't make it into RHEL8.6 - sorry for any inconveniences this causes.

Comment 12 Phil Sutter 2022-08-31 17:14:26 UTC
Missed RHEL8.7, proposing for RHEL8.8.

Comment 14 RHEL Program Management 2023-07-10 07:28:09 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.


Note You need to log in before you can comment on or make changes to this bug.