Bug 2030476
| Summary: | Kernel 4.18.0-348.2.1 secpath_cache memory leak involving strongswan tunnel | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | kegbeach <ryan> | |
| Component: | kernel | Assignee: | Xin Long <lxin> | |
| kernel sub component: | Networking | QA Contact: | Jianlin Shi <jishi> | |
| Status: | CLOSED ERRATA | Docs Contact: | ||
| Severity: | urgent | |||
| Priority: | high | CC: | cye, hasingh, jiji, jishi, kzhang, lxin, mleitner, mtesar, nmurray, nyelle, pabeni, pasteur, prpatel, skamboj, sukulkar, xmu, yuma | |
| Version: | 8.5 | Keywords: | Triaged, ZStream | |
| Target Milestone: | rc | Flags: | pm-rhel:
mirror+
|
|
| Target Release: | --- | |||
| Hardware: | x86_64 | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | kernel-4.18.0-358.el8 | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 2047427 (view as bug list) | Environment: | ||
| Last Closed: | 2022-05-10 15:09:30 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 2047427 | |||
|
Description
kegbeach
2021-12-08 21:42:00 UTC
(In reply to kegbeach from comment #0) > Description of problem: ... > > Steps to Reproduce: > 1. setup strongswan site to site tunnel > 2. initiate connection and pass traffic > 3. keep an eye out on avail memory and watch secpath_cache steadily increase > using slabtop or cat /proc/slabinfo | grep secpath_cache As we don't really use strongswan on rhel, but libreswan instead, can you please provide a reproducer also including how to install and configure strongswan on RHEL to preoduce this issue if you want us to investigate this? Also, have you tested it on upstream kernel? Thanks. I have installed libreswan site to site tunnel in replace of strongswan and after a few hours the system starts losing memory with the exact same symptoms. Stopping libreswan (systemctl stop ipsec) halts secpath_cache from increasing and upon restarting the tunnel it begins to increase again. This is telling me the problem is not strongswan itself but something with the kernel. If you have a testbed for libreswan this will successfully reproduce the issue but if you would still like the configs I will send them over. The only kernel I have tried that has the issue is 4.18.0-348.2.1 so if there is another more recent kernel can you point me in the direction and I will re-test. Thanks (In reply to kegbeach from comment #2) > I have installed libreswan site to site tunnel in replace of strongswan and > after a few hours the system starts losing memory with the exact same > symptoms. Stopping libreswan (systemctl stop ipsec) halts secpath_cache from > increasing and upon restarting the tunnel it begins to increase again. This > is telling me the problem is not strongswan itself but something with the > kernel. If you have a testbed for libreswan this will successfully reproduce > the issue but if you would still like the configs I will send them over. > > The only kernel I have tried that has the issue is 4.18.0-348.2.1 so if > there is another more recent kernel can you point me in the direction and I > will re-test. > I can reproduce it with 'ip xfrm' cmds now, it's indeed a kernel problem. Thanks for reporting it. This leak was introduced by:
commit acc00ba5d8d48f8749572597b051b3e7ba9ab3ff
Author: Paolo Abeni <pabeni>
Date: Mon Sep 13 12:32:20 2021 +0200
net: re-initialize slow_gro flag at gro_list_prepare time
The leaked object was created in:
[<000000004241fc10>] kmem_cache_alloc+0x156/0x390
[<0000000053d8cf53>] secpath_dup+0x23/0x1d0
[<00000000a5fa59b1>] secpath_set+0x9f/0x160
[<00000000266babc4>] xfrm_input+0x29c/0x2850
[<0000000080081871>] xfrm4_esp_rcv+0x9f/0x190
[<000000004b63ecc5>] ip_protocol_deliver_rcu+0x5ae/0x7d0
[<000000005accc408>] ip_local_deliver_finish+0x222/0x330
[<0000000073afae7a>] ip_local_deliver+0x1a0/0x410
[<000000003af25303>] ip_rcv+0xa7d/0x123d
[<00000000b58cea8c>] __netif_receive_skb_core+0x2051/0x3330
[<000000007186f64a>] netif_receive_skb_internal+0xed/0x340
[<000000000c4ddbf8>] napi_gro_receive+0x27f/0x3c0
As before arriving in xfrm_input, skb->slow_gro is already 1; then in xfrm_input, it call gro_cells_receive() to start GRO again. However, in gro_list_prepare(), the slow_gro is set to 0 as skb's sk, dst, active_extensions and nfct are all NULL. Later when it comes to napi_skb_free_stolen_head() called by napi_skb_finish(), skb->sp is supposed to be freed in skb_ext_put(), but it's not as slow_gro is 0.
I'm thinking to fix this by also considering skb->sp when set slow_gro in gro_list_prepare():
diff --git a/net/core/dev.c b/net/core/dev.c
index d3f3336d3edf..0c87487f93b2 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5659,7 +5659,7 @@ static void gro_list_prepare(struct napi_struct *napi, struct sk_buff *skb)
/* RHEL-only: out-of-tree drivers build vs prior release don't set
* correctly the slow_gro flag, re-initialize it here
*/
- skb->slow_gro = !!(skb->sk || skb->_skb_refdst ||
+ skb->slow_gro = !!(skb->sk || skb->_skb_refdst || skb->sp ||
#ifdef CONFIG_SKB_EXTENSIONS
skb->active_extensions ||
#endif
Thanks.
(In reply to Xin Long from comment #6) > I'm thinking to fix this by also considering skb->sp when set slow_gro in > gro_list_prepare(): > > diff --git a/net/core/dev.c b/net/core/dev.c > index d3f3336d3edf..0c87487f93b2 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -5659,7 +5659,7 @@ static void gro_list_prepare(struct napi_struct *napi, > struct sk_buff *skb) > /* RHEL-only: out-of-tree drivers build vs prior release don't set > * correctly the slow_gro flag, re-initialize it here > */ > - skb->slow_gro = !!(skb->sk || skb->_skb_refdst || > + skb->slow_gro = !!(skb->sk || skb->_skb_refdst || skb->sp || > #ifdef CONFIG_SKB_EXTENSIONS > skb->active_extensions || > #endif I double checked the relevant code paths, and I think the above fix is the correct one! Thanks for catching it! tested with following steps:
client:
systemctl stop NetworkManager
ip addr add 192.168.4.2/24 dev ens1f0
ip link set ens1f0 up
ip xfrm state add src 192.168.4.2 dst 192.168.4.1 spi 0x1001 proto esp enc aes 0x0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f mode tunnel sel src 192.168.4.2 dst 192.168.4.1
ip xfrm state add src 192.168.4.1 dst 192.168.4.2 spi 0x1000 proto esp enc aes 0x0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f mode tunnel sel src 192.168.4.1 dst 192.168.4.2
ip xfrm policy add dir out src 192.168.4.2 dst 192.168.4.1 tmpl src 192.168.4.2 dst 192.168.4.1 proto esp mode tunnel
ip xfrm policy add dir in src 192.168.4.1 dst 192.168.4.2 tmpl src 192.168.4.1 dst 192.168.4.2 proto esp mode tunnel level use
cat /proc/slabinfo | grep secpath_cache
netserver
server:
systemctl stop NetworkManager
ip addr add 192.168.4.1/24 dev ens1f0
ip link set ens1f0 up
ip xfrm state add src 192.168.4.1 dst 192.168.4.2 spi 0x1000 proto esp enc aes 0x0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f mode tunnel sel src 192.168.4.1 dst 192.168.4.2
ip xfrm state add src 192.168.4.2 dst 192.168.4.1 spi 0x1001 proto esp enc aes 0x0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f mode tunnel sel src 192.168.4.2 dst 192.168.4.1
ip xfrm policy add dir out src 192.168.4.1 dst 192.168.4.2 tmpl src 192.168.4.1 dst 192.168.4.2 proto esp mode tunnel
ip xfrm policy add dir in src 192.168.4.2 dst 192.168.4.1 tmpl src 192.168.4.2 dst 192.168.4.1 proto esp mode tunnel level use
ping 192.168.4.2 -c 1
netperf -H 192.168.4.2 -t TCP_STREAM -l 120
reproduced on 4.18.0-348.2.1:
+ cat /proc/slabinfo
+ grep secpath_cache
secpath_cache 0 0 128 32 1 : tunables 0 0 0 : slabdata 0 0 0
+ netserver
after server run:
[root@wsfd-advnetlab19 bz2030476]# cat /proc/slabinfo | grep secpath_cache
secpath_cache 6181120 6181120 128 32 1 : tunables 0 0 0 : slabdata 193160 193160 0
[root@wsfd-advnetlab19 bz2030476]# uname -a
Linux wsfd-advnetlab19.anl.lab.eng.bos.redhat.com 4.18.0-348.2.1.el8_5.x86_64 #1 SMP Mon Nov 8 13:30:15 EST 2021 x86_64 x86_64 x86_64 GNU/Linux
Verified on 4.18.0-358:
[root@wsfd-advnetlab19 bz2030476]# uname -a
Linux wsfd-advnetlab19.anl.lab.eng.bos.redhat.com 4.18.0-358.el8.x86_64 #1 SMP Tue Dec 28 11:15:35 EST 2021 x86_64 x86_64 x86_64 GNU/Linux
after server run:
[root@wsfd-advnetlab19 bz2030476]# cat /proc/slabinfo | grep secpath_cache
secpath_cache 2400 2464 128 32 1 : tunables 0 0 0 : slabdata 77 77 0
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: kernel security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:1988 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 365 days |