Created attachment 1422750 [details]
Fix for the issue
kernel-3.10.0-862.el7 doesn't work with Libreswan IPsec when connection has options ikev2=insist and no esp= set and when kernel supports aesni.
Symptoms: Normal ping over IPsec works. If you increase ping size to bigger, like ping -s 1400 packet flow ends and won't restore.
/proc/net/xfrm_stat shows following error:
After checking diff from kernel-3.10.0-693.21.1.el7 and kernel-3.10.0-862.el7 and verifying changes with upstream Linux kernel I found out a wrong change causing this issue. The change was to change ivsize from 12 to 16 for aes(gcm).
This breaks compatibility with previous kernels and upstream kernel and every other implementation of aes_gcm.
Note: even two machines with kernel-3.10.0-862.el7 are affected by this issue so even two machines with 862 kernel can't talk together with aes_gcm if aesni-intel kernel module is loaded.
This is really bad issue and requires immediate fix. This issue breaks all IPsec tunnels using ikev2 with default esp options.
I'm really worried how this affects us as Libreswan upstream when Centos 7 updates to 7.5 kernel.
Please make sure there is fixing kernel update before that.
My fix patch doesn't fix whole issue. Even with patch applies I can trigger same issue so there is something less obvious broken with aesni aes-gcm and IPsec.
(In reply to Tuomo Soini from comment #3)
> My fix patch doesn't fix whole issue. Even with patch applies I can trigger
> same issue so there is something less obvious broken with aesni aes-gcm and
Do you mean the patch changing ivsize back to 12? If that's the case, after applying, did see the same issue or was another? And have you reproduced in the same way: bigger ICMP packet size?
Btw, could you also share the config options you've used to create the IPsec tunnel? Thus we can test it using almost the same environment.
Created attachment 1422909 [details]
Config to test the issue.
We tested with the iv size patch and it looked like it fixed the issue because I errorously used too small ping when testing patched kernel. So ivsize is not the only required fix for this.
VM on one end of the tunnel need to support aesni. And when you send big packets over tunnel traffic to both directions stop working. In our test case machine called "fi" has aes flag in /proc/cpuinfo
There is no esp=setting because aes_gcm is the default algo when ikev2=insist is used.
Any traffic which uses big packets (ping -s 1400), wget over tunnel, scp over tunnel etc causes immediate lockup of tunnel and visible error in
Also ping running on background when problem is triggered stops.
Originally we had hard time reproducing this because most of our testing vms don't have aes instruction set available.
Problem is also specific to aes_gcm, for example setting esp=aes128-sha2_512 works around the issue.
Triggering problem doesn't require ipv6 - but I created config as near to original one as possible.
just as a matter of logging the results, Paul Wouters informed us through email that you've tested the patch proposed by Sabrina Dubroca and the issue was solved and it wasn't related to the actual ivsize as you first thought (comment#7), is it right?
Correct. While I did not test with original ivsize 16.
We also tested that aes_gcm128 was not affected by the issue.
(In reply to Tuomo Soini from comment #9)
> Correct. While I did not test with original ivsize 16.
> We also tested that aes_gcm128 was not affected by the issue.
Right, well, the ivsize should not affect the results in this case.
Patch(es) committed on kernel repository and an interim kernel build is undergoing testing
Patch(es) available on kernel-3.10.0-875.el7
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.