Bug 1568167 - crypto aesni-intel aes(gcm) is broken for IPsec
Summary: crypto aesni-intel aes(gcm) is broken for IPsec
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: kernel
Version: 7.5
Hardware: All
OS: Linux
urgent
urgent
Target Milestone: rc
: ---
Assignee: Bruno Meneguele
QA Contact: xmu
URL:
Whiteboard:
Depends On:
Blocks: 1570537
TreeView+ depends on / blocked
 
Reported: 2018-04-16 22:01 UTC by Tuomo Soini
Modified: 2019-02-26 20:27 UTC (History)
9 users (show)

Fixed In Version: kernel-3.10.0-875.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1570537 (view as bug list)
Environment:
Last Closed: 2018-10-30 09:05:48 UTC


Attachments (Terms of Use)
Fix for the issue (590 bytes, patch)
2018-04-16 22:01 UTC, Tuomo Soini
no flags Details | Diff
Config to test the issue. (460 bytes, text/plain)
2018-04-17 05:51 UTC, Tuomo Soini
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2018:3083 None None None 2018-10-30 09:07:56 UTC

Description Tuomo Soini 2018-04-16 22:01:32 UTC
Created attachment 1422750 [details]
Fix for the issue

kernel-3.10.0-862.el7 doesn't work with Libreswan IPsec when connection has options ikev2=insist and no esp= set and when kernel supports aesni.

Symptoms: Normal ping over IPsec works. If you increase ping size to bigger, like ping -s 1400 packet flow ends and won't restore.


/proc/net/xfrm_stat shows following error:

XfrmInStateProtoError           397

After checking diff from kernel-3.10.0-693.21.1.el7 and kernel-3.10.0-862.el7 and verifying changes with upstream Linux kernel I found out a wrong change causing this issue. The change was to change ivsize from 12 to 16 for aes(gcm).

This breaks compatibility with previous kernels and upstream kernel and every other implementation of aes_gcm.

Note: even two machines with kernel-3.10.0-862.el7 are affected by this issue so even two machines with 862 kernel can't talk together with aes_gcm if aesni-intel kernel module is loaded.

This is really bad issue and requires immediate fix. This issue breaks all IPsec tunnels using ikev2 with default esp options.

I'm really worried how this affects us as Libreswan upstream when Centos 7 updates to 7.5 kernel.

Please make sure there is fixing kernel update before that.

Comment 3 Tuomo Soini 2018-04-16 22:23:45 UTC
My fix patch doesn't fix whole issue. Even with patch applies I can trigger same issue so there is something less obvious broken with aesni aes-gcm and IPsec.

Comment 4 Bruno Meneguele 2018-04-16 22:46:47 UTC
(In reply to Tuomo Soini from comment #3)
> My fix patch doesn't fix whole issue. Even with patch applies I can trigger
> same issue so there is something less obvious broken with aesni aes-gcm and
> IPsec.

Do you mean the patch changing ivsize back to 12? If that's the case, after applying, did see the same issue or was another? And have you reproduced in the same way: bigger ICMP packet size?

Comment 5 Bruno Meneguele 2018-04-16 22:55:14 UTC
Btw, could you also share the config options you've used to create the IPsec tunnel? Thus we can test it using almost the same environment.

Comment 6 Tuomo Soini 2018-04-17 05:51:01 UTC
Created attachment 1422909 [details]
Config to test the issue.

Comment 7 Tuomo Soini 2018-04-17 06:08:41 UTC
We tested with the iv size patch and it looked like it fixed the issue because I errorously used too small ping when testing patched kernel. So ivsize is not the only required fix for this.

VM on one end of the tunnel need to support aesni. And when you send big packets over tunnel traffic to both directions stop working. In our test case machine called "fi" has aes flag in /proc/cpuinfo

There is no esp=setting because aes_gcm is the default algo when ikev2=insist is used.

Any traffic which uses big packets (ping -s 1400), wget over tunnel, scp over tunnel etc causes immediate lockup of tunnel and visible error in

/proc/net/xfrm_stat

Also ping running on background when problem is triggered stops.

Originally we had hard time reproducing this because most of our testing vms don't have aes instruction set available.

Problem is also specific to aes_gcm, for example setting esp=aes128-sha2_512 works around the issue.

Triggering problem doesn't require ipv6 - but I created config as near to original one as possible.

Comment 8 Bruno Meneguele 2018-04-18 17:59:03 UTC
Hi Tuomo,

just as a matter of logging the results, Paul Wouters informed us through email that you've tested the patch proposed by Sabrina Dubroca and the issue was solved and it wasn't related to the actual ivsize as you first thought (comment#7), is it right?

Comment 9 Tuomo Soini 2018-04-18 18:23:54 UTC
Correct. While I did not test with original ivsize 16.

We also tested that aes_gcm128 was not affected by the issue.

Comment 10 Bruno Meneguele 2018-04-18 20:37:47 UTC
(In reply to Tuomo Soini from comment #9)
> Correct. While I did not test with original ivsize 16.
> 
> We also tested that aes_gcm128 was not affected by the issue.

Right, well, the ivsize should not affect the results in this case.
Thanks!

Comment 13 Bruno Meneguele 2018-04-22 20:36:04 UTC
Patch(es) committed on kernel repository and an interim kernel build is undergoing testing

Comment 16 Bruno Meneguele 2018-04-23 13:51:44 UTC
Patch(es) available on kernel-3.10.0-875.el7

Comment 21 errata-xmlrpc 2018-10-30 09:05:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:3083


Note You need to log in before you can comment on or make changes to this bug.