Bug 1741775

Summary: crypto_rfc4543_encrypt: BUG: sleeping function called from invalid context at kernel/rtmutex.c:883
Product: Red Hat Enterprise Linux 7 Reporter: xmu
Component: kernel-rtAssignee: Tom Rix <trix>
kernel-rt sub component: IPSec/Crypto QA Contact: xmu
Status: CLOSED ERRATA Docs Contact:
Severity: unspecified    
Priority: unspecified CC: bhu, haliu, jan.public, lgoncalv, mstowell, network-qe, qzhao, trix, williams
Version: 7.7   
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: kernel-rt-3.10.0-1110.rt56.1072.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-03-31 19:49:44 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1655690, 1715542    

Description xmu 2019-08-16 06:10:47 UTC
Description of problem:
[ 3083.769052] BUG: sleeping function called from invalid context at kernel/rtmutex.c:883
[ 3083.769054] in_atomic(): 1, irqs_disabled(): 0, pid: 17908, name: ping
[ 3083.769055] 1 lock held by ping/17908:
[ 3083.769064]  #0:  (sk_lock-AF_INET){+.+.+.}, at: [<ffffffff93b69d4f>] raw_sendmsg+0x66f/0xdb0
[ 3083.769067] CPU: 9 PID: 17908 Comm: ping Kdump: loaded Not tainted 3.10.0-1070.rt56.1030.el7.x86_64.debug #1
[ 3083.769068] Hardware name: IBM System x3650 M4 -[7915I33]-/00W2665, BIOS -[VVE124AUS-1.30]- 11/21/2012
[ 3083.769068] Call Trace:
[ 3083.769075]  [<ffffffff93c4b2be>] dump_stack+0x19/0x1b
[ 3083.769080]  [<ffffffff934db4ef>] __might_sleep+0x12f/0x1f0
[ 3083.769084]  [<ffffffff93c54aa4>] rt_spin_lock+0x24/0x60
[ 3083.769089]  [<ffffffff935fb910>] get_page_from_freelist+0x570/0xe20
[ 3083.769098]  [<ffffffff935fc456>] __alloc_pages_nodemask+0x296/0xcd0
[ 3083.769104]  [<ffffffff93527f31>] ? __lock_acquire+0xcb1/0x1d80
[ 3083.769114]  [<ffffffff936519de>] alloc_pages_current+0x10e/0x1a0
[ 3083.769116]  [<ffffffff935f56a4>] ? __get_free_pages+0x14/0x40
[ 3083.769120]  [<ffffffff935f56a4>] __get_free_pages+0x14/0x40
[ 3083.769123]  [<ffffffff93660f93>] kmalloc_order_trace+0x33/0x1b0
[ 3083.769126]  [<ffffffff9365e205>] ? get_partial_node.isra.39+0xe5/0x1a0
[ 3083.769129]  [<ffffffff93661519>] __kmalloc+0x409/0x4a0
[ 3083.769132]  [<ffffffff934e29bd>] ? get_parent_ip+0xd/0x50
[ 3083.769143]  [<ffffffffc07282e8>] gcmaes_encrypt.isra.5.constprop.11+0x98/0x4b0 [aesni_intel]
[ 3083.769148]  [<ffffffff937f7f7a>] ? sg_init_table+0x1a/0x40
[ 3083.769151]  [<ffffffff937f8ba4>] ? sg_init_one+0x24/0xa0
[ 3083.769153]  [<ffffffff937f7f7a>] ? sg_init_table+0x1a/0x40
[ 3083.769159]  [<ffffffffc0728753>] __generic_gcmaes_encrypt+0x53/0x70 [aesni_intel]
[ 3083.769163]  [<ffffffff93c5adcf>] ? add_preempt_count+0x1f/0x60
[ 3083.769188]  [<ffffffffc072748d>] generic_gcmaes_encrypt+0x5d/0x150 [aesni_intel]
[ 3083.769194]  [<ffffffffc07e2a87>] crypto_rfc4543_encrypt+0x57/0xb0 [gcm]
[ 3083.769200]  [<ffffffff9378b10b>] seqiv_aead_givencrypt+0xdb/0x140
[ 3083.769207]  [<ffffffffc050415c>] esp_output+0x3dc/0x7c6 [esp4]
[ 3083.769221]  [<ffffffff93bb8ae4>] xfrm_output_resume+0x184/0xc30
[ 3083.769225]  [<ffffffff93c5adcf>] ? add_preempt_count+0x1f/0x60
[ 3083.769229]  [<ffffffff9352373e>] ? put_lock_stats.isra.30+0xe/0x40
[ 3083.769232]  [<ffffffff93523ecc>] ? lock_release_holdtime.part.31+0xdc/0x1a0
[ 3083.769236]  [<ffffffff93bb960c>] xfrm_output+0x5c/0x1f0
[ 3083.769241]  [<ffffffff93ba6fc2>] xfrm4_output_finish+0x32/0x40
[ 3083.769243]  [<ffffffff93ba6cb7>] __xfrm4_output+0x47/0x1d0
[ 3083.769246]  [<ffffffff93ba7033>] xfrm4_output+0x63/0x180
[ 3083.769250]  [<ffffffff93ba6c70>] ? xfrm4_prepare_output+0x40/0x40
[ 3083.769255]  [<ffffffff93b388bb>] ip_local_out_sk+0x3b/0xe0
[ 3083.769258]  [<ffffffff93b3c836>] ip_send_skb+0x16/0x50
[ 3083.769261]  [<ffffffff93b3c8a3>] ip_push_pending_frames+0x33/0x40
[ 3083.769264]  [<ffffffff93b6a0df>] raw_sendmsg+0x9ff/0xdb0
[ 3083.769270]  [<ffffffff9352373e>] ? put_lock_stats.isra.30+0xe/0x40
[ 3083.769272]  [<ffffffff93523ecc>] ? lock_release_holdtime.part.31+0xdc/0x1a0
[ 3083.769277]  [<ffffffff93c5adcf>] ? add_preempt_count+0x1f/0x60
[ 3083.769279]  [<ffffffff9352373e>] ? put_lock_stats.isra.30+0xe/0x40
[ 3083.769281]  [<ffffffff93523ecc>] ? lock_release_holdtime.part.31+0xdc/0x1a0
[ 3083.769284]  [<ffffffff93b7ea90>] ? inet_sendmsg+0xc0/0x230
[ 3083.769287]  [<ffffffff93b7ead6>] inet_sendmsg+0x106/0x230
[ 3083.769289]  [<ffffffff93b7e9d5>] ? inet_sendmsg+0x5/0x230
[ 3083.769293]  [<ffffffff93aadd96>] sock_sendmsg+0xb6/0xf0
[ 3083.769297]  [<ffffffff9352373e>] ? put_lock_stats.isra.30+0xe/0x40
[ 3083.769301]  [<ffffffff93622fe1>] ? __might_fault+0x61/0xc0
[ 3083.769303]  [<ffffffff93529d48>] ? lock_release_non_nested+0x308/0x360
[ 3083.769312]  [<ffffffff93aae4f4>] SYSC_sendto+0x124/0x1d0
[ 3083.769318]  [<ffffffff9352373e>] ? put_lock_stats.isra.30+0xe/0x40
[ 3083.769322]  [<ffffffff93516409>] ? current_kernel_time+0x69/0xd0
[ 3083.769325]  [<ffffffff93526e2d>] ? trace_hardirqs_on_caller+0xfd/0x1c0
[ 3083.769328]  [<ffffffff93526efd>] ? trace_hardirqs_on+0xd/0x10
[ 3083.769331]  [<ffffffff93ab0d6e>] SyS_sendto+0xe/0x10
[ 3083.769334]  [<ffffffff93c5fa12>] system_call_fastpath+0x25/0x2a
[ 3083.769339]  [<ffffffff93c5f955>] ? system_call_after_swapgs+0xa2/0x13a
[ 3083.769342] ---------------------------
[ 3083.769343] | preempt count: 00000001 ]
[ 3083.769344] | 1-level deep critical section nesting:
[ 3083.769344] ----------------------------------------
[ 3083.769349] .. [<ffffffffc072747b>] .... generic_gcmaes_encrypt+0x4b/0x150 [aesni_intel]
[ 3083.769352] .....[<ffffffffc07e2a87>] ..   ( <= crypto_rfc4543_encrypt+0x57/0xb0 [gcm])
[ 3083.769352] 


# cat ipsec_netns_rfc4543.sh 
#!/bin/bash
set -x

# Test topo with one network
#         br0
# Host A --|-- Host B
#   0.1          0.2
# 2000::1     2000::2
MASK[4]="24"
BR0_IP[4]="192.168.0.254"
HA_IP[4]="192.168.0.1"
HB_IP[4]="192.168.0.2"

MASK[6]="64"
BR0_IP[6]="2000::254"
HA_IP[6]="2000::1"
HB_IP[6]="2000::2"
HA="ip netns exec ha"
HB="ip netns exec hb"

# clean env
ip netns del ha
ip netns del hb
modprobe -r veth
ip link show br0 && ip link set dev br0 down && ip link del dev br0

# start setup
ip netns add ha
ip netns add hb

ip link add br0 type bridge || brctl addbr br0
ip link add ha_veth0 netns ha type veth peer name ha_veth0_br
ip link add hb_veth0 netns hb type veth peer name hb_veth0_br

ip link set ha_veth0_br master br0 || brctl addif br0 ha_veth0_br
ip link set hb_veth0_br master br0 || brctl addif br0 hb_veth0_br

ip link set br0 up
ip link set ha_veth0_br up
ip link set hb_veth0_br up
$HA ip link set lo up
$HA ip link set ha_veth0 up
$HB ip link set lo up
$HB ip link set hb_veth0 up

# MTU testing
#$HB ip link set hb_veth0 mtu 1300
spi1='0x100'
spi2='0x101'
PROTO="proto esp"
#ALG="aead rfc4106(gcm(aes)) 0x8a60673197134fca0959564327a61f943accde0a 128"
ALG="aead rfc4543(gcm(aes)) 0x0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f 128"
IPSEC_MODE="transport"
for TEST_VER in 4 6 ; do
	# set ip
	ip addr add ${BR0_IP[$TEST_VER]}/${MASK[$TEST_VER]} dev br0
	$HA ip addr add ${HA_IP[$TEST_VER]}/${MASK[$TEST_VER]} dev ha_veth0
	$HB ip addr add ${HB_IP[$TEST_VER]}/${MASK[$TEST_VER]} dev hb_veth0

	# set ipsec
	$HA ip xfrm state add src ${HA_IP[$TEST_VER]} dst ${HB_IP[$TEST_VER]} spi $spi1 $PROTO $ALG mode $IPSEC_MODE sel src ${HA_IP[$TEST_VER]} dst ${HB_IP[$TEST_VER]}
	$HA ip xfrm state add src ${HB_IP[$TEST_VER]} dst ${HA_IP[$TEST_VER]} spi $spi2 $PROTO $ALG mode $IPSEC_MODE sel src ${HB_IP[$TEST_VER]} dst ${HA_IP[$TEST_VER]}
	$HA ip xfrm policy add dir out src ${HA_IP[$TEST_VER]} dst ${HB_IP[$TEST_VER]} tmpl src ${HA_IP[$TEST_VER]} dst ${HB_IP[$TEST_VER]} $PROTO mode $IPSEC_MODE
	$HA ip xfrm policy add dir in src ${HB_IP[$TEST_VER]} dst ${HA_IP[$TEST_VER]} tmpl src ${HB_IP[$TEST_VER]} dst ${HA_IP[$TEST_VER]} $PROTO mode $IPSEC_MODE level use

	$HB ip xfrm state add src ${HB_IP[$TEST_VER]} dst ${HA_IP[$TEST_VER]} spi $spi2 $PROTO $ALG mode $IPSEC_MODE sel src ${HB_IP[$TEST_VER]} dst ${HA_IP[$TEST_VER]}
	$HB ip xfrm state add src ${HA_IP[$TEST_VER]} dst ${HB_IP[$TEST_VER]} spi $spi1 $PROTO $ALG mode $IPSEC_MODE sel src ${HA_IP[$TEST_VER]} dst ${HB_IP[$TEST_VER]}
	$HB ip xfrm policy add dir out src ${HB_IP[$TEST_VER]} dst ${HA_IP[$TEST_VER]} tmpl src ${HB_IP[$TEST_VER]} dst ${HA_IP[$TEST_VER]} $PROTO mode $IPSEC_MODE
	$HB ip xfrm policy add dir in src ${HA_IP[$TEST_VER]} dst ${HB_IP[$TEST_VER]} tmpl src ${HA_IP[$TEST_VER]} dst ${HB_IP[$TEST_VER]} $PROTO mode $IPSEC_MODE level use

	$HB netserver
	sleep 1
	$HA ping -$TEST_VER -i 0.4 -c 5 ${HB_IP[$TEST_VER]}
	$HA netperf -H ${HB_IP[$TEST_VER]} -t TCP_STREAM -f m
	$HA netperf -H ${HB_IP[$TEST_VER]} -t UDP_STREAM -f m -- -m 32768 -s 128K -S 128K
	$HA netperf -H ${HB_IP[$TEST_VER]} -t SCTP_STREAM -f m -- -m 32768 -s 128K -S 128K
done
set +x

Version-Release number of selected component (if applicable):
3.10.0-1070.rt56.1030.el7.x86_64.debug

How reproducible:
always

Steps to Reproduce:
1. # sh -x ./ipsec_netns_rfc4543.sh
2.
3.

Actual results:


Expected results:


Additional info:
a beaker job for ipsec on kernel-rt-debug-3.10.0-1070.rt56.1030.el7 : https://beaker.engineering.redhat.com/recipes/7244686#task97813466,task97813467,task97813468,task97813472,task97813473,task97813474,task97813475,task97813476,task97813477,task97813481,task97813482,task97813483,task97813491,task97813492,task97813493
please check the dmesg whrn run RFC4106, RFC4543(ipv4 and ipv6).
For rfc4106:bz1700169

Comment 10 Tom Rix 2019-11-24 15:30:05 UTC
*** Bug 1700169 has been marked as a duplicate of this bug. ***

Comment 14 errata-xmlrpc 2020-03-31 19:49:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:1070