Bug 1887805

Summary: RT kernel nodes report BUG: using smp_processor_id() in preemptible [00000000] code: handler18/6040
Product: Red Hat Enterprise Linux 8 Reporter: Marius Cornea <mcornea>
Component: kernel-rtAssignee: Networking Services Kernel Team bug triage <nst-kernel-bugs>
kernel-rt sub component: Networking QA Contact: Jianwen Ji <jiji>
Status: CLOSED DUPLICATE Docs Contact:
Severity: urgent    
Priority: urgent CC: atragler, bbreard, bhu, imcleod, jligon, kzhang, mleitner, nstielau, william.caban
Version: 8.2   
Target Milestone: rc   
Target Release: 8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-13 21:15:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Marius Cornea 2020-10-13 11:52:22 UTC
Description of problem:

RT kernel nodes report:

[ 1309.725742] BUG: using smp_processor_id() in preemptible [00000000] code: handler20/6042
[ 1309.725747] caller is flow_lookup.isra.15+0x2c/0xf0 [openvswitch]
[ 1309.725749] CPU: 3 PID: 6042 Comm: handler20 Not tainted 4.18.0-193.24.1.rt13.74.el8_2.dt1.x86_64 #1
[ 1309.725750] Hardware name: Red Hat KVM, BIOS 0.0.0 02/06/2015
[ 1309.725751] Call Trace:
[ 1309.725755]  dump_stack+0x5c/0x80
[ 1309.725757]  check_preemption_disabled+0xc4/0xd0
[ 1309.725762]  flow_lookup.isra.15+0x2c/0xf0 [openvswitch]
[ 1309.725767]  ovs_flow_tbl_lookup+0x3b/0x60 [openvswitch]
[ 1309.725772]  ovs_flow_cmd_new+0x2d8/0x430 [openvswitch]
[ 1309.725777]  ? do_execute_actions+0xc4/0xa20 [openvswitch]
[ 1309.725786]  genl_family_rcv_msg+0x1d7/0x410
[ 1309.725790]  ? migrate_enable+0x123/0x3a0
[ 1309.725794]  genl_rcv_msg+0x47/0x8c
[ 1309.725797]  ? __kmalloc_node_track_caller+0xff/0x2e0
[ 1309.725799]  ? genl_family_rcv_msg+0x410/0x410
[ 1309.725850]  netlink_rcv_skb+0x4c/0x120
[ 1309.725853]  genl_rcv+0x24/0x40
[ 1309.725855]  netlink_unicast+0x197/0x230
[ 1309.725858]  netlink_sendmsg+0x204/0x3d0
[ 1309.725862]  sock_sendmsg+0x4c/0x50
[ 1309.725864]  ___sys_sendmsg+0x29f/0x300
[ 1309.725867]  ? migrate_enable+0x123/0x3a0
[ 1309.725870]  ? ep_send_events_proc+0x8a/0x1f0
[ 1309.725872]  ? ep_scan_ready_list.constprop.23+0x237/0x260
[ 1309.725874]  ? rt_spin_unlock+0x23/0x40
[ 1309.725877]  ? ep_poll+0x1b3/0x390
[ 1309.725880]  ? __fget+0x72/0xa0
[ 1309.725882]  __sys_sendmsg+0x57/0xa0
[ 1309.725886]  do_syscall_64+0x87/0x1a0
[ 1309.725889]  entry_SYSCALL_64_after_hwframe+0x65/0xca
[ 1309.725891] RIP: 0033:0x7f58bd32bb07
[ 1309.725894] Code: c3 66 90 41 54 41 89 d4 55 48 89 f5 53 89 fb 48 83 ec 10 e8 eb ec ff ff 44 89 e2 48 89 ee 89 df 41 89 c0 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 35 44 89 c7 48 89 44 24 08 e8 24 ed ff ff 48
[ 1309.725895] RSP: 002b:00007f58b3f9ca80 EFLAGS: 00003293 ORIG_RAX: 000000000000002e
[ 1309.725898] RAX: ffffffffffffffda RBX: 000000000000001c RCX: 00007f58bd32bb07
[ 1309.725899] RDX: 0000000000000000 RSI: 00007f58b3f9cb10 RDI: 000000000000001c
[ 1309.725900] RBP: 00007f58b3f9cb10 R08: 0000000000000000 R09: 00007f58a80153e0
[ 1309.725901] R10: 00007f58a80149b8 R11: 0000000000003293 R12: 0000000000000000
[ 1309.725902] R13: 00007f58b3f9ebd0 R14: 00007f58b3f9cfc0 R15: 00007f58b3f9cb10


Version-Release number of selected component (if applicable):

4.18.0-193.24.1.rt13.74.el8_2.dt1.x86_64
4.6.0-rc.3

How reproducible:
100%

Steps to Reproduce:

1. Deploy via baremetal IPI flow by using 4.6.0-rc.3 with following 

MachineConfig:

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
  name: realtime-worker
spec:
  kernelType: realtime

2. After deployment copletes log in to one of the workers and check console or dmesg


Actual results:

Continuously reporting the bug below:

[ 1446.262496] BUG: using smp_processor_id() in preemptible [00000000] code: handler19/6041
[ 1446.262507] caller is flow_lookup.isra.15+0x2c/0xf0 [openvswitch]
[ 1446.262510] CPU: 2 PID: 6041 Comm: handler19 Not tainted 4.18.0-193.24.1.rt13.74.el8_2.dt1.x86_64 #1
[ 1446.262511] Hardware name: Red Hat KVM, BIOS 0.0.0 02/06/2015
[ 1446.262512] Call Trace:
[ 1446.262523]  dump_stack+0x5c/0x80
[ 1446.262528]  check_preemption_disabled+0xc4/0xd0
[ 1446.262534]  flow_lookup.isra.15+0x2c/0xf0 [openvswitch]
[ 1446.262540]  ovs_flow_tbl_lookup+0x3b/0x60 [openvswitch]
[ 1446.262544]  ovs_flow_cmd_new+0x2d8/0x430 [openvswitch]
[ 1446.262548]  ? __switch_to_asm+0x35/0x70
[ 1446.262550]  ? __switch_to_asm+0x41/0x70
[ 1446.262551]  ? __switch_to_asm+0x35/0x70
[ 1446.262563]  genl_family_rcv_msg+0x1d7/0x410
[ 1446.262569]  ? migrate_enable+0x123/0x3a0
[ 1446.262574]  genl_rcv_msg+0x47/0x8c
[ 1446.262579]  ? __kmalloc_node_track_caller+0xff/0x2e0
[ 1446.262581]  ? genl_family_rcv_msg+0x410/0x410
[ 1446.262584]  netlink_rcv_skb+0x4c/0x120
[ 1446.262587]  genl_rcv+0x24/0x40
[ 1446.262590]  netlink_unicast+0x197/0x230
[ 1446.262593]  netlink_sendmsg+0x204/0x3d0
[ 1446.262599]  sock_sendmsg+0x4c/0x50
[ 1446.262602]  ___sys_sendmsg+0x29f/0x300
[ 1446.262604]  ? ___sys_recvmsg+0x15e/0x1e0
[ 1446.262610]  ? ep_scan_ready_list.constprop.23+0x237/0x260
[ 1446.262612]  ? rt_spin_unlock+0x23/0x40
[ 1446.262618]  ? ep_poll+0x1b3/0x390
[ 1446.262623]  ? __fget+0x72/0xa0
[ 1446.262626]  __sys_sendmsg+0x57/0xa0
[ 1446.262631]  do_syscall_64+0x87/0x1a0
[ 1446.262634]  entry_SYSCALL_64_after_hwframe+0x65/0xca
[ 1446.262637] RIP: 0033:0x7f58bd32bb07
[ 1446.262640] Code: c3 66 90 41 54 41 89 d4 55 48 89 f5 53 89 fb 48 83 ec 10 e8 eb ec ff ff 44 89 e2 48 89 ee 89 df 41 89 c0 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 35 44 89 c7 48 89 44 24 08 e8 24 ed ff ff 48
[ 1446.262641] RSP: 002b:00007f58b9f8aa80 EFLAGS: 00003293 ORIG_RAX: 000000000000002e
[ 1446.262643] RAX: ffffffffffffffda RBX: 0000000000000040 RCX: 00007f58bd32bb07
[ 1446.262645] RDX: 0000000000000000 RSI: 00007f58b9f8ab10 RDI: 0000000000000040
[ 1446.262646] RBP: 00007f58b9f8ab10 R08: 0000000000000000 R09: 00007f58b9f8c390
[ 1446.262647] R10: 00000000e2990db8 R11: 0000000000003293 R12: 0000000000000000
[ 1446.262648] R13: 00007f58b9f8c338 R14: 00007f58b9f8afb0 R15: 00007f58b9f8ab10


Expected results:

No bugs are reported

Additional info:

Comment 1 Marius Cornea 2020-10-13 11:53:11 UTC
Found an upstream patch which seems related - https://lkml.org/lkml/2020/10/9/515

Comment 2 Micah Abbott 2020-10-13 14:26:59 UTC
Looks like a dupe of https://bugzilla.redhat.com/show_bug.cgi?id=1886109

Assigning to kernel-rt team for confirmation

Comment 3 Marcelo Ricardo Leitner 2020-10-13 21:15:56 UTC
(In reply to Micah Abbott from comment #2)
> Looks like a dupe of https://bugzilla.redhat.com/show_bug.cgi?id=1886109
> 
> Assigning to kernel-rt team for confirmation

Not kernel-rt here, but right you are :-)
Thanks.

*** This bug has been marked as a duplicate of bug 1886109 ***