Bug 2231195
| Summary: | ice driver: kernel panic when running ovs dpdk pvp performance case of ovs3.2 | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux Fast Datapath | Reporter: | liting <tli> |
| Component: | openvswitch | Assignee: | Timothy Redaelli <tredaelli> |
| openvswitch sub component: | ovs-dpdk | QA Contact: | qding |
| Status: | NEW --- | Docs Contact: | |
| Severity: | unspecified | ||
| Priority: | unspecified | CC: | ctrautma, jhsiao, ktraynor |
| Version: | FDP 23.F | ||
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | Bug | |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
liting
2023-08-11 01:48:03 UTC
Access to the 740-57 system and create sosreport, but it is hung when setting up plugins.
[root@dell-per740-57 ~]# sos report --all-logs --batch
sosreport (version 4.5.5)
This command will collect diagnostic and configuration information from
this Red Hat Enterprise Linux system and installed applications.
An archive containing the collected information will be generated in
/var/tmp/sos.iroatpia and may be provided to a Red Hat support
representative.
Any information provided to Red Hat will be treated in accordance with
the published support policies at:
Distribution Website : https://www.redhat.com/
Commercial Support : https://access.redhat.com/
The generated archive may contain data considered sensitive and its
content should be reviewed by the originating organization before being
passed to any third party.
No changes will be made to system configuration.
Setting up archive ...
Setting up plugins ...
Reboot the system, it cannot reboot successfully, following is the console log.
[ **] A stop job is running for D-Bus Sys… Message Bus (14min 4s / no limit)
[43872.215389] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 7-... } 5329644 jiffies s: 1917 root: 0x1/.
[43872.226351] rcu: blocking rcu_node structures (internal RCU debug): l=1:0-15:0x80/.
[43872.234009] Task dump for CPU 7:
[43872.237240] task:kworker/u96:0 state:R running task stack: 0 pid: 11 ppid: 2 flags:0x00004008
[43872.247147] Workqueue: netns cleanup_net
[43872.251075] Call Trace:
[43872.253527] <TASK>
[43872.255633] ? cleanup_net+0x1d2/0x380
[43872.259385] ? process_one_work+0x1e2/0x3b0
[43872.263570] ? worker_thread+0x50/0x3a0
[43872.267409] ? rescuer_thread+0x390/0x390
[43872.271423] ? kthread+0xd6/0x100
[43872.274743] ? kthread_complete_and_exit+0x20/0x20
[43872.279536] ? ret_from_fork+0x1f/0x30
[ OK ] Stopped D-Bus System Message Bus.
[ OK ] Closed D-Bus System Message Bus Socket.
[ OK ] Stopped target System Initialization.
[43937.752574] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 7-... } 5395180 jiffies s: 1917 root: 0x1/.
[43937.763535] rcu: blocking rcu_node structures (internal RCU debug): l=1:0-15:0x80/.
[43937.771191] Task dump for CPU 7:
[43937.774425] task:kworker/u96:0 state:R running task stack: 0 pid: 11 ppid: 2 flags:0x00004008
[43937.784332] Workqueue: netns cleanup_net
[43937.788257] Call Trace:
[43937.790708] <TASK>
[43937.792817] ? cleanup_net+0x1d2/0x380
[43937.796570] ? process_one_work+0x1e2/0x3b0
[43937.800757] ? worker_thread+0x50/0x3a0
[43937.804596] ? rescuer_thread+0x390/0x390
[43937.808607] ? kthread+0xd6/0x100
[43937.811925] ? kthread_complete_and_exit+0x20/0x20
[43937.816721] ? ret_from_fork+0x1f/0x30
[43937.820474] </TASK>
[44003.289758] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 7-... } 5460716 jiffies s: 1917 root: 0x1/.
[44003.300717] rcu: blocking rcu_node structures (internal RCU debug): l=1:0-15:0x80/.
[44003.308378] Task dump for CPU 7:
[44003.311610] task:kworker/u96:0 state:R running task stack: 0 pid: 11 ppid: 2 flags:0x00004008
[44003.321514] Workqueue: netns cleanup_net
[44003.325440] Call Trace:
[44003.327894] <TASK>
[44003.330002] ? cleanup_net+0x1d2/0x380
[44003.333752] ? process_one_work+0x1e2/0x3b0
[44003.337942] ? worker_thread+0x50/0x3a0
[44003.341778] ? rescuer_thread+0x390/0x390
[44003.345790] ? kthread+0xd6/0x100
[44003.349109] ? kthread_complete_and_exit+0x20/0x20
[44003.353905] ? ret_from_fork+0x1f/0x30
[44003.357658] </TASK>
[44016.982936] rcu: INFO: rcu_preempt self-detected stall on CPU
[44016.988680] rcu: 7-....: (37320832 ticks this GP) idle=349/1/0x4000000000000000 softirq=485496/485496 fqs=9331035
[44016.999109] (t=37372962 jiffies g=5329081 q=1208124 ncpus=48)
[44017.004937] NMI backtrace for cpu 7
[44017.008430] CPU: 7 PID: 11 Comm: kworker/u96:0 Kdump: loaded Tainted: G I -------- --- 5.14.0-284.27.1.el9_2.x86_64 #1
[44017.020504] Hardware name: Dell Inc. PowerEdge R740/06WXJT, BIOS 2.12.2 07/09/2021
[44017.028069] Workqueue: netns cleanup_net
[44017.031997] Call Trace:
[44017.034450] <IRQ>
[44017.036470] dump_stack_lvl+0x34/0x48
[44017.040134] nmi_cpu_backtrace.cold+0x30/0x6f
[44017.044495] ? lapic_can_unplug_cpu+0x80/0x80
[44017.048855] nmi_trigger_cpumask_backtrace+0xef/0x110
[44017.053907] trigger_single_cpu_backtrace+0x2a/0x31
[44017.058784] rcu_dump_cpu_stacks+0xa7/0xe4
[44017.062887] print_cpu_stall.cold+0x4f/0x17d
[44017.067158] check_cpu_stall+0xe9/0x240
[44017.070997] rcu_pending+0x26/0x190
[44017.074489] rcu_sched_clock_irq+0x3d/0x180
[44017.078677] update_process_times+0x8c/0xc0
[44017.082862] tick_sched_handle+0x22/0x60
[44017.086787] tick_sched_timer+0x65/0x80
[44017.090627] ? tick_sched_do_timer+0xa0/0xa0
[44017.094900] __hrtimer_run_queues+0x127/0x2c0
[44017.099261] hrtimer_interrupt+0xfc/0x210
[44017.103274] __sysvec_apic_timer_interrupt+0x5c/0x110
[44017.108327] sysvec_apic_timer_interrupt+0x6d/0x90
[44017.113119] </IRQ>
[44017.115225] <TASK>
[44017.117329] asm_sysvec_apic_timer_interrupt+0x16/0x20
[44017.122471] RIP: 0010:__rcu_read_lock+0x1d/0x30
[44017.127003] Code: 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00 00 65 48 8b 14 25 40 8f 01 00 8b 82 f4 08 00 00 83 c0 01 89 82 f4 08 00 00 <c3> cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 0f 1f 44
[44017.145748] RSP: 0018:ffffb4f0003c7d68 EFLAGS: 00000202
[44017.150975] RAX: 0000000000000002 RBX: 0000000000000001 RCX: 0000000000000001
[44017.158107] RDX: ffff8daa04780000 RSI: ffffb4f0003c7e08 RDI: ffffffffac704180
[44017.165240] RBP: ffffb4f0003c7e08 R08: 0000000000000238 R09: 0000000020000000
[44017.172372] R10: ffffffffffffffff R11: ffffffffffffffc0 R12: ffffffffaadbf6d0
[44017.179506] R13: ffffffffffffffff R14: ffffffffade17da0 R15: ffffffffade17d54
[44017.186638] ? xas_find+0x1c0/0x1c0
[44017.190133] xa_find+0x5f/0xe0
[44017.193193] ? xas_find+0x1c0/0x1c0
[44017.196684] devlinks_xa_find_get.constprop.0+0x32/0x90
[44017.201910] devlink_pernet_pre_exit+0x3c/0xf0
[44017.206357] ? mutex_lock+0xe/0x30
[44017.209763] cleanup_net+0x1d2/0x380
[44017.213344] process_one_work+0x1e2/0x3b0
[44017.217354] worker_thread+0x50/0x3a0
[44017.221022] ? rescuer_thread+0x390/0x390
[44017.225032] kthread+0xd6/0x100
[44017.228180] ? kthread_complete_and_exit+0x20/0x20
[44017.232972] ret_from_fork+0x1f/0x30
[44017.236552] </TASK>
[44068.826942] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 7-... } 5526252 jiffies s: 1917 root: 0x1/.
[44068.837904] rcu: blocking rcu_node structures (internal RCU debug): l=1:0-15:0x80/.
[44068.845562] Task dump for CPU 7:
[44068.848794] task:kworker/u96:0 state:R running task stack: 0 pid: 11 ppid: 2 flags:0x00004008
[44068.858700] Workqueue: netns cleanup_net
[44068.862626] Call Trace:
[44068.865079] <TASK>
[44068.867184] ? cleanup_net+0x1d2/0x380
[44068.870938] ? process_one_work+0x1e2/0x3b0
[44068.875124] ? worker_thread+0x50/0x3a0
[44068.878962] ? rescuer_thread+0x390/0x390
[44068.882974] ? kthread+0xd6/0x100
[44068.886294] ? kthread_complete_and_exit+0x20/0x20
[44068.891088] ? ret_from_fork+0x1f/0x30
[44068.894841] </TASK>
I clicked "reboot" in the beaker, it can reboot successfully. And I create sosreport and attach it. I rerun 3 jobs and has no panic, so this issue does not always reproduce. https://beaker.engineering.redhat.com/jobs/8177222 https://beaker.engineering.redhat.com/jobs/8181689 https://beaker.engineering.redhat.com/jobs/8178632 Run one more job, still has no panic. https://beaker.engineering.redhat.com/jobs/8189074 |