Bug 2215022

Summary: [NMCI] crash in ovs_mtu test
Product: Red Hat Enterprise Linux 9 Reporter: Vladimir Benes <vbenes>
Component: NetworkManagerAssignee: Gris Ge <fge>
Status: CLOSED ERRATA QA Contact: Vladimir Benes <vbenes>
Severity: high Docs Contact:
Priority: unspecified    
Version: 9.3CC: bgalvani, ferferna, fpokryvk, lrintel, rkhan, sfaye, sukulkar, till
Target Milestone: rcKeywords: Regression, Triaged
Target Release: ---Flags: pm-rhel: mirror+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: NetworkManager-1.43.11-1.el9 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-11-07 08:38:26 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Vladimir Benes 2023-06-14 13:34:41 UTC
Description of problem:
when running ovs_mtu test, there is a crash here and there.


Version-Release number of selected component (if applicable):
1.43.9-1.el9

How reproducible:
not sure

Steps to Reproduce:
1. run ovs_mtu NMCI test

Actual results:
crash

Expected results:
no crash

Additional info:
           PID: 533335 (NetworkManager)
           UID: 0 (root)
           GID: 0 (root)
        Signal: 5 (TRAP)
     Timestamp: Tue 2023-06-13 12:26:41 EDT (13s ago)
  Command Line: /usr/sbin/NetworkManager --no-daemon
    Executable: /usr/sbin/NetworkManager
 Control Group: /system.slice/NetworkManager.service
          Unit: NetworkManager.service
         Slice: system.slice
       Boot ID: 575311ff1a6640cd99170140abfccd13
    Machine ID: 22a6a62f096f4281b5344430fafb2790
      Hostname: beaker-networkmanager-gitlab-trigger-test-upstream-4669
       Storage: /var/lib/systemd/coredump/core.NetworkManager.0.575311ff1a6640cd99170140abfccd13.533335.1686673601000000.zst (present)
  Size on Disk: 864.5K
       Message: Process 533335 (NetworkManager) of user 0 dumped core.
                
                Stack trace of thread 533335:
                #0  0x00007f63483725a7 g_logv (libglib-2.0.so.0 + 0x5a5a7)
                #1  0x00007f6348372863 g_log (libglib-2.0.so.0 + 0x5a863)
                #2  0x000055b37658cbdc _nm_g_return_if_fail_warning (NetworkManager + 0x197bdc)
                #3  0x000055b3766562be _hw_addr_set (NetworkManager + 0x2612be)
                #4  0x000055b376658483 _set_state_full (NetworkManager + 0x263483)
                #5  0x000055b376661aa3 nm_device_state_changed (NetworkManager + 0x26caa3)
                #6  0x000055b3766671a8 nm_device_set_unmanaged_by_flags (NetworkManager + 0x2721a8)
                #7  0x000055b376445c47 nm_manager_stop (NetworkManager + 0x50c47)
                #8  0x00007f634803feb0 __libc_start_call_main (libc.so.6 + 0x3feb0)
                #9  0x00007f634803ff60 __libc_start_main_impl (libc.so.6 + 0x3ff60)
                #10 0x000055b376446ae5 _start (NetworkManager + 0x51ae5)
                
                Stack trace of thread 533336:
                #0  0x00007f634814296f __GI___poll (libc.so.6 + 0x14296f)
                #1  0x00007f63483c203c g_main_context_poll (libglib-2.0.so.0 + 0xaa03c)
                #2  0x00007f634836a5f3 g_main_context_iteration (libglib-2.0.so.0 + 0x525f3)
                #3  0x00007f634836a641 glib_worker_main (libglib-2.0.so.0 + 0x52641)
                #4  0x00007f634839b582 g_thread_proxy (libglib-2.0.so.0 + 0x83582)
                #5  0x00007f634809f832 start_thread (libc.so.6 + 0x9f832)
                #6  0x00007f634803f450 __clone3 (libc.so.6 + 0x3f450)
                
                Stack trace of thread 533337:
                #0  0x00007f634814296f __GI___poll (libc.so.6 + 0x14296f)
                #1  0x00007f63483c203c g_main_context_poll (libglib-2.0.so.0 + 0xaa03c)
                #2  0x00007f634836c483 g_main_loop_run (libglib-2.0.so.0 + 0x54483)
                #3  0x00007f63485bee1a gdbus_shared_thread_func (libgio-2.0.so.0 + 0x110e1a)
                #4  0x00007f634839b582 g_thread_proxy (libglib-2.0.so.0 + 0x83582)
                #5  0x00007f634809f832 start_thread (libc.so.6 + 0x9f832)
                #6  0x00007f634803f450 __clone3 (libc.so.6 + 0x3f450)
                
                Stack trace of thread 533338:
                #0  0x00007f634803ee5d syscall (libc.so.6 + 0x3ee5d)
                #1  0x00007f63483bbb6c g_cond_wait_until (libglib-2.0.so.0 + 0xa3b6c)
                #2  0x00007f634833d071 g_async_queue_pop_intern_unlocked (libglib-2.0.so.0 + 0x25071)
                #3  0x00007f634833d1f6 g_async_queue_timeout_pop (libglib-2.0.so.0 + 0x251f6)
                #4  0x00007f634839e519 g_thread_pool_wait_for_new_pool (libglib-2.0.so.0 + 0x86519)
                #5  0x00007f634839b582 g_thread_proxy (libglib-2.0.so.0 + 0x83582)
                #6  0x00007f634809f832 start_thread (libc.so.6 + 0x9f832)
                #7  0x00007f634803f450 __clone3 (libc.so.6 + 0x3f450)
                ELF object binary architecture: AMD x86-64

GNU gdb (GDB) Red Hat Enterprise Linux 10.2-11.el9
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/sbin/NetworkManager...
Reading symbols from /usr/lib/debug/usr/sbin/NetworkManager-1.43.9-1.el9.x86_64.debug...
[New LWP 533335]
[New LWP 533336]
[New LWP 533337]
[New LWP 533338]
Missing separate debuginfo for /usr/lib64/NetworkManager/1.43.9-1.el9/libnm-device-plugin-ovs.so
Try: dnf --enablerepo='*debug*' install /usr/lib/debug/.build-id/fc/78a05e09f3b1aa5e825435bcc718dd1bae7871.debug
Missing separate debuginfo for /usr/lib64/NetworkManager/1.43.9-1.el9/libnm-device-plugin-wifi.so
Try: dnf --enablerepo='*debug*' install /usr/lib/debug/.build-id/39/fcae642d5b3b2bfd5b89e71f2f73851c917b3d.debug
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/sbin/NetworkManager --no-daemon'.
Program terminated with signal SIGTRAP, Trace/breakpoint trap.
#0  g_logv (log_domain=0x55b37668d568 "nm", log_level=G_LOG_LEVEL_CRITICAL, format=<optimized out>, args=<optimized out>) at ../glib/gmessages.c:1413
1413		  g_private_set (&g_log_depth, GUINT_TO_POINTER (depth));
[Current thread is 1 (Thread 0x7f634709d500 (LWP 533335))]
(gdb) #0  g_logv (log_domain=0x55b37668d568 "nm", log_level=G_LOG_LEVEL_CRITICAL, format=<optimized out>, args=<optimized out>) at ../glib/gmessages.c:1413
#1  0x00007f6348372863 in g_log (log_domain=<optimized out>, log_level=<optimized out>, format=<optimized out>) at ../glib/gmessages.c:1451
#2  0x000055b37658cbdc in _nm_g_return_if_fail_warning (log_domain=<optimized out>, line=<optimized out>, file=<optimized out>) at ./src/libnm-glib-aux/nm-gassert-patch.h:25
#3  nm_platform_link_set_address (self=<optimized out>, ifindex=0, address=0x7ffd54ba7270, length=6) at src/libnm-platform/nm-platform.c:1862
#4  0x000055b3766562be in _hw_addr_set (self=0x55b376e066b0, addr=0x55b376e20800 "7A:F1:21:AD:A1:3E", operation=0x55b37669aa8a "reset", detail=0x55b3766ba89f "unmanage") at src/core/devices/nm-device.c:16831
#5  0x000055b376658483 in _set_state_full (self=0x55b376e066b0, state=<optimized out>, reason=<optimized out>, quitting=<optimized out>) at src/core/devices/nm-device.c:16140
#6  0x000055b376661aa3 in nm_device_state_changed (reason=NM_DEVICE_STATE_REASON_REMOVED, state=NM_DEVICE_STATE_UNMANAGED, self=0x55b376e066b0) at src/core/devices/nm-device.c:16400
#7  _set_unmanaged_flags (self=0x55b376e066b0, flags=<optimized out>, set_op=<optimized out>, allow_state_transition=<optimized out>, now=<optimized out>, reason=<optimized out>) at src/core/devices/nm-device.c:14778
#8  0x000055b3766671a8 in nm_device_set_unmanaged_by_flags (reason=<optimized out>, set_op=NM_UNMAN_FLAG_OP_SET_UNMANAGED, flags=NM_UNMANAGED_QUITTING, self=<optimized out>) at src/core/devices/nm-device.c:14817
#9  nm_device_set_unmanaged_by_quitting (self=<optimized out>) at src/core/devices/nm-device.c:14950
#10 remove_device (self=0x55b376c8e000, device=<optimized out>, quitting=1) at src/core/nm-manager.c:2286
#11 0x000055b376445c47 in nm_manager_stop (self=0x55b376c8e000) at src/core/nm-manager.c:7905
#12 main (argc=<optimized out>, argv=<optimized out>) at src/core/main.c:530
(gdb) quit

Comment 7 Vladimir Benes 2023-07-17 20:14:13 UTC
ovs_mtu test running happily under rhel9.3 again.

Comment 9 errata-xmlrpc 2023-11-07 08:38:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (NetworkManager bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:6585