Bug 1547472

Summary: [abrt] [faf] libteam: unknown function(): /usr/bin/teamd killed by 11
Product: Red Hat Enterprise Linux 7 Reporter: Vladimir Benes <vbenes>
Component: libteamAssignee: Xin Long <lxin>
Status: CLOSED DUPLICATE QA Contact: Network QE <network-qe>
Severity: low Docs Contact:
Priority: low    
Version: 7.5-AltCC: mleitner, ralongi, sukulkar
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
URL: http://faf.lab.eng.brq.redhat.com/faf/reports/bthash/ddd8b99e9e94d397a8b442e0285b0ac1be58e638/
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-07-02 11:39:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Vladimir Benes 2018-02-21 12:07:58 UTC
This bug has been created based on an anonymous crash report requested by the package maintainer.

Report URL: http://faf.lab.eng.brq.redhat.com/faf/reports/bthash/ddd8b99e9e94d397a8b442e0285b0ac1be58e638/

not sure if the crash is clear from backtrace, but I can definitelly dig deeper to NM test suite to find reproducer.

Comment 2 Marcelo Ricardo Leitner 2018-02-23 13:58:40 UTC
I'm guessing this was an internal error on glibc (vsnprintf). Maybe it couldn't allocate enough memory.

Comment 3 Xin Long 2018-02-26 12:05:40 UTC
(gdb) select-frame 5
(gdb) p port->ifinfo
$1 = (struct team_ifinfo *) 0x0

static bool __team_port_str(struct team_port *port,
                            char **pbuf, size_t *pbufsiz)
{
        uint32_t ifindex = team_get_port_ifindex(port);
        struct team_ifinfo *ifinfo = team_get_port_ifinfo(port); <--- NULL

        return __buf_append(...
                            team_get_ifinfo_ifname(ifinfo), <--- crash

Introduced by:
commit 046fb6ba0aec8246075b18d787daec43201566fa
Author: Antti Tiainen <atiainen>
Date:   Mon Feb 6 15:41:05 2017 +0200

    libteam: resynchronize ifinfo after lost RTNLGRP_LINK notifications

team_port's ifinfo should not be null, but ifinfo's destroy is earlier than team_port, so we have to delay it, the fix could be something like:

 static void ifinfo_destroy(struct team_ifinfo *ifinfo)
 {
-       if (ifinfo->linked && ifinfo->port)
-               port_unlink(ifinfo->port);
        list_del(&ifinfo->list);
+       if (ifinfo->linked && ifinfo->port)
+               return;
        free(ifinfo);
 }

@@ -470,6 +470,9 @@ void ifinfo_unlink(struct team_ifinfo *ifinfo)
 {
        ifinfo->port = NULL;
        ifinfo->linked = false;
+
+       if (list_empty(&ifinfo->list))
+               free(ifinfo);
 }

after which, we can also do the similar thing for port and option's cleanup as the commit does for ifinfo, and remove them from elsewhere, it would be a nice improvement.
@@ -240,6 +240,13 @@ int check_call_change_handlers(struct team_handle *th,
                ifinfo_destroy_removed(th);
                ifinfo_clear_changed(th);
        }
+
+       if (call_type_mask & TEAM_PORT_CHANGE)
+               port_list_cleanup_last_state(th);
+
+       if (call_type_mask & TEAM_OPTION_CHANGE)
+               option_list_cleanup_last_state(th);
+
        th->change_handler.pending_type_mask &= ~call_type_mask;
        return err;
 }

Comment 4 Marcelo Ricardo Leitner 2018-02-26 12:34:48 UTC
(In reply to Xin Long from comment #3)
> (gdb) select-frame 5
> (gdb) p port->ifinfo
> $1 = (struct team_ifinfo *) 0x0
> 
> static bool __team_port_str(struct team_port *port,
>                             char **pbuf, size_t *pbufsiz)
> {
>         uint32_t ifindex = team_get_port_ifindex(port);
>         struct team_ifinfo *ifinfo = team_get_port_ifinfo(port); <--- NULL
> 
>         return __buf_append(...
>                             team_get_ifinfo_ifname(ifinfo), <--- crash

That's interesting, and odd. As it had __buf_append() in the call stack, I assumed that that had worked. I wonder how this call stack could happen then, because that expression must be evaluated before the function call.

Comment 5 Xin Long 2018-02-26 13:03:52 UTC
(In reply to Marcelo Ricardo Leitner from comment #4)
> (In reply to Xin Long from comment #3)
> > (gdb) select-frame 5
> > (gdb) p port->ifinfo
> > $1 = (struct team_ifinfo *) 0x0
> > 
> > static bool __team_port_str(struct team_port *port,
> >                             char **pbuf, size_t *pbufsiz)
> > {
> >         uint32_t ifindex = team_get_port_ifindex(port);
> >         struct team_ifinfo *ifinfo = team_get_port_ifinfo(port); <--- NULL
> > 
> >         return __buf_append(...
> >                             team_get_ifinfo_ifname(ifinfo), <--- crash
> 
> That's interesting, and odd. As it had __buf_append() in the call stack, I
> assumed that that had worked. I wonder how this call stack could happen
> then, because that expression must be evaluated before the function call.
Maybe ifinfo->ifname (offset) was just a value as the parameter, team_get_ifinfo_ifname didn't yet dereference it (access the memory), until vsnprintf/strlen did that.

Comment 6 Xin Long 2018-07-02 11:39:06 UTC

*** This bug has been marked as a duplicate of bug 1563155 ***