Bug 1633672

Summary: go panic if gre interfaces are present on the node
Product: OpenShift Container Platform Reporter: raffaele spazzoli <rspazzol>
Component: NetworkingAssignee: Casey Callendrello <cdc>
Networking sub component: openshift-sdn QA Contact: zhaozhanqi <zzhao>
Status: CLOSED DUPLICATE Docs Contact:
Severity: unspecified    
Priority: unspecified CC: aos-bugs, ricarril
Version: 3.9.0   
Target Milestone: ---   
Target Release: 3.9.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-11-08 09:59:14 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description raffaele spazzoli 2018-09-27 14:00:05 UTC
Description of problem:

node service could not restart after I created gre interfaces. The node service errs out with a go panic. here is some of the error message:

Sep 27 09:53:50 app-node-2.raffa1.casl-contrib.osp.rht-labs.com atomic-openshift-node[6253]: I0927 09:53:50.445282    6253 node.go:149] Initializing SDN node of type "redhat/openshift-ovs-networkpolicy" with con
Sep 27 09:53:50 app-node-2.raffa1.casl-contrib.osp.rht-labs.com systemd[1]: Started Kubernetes systemd probe.
Sep 27 09:53:50 app-node-2.raffa1.casl-contrib.osp.rht-labs.com systemd[1]: Starting Kubernetes systemd probe.
Sep 27 09:53:50 app-node-2.raffa1.casl-contrib.osp.rht-labs.com atomic-openshift-node[6253]: panic: runtime error: index out of range
Sep 27 09:53:50 app-node-2.raffa1.casl-contrib.osp.rht-labs.com atomic-openshift-node[6253]: goroutine 1 [running]:
Sep 27 09:53:50 app-node-2.raffa1.casl-contrib.osp.rht-labs.com atomic-openshift-node[6253]: panic(0x4d31aa0, 0xf16e9b0)
Sep 27 09:53:50 app-node-2.raffa1.casl-contrib.osp.rht-labs.com atomic-openshift-node[6253]: /usr/lib/golang/src/runtime/panic.go:540 +0x45e fp=0xc4210820a8 sp=0xc421082000 pc=0x42eb7e
Sep 27 09:53:50 app-node-2.raffa1.casl-contrib.osp.rht-labs.com atomic-openshift-node[6253]: runtime.panicindex()
Sep 27 09:53:50 app-node-2.raffa1.casl-contrib.osp.rht-labs.com atomic-openshift-node[6253]: /usr/lib/golang/src/runtime/panic.go:28 +0x5e fp=0xc4210820c8 sp=0xc4210820a8 pc=0x42d67e
Sep 27 09:53:50 app-node-2.raffa1.casl-contrib.osp.rht-labs.com atomic-openshift-node[6253]: github.com/openshift/origin/vendor/github.com/vishvananda/netlink.parseGretapData(0xf1c8600, 0xc42080d8c0, 0xc4212a0a0
Sep 27 09:53:50 app-node-2.raffa1.casl-contrib.osp.rht-labs.com atomic-openshift-node[6253]: /builddir/build/BUILD/atomic-openshift-git-0.0c9824a/_output/local/go/src/github.com/openshift/origin/vendor/github.co
Sep 27 09:53:50 app-node-2.raffa1.casl-contrib.osp.rht-labs.com atomic-openshift-node[6253]: github.com/openshift/origin/vendor/github.com/vishvananda/netlink.LinkDeserialize(0x0, 0xc42129a548, 0x548, 0xab8, 0xf
Sep 27 09:53:50 app-node-2.raffa1.casl-contrib.osp.rht-labs.com atomic-openshift-node[6253]: /builddir/build/BUILD/atomic-openshift-git-0.0c9824a/_output/local/go/src/github.com/openshift/origin/vendor/github.co
Sep 27 09:53:50 app-node-2.raffa1.casl-contrib.osp.rht-labs.com atomic-openshift-node[6253]: github.com/openshift/origin/vendor/github.com/vishvananda/netlink.(*Handle).LinkList(0xf588920, 0x573af0d, 0x1, 0xc421
Sep 27 09:53:50 app-node-2.raffa1.casl-contrib.osp.rht-labs.com atomic-openshift-node[6253]: /builddir/build/BUILD/atomic-openshift-git-0.0c9824a/_output/local/go/src/github.com/openshift/origin/vendor/github.co
Sep 27 09:53:50 app-node-2.raffa1.casl-contrib.osp.rht-labs.com atomic-openshift-node[6253]: github.com/openshift/origin/vendor/github.com/vishvananda/netlink.LinkList(0x2, 0x573af0d, 0x1, 0xc421208b32, 0x2)
Sep 27 09:53:50 app-node-2.raffa1.casl-contrib.osp.rht-labs.com atomic-openshift-node[6253]: /builddir/build/BUILD/atomic-openshift-git-0.0c9824a/_output/local/go/src/github.com/openshift/origin/vendor/github.co
Sep 27 09:53:50 app-node-2.raffa1.casl-contrib.osp.rht-labs.com atomic-openshift-node[6253]: github.com/openshift/origin/pkg/network/node.GetLinkDetails(0xc421208b40, 0xd, 0xc421208b40, 0xd, 0x0, 0x0, 0xc42124ef
Sep 27 09:53:50 app-node-2.raffa1.casl-contrib.osp.rht-labs.com atomic-openshift-node[6253]: /builddir/build/BUILD/atomic-openshift-git-0.0c9824a/_output/local/go/src/github.com/openshift/origin/pkg/network/node
Sep 27 09:53:50 app-node-2.raffa1.casl-contrib.osp.rht-labs.com atomic-openshift-node[6253]: github.com/openshift/origin/pkg/network/node.(*OsdnNodeConfig).setNodeIP(0xc420d51520, 0x28, 0xf1b5600)
Sep 27 09:53:50 app-node-2.raffa1.casl-contrib.osp.rht-labs.com atomic-openshift-node[6253]: /builddir/build/BUILD/atomic-openshift-git-0.0c9824a/_output/local/go/src/github.com/openshift/origin/pkg/network/node
Sep 27 09:53:50 app-node-2.raffa1.casl-contrib.osp.rht-labs.com atomic-openshift-node[6253]: github.com/openshift/origin/pkg/network/node.New(0xc420d51520, 0xc420d51520, 0x5756582, 0xd, 0xc420d2a4e0)


here are the interfaces present in my node:

[root@app-node-2 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc pfifo_fast state UP group default qlen 1000
    link/ether fa:16:3e:87:9d:2c brd ff:ff:ff:ff:ff:ff
    inet 192.168.99.13/24 brd 192.168.99.255 scope global noprefixroute dynamic eth0
       valid_lft 85446sec preferred_lft 85446sec
    inet6 fe80::f816:3eff:fe87:9d2c/64 scope link 
       valid_lft forever preferred_lft forever
3: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 86:b5:b8:cf:cf:48 brd ff:ff:ff:ff:ff:ff
4: tun0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 86:18:dd:81:fc:1f brd ff:ff:ff:ff:ff:ff
5: vxlan_sys_4789: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65000 qdisc noqueue master ovs-system state UNKNOWN group default qlen 1000
    link/ether 26:3d:eb:34:9e:6a brd ff:ff:ff:ff:ff:ff
    inet6 fe80::243d:ebff:fe34:9e6a/64 scope link 
       valid_lft forever preferred_lft forever
6: gre0@NONE: <NOARP> mtu 1476 qdisc noop state DOWN group default qlen 1000
    link/gre 0.0.0.0 brd 0.0.0.0
7: gretap0@NONE: <BROADCAST,MULTICAST> mtu 1462 qdisc noop state DOWN group default qlen 1000
    link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
8: gre_sys@NONE: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65000 qdisc pfifo_fast master ovs-system state UNKNOWN group default qlen 1000
    link/ether 42:66:12:18:16:d4 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::4066:12ff:fe18:16d4/64 scope link 
       valid_lft forever preferred_lft forever
9: br0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 0e:e2:a6:23:94:44 brd ff:ff:ff:ff:ff:ff
10: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default 
    link/ether 02:42:00:30:e7:05 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 scope global docker0
       valid_lft forever preferred_lft forever



Version-Release number of selected component (if applicable):


How reproducible:

not sure

Comment 1 Ricardo Carrillo Cruz 2019-05-16 13:44:11 UTC
This looks like a netlink library bug.
Will look upstream if it's a known issue, otherwise it should be easy to reproduce locally.

Comment 2 Ricardo Carrillo Cruz 2019-05-17 09:50:25 UTC
I'm unable to reproduce.
I created a little Go program that calls LinkList to see if it would fail when GRE interfaces are present, but it works for me:

<SNIP>
[ricky@ricky-laptop netlinkgre]$ cat linklist.go 
package main

import (
        "fmt"
        "github.com/vishvananda/netlink"
)

func main() {
        ll, err := netlink.LinkList()

        if err != nil {
                panic("Couldn't get linklist")
        }

        for _, l := range ll {
                if l.Attrs().Name == "gretap0" {
                        fmt.Printf("%+v", l)
                }
        }
}
[ricky@ricky-laptop netlinkgre]$ ip link show gretap0
43: gretap0@NONE: <BROADCAST,MULTICAST> mtu 1462 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
[ricky@ricky-laptop netlinkgre]$ go run linklist.go 
&{LinkAttrs:{Index:43 MTU:1462 TxQLen:1000 Name:gretap0 HardwareAddr: Flags:broadcast|multicast RawFlags:4098 ParentIndex:0 MasterIndex:0 Namespace:<nil> Alias: Statistics:0xc0000b7764 Promisc:0 Xdp:0xc00001a5e0 EncapType:ether Protinfo:<nil> OperState:down NetNsID:0 NumTxQueues:0 NumRxQueues:0 GSOMaxSize:65536 GSOMaxSegs:65535 Vfs:[]} IKey:0 OKey:0 EncapSport:0 EncapDport:0 Local:0.0.0.0 Remote:0.0.0.0 IFlags:0 OFlags:0 PMtuDisc:0 Ttl:0 Tos:0 EncapType:0 EncapFlags:0 Link:0 FlowBased:false}[ricky@ricky-laptop netlinkgre]$
</SNIP>

Can you please let me know what version of vishvananda/netlink is installed?

Thanks

Comment 3 Ricardo Carrillo Cruz 2019-09-13 09:28:44 UTC
This looks like a duplicaet of https://bugzilla.redhat.com/show_bug.cgi?id=1751458 , will close it accordingly when that bug
goes thru QA.

Comment 4 Ricardo Carrillo Cruz 2019-11-08 09:59:14 UTC

*** This bug has been marked as a duplicate of bug 1751458 ***