Hide Forgot
Description of problem: We are seeing occasionally seeing issues with the teaming daemon not starting after a reboot on centos 7 VMs. Here is an example (from /var/log/messages.minor): Oct 6 23:36:21 ****** ovs-ctl[623]: Starting ovsdb-server [ OK ] Oct 6 23:36:22 ****** ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl --no-wait -- init -- set Open_vSwitch . db-version=7.6.2 Oct 6 23:36:22 ****** ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl --no-wait set Open_vSwitch . ovs-version=2.3.1 "external-ids:system-id=\"47ff9309-5609-47e0-819c-b9055b25edbb\"" "system-type=\"CentOS\"" "system-version=\"7.1.1503-Core\"" Oct 6 23:36:22 ****** ovs-ctl[623]: Configuring Open vSwitch system IDs [ OK ] Oct 6 23:36:22 ****** network[733]: Bringing up loopback interface: [ OK ] Oct 6 23:36:22 ****** kernel: [ 6.158533] gre: GRE over IPv4 demultiplexor driver Oct 6 23:36:22 ****** systemd[1]: Starting system-teamd.slice. Oct 6 23:36:22 ****** systemd[1]: Created slice system-teamd.slice. Oct 6 23:36:22 ****** systemd[1]: Starting Team Daemon for device bond0... Oct 6 23:36:22 ****** kernel: [ 6.199635] openvswitch: Open vSwitch switching datapath Oct 6 23:36:22 ****** ovs-ctl[623]: Inserting openvswitch module [ OK ] Oct 6 23:36:22 ****** kernel: [ 6.338577] device ovs-system entered promiscuous mode Oct 6 23:36:22 ****** kernel: [ 6.340086] openvswitch: netlink: Unknown key attribute (type=62, max=21). Oct 6 23:36:22 ****** kernel: [ 6.385293] device br-ex entered promiscuous mode Oct 6 23:36:22 ****** kernel: [ 6.426511] device br-int entered promiscuous mode Oct 6 23:36:22 ****** teamd[857]: Failed to get interface information list. Oct 6 23:36:22 ****** teamd[857]: Failed to init interface information list. Oct 6 23:36:22 ****** teamd[857]: Team init failed. Oct 6 23:36:22 ****** teamd[857]: teamd_init() failed. Oct 6 23:36:22 ****** teamd[857]: Failed: Invalid argument Oct 6 23:36:22 ****** systemd[1]: teamd@bond0.service: main process exited, code=exited, status=1/FAILURE Oct 6 23:36:22 ****** network[733]: Bringing up interface bond0: Job for teamd@bond0.service failed. See 'systemctl status teamd@bond0.service' and 'journalctl -xn' for details. Oct 6 23:36:22 ****** kernel: [ 6.433515] device br-tun entered promiscuous mode Oct 6 23:36:22 ****** systemd[1]: Unit teamd@bond0.service entered failed state. Oct 6 23:36:22 ****** ovs-ctl[623]: Starting ovs-vswitchd [ OK ] Oct 6 23:36:22 ****** network[733]: [FAILED] Oct 6 23:36:22 ****** ovs-ctl[623]: Enabling remote OVSDB managers [ OK ] Version-Release number of selected component (if applicable): teamd-1.15-1.el7.centos.x86_64 libteam-1.15-1.el7.centos.x86_64 How reproducible: Only happens occasionally, not reproducible on demand Steps to Reproduce: 1. reboot a VM 2. after reboot teamd fails to start with error "Failed to get interface information list." Actual results: Expected results: Additional info: Investigation has showed that teamd is failing because libteam code in ifinfo.c is not handling error NLE_DUMP_INTR returned by nl_recvmsgs (part of libnl3)
Adding upstream maintainer to CC.
Hi, can you offer the starting commands of VM and the network config file in guest?
This is fixed by: https://github.com/jpirko/libteam/commit/8e44b17159522e6afecd64a507cdfae3ed341257
(In reply to Jiri Pirko from comment #4) > This is fixed by: > > https://github.com/jpirko/libteam/commit/ > 8e44b17159522e6afecd64a507cdfae3ed341257 ok, thanks, Jiri.
Fix is prepared for 7.3. Flagging 7.2.z as there is no workaround for this issue.
Oups, should be Modified really, as libteam is updated to 1.23 which contains that commit.
Verified on- libteam-1.23-1.el7.x86_64 teamd-1.23-1.el7.x86_64 Ran test multiple times. LOG- :: [ PASS ] :: Command 'virsh reboot vm1' (Expected 0, got 0) :: [ LOG ] :: Duration: 2m 54s :: [ LOG ] :: Assertions: 12 good, 0 bad :: [ PASS ] :: RESULT: Start VM and upgrade kernel :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :: [ LOG ] :: Test :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :: [ LOG ] :: Output of 'vmsh run_cmd vm1 'ip a | grep team0:'': :: [ LOG ] :: --------------- OUTPUT START --------------- :: [ LOG ] :: spawn virsh console vm1 :: [ LOG ] :: :: [ LOG ] :: Connected to domain vm1 :: [ LOG ] :: :: [ LOG ] :: Escape character is ^] :: [ LOG ] :: :: [ LOG ] :: :: [ LOG ] :: :: [ LOG ] :: :: [ LOG ] :: Red Hat Enterprise Linux Server 7.2 Beta (Maipo) :: [ LOG ] :: :: [ LOG ] :: Kernel 3.10.0-451.el7.x86_64 on an x86_64 :: [ LOG ] :: :: [ LOG ] :: :: [ LOG ] :: :: [ LOG ] :: localhost login: root :: [ LOG ] :: :: [ LOG ] :: :: [ LOG ] :: Password: :: [ LOG ] :: :: [ LOG ] :: Last login: Tue Jun 28 10:02:56 on ttyS0 :: [ LOG ] :: :: [ LOG ] :: [root@localhost ~]# ip a | grep team0: :: [ LOG ] :: :: [ LOG ] :: 4: [01;31m[Kteam0:[m[K <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP qlen 1000 :: [ LOG ] :: :: [ LOG ] :: [root@localhost ~]# echo $? :: [ LOG ] :: :: [ LOG ] :: 0 :: [ LOG ] :: :: [ LOG ] :: [root@localhost ~]# logout :: [ LOG ] :: :: [ LOG ] :: :: [ LOG ] :: :: [ LOG ] :: :: [ LOG ] :: Red Hat Enterprise Linux Server 7.2 Beta (Maipo) :: [ LOG ] :: :: [ LOG ] :: Kernel 3.10.0-451.el7.x86_64 on an x86_64 :: [ LOG ] :: :: [ LOG ] :: :: [ LOG ] :: :: [ LOG ] :: localhost login: :: [ LOG ] :: :: [ LOG ] :: --------------- OUTPUT END --------------- :: [ PASS ] :: Command 'vmsh run_cmd vm1 'ip a | grep team0:'' (Expected 0, got 0) :: [ PASS ] :: There should not be an error and Team should initialise without errors (Assert: '0' should equal '0') :: [ LOG ] :: Output of 'ping -c 5 192.168.1.22': :: [ LOG ] :: --------------- OUTPUT START --------------- :: [ LOG ] :: PING 192.168.1.22 (192.168.1.22) 56(84) bytes of data. :: [ LOG ] :: 64 bytes from 192.168.1.22: icmp_seq=1 ttl=64 time=0.328 ms :: [ LOG ] :: 64 bytes from 192.168.1.22: icmp_seq=2 ttl=64 time=0.118 ms :: [ LOG ] :: 64 bytes from 192.168.1.22: icmp_seq=3 ttl=64 time=0.192 ms :: [ LOG ] :: 64 bytes from 192.168.1.22: icmp_seq=4 ttl=64 time=0.118 ms :: [ LOG ] :: 64 bytes from 192.168.1.22: icmp_seq=5 ttl=64 time=0.112 ms :: [ LOG ] :: :: [ LOG ] :: --- 192.168.1.22 ping statistics --- :: [ LOG ] :: 5 packets transmitted, 5 received, 0% packet loss, time 3999ms :: [ LOG ] :: rtt min/avg/max/mdev = 0.112/0.173/0.328/0.083 ms :: [ LOG ] :: --------------- OUTPUT END --------------- :: [ PASS ] :: Command 'ping -c 5 192.168.1.22' (Expected 0, got 0) :: [ LOG ] :: Duration: 13s :: [ LOG ] :: Assertions: 3 good, 0 bad :: [ PASS ] :: RESULT: Test
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2016-2219.html