Bug 1550097

Summary: iproute: setting MAC with ip-link fails
Product: Red Hat Enterprise Linux 7 Reporter: Eric Garver <egarver>
Component: iprouteAssignee: Phil Sutter <psutter>
Status: CLOSED ERRATA QA Contact: Jaroslav Aster <jaster>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 7.5CC: atragler, egarver, jaster, lmiksik, omoris, psutter, sbrivio, szidek
Target Milestone: rcKeywords: Regression
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: iproute-4.11.0-14.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-10 14:31:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1457439    

Description Eric Garver 2018-02-28 13:49:37 UTC
Description of problem:

See bug 1501418 comment 11.

Setting the MAC of an interface fails with the following: 

  # ip link set dev p0 address "f0:00:00:01:01:01"
  Invalid address length 6 - must be 42401 bytes

This was introduced with iproute-4.11.0-9.el7.

Marking urgent as it blocks/breaks a lot of OVS scenarios and tests.

Comment 3 Phil Sutter 2018-02-28 15:43:03 UTC
Hi Eric,

(In reply to Eric Garver from comment #0)
> Description of problem:
> 
> See bug 1501418 comment 11.
> 
> Setting the MAC of an interface fails with the following: 
> 
>   # ip link set dev p0 address "f0:00:00:01:01:01"
>   Invalid address length 6 - must be 42401 bytes
> 
> This was introduced with iproute-4.11.0-9.el7.
> 
> Marking urgent as it blocks/breaks a lot of OVS scenarios and tests.

I can't reproduce this on my local RHEL7 VM. Maybe it is a kernel issue? With which kernel version are you able to reproduce the problem?

Does this happen only with veth type interfaces or others as well?

Thanks, Phil

Comment 4 Eric Garver 2018-02-28 20:27:14 UTC
(In reply to Phil Sutter from comment #3)
> Hi Eric,
> 
> (In reply to Eric Garver from comment #0)
> > Description of problem:
> > 
> > See bug 1501418 comment 11.
> > 
> > Setting the MAC of an interface fails with the following: 
> > 
> >   # ip link set dev p0 address "f0:00:00:01:01:01"
> >   Invalid address length 6 - must be 42401 bytes
> > 
> > This was introduced with iproute-4.11.0-9.el7.
> > 
> > Marking urgent as it blocks/breaks a lot of OVS scenarios and tests.
> 
> I can't reproduce this on my local RHEL7 VM. Maybe it is a kernel issue?
> With which kernel version are you able to reproduce the problem?
> 
> Does this happen only with veth type interfaces or others as well?
> 
> Thanks, Phil

I was also unable to reproduce it outside of the OVS testsuite. So I went looking for suspicious code in iproute and found a use after free in nl_get_ll_addr_len(). tb array is filled from the dynamically allocated answer. But tb[IFLA_ADDRESS] is accessed after answer is freed.

This was introduced by

  86bf43c7c2fd ("lib/libnetlink: update rtnl_talk to support malloc buff at run time")

which was backported in iproute-4.11.0-9

Comment 5 Eric Garver 2018-02-28 20:43:52 UTC
(In reply to Eric Garver from comment #4)
> (In reply to Phil Sutter from comment #3)
> > Hi Eric,
> > 
> > (In reply to Eric Garver from comment #0)
> > > Description of problem:
> > > 
> > > See bug 1501418 comment 11.
> > > 
> > > Setting the MAC of an interface fails with the following: 
> > > 
> > >   # ip link set dev p0 address "f0:00:00:01:01:01"
> > >   Invalid address length 6 - must be 42401 bytes
> > > 
> > > This was introduced with iproute-4.11.0-9.el7.
> > > 
> > > Marking urgent as it blocks/breaks a lot of OVS scenarios and tests.
> > 
> > I can't reproduce this on my local RHEL7 VM. Maybe it is a kernel issue?
> > With which kernel version are you able to reproduce the problem?
> > 
> > Does this happen only with veth type interfaces or others as well?
> > 
> > Thanks, Phil
> 
> I was also unable to reproduce it outside of the OVS testsuite. So I went
> looking for suspicious code in iproute and found a use after free in
> nl_get_ll_addr_len(). tb array is filled from the dynamically allocated
> answer. But tb[IFLA_ADDRESS] is accessed after answer is freed.
> 
> This was introduced by
> 
>   86bf43c7c2fd ("lib/libnetlink: update rtnl_talk to support malloc buff at
> run time")
> 
> which was backported in iproute-4.11.0-9

ElectricFence and gdb agree.

$ cat bz1550097.sh 
#!/bin/sh
set -e

ip link delete ovs-p0 || true
ip netns delete at_ns0 || true

ip netns add at_ns0
ip link add p0 type veth peer name ovs-p0
ip link set p0 netns at_ns0
ip link set dev ovs-p0 up

ip netns exec at_ns0 ip addr add 10.1.1.1/24 dev p0
ip netns exec at_ns0 ip link set dev p0 up
ip netns exec at_ns0 ef ip link set dev p0 address f0:00:00:01:01:01

Note the "ef" in the last command.

$ sudo sh bz1550097.sh

  Electric Fence 2.2.2 Copyright (C) 1987-1999 Bruce Perens <bruce>
/bin/ef: line 20: 24650 Segmentation fault      (core dumped) ( export LD_PRELOAD=libefence.so.0.0; exec "$@" )

$ gdb -c core.24650  /usr/sbin/ip
...
Core was generated by `ip link set dev p0 address f0 00 00 01 01 01'.
Program terminated with signal 11, Segmentation fault.
#0  0x000000000041db1c in nl_get_ll_addr_len ()
Missing separate debuginfos, use: debuginfo-install iproute-4.11.0-9.el7.x86_64
(gdb) bt
#0  0x000000000041db1c in nl_get_ll_addr_len ()
#1  0x000000000041e5dc in iplink_parse ()
#2  0x0000000000420146 in iplink_modify ()
#3  0x00000000004207b2 in do_iplink ()
#4  0x0000000000408384 in do_cmd ()
#5  0x0000000000407e97 in main ()

(gdb) layout asm
...
   |0x41db00 <nl_get_ll_addr_len+160>       callq  0x448890 <parse_rtattr_flags>
   |0x41db05 <nl_get_ll_addr_len+165>       cmpq   $0x0,0x18(%rsp)
   |0x41db0b <nl_get_ll_addr_len+171>       mov    0x8(%rsp),%rdi
   |0x41db10 <nl_get_ll_addr_len+176>       je     0x41db40 <nl_get_ll_addr_len+224>
   |0x41db12 <nl_get_ll_addr_len+178>       callq  0x405e60 <free@plt>
   |0x41db17 <nl_get_ll_addr_len+183>       mov    0x18(%rsp),%rax
  >|0x41db1c <nl_get_ll_addr_len+188>       movzwl (%rax),%eax
   |0x41db1f <nl_get_ll_addr_len+191>       sub    $0x4,%eax
   |0x41db22 <nl_get_ll_addr_len+194>       mov    0x598(%rsp),%rdx
   |0x41db2a <nl_get_ll_addr_len+202>       xor    %fs:0x28,%rdx
   |0x41db33 <nl_get_ll_addr_len+211>       jne    0x41db53 <nl_get_ll_addr_len+243>
   |0x41db35 <nl_get_ll_addr_len+213>       add    $0x5a8,%rsp
   |0x41db3c <nl_get_ll_addr_len+220>       retq
...

Comment 6 Phil Sutter 2018-03-01 09:44:09 UTC
Hi Eric,

(In reply to Eric Garver from comment #5)
> (In reply to Eric Garver from comment #4)
[...]
> > I was also unable to reproduce it outside of the OVS testsuite. So I went
> > looking for suspicious code in iproute and found a use after free in
> > nl_get_ll_addr_len(). tb array is filled from the dynamically allocated
> > answer. But tb[IFLA_ADDRESS] is accessed after answer is freed.
> > 
> > This was introduced by
> > 
> >   86bf43c7c2fd ("lib/libnetlink: update rtnl_talk to support malloc buff at
> > run time")
> > 
> > which was backported in iproute-4.11.0-9

Oh, I see! That also explains why some builds don't expose the issue - it
simply depends on how the code was compiled.

> ElectricFence and gdb agree.

Thanks for analyzing the issue!

Patch sent upstream: https://marc.info/?l=linux-netdev&m=151989693717192&w=2

Comment 17 errata-xmlrpc 2018-04-10 14:31:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:0815