Bug 1817236

Summary: non_virtual_ip fails with unexpected keyword argument
Product: OpenShift Container Platform Reporter: Ben Nemec <bnemec>
Component: Machine Config OperatorAssignee: Ben Nemec <bnemec>
Status: CLOSED ERRATA QA Contact: Victor Voronkov <vvoronko>
Severity: high Docs Contact:
Priority: high    
Version: 4.4CC: smilner, vvoronko
Target Milestone: ---   
Target Release: 4.5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: Some routes could not be handled by the non_virtual_ip parsing code. Consequence: Services that need to be configured with the non-virtual ip will fail. Fix: Make non_virtual_ip parsing code more flexible. Result: Routes are parsed and services are configured correctly.
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-07-13 17:23:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ben Nemec 2020-03-25 21:49:37 UTC
Description of problem: In an ipv6 deployment, non_virtual_ip may fail with the following stack trace:

Traceback (most recent call last):
  File "/usr/local/bin/non_virtual_ip", line 235, in <module>
    main()
  File "/usr/local/bin/non_virtual_ip", line 220, in main
    iface_addrs = list(interface_addrs(filters))
  File "/usr/local/bin/non_virtual_ip", line 163, in interface_addrs
    for route in (V6Route.from_line(rline) for rline in route_out.splitlines()):
  File "/usr/local/bin/non_virtual_ip", line 163, in <genexpr>
    for route in (V6Route.from_line(rline) for rline in route_out.splitlines()):
  File "/usr/local/bin/non_virtual_ip", line 81, in from_line
    return cls(**attrs)
TypeError: __init__() got an unexpected keyword argument 'expires'

This is because there is a route that looks like this:

fd00:1101::/64 dev enp1s0 proto kernel metric 256 expires 5sec pref medium

non_virtual_ip does not know how to handle the expires field so it fails when it gets to this route.

I'm unsure how common this would be in regular deployments because I'm working on some semi-related changes that may be triggering this more often, but since that is a valid route we should make sure that the script can handle it.

Comment 1 Ben Nemec 2020-03-26 21:38:03 UTC
After fixing the expires problem, I'm seeing another failure on the following route:

default proto ra metric 100 \   nexthop via fe80::69e0:8df7:4351:4eb6 dev enp1s0 weight 1 \     nexthop via fe80::f620:b5b3:612a:8446 dev enp1s0 weight 1 \     nexthop via fe80::b449:f968:42ce:eef5 dev enp1s0 weight 1 pref medium

This one is because of the \'s in the route, which are standing in for line breaks. The nexthop stuff isn't handled correctly either, so I think we might want to allow this class to take arbitrary params and just ignore the ones we don't care about.

Comment 3 Ben Nemec 2020-04-09 17:31:38 UTC
The fix for this ended up going in as part of https://github.com/openshift/machine-config-operator/pull/1601

Comment 4 Ben Nemec 2020-04-09 17:32:04 UTC
*** Bug 1821950 has been marked as a duplicate of this bug. ***

Comment 5 Victor Voronkov 2020-04-20 11:41:25 UTC
Verified on 4.5.0-0.nightly-2020-04-14-031010 

by adding multiple lines route to the master:
sudo ip -6 r add f000::/64 proto ra metric 100 nexthop via fe80::1 dev enp5s0 nexthop via fe80::2 dev enp5s0

# ip -o -6 r
f000::/64 proto ra metric 100 \	nexthop via fe80::1 dev enp5s0 weight 1 \	nexthop via fe80::2 dev enp5s0 weight 1 pref medium
...


[core@master-0-0 ~]$ non_virtual_ip fd2e:6f44:5dd8:c956::2
Filtering out Address(127.0.0.1/8, dev=lo) due to it having host scope
Filtering out Address(::1/128, dev=lo) due to it having host scope
Filtering out Address(fd2e:6f44:5dd8::5/128, dev=enp5s0) due to it being deprecated
Filtering out Address(fd2e:6f44:5dd8::2/128, dev=enp5s0) due to it being deprecated
Checking V6Route(f000::/64, dev=enp5s0) for Address(fd2e:6f44:5dd8::11a/128, dev=enp5s0)
Checking V6Route(fd2e:6f44:5dd8::/64, dev=enp5s0) for Address(fd2e:6f44:5dd8::11a/128, dev=enp5s0)
Is fd2e:6f44:5dd8:c956::2 between fe80:: and fe80::ffff:ffff:ffff:ffff
Is fd2e:6f44:5dd8:c956::2 between fd2e:6f44:5dd8:: and fd2e:6f44:5dd8:0:ffff:ffff:ffff:ffff
Is fd2e:6f44:5dd8:c956::2 between fd2e:6f44:5dd8::11a and fd2e:6f44:5dd8::11a
Is fd2e:6f44:5dd8:c956::2 between fe80:: and fe80::ffff:ffff:ffff:ffff
Is fd2e:6f44:5dd8:c956::2 between fe80:: and fe80::ffff:ffff:ffff:ffff
Is fd2e:6f44:5dd8:c956::2 between fe80:: and fe80::ffff:ffff:ffff:ffff
Is fd2e:6f44:5dd8:c956::2 between fd99:: and fd99::ffff:ffff:ffff:ffff
Is fd2e:6f44:5dd8:c956::2 between fd01:0:0:2:: and fd01::2:ffff:ffff:ffff:ffff
Is fd2e:6f44:5dd8:c956::2 between fe80:: and fe80::ffff:ffff:ffff:ffff

[core@master-0-0 ~]$ cat /etc/resolv.conf
# Generated by KNI resolv prepender NM dispatcher script
search ocp-edge-cluster-0.qe.lab.redhat.com
nameserver fd2e:6f44:5dd8:0:0:0:0:2
nameserver fe80::5054:ff:fe49:76e3%enp5s0
nameserver fd2e:6f44:5dd8::1

sudo journalctl -b | grep prepender
...
Apr 20 09:12:06 master-0-0.ocp-edge-cluster-0.qe.lab.redhat.com root[3861282]: NM resolv-prepender triggered by enp4s0 dhcp6-change.
Apr 20 09:12:06 master-0-0.ocp-edge-cluster-0.qe.lab.redhat.com nm-dispatcher[3861275]: <13>Apr 20 09:12:06 root: NM resolv-prepender triggered by enp4s0 dhcp6-change.
Apr 20 09:12:06 master-0-0.ocp-edge-cluster-0.qe.lab.redhat.com root[3861283]: NM resolv-prepender: Prepending 'nameserver fd2e:6f44:5dd8:0:0:0:0:2' to /etc/resolv.conf (other nameservers from /var/run/NetworkManager/resolv.conf)
Apr 20 09:12:06 master-0-0.ocp-edge-cluster-0.qe.lab.redhat.com nm-dispatcher[3861275]: <13>Apr 20 09:12:06 root: NM resolv-prepender: Prepending 'nameserver fd2e:6f44:5dd8:0:0:0:0:2' to /etc/resolv.conf (other nameservers from /var/run/NetworkManager/resolv.conf)

Comment 8 errata-xmlrpc 2020-07-13 17:23:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409