Bug 1821950 - nodeip-configuration.service fails when multiple ipv6 default routes are installed
Summary: nodeip-configuration.service fails when multiple ipv6 default routes are inst...
Keywords:
Status: CLOSED DUPLICATE of bug 1817236
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.4
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 4.5.0
Assignee: Ben Nemec
QA Contact: Victor Voronkov
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-04-07 22:51 UTC by Marius Cornea
Modified: 2020-04-09 18:34 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-04-09 17:32:04 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift machine-config-operator pull 1616 0 None closed Bug 1817594: [release-4.4] Nodeip retry on failure 2021-01-13 16:40:59 UTC

Description Marius Cornea 2020-04-07 22:51:14 UTC
Description of problem:

nodeip-configuration.service fails when multiple ipv6 default routes are installed.

[root@openshift-worker-0 core]# ip -6 r
::1 dev lo proto kernel metric 256 pref medium
2620:52:0:2e1d::80/121 dev ens1f1 proto ra metric 102 pref medium
fd00:1101::1c dev ens1f0 proto kernel metric 101 pref medium
fd00:1101::/64 dev ens1f0 proto ra metric 101 pref medium
fe80::/64 dev enp1s0f4u4 proto kernel metric 100 pref medium
fe80::/64 dev ens1f0 proto kernel metric 101 pref medium
fe80::/64 dev ens1f1 proto kernel metric 102 pref medium
fe80::/64 dev ens1f2 proto kernel metric 103 pref medium
fe80::/64 dev ens1f3 proto kernel metric 104 pref medium
default via fe80::5000:8c34:2d2f:55a0 dev ens1f0 proto ra metric 101 pref medium
default proto ra metric 102 
	nexthop via fe80::200:5eff:fe00:201 dev ens1f1 weight 1 
	nexthop via fe80::d207:ca01:5521:2700 dev ens1f1 weight 1 
	nexthop via fe80::2e21:3101:55e3:8f00 dev ens1f1 weight 1 pref medium

[root@openshift-worker-0 core]# systemctl restart nodeip-configuration.service
Job for nodeip-configuration.service failed because the control process exited with error code.
See "systemctl status nodeip-configuration.service" and "journalctl -xe" for details.


[root@openshift-worker-0 core]# systemctl status nodeip-configuration.service
● nodeip-configuration.service - Writes IP address configuration so that kubelet and crio services select a valid node IP
   Loaded: loaded (/etc/systemd/system/nodeip-configuration.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Tue 2020-04-07 22:48:48 UTC; 10s ago
  Process: 8885 ExecStart=/usr/local/bin/nodeip-finder 10.46.29.199 (code=exited, status=1/FAILURE)
 Main PID: 8885 (code=exited, status=1/FAILURE)
      CPU: 65ms

Apr 07 22:48:48 openshift-worker-0 nodeip-finder[8885]:     for route in (V6Route.from_line(rline) for rline in route_out.splitlines()):
Apr 07 22:48:48 openshift-worker-0 nodeip-finder[8885]:   File "/var/usrlocal/bin/non_virtual_ip", line 163, in <genexpr>
Apr 07 22:48:48 openshift-worker-0 nodeip-finder[8885]:     for route in (V6Route.from_line(rline) for rline in route_out.splitlines()):
Apr 07 22:48:48 openshift-worker-0 nodeip-finder[8885]:   File "/var/usrlocal/bin/non_virtual_ip", line 81, in from_line
Apr 07 22:48:48 openshift-worker-0 nodeip-finder[8885]:     return cls(**attrs)
Apr 07 22:48:48 openshift-worker-0 nodeip-finder[8885]: TypeError: __init__() got an unexpected keyword argument '\'
Apr 07 22:48:48 openshift-worker-0 systemd[1]: nodeip-configuration.service: Main process exited, code=exited, status=1/FAILURE
Apr 07 22:48:48 openshift-worker-0 systemd[1]: nodeip-configuration.service: Failed with result 'exit-code'.
Apr 07 22:48:48 openshift-worker-0 systemd[1]: Failed to start Writes IP address configuration so that kubelet and crio services select a valid node IP.
Apr 07 22:48:48 openshift-worker-0 systemd[1]: nodeip-configuration.service: Consumed 65ms CPU time



[root@openshift-worker-0 core]# /usr/local/bin/nodeip-finder 10.46.29.199
Filtering out Address(127.0.0.1/8, dev=lo) due to it having host scope
Filtering out Address(::1/128, dev=lo) due to it having host scope
Checking V6Route(fd00:1101::/64, dev=ens1f0) for Address(fd00:1101::1c/128, dev=ens1f0)
Traceback (most recent call last):
  File "/usr/local/bin/nodeip-finder", line 73, in <module>
    main()
  File "/usr/local/bin/nodeip-finder", line 54, in main
    first: non_virtual_ip.Address = first_candidate_addr(api_vip)
  File "/usr/local/bin/nodeip-finder", line 31, in first_candidate_addr
    iface_addrs = list(non_virtual_ip.interface_addrs(filters))
  File "/var/usrlocal/bin/non_virtual_ip", line 163, in interface_addrs
    for route in (V6Route.from_line(rline) for rline in route_out.splitlines()):
  File "/var/usrlocal/bin/non_virtual_ip", line 163, in <genexpr>
    for route in (V6Route.from_line(rline) for rline in route_out.splitlines()):
  File "/var/usrlocal/bin/non_virtual_ip", line 81, in from_line
    return cls(**attrs)
TypeError: __init__() got an unexpected keyword argument '\'


If I remove the default proto ra metric 102 route the scripts runs fine:

[root@openshift-worker-0 core]# ip -6 r del default proto ra metric 102

[root@openshift-worker-0 core]# /usr/local/bin/nodeip-finder 10.46.29.199
Filtering out Address(127.0.0.1/8, dev=lo) due to it having host scope
Filtering out Address(::1/128, dev=lo) due to it having host scope
Checking V6Route(fd00:1101::/64, dev=ens1f0) for Address(fd00:1101::1c/128, dev=ens1f0)
Is 10.46.29.199 between fd00:1101:: and fd00:1101::ffff:ffff:ffff:ffff
Is 10.46.29.199 between fd00:1101::1c and fd00:1101::1c
Is 10.46.29.199 between fe80:: and fe80::ffff:ffff:ffff:ffff
Is 10.46.29.199 between 10.46.29.128 and 10.46.29.255
Is 10.46.29.199 between fe80:: and fe80::ffff:ffff:ffff:ffff
Is 10.46.29.199 between fe80:: and fe80::ffff:ffff:ffff:ffff
Is 10.46.29.199 between fe80:: and fe80::ffff:ffff:ffff:ffff
Is 10.46.29.199 between 16.1.15.0 and 16.1.15.3
Is 10.46.29.199 between fe80:: and fe80::ffff:ffff:ffff:ffff
VIP Subnet 10.46.29.128/25


Version-Release number of selected component (if applicable):
4.4.0-0.nightly-2020-04-04-025830

How reproducible:
100%

Comment 1 Ben Nemec 2020-04-09 16:44:49 UTC
This should have been fixed in 4.4 by https://github.com/openshift/machine-config-operator/pull/1616 .

Comment 2 Ben Nemec 2020-04-09 17:32:04 UTC

*** This bug has been marked as a duplicate of bug 1817236 ***


Note You need to log in before you can comment on or make changes to this bug.