Bug 1825991

Summary: vsphere ipi: workers fails with nodeip-configuration service
Product: OpenShift Container Platform Reporter: Joseph Callen <jcallen>
Component: InstallerAssignee: Russell Teague <rteague>
Installer sub component: openshift-installer QA Contact: jima
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: unspecified CC: jima
Version: 4.5   
Target Milestone: ---   
Target Release: 4.5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
The file conflicts with the common template file of the same name (vsphere-non-virtual-ip.yaml) causing the nodeip-configuration service to fail on workers.
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-07-13 17:29:09 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Joseph Callen 2020-04-20 16:09:13 UTC
Version:

OPENSHIFT_INSTALL_RELEASE_IMAGE_OVERRIDE="registry.svc.ci.openshift.org/ocp/release:4.5.0-0.ci-2020-04-20-144713"

Description of problem:

[root@jcallen-fvfnt-worker-cfpdw ~]# journalctl -fu nodeip-configuration
-- Logs begin at Mon 2020-04-20 15:34:23 UTC. --
Apr 20 15:37:33 jcallen-fvfnt-worker-cfpdw nodeip-finder[1335]: Traceback (most recent call last):
Apr 20 15:37:33 jcallen-fvfnt-worker-cfpdw nodeip-finder[1335]:   File "/usr/local/bin/nodeip-finder", line 105, in <module>
Apr 20 15:37:33 jcallen-fvfnt-worker-cfpdw nodeip-finder[1335]:     main()
Apr 20 15:37:33 jcallen-fvfnt-worker-cfpdw nodeip-finder[1335]:   File "/usr/local/bin/nodeip-finder", line 82, in main
Apr 20 15:37:33 jcallen-fvfnt-worker-cfpdw nodeip-finder[1335]:     except (non_virtual_ip.AddressNotFoundException, non_virtual_ip.SubnetNotFoundException):
Apr 20 15:37:33 jcallen-fvfnt-worker-cfpdw nodeip-finder[1335]: AttributeError: module 'non_virtual_ip' has no attribute 'AddressNotFoundException'
Apr 20 15:37:33 jcallen-fvfnt-worker-cfpdw systemd[1]: nodeip-configuration.service: Main process exited, code=exited, status=1/FAILURE
Apr 20 15:37:33 jcallen-fvfnt-worker-cfpdw systemd[1]: nodeip-configuration.service: Failed with result 'exit-code'.
Apr 20 15:37:33 jcallen-fvfnt-worker-cfpdw systemd[1]: Failed to start Writes IP address configuration so that kubelet and crio services select a valid node IP.
Apr 20 15:37:33 jcallen-fvfnt-worker-cfpdw systemd[1]: nodeip-configuration.service: Consumed 78ms CPU time


https://github.com/openshift/machine-config-operator/blob/master/templates/worker/00-worker/vsphere/files/vsphere-non-virtual-ip.yaml

https://github.com/openshift/machine-config-operator/blob/master/templates/common/vsphere/units/nodeip-configuration.yaml

Comment 3 Joseph Callen 2020-04-20 20:05:51 UTC
Review this PR as well.  Maybe this should be changed at some point as well
https://github.com/openshift/machine-config-operator/pull/1659

Comment 4 jima 2020-04-21 01:00:05 UTC
The issue is reproduced on QE side when manually install vsphere ipi with nightly load 4.5.0-0.nightly-2020-04-20-224257
$ sudo journalctl -u nodeip-configuration.service
-- Logs begin at Tue 2020-04-21 00:27:01 UTC, end at Tue 2020-04-21 00:54:47 UTC. --
Apr 21 00:28:47 jimaipi-g8bhw-worker-dlk78 systemd[1]: Starting Writes IP address configuration so that kubelet and crio services select a valid node IP...
Apr 21 00:28:48 jimaipi-g8bhw-worker-dlk78 nodeip-finder[1602]: Processing CustomAction for target
Apr 21 00:28:48 jimaipi-g8bhw-worker-dlk78 nodeip-finder[1602]:   parser = 140010777447112
Apr 21 00:28:48 jimaipi-g8bhw-worker-dlk78 nodeip-finder[1602]:   values = '136.144.52.198'
Apr 21 00:28:48 jimaipi-g8bhw-worker-dlk78 nodeip-finder[1602]:   option_string = None
Apr 21 00:28:48 jimaipi-g8bhw-worker-dlk78 nodeip-finder[1602]: Traceback (most recent call last):
Apr 21 00:28:48 jimaipi-g8bhw-worker-dlk78 nodeip-finder[1602]:   File "/usr/local/bin/nodeip-finder", line 79, in main
Apr 21 00:28:48 jimaipi-g8bhw-worker-dlk78 nodeip-finder[1602]:     first: non_virtual_ip.Address = first_candidate_addr(args.target)
Apr 21 00:28:48 jimaipi-g8bhw-worker-dlk78 nodeip-finder[1602]:   File "/usr/local/bin/nodeip-finder", line 32, in first_candidate_addr
Apr 21 00:28:48 jimaipi-g8bhw-worker-dlk78 nodeip-finder[1602]:     non_virtual_ip.non_deprecated,
Apr 21 00:28:48 jimaipi-g8bhw-worker-dlk78 nodeip-finder[1602]: AttributeError: module 'non_virtual_ip' has no attribute 'non_deprecated'
Apr 21 00:28:48 jimaipi-g8bhw-worker-dlk78 nodeip-finder[1602]: During handling of the above exception, another exception occurred:
Apr 21 00:28:48 jimaipi-g8bhw-worker-dlk78 nodeip-finder[1602]: Traceback (most recent call last):
Apr 21 00:28:48 jimaipi-g8bhw-worker-dlk78 nodeip-finder[1602]:   File "/usr/local/bin/nodeip-finder", line 105, in <module>
Apr 21 00:28:48 jimaipi-g8bhw-worker-dlk78 nodeip-finder[1602]:     main()
Apr 21 00:28:48 jimaipi-g8bhw-worker-dlk78 nodeip-finder[1602]:   File "/usr/local/bin/nodeip-finder", line 82, in main
Apr 21 00:28:48 jimaipi-g8bhw-worker-dlk78 nodeip-finder[1602]:     except (non_virtual_ip.AddressNotFoundException, non_virtual_ip.SubnetNotFoundException):
Apr 21 00:28:48 jimaipi-g8bhw-worker-dlk78 nodeip-finder[1602]: AttributeError: module 'non_virtual_ip' has no attribute 'AddressNotFoundException'
Apr 21 00:28:48 jimaipi-g8bhw-worker-dlk78 systemd[1]: nodeip-configuration.service: Main process exited, code=exited, status=1/FAILURE
Apr 21 00:28:48 jimaipi-g8bhw-worker-dlk78 systemd[1]: nodeip-configuration.service: Failed with result 'exit-code'.
Apr 21 00:28:48 jimaipi-g8bhw-worker-dlk78 systemd[1]: Failed to start Writes IP address configuration so that kubelet and crio services select a valid node IP.
Apr 21 00:28:48 jimaipi-g8bhw-worker-dlk78 systemd[1]: nodeip-configuration.service: Consumed 109ms CPU time

Comment 7 jima 2020-04-25 07:54:47 UTC
The issue is verified on nightly build 4.5.0-0.nightly-2020-04-25-034022 and passed.

Comment 8 errata-xmlrpc 2020-07-13 17:29:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409