Bug 2003915 - pre-network-manager-config failed due to timeout when static config is used
Summary: pre-network-manager-config failed due to timeout when static config is used
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Advanced Cluster Management for Kubernetes
Classification: Red Hat
Component: Infrastructure Operator
Version: rhacm-2.4
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: rhacm-2.4
Assignee: Michael Filanov
QA Contact: Chad Crum
Derek
URL:
Whiteboard:
Depends On:
Blocks: 2014084
TreeView+ depends on / blocked
 
Reported: 2021-09-14 07:05 UTC by Jatan Malde
Modified: 2022-08-01 18:56 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-11-11 18:33:32 UTC
Target Upstream Version:
Embargoed:
ccrum: qe_test_coverage-
ming: rhacm-2.4+
jmalde: needinfo-
jmalde: needinfo-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github open-cluster-management backlog issues 17237 0 None None None 2021-10-14 15:49:50 UTC
Github openshift assisted-service pull 2763 0 None Merged OCPBUGSM-35106: pre-network-manager-config failed due to timeout when static config is used 2021-10-14 11:20:11 UTC
Red Hat Issue Tracker MGMTBUGSM-100 0 None None None 2022-02-17 15:53:52 UTC
Red Hat Product Errata RHSA-2021:4618 0 None None None 2021-11-11 18:33:58 UTC

Description Jatan Malde 2021-09-14 07:05:30 UTC
Version:

$ openshift-install version
Installer used from cloud.redhat.com 

Platform:

baremetal

What happened?

Single node OCP with static ip address install fails sometimes because the static network configuration do not get applied when booting first time from discovery ISO image. Since DHCP is not used and static address setting failed the node do not have external connectivity at all. Detailed failure is pre-network-manager-config service timeout. the service is the one that should apply the static network config to networkManager. In ignition file the timeout for the service is set as 10s and no retries OR no dependencies to any service are defined. so when the service fails once, node will never get network connectivity.


 Sep 08 09:20:55 localhost sh[1970]: changing security context of '/var/usrlocal/sbin'
Sep 08 09:20:55 localhost auditd[2055]: No plugins found, not dispatching events
Sep 08 09:20:55 localhost auditd[2055]: Init complete, auditd 3.0 listening for events (startup state enable)
Sep 08 09:20:57 localhost systemd[1]: Started RHCOS Fix SELinux Labeling For /usr/local/sbin.
Sep 08 09:21:01 localhost systemd[1]: pre-network-manager-config.service: start operation timed out. Terminating.
Sep 08 09:21:01 localhost systemd[1]: pre-network-manager-config.service: Main process exited, code=killed, status=15/TERM
Sep 08 09:21:01 localhost systemd[1]: pre-network-manager-config.service: Failed with result 'timeout'.
Sep 08 09:21:01 localhost systemd[1]: Failed to start Prepare network manager config content.
Sep 08 09:21:03 localhost systemd[1]: Started Run update-ca-trust.


What did you expect to happen?

The pre-network-manager-config service should work fine and the static configuration should be set up.

How to reproduce it (as minimally and precisely as possible)?

Pull a 4.8 cluster on cloud.redhat.com using assisted installer and select the checkbox for single node openshift.

Anything else we need to know?

When the timeout is set to 30seconds the configuration is successfully applied and the node network is setup.
Attaching the patched ignition file and also full serial logs from the machine.

Comment 2 Nick Carboni 2021-09-22 17:44:13 UTC
This is a potential issue with the assisted installer running in cloud.redhat.com so it doesn't qualify as an OCP blocker.

Comment 4 Alexander Chuzhoy 2021-10-05 13:29:11 UTC
Hi Jatan,
What model is the machine where you see this issue?

Comment 6 Alexander Chuzhoy 2021-10-07 16:54:13 UTC
IMHO this looks similar to https://bugzilla.redhat.com/show_bug.cgi?id=2002059
Duplicate?

Comment 12 Crystal Chun 2021-10-26 19:30:28 UTC
PR for 2.4 has been merged https://github.com/openshift/assisted-service/pull/2773 

Can we get this verified and closed?

Comment 16 errata-xmlrpc 2021-11-11 18:33:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat Advanced Cluster Management 2.4 images and security updates), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:4618

Comment 17 Osher De Paz 2021-11-14 16:36:49 UTC
Hi!
just wanted to let everyone know that console.redhat.com/cloud.redhat.com contains the relevant fix, with version v1.0.27.3


Note You need to log in before you can comment on or make changes to this bug.