Bug 1991738 - Assisted installer: machine-config-server refusing to serve config to pool "worker"
Summary: Assisted installer: machine-config-server refusing to serve config to pool "w...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: assisted-installer
Version: 4.8
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Igal Tsoiref
QA Contact: Udi Kalifon
URL:
Whiteboard: AI-Team-Core
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-08-09 20:43 UTC by Lars Kellogg-Stedman
Modified: 2023-09-15 01:13 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-08-17 17:20:32 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker AITRIAGE-1215 0 None None None 2021-08-16 13:53:05 UTC

Description Lars Kellogg-Stedman 2021-08-09 20:43:13 UTC
I'm trying to install an OCP 4.8.2 bare metal cluster using the
assisted installer. The discovery and install prep went fine, but then
the install got stuck; the nodes are all "waiting for ignition" or
otherwise failing to contact the machine config service on port 22623.

Looking at the logs for the machine config service, I see:

    I0809 19:41:38.194289       1 bootstrap.go:37] Version: v4.8.0-202107011817.p0.git.29813c8.assembly.stream-dirty (29813c845a4a3ee8e6856713c585aca834e0bf1e)
    I0809 19:41:38.194406       1 api.go:65] Launching server on :22624
    I0809 19:41:38.194433       1 api.go:65] Launching server on :22623
    I0809 19:47:13.693042       1 api.go:110] Pool master requested by address:"128.52.62.137:42374" User-Agent:"Ignition/2.9.0" Accept-Header: "application/vnd.coreos.ignition+json;version=3.2.0, */*;q=0.1"
    I0809 19:47:13.693079       1 bootstrap_server.go:66] reading file "/etc/mcs/bootstrap/machine-pools/master.yaml"
    I0809 19:47:13.694091       1 bootstrap_server.go:86] reading file "/etc/mcs/bootstrap/machine-configs/rendered-master-7cc4ff6b0c0903a34703c4c86d9ff87e.yaml"
    I0809 19:47:18.098980       1 api.go:110] Pool worker requested by address:"128.52.62.244:59586" User-Agent:"Ignition/2.9.0" Accept-Header: "application/vnd.coreos.ignition+json;version=3.2.0, */*;q=0.1"
    E0809 19:47:18.099059       1 api.go:129] couldn't get config for req: {worker 0xc00060e200}, error: refusing to serve bootstrap configuration to pool "worker"
    I0809 19:47:18.116060       1 api.go:110] Pool worker requested by address:"128.52.62.134:60782" User-Agent:"Ignition/2.9.0" Accept-Header: "application/vnd.coreos.ignition+json;version=3.2.0, */*;q=0.1"

And then that final error message just repeats forever.

I am going to try aborting and restarting the install, so I won't be able to service requests for additional information from the cluster.

Comment 1 Lars Kellogg-Stedman 2021-08-13 20:02:56 UTC
After resetting the install and trying again, we're getting a very similar error:

E0813 20:01:08.704977       1 api.go:129] couldn't get config for req: {worker 0xc00063f280}, error: refusing to serve bootstrap configuration to pool "worker"
I0813 20:01:08.757313       1 api.go:110] Pool worker requested by address:"128.52.62.231:42004" User-Agent:"Ignition/2.9.0" Accept-Header: "application/vnd.coreos.ignition+json;version=3.2.0, */*;q=0.1"

Comment 2 Nir Magnezi 2021-08-16 12:29:43 UTC
Hi Lars,

Thank you for reporting this.
Can you please share your cluster-id so we can take a closer look?

Comment 4 Igal Tsoiref 2021-08-17 08:05:56 UTC
@lars this is expected behavior. Workers will not join on bootsrap control plane.

Comment 5 Lars Kellogg-Stedman 2021-08-17 11:59:36 UTC
Igor,

What are you talking about? This results in all the nodes getting stuck in the "waiting for ignition" state and failing to install.  This is absolutely not "expected behavior" unless we expect the assisted installer to always fail and be completely useless as a product.

I'm happy to hop on a call to look at this issue, but the bug is blocking a major project and should be fixed rather than closed.

Comment 7 Red Hat Bugzilla 2023-09-15 01:13:25 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days


Note You need to log in before you can comment on or make changes to this bug.