Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1866344

Summary: [Assisted 4.5] When using the wrong boot cd - the error in the logs is very unclear
Product: OpenShift Container Platform Reporter: Udi Kalifon <ukalifon>
Component: assisted-installerAssignee: Ori Amizur <oamizur>
assisted-installer sub component: discovery-agent QA Contact: Yuri Obshansky <yobshans>
Status: CLOSED DUPLICATE Docs Contact:
Severity: high    
Priority: high CC: alazar, aos-bugs
Version: 4.5   
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-08-25 14:59:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Udi Kalifon 2020-08-05 11:53:10 UTC
Description of problem:
I booted with the wrong ISO and looked in the agent's logs to see why the hosts are not registering. All I say was "Could not find motherboard serial number" and "Host will stop trying to register":

Aug 05 09:31:43 master-0-0 systemd[1]: Started agent.service.
Aug 05 09:31:43 master-0-0 agent[1971]: time="05-08-2020 09:31:43" level=warning msg="Could not find motherboard serial number" file="machine_uuid_scanner.go:81"
Aug 05 09:31:44 master-0-0 agent[1971]: time="05-08-2020 09:31:44" level=warning msg="Host will stop trying to register" file="register_node.go:39" request_id=755c189d-42bd-4557-8641-f48f58eec455

The errors make no sense. It should be clear that the agent got a 404 because a cluster with the ID it has does not exist. The messages you see tell you that the problem is in detecting the serial number of the machine, and that should not be a reason to not register the host.


Version-Release number of selected component (if applicable):
Release tag
    latest
Assisted Installer UI version
    quay.io/ocpmetal/ocp-metal-ui:521cd8e4e07dc3aff5dd48bca8cbca0ae3244c8a
Assisted Installer
    quay.io/ocpmetal/assisted-installer:latest
Assisted Installer Controller
    quay.io/ocpmetal/assisted-installer-controller:latest
Assisted Installer Service
    quay.io/ocpmetal/bm-inventory:latest
Discovery Agent
    quay.io/ocpmetal/agent:latest
Ignition Manifests and Kubeconfig Generate
    quay.io/ocpmetal/ignition-manifests-and-kubeconfig-generate:latest
Image Builder
    quay.io/ocpmetal/installer-image-build:latest


How reproducible:
100%


Steps to Reproduce:
1. Create a cluster and download the ISO
2. Delete the cluster and create a new one
3. Use the ISO from the non-existent cluster to boot the hosts
4. Log in to the hosts with ssh, and check the messages in "sudo journalctl -u agent"


Actual results:
The logs don't give a hint of the actual problem


Expected results:
The logs should show a 404 not found error, and specify the url and cluster id that was attempted


Additional info:
The errors about the serial number may be related to another bug and may be there regardless, even when the registration succeeds.

Comment 1 Ronnie Lazar 2020-08-10 08:45:15 UTC
oamizur, I think we talked about removing the  "Could not find motherboard serial number" and "Host will stop trying to register" message.
Also please make sure that we have errors about not being able to register to the assisted-service

Comment 3 Ronnie Lazar 2020-08-25 09:27:20 UTC
When we output the message about host not being able to register to the assisted-service we should also output why it cannot register.
In Assisted-Service we have the information, and we should pass it back to the agent so it will be printed to the logs

Comment 4 Ronnie Lazar 2020-08-25 14:59:13 UTC

*** This bug has been marked as a duplicate of bug 1870047 ***