Bug 1889273 - Improve the generic error message 'connect: no route to host'
Summary: Improve the generic error message 'connect: no route to host'
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.6
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
: ---
Assignee: aos-install
QA Contact: Gaoyun Pei
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-10-19 08:50 UTC by Michael Burman
Modified: 2020-10-20 14:03 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-20 14:03:15 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Michael Burman 2020-10-19 08:50:43 UTC
Improve the generic error message 'connect: no route to host' 

Many times the OCP on RHV failing with the generic error message:
no route to host

This error is not useful and very generic. The real issue or failure can't be understood from such message.
We need to try and improve it and provide the user a meaningful error message.

For example, I'm now failing with:
DEBUG Still waiting for the Kubernetes API: Get "https://<hostname>:6443/version?timeout=32s": dial tcp 10.x.x.x:6443: connect: no route to host 
ERROR Attempted to gather ClusterOperator status after installation failure: listing ClusterOperator objects: Get "https://<hostname>:6443/apis/config.openshift.io/v1/clusteroperators": dial tcp 10.x.x.x:6443: connect: no route to host 
DEBUG Fetching Bootstrap SSH Key Pair...           
DEBUG Loading Bootstrap SSH Key Pair...            
DEBUG Using Bootstrap SSH Key Pair loaded from state file 
DEBUG Reusing previously-fetched Bootstrap SSH Key Pair 
DEBUG Fetching Install Config...                   
DEBUG Loading Install Config...                    
DEBUG   Loading SSH Key...                         
DEBUG   Using SSH Key loaded from state file       
DEBUG   Loading Base Domain...                     
DEBUG     Loading Platform...                      
DEBUG     Using Platform loaded from state file    
DEBUG   Using Base Domain loaded from state file   
DEBUG   Loading Cluster Name...                    
DEBUG     Loading Base Domain...                   
DEBUG     Loading Platform...                      
DEBUG   Using Cluster Name loaded from state file  
DEBUG   Loading Pull Secret...                     
DEBUG   Using Pull Secret loaded from state file   
DEBUG   Loading Platform...                        
DEBUG Using Install Config loaded from state file  
DEBUG Reusing previously-fetched Install Config    
INFO Pulling debug logs from the bootstrap machine 
DEBUG Added /tmp/bootstrap-ssh263426912 to installer's internal agent 
DEBUG Added /root/.ssh/id_rsa to installer's internal agent 
ERROR Attempted to gather debug logs after installation failure: failed to create SSH client: failed to use the provided keys for authentication: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain 
FATAL Bootstrap failed to complete: failed waiting for Kubernetes API: Get "https://<hostname>:6443/version?timeout=32s": dial tcp 10.x.x.x:6443: connect: no route to host

Bootstrap VM:

Error: error pulling image "registry.svc.ci.openshift.org/ocp/release@sha256:9663f178a9a5bf87fad0d4e2dabeaef32110d4c9c3d400eededd4f6bff5109fc": unable to pull registry.svc.ci.openshift.org/ocp/release@sha256:9663f178a9a5bf87fad0d4e2dabeaef32110d4c9c3d400eededd4f6bff5109fc: unable to pull image: Error initializing source docker://registry.svc.ci.openshift.org/ocp/release@sha256:9663f178a9a5bf87fad0d4e2dabeaef32110d4c9c3d400eededd4f6bff5109fc: Error reading manifest sha256:9663f178a9a5bf87fad0d4e2dabeaef32110d4c9c3d400eededd4f6bff5109fc in registry.svc.ci.openshift.org/ocp/release: unauthorized: authentication required

Version:
openshift-install-linux-4.6.0-0.nightly-2020-10-03-051134

Platform:
ovirt/RHV

Please specify:
* IPI 

What happened?
Install failed with no route to host error message. 

What did you expect to happen?
Provide clear and meaningful 
The current error is not helpful and might be misleading the user about the real issue. 

How to reproduce it (as minimally and precisely as possible)?
100% at the current moment
Issue was shown to devel
FYI, i have masked the hostname and IPs in the description.

Comment 2 Scott Dodson 2020-10-20 14:03:15 UTC
The API is the Installer's only view in to the target cluster, if that API never becomes available then the fall back is to generate a bootstrap log bundle. We're working to ensure that log bundle includes information necessary to clearly identify pull secret errors like yours. 

This is tracked in https://issues.redhat.com/browse/CORS-1533 as a feature.


Note You need to log in before you can comment on or make changes to this bug.