Bug 1850149

Summary: Error in Scheduler logs: didn't match expected format "openstack:///InstanceID"
Product: OpenShift Container Platform Reporter: Andy Bartlett <andbartl>
Component: InstallerAssignee: Mike Fedosin <mfedosin>
Installer sub component: OpenShift on OpenStack QA Contact: David Sanz <dsanzmor>
Status: CLOSED ERRATA Docs Contact:
Severity: urgent    
Priority: urgent CC: abodhe, aos-bugs, m.andre, mfedosin, palonsor, pprinett, rkshirsa
Version: 4.3.0   
Target Milestone: ---   
Target Release: 4.6.0   
Hardware: x86_64   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: Inconsistent output format for InstanceID() function. The function can get instance id either from metadata or from sending requests to the server. In the latter case, the result always has '/' prefix, and this is the correct format. Consequence: If the instance ID was fetched from the metadata, the system fails to verify its node existence and fails with the error. Fix: Fix the format of the result obtained from the metadata and include '/' prefix there, too. Result: The system can always successfully verify a node existence since the instance ID format is correct in all cases.
Story Points: ---
Clone Of:
: 1893156 (view as bug list) Environment:
Last Closed: 2020-10-27 16:09:04 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1893156    

Description Andy Bartlett 2020-06-23 15:28:37 UTC
Description of problem:

There appears to be an issue with the code from openstack_instances.go

It has a confusion around the number of "/" it should be using, this causes an error like this:

E0619 08:43:46.603838       1 node_lifecycle_controller.go:171] error checking if node prod-6fpdq-worker-0 exists: ProviderID "openstack://f2990d2e-8f63-4e08-85d2-106345456400" didn't match expected format "openstack:///InstanceID"

The code issue seems to be around this section:

|// instanceIDFromProviderID splits a provider's id and return instanceID.|
|---|
| | // A providerID is build out of '${ProviderName}:///${instance-id}'which contains ':///'.|
| | // See cloudprovider.GetInstanceProviderID and Instances.InstanceID.|
| | func instanceIDFromProviderID(providerID string) (instanceID string, err error) {|
| | // If Instances.InstanceID or cloudprovider.GetInstanceProviderID is changed, the regexp should be changed too.|  <-------- CHECK THIS
| | var providerIDRegexp = regexp.MustCompile(`^` + ProviderName + `:///([^/]+)$`)|
| | |
| | matches := providerIDRegexp.FindStringSubmatch(providerID)|
| | if len(matches) != 2 {|
| | return "", fmt.Errorf("ProviderID \"%s\" didn't match expected format \"openstack:///InstanceID\"", providerID)|
| | }|
| | return matches[1], nil|
| | }|


Version-Release number of selected component (if applicable):

Openshift 4.4 on Openstack Stein

How reproducible:

Difficult, some clusters are affect some not 

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:

Master Log:

Node Log (of failed PODs):

PV Dump:

PVC Dump:

StorageClass Dump (if StorageClass used by PV/PVC):

Additional info:

Comment 4 Pierre Prinetti 2020-07-02 14:32:47 UTC
Currently being worked by the team

Comment 13 Mike Fedosin 2020-09-14 13:00:34 UTC
The bug has been fixed in 4.6 by https://github.com/openshift/kubernetes/pull/343

Comment 15 David Sanz 2020-09-21 08:21:24 UTC
Verified on 4.6.0-0.nightly-2020-09-21-030155

No reference to node_lifecycle_controller.go:171 on masters journal

Comment 24 errata-xmlrpc 2020-10-27 16:09:04 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196

Comment 28 Red Hat Bugzilla 2023-09-14 06:02:41 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days