Bug 2097691
Summary: | [vsphere] failed to create cluster if datacenter is embedded in a Folder | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | mheppler |
Component: | Installer | Assignee: | OCP Installer <ocp-installer> |
Installer sub component: | openshift-installer | QA Contact: | jima |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | medium | ||
Priority: | unspecified | CC: | bscott, daliu, dhuynh, efried, gekis, jima, ocp-installer, padillon, rbost, rdossant, rspagnol |
Version: | 4.11 | Flags: | efried:
needinfo-
efried: needinfo- |
Target Milestone: | --- | ||
Target Release: | 4.12.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: |
* Previously, when installing a cluster on vSphere using a datacenter that is embedded inside a folder, the installation program could not locate the datacenter object, causing the installation to fail. In this update, the installation program can traverse the directory that contains the datacenter object, allowing the installation to succeed. (link:https://bugzilla.redhat.com/show_bug.cgi?id=2097691[*BZ2097691*])
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2023-01-17 19:50:02 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 2110482 |
Description
mheppler
2022-06-16 10:17:53 UTC
@efried Could you help to take a look? Sorry for the delay. Can we please involve the installer team here? (I'll say in general if the same version of hive succeeds on one Z and fails on another, it'll be less likely to be a hive problem, and thus more expeditious to start with OCP engineering.) I did a quick skim based on the error message. This *might* be related to https://github.com/openshift/installer/pull/5773 (backport of https://github.com/openshift/installer/pull/5673). Regardless, it looks like the author of that PR may be the SME in this space, and a good person to consult. @rbost would you mind having a look? It looks like the installer is failing at the following line which changed in 4.10.11 in the PR that Eric mentioned in the previous comment: https://github.com/openshift/installer/blob/release-4.10/pkg/asset/installconfig/vsphere/client.go#L88-L93 In the original pull request we acknowledged the line as a risk and did not change it since similar lines were used elsewhere (and we hadn't heard of reports of failure for those similar lines of code). Given this bug report, we probably need to address it! I've reviewed the case and see that the customer does indeed have a Datacenter embedded in a Folder which would cause the error. Leaving needinfo. Thanks @efried and @rbost I will transfer the issue to installer team. In one QE env, we also reproduced the same issue on 4.11, when datacenter embeded in a folder, cluster will be deployed failed due to unable to find expected vSphere cluster. FATAL failed to fetch Terraform Variables: failed to fetch dependency of "Terraform Variables": failed to generate asset "Platform Provisioning Check": platform.vsphere.network: Invalid value: "VM Network": could not find vSphere cluster at /Datacenter/host/jima/reliability: cluster '/Datacenter/host/jima/reliability' not found Dropping my needinfo since someone else submitted a fix to this bug (https://github.com/openshift/installer/pull/6105). verified on 4.12.0-0.nightly-2022-07-17-215842 and passed, move bug to VERIFIED. Install cluster successfully on env where datacenter embedded into folder $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.12.0-0.nightly-2022-07-17-215842 True False 70m Cluster version is 4.12.0-0.nightly-2022-07-17-215842 $ oc get cm cloud-provider-config -n openshift-config -o yaml apiVersion: v1 data: config: | [Global] secret-name = "vsphere-creds" secret-namespace = "kube-system" insecure-flag = "1" [Workspace] server = "xxx" datacenter = "qedc/sub-qe-dc/Datacenter" default-datastore = "datastore3" folder = "/qedc/sub-qe-dc/Datacenter/vm/jima23a-6qv4d" [VirtualCenter "dhcp-8-30-198.lab.eng.rdu2.redhat.com"] datacenters = "qedc/sub-qe-dc/Datacenter" kind: ConfigMap metadata: creationTimestamp: "2022-07-19T10:25:10Z" name: cloud-provider-config namespace: openshift-config resourceVersion: "1912" uid: 57ee3323-fd7e-401a-ac2b-6e8d1bf7686b Thank you for fixing this bug. Just a question... When this fix will be backported to 4.10 and 4.11? I plan on doing the backport to 4.10 as well. Please let us know if you have more details ETA for backport. We are waiting on 4.11.z to open so we can merge the changes. How it looks with 4.10.Z backport? It is urgent for customer now. --mheppler (In reply to mheppler from comment #14) > How it looks with 4.10.Z backport? It is urgent for customer now. > > --mheppler The change has to make its way in to 4.11 first. At the moment it's pending verification by the QE team. You can follow the current status here https://bugzilla.redhat.com/show_bug.cgi?id=2110482 FYI, the fix has been merged into the installer 4.10 branch. Hi, please, which version of 4.10 will contain fix? Thanks... (In reply to mheppler from comment #17) > Hi, > > please, which version of 4.10 will contain fix? > > Thanks... From https://bugzilla.redhat.com/show_bug.cgi?id=2111258#c6, it's 4.10.31 onwards. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.12.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:7399 |