Bug 2219641
| Summary: | metalsmith does not recognize subnet field under network in overcloud-baremetal-deploy.yaml as per documentation | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Jaison Raju <jraju> |
| Component: | python-metalsmith | Assignee: | Harald Jensås <hjensas> |
| Status: | CLOSED ERRATA | QA Contact: | James E. LaBarre <jlabarre> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 17.1 (Wallaby) | CC: | eshames, hjensas, jraju, mariel, sbaker |
| Target Milestone: | z2 | Keywords: | Triaged |
| Target Release: | 17.1 | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | python-metalsmith-1.4.4-17.1.20230815101022.5e7461e.el9ost | Doc Type: | No Doc Update |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2024-01-16 14:32:47 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Jaison Raju
2023-07-04 17:06:45 UTC
Hey, we have a job doing this in downstream CI:
- name: Compute1
count: 2
hostname_format: 'c1-compute-%index%'
defaults:
profile: compute
network_config:
template: /home/stack/virt/network/three-nics-vlans/compute.j2
networks:
- network: ctlplane
vif: true
- network: internal_api
subnet: internal_api1_subnet
- network: storage
subnet: storage1_subnet
- network: tenant
subnet: tenant1_subnet
instances:
- name: compute-0
hostname: compute-0
- name: compute-1
hostname: compute-1
- Which example are you using? Is it the specific nodes example?
Can you attach your - home/stack/templates/central/overcloud-baremetal-deploy.yaml file please?
So this is happening on the "ctlplane" network? In that case, the workaround is to simply not specify the subnet. The correct subnet will be used automatically based on the physical_network bridge mappings in neutron. The physical network property on the baremetal ports must be set, but this happens automatically when you introspect the nodes with OSP 17.x director. Introspection succeeds. I got this example from the document.
I am not sure if it is happening for ctlplane. Here is the file I used. Since I am doing this on baremetal, I made a few changes, but this file should be similar:
cat templates/central/overcloud-baremetal-deploy.yaml-bak [10/1558]
- name: Controller0
count: 3
defaults:
resource_class: baremetal.control
profile: control
network_config:
default_route_network:
- External
template: /home/stack/templates/central/network/leaf0/controller0.j2
networks:
- network: ctlplane
subnet: leaf0
vif: true
- network: storage
subnet: storage_leaf0_subnet
- network: storage_mgmt
subnet: storage_mgmt_leaf0_subnet
- network: internal_api
subnet: internalapi_leaf0_subnet
- network: tenant
subnet: tenant_leaf0_subnet
- network: external
subnet: external_leaf0_subnet
ansible_playbooks:
- playbook: /usr/share/ansible/tripleo-playbooks/cli-overcloud-node-growvols.yaml
extra_vars:
growvols_args: >
/=10GB
/tmp=1GB
/var/log=10GB
/var/log/audit=1GB
/home=10GB
/srv=10GB
/var=100%
- name: ComputeHCI-r640
count: 4
defaults:
resource_class: baremetal.computel0
profile: compute
network_config:
template: /home/stack/templates/central/network/leaf0/computehci-r640.j2
networks:
- network: ctlplane
subnet: leaf0
vif: true
- network: storage
subnet: storage_leaf0_subnet
- network: internal_api
subnet: internalapi_leaf0_subnet
- network: tenant
subnet: tenant_leaf0_subnet
ansible_playbooks:
- playbook: /usr/share/ansible/tripleo-playbooks/cli-overcloud-node-growvols.yaml
extra_vars:
growvols_args: >
/=10GB
/tmp=1GB
/var/log=10GB
/var/log/audit=1GB
/home=10GB
/srv=10GB
/var=100%
I have set physical network property as per the documentation. I have tried your recommendation and it helps me proceed, but pxeboot fails to find mac under pxelinux.cfg/<mac> . Is removing the subnet the solution for this dcn deployment? Should we remove this from documentation example?
The error I am facing now is probably not because of removing the subnet as the ports created for the nodes have physical-network 'ctlplane'.
Any idea what could be the issue?
http://perf1.lab.bos.redhat.com/jaison/edge-l3/osp-edge-backup2.tar.xz
the pxe boot issue was resolved too. Confirmed edits are in place in python3-metalsmith-1.4.4-17.1.20230815101022.5e7461e.el9ost.noarch.rpm from latest compose RHOS-17.1-RHEL-9-20231122.n.1 This compose ran phases 1, 2 & 3 with no errors in the package. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenStack Platform 17.1.2 bug fix and enhancement advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2024:0209 |