Bug 2172507
| Summary: | Loss of connectivity with controllers after doing an undercloud restore | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Fernando Díaz <fdiazbra> |
| Component: | tripleo-ansible | Assignee: | Carlos Camacho <ccamacho> |
| Status: | CLOSED WORKSFORME | QA Contact: | Joe H. Rahme <jhakimra> |
| Severity: | urgent | Docs Contact: | |
| Priority: | high | ||
| Version: | 17.1 (Wallaby) | CC: | ayefimov, ccamacho, hjensas, jpretori, sbaker |
| Target Milestone: | --- | Keywords: | Triaged |
| Target Release: | --- | Flags: | fdiazbra:
needinfo-
fdiazbra: needinfo- |
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | No Doc Update | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-03-24 11:19:27 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Fernando Díaz
2023-02-22 11:50:20 UTC
You are destroying the undercloud VM set up by infrared virsh plugin and re-creating the undercloud VM without using the virsh plugin, I guess in one of these playbooks? * https://github.com/redhat-openstack/infrared/blob/master/plugins/tripleo-undercloud/restore.yml#L97-L115 * https://github.com/redhat-openstack/infrared/blob/55ba05ca0d9f5aca6f605816da02dac053537254/plugins/tripleo-undercloud/restore_containerized.yml The virsh plugin does it in a similar way, but there is many things that can be different based on options. * https://github.com/redhat-openstack/infrared/blob/master/plugins/virsh/tasks/vms_2_install.yml#L37-L78 For example: {% if provision.bootmode == 'uefi' %} --boot {{ 'hd' if topology_node.deploy_os|default(True) else 'uefi' }} \ {% else %} {%- if interface.model is defined and interface.model %},model={{ interface.model }}{% endif %} {% if topology_node.machine_type is defined and topology_node.machine_type %} --machine {{ topology_node.machine_type }} \ {% endif %} --os-variant {{ topology_node.os.variant }} \ I think something is different, i.e the hardware the undercloud sees is different and based on that the interface names are different. It is also possible the undercloud initially installed has netifnames disabled? I doubt that this is a product bug, this is an issue with the infrastructure used for testing. Thanks Harald for your comment, let me provide a clarification about the procedure: We are not using the tripleo-undercloud IR plugin. We use the backup and restore plugin [1] that execute the backup and restore tripleo role [2] using the openstack backup commands that are implemented in the cli [3] . The backup and restore role relies on ReaR [4] to backup and restore the undercloud and controller nodes, so when we restore the undercloud node using ReaR, we expect to have restored exactly the vm with the same network interfaces as in the backup image. I wonder why the interfaces names are changing since the scripts to enable the network interfaces in the restored node are with the ethX naming. [1] https://gitlab.cee.redhat.com/osp-dfg-enterprise/infrared-plugin-backup-restore [2] https://github.com/openstack/tripleo-ansible/tree/master/tripleo_ansible/roles/backup_and_restore [3] https://github.com/openstack/python-tripleoclient [4] https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/configuring_basic_system_settings/assembly_recovering-and-restoring-a-system_configuring-basic-system-settings I can confirm that for some reason the undercloud was initially installed with netifnames disabled: [root@undercloud-0 stack]# cat /proc/cmdline BOOT_IMAGE=(hd0,gpt3)/vmlinuz-5.14.0-283.el9.x86_64 root=UUID=62b51192-13b0-4838-a267-e410f86ee01e console=tty0 console=ttyS0,115200n8 no_timer_check net.ifnames=0 crashkernel=1G-4G:192M,4G-64G:256M,64G-:512M We will go doing further investigation on ReaR side. |