Description of problem: When a user has undercloud (or any other node used for backup control plane nodes) on a subnet other than subnets set in tripleo_backup_and_restore_clients_nets (see [1]) openstack overcloud backup fails to mount nfs shares for backup. The problem is that even if we specify BACKUP_MIGRATION_IP/backup_migration_ip it's not enough and we still need also specify corresponding net to tripleo_backup_and_restore_clients_nets. At this moment ovn migration script does not allow specifying custom values for tripleo_backup_and_restore_clients_nets. [1] https://opendev.org/openstack/tripleo-ansible/src/branch/master/tripleo_ansible/roles/backup_and_restore/defaults/main.yml#L47 Version-Release number of selected component (if applicable): RHOS-17.1-RHEL-9-20221130.n.1 How reproducible: 100% Steps to Reproduce: 1. Deploy undercloud with local_ip = 192.168.25.1/24 2. In case your environment is SR-IOV make sure you renamed ControllerSriov role to Controller in order to workaround BZ2158396 (see the BZ for details) 3. Deploy overcloud with ml2ovs backend, 3 controllers and 2 compute nodes 4. Try to run ovn migration according to official documentation [1]. Make sure you enabled backup by specifying environment variables: export CREATE_BACKUP=True export BACKUP_MIGRATION_IP=192.168.25.1 [1] https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/17.0/html/testing_migration_of_the_networking_service_to_the_ml2ovn_mechanism_driver/migrating-ml2ovs-to-ovn#doc-wrapper Actual results: Backup of controller nodes failed complaining on mount command 'mount -v -t nfs -o rw,noatime 192.168.25.1:/ctl_plane_backups /var/tmp/rear.ooULEDUn2lGsOCY/outputfs' failed. See full log below in additional info section. Expected results: Backup of controller nodes succeeded Additional info: from ovn_migration_ansible.log FATAL | Create the node backup | controller-0 | error={\"changed\": true, \"cmd\": [\"rear\", \"-d\", \"-v\", \"mkbackup\"], \"delta\": \"0:00:17.861449\", \"end\": \"2023-0 1-05 01:44:30.534047\", \"msg\": \"non-zero return code\", \"rc\": -15, \"start\": \"2023-01-05 01:44:12.672598\", \"stderr\": \"ERROR: Mount command 'mount -v -t nfs -o rw,noatime 192.168.25.1:/ctl_plane_backups /var/tmp/rear.ooULEDUn2lGsOCY/outputfs' failed.\\nSome latest log messages since the last called script 060_mount_NETFS_path.sh:\\n mount.nfs: timeout set for Thu Jan 5 01:46:13 2023\\n mount.nfs: trying text-based options 'vers=4.2,addr=192.168.25.1,clientaddr=192.168.25.11'\\n mount.nfs: trying text-based options 'vers=4,minorversion=1,addr=192.168.25.1,clientaddr=192.168.25.11'\\n mount.nfs: trying text-based options 'vers=4,addr=192.168.25.1,clientaddr=192.168.25.11'\\n mount.nfs: trying text-based options 'addr=192.168.25.1'\\n mount.nfs: prog 100003, trying vers=3, prot=6\\n mount.nfs: prog 100005, trying vers=3, prot=17\\n mount.nfs: prog 100005, trying vers=3, prot=6\\nAborting due to an error, check /var/log/rear/rear-controller-0.log for details\", \"stderr_lines\": [\"ERROR: Mount command 'mount -v -t nfs -o rw,noatime 192.168.25.1:/ctl_plane_backups /var/tmp/rear.ooULEDUn2lGsOCY/outputfs' failed.\", \"Some latest log messages since the last called script 060_mount_NETFS_path.sh:\", \" mount.nfs: timeout set for Thu Jan 5 01:46:13 2023\", \" mount.nfs: trying text-based options 'vers=4.2,addr=192.168.25.1,clientaddr=192.168.25.11'\", \" mount.nfs: trying text-based options 'vers=4,minorversion=1,addr=192.168.25.1,clientaddr=192.168.25.11'\", \" mount.nfs: trying text-based options 'vers=4,addr=192.168.25.1,clientaddr=192.168.25.11'\", \" mount.nfs: trying text-based options 'addr=192.168.25.1'\", \" mount.nfs: prog 100003, trying vers=3, prot=6\", \" mount.nfs: prog 100005, trying vers=3, prot=17\", \" mount.nfs: prog 100005, trying vers=3, prot=6\", \"Aborting due to an error, check /var/log/rear/rear-controller-0.log for details\"], (undercloud) [stack@undercloud-0 ~]$ showmount --exports Export list for undercloud-0.redhat.local: /ctl_plane_backups 172.16.0.0/24,10.0.0.0/24,192.168.24.0/24 cat /etc/exports # BEGIN ANSIBLE MANAGED BLOCK /ctl_plane_backups /ctl_plane_backups 192.168.24.0/24(rw,sync,no_root_squash,no_subtree_check) /ctl_plane_backups 10.0.0.0/24(rw,sync,no_root_squash,no_subtree_check) /ctl_plane_backups 172.16.0.0/24(rw,sync,no_root_squash,no_subtree_check) # END ANSIBLE MANAGED BLOCK /ctl_plane_backups After I changed 192.168.24.0 in /etc/exports to 192.168.25.0 and restarted nfs-server by running "sudo systemctl restart nfs-server" I was able to successfully mount 192.168.25.1:/ctl_plane_backups from a controller node by running: mount -v -t nfs -o rw,noatime 192.168.25.1:/ctl_plane_backups /tmp/my_share_name
Verified on RHOS-17.1-RHEL-9-20230607.n.2 puddle with openstack-neutron-ovn-migration-tool-18.6.1-1.20230518200966.el9ost.noarch Verified that it's possible to override default control plane CIDR(s) for backup by means of BACKUP_MIGRATION_CTL_PLANE_CIDRS environment variable - Deployed overcloud with control plane on a custom CIDR - Run ovn_migration.sh backup with specifyng BACKUP_MIGRATION_IP and BACKUP_MIGRATION_CTL_PLANE_CIDRS - Confirmed the NFS server had correct NFS settings in /etc/exports - Ansible tasks for backup completed successfully
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Release of components for Red Hat OpenStack Platform 17.1 (Wallaby)), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2023:4577