Description of problem: I was upgrading a functioning containerized OSP 12 environment that was running ceph 2.4 containers. The update deployment failed and ceph broke. The mon containers on the controller were running but not responsive. The deployment will time-out at WorkflowTasks_Step2_Execution I trace the issue down to the following deployment parameters. parameter_defaults: CephAnsibleExtraConfig: mon_use_fqdn: true With this parameter ceph 2.5x mon containers don't seem to function. I tested with the container version 2.5-3 & 2.5-4. ceph 2.4 containers function fine with this parameter; downgrading to the 2.4 containers will fix the deployment. Version-Release number of selected component (if applicable): ceph 2.5-3 and 2.5-4 containers How reproducible: 100% Steps to Reproduce: 1. deploy current osp12 with ceph-ansible and the following parameters parameter_defaults: CephAnsibleExtraConfig: mon_use_fqdn: true test deployment to reproduce: openstack overcloud deploy \ --templates /usr/share/openstack-tripleo-heat-templates \ --ntp-server 192.168.0.10 \ --timeout 120 \ -e /home/stack/templates/env.yaml \ -e /home/stack/templates/overcloud_images.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/low-memory-usage.yaml \ --log-file /tmp/deploy.log ======= env.yaml ======= parameter_defaults: OvercloudCephStorageFlavor: ceph CephStorageCount: 1 ControllerCount: 1 ComputeCount: 1 CloudDomain: example.com DockerInsecureRegistryAddress: 172.16.5.1:8787 CephPoolDefaultSize: 1 CephAnsibleDisksConfig: devices: - /dev/vdb CephAnsibleExtraConfig: mon_use_fqdn: true ======= Removing "mon_use_fqdn: true" will result in a successful deployment. I'll attach the ceph container log and ceph-ansible log from a new test deployment.
Created attachment 1434550 [details] docker logs ceph-mon-overcloud-controller output
Created attachment 1434551 [details] ceph-install-workflow log
I'm trying to reproduce in an env with RHEL VMs. I'll update this BZ accordingly.
*** Bug 1581593 has been marked as a duplicate of this bug. ***
Updating the QA Contact to a Hemant. Hemant will be rerouting them to the appropriate QE Associate. Regards, Giri