Description of problem:
I was upgrading a functioning containerized OSP 12 environment that was running ceph 2.4 containers. The update deployment failed and ceph broke. The mon containers on the controller were running but not responsive.
The deployment will time-out at WorkflowTasks_Step2_Execution
I trace the issue down to the following deployment parameters.
parameter_defaults:
CephAnsibleExtraConfig:
mon_use_fqdn: true
With this parameter ceph 2.5x mon containers don't seem to function. I tested with the container version 2.5-3 & 2.5-4. ceph 2.4 containers function fine with this parameter; downgrading to the 2.4 containers will fix the deployment.
Version-Release number of selected component (if applicable):
ceph 2.5-3 and 2.5-4 containers
How reproducible:
100%
Steps to Reproduce:
1. deploy current osp12 with ceph-ansible and the following parameters
parameter_defaults:
CephAnsibleExtraConfig:
mon_use_fqdn: true
test deployment to reproduce:
openstack overcloud deploy \
--templates /usr/share/openstack-tripleo-heat-templates \
--ntp-server 192.168.0.10 \
--timeout 120 \
-e /home/stack/templates/env.yaml \
-e /home/stack/templates/overcloud_images.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/low-memory-usage.yaml \
--log-file /tmp/deploy.log
=======
env.yaml
=======
parameter_defaults:
OvercloudCephStorageFlavor: ceph
CephStorageCount: 1
ControllerCount: 1
ComputeCount: 1
CloudDomain: example.com
DockerInsecureRegistryAddress: 172.16.5.1:8787
CephPoolDefaultSize: 1
CephAnsibleDisksConfig:
devices:
- /dev/vdb
CephAnsibleExtraConfig:
mon_use_fqdn: true
=======
Removing "mon_use_fqdn: true" will result in a successful deployment.
I'll attach the ceph container log and ceph-ansible log from a new test deployment.