Description of problem: This seems to fail because of the addtion of "/usr/share/openstack-puppet/modules/tripleo/manifests/certmonger/ceph_rgw.pp" in OSP 16 Now it tries to create a certificate for ceph rgw which fails because the host Insufficient permission to added a krbprincipal for the service to storage dns domain. ~~~ Insufficient access: Insufficient 'add' privilege to add the entry 'krbprincipalname=ceph_rgw/controller-1.storage.redhat.local,cn=services,cn=accounts,dc=redhat,dc=local' ~~~ Version-Release number of selected component (if applicable): How reproducible: Everytime Steps to Reproduce: 1. Deploy RHOSP 13 Integrated with IdM using novajoin https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/13/html/integrate_with_identity_service/idm-novajoin 2. Follow the Framework for Upgrades (13 to 16.1) documentation. Actual results: "<13>Oct 23 14:46:24 puppet-user: Debug: /Stage[main]/Tripleo::Certmonger::Novnc_pr oxy/Certmonger_certificate[novnc-proxy]: The container Class[Tripleo::Certmonger::Novnc_proxy] will propagate my refresh event", "<13>Oct 23 14:46:24 puppet-user: Debug: Class[Tripleo::Certmonger::Novnc_proxy]: The container Stage[main] w ill propagate my refresh event", "<13>Oct 23 14:46:24 puppet-user: Notice: /Stage[main]/Tripleo::Certmonger::Ceph_rgw/Certmonger_certificate[ceph_rgw]/ensure: created", "<13>Oct 23 14:46:24 puppet-user: Debug: Issuing getcert command with args: [\"request\", \"-I\", \"ceph_rgw\", \"-f\", \"/etc/pki/tls/certs/ceph_rgw.crt\", \"-c\", \"IPA\", \"-N\", \"CN=controller-0.storage.redhat.local\", \"-K\", \"ceph_rgw/controller-0.storage.redhat.local\", \"-D\", \"controller-0.stor age.redhat.local\", \"-C\", \"/usr/bin/certmonger-rgw-refresh.sh\", \"-w\", \"-k\", \"/etc/pki/tls/private/ceph_rgw.key\"]", "<13>Oct 23 14:46:24 puppet-user: Debug: Executing: '/usr/bin/getcert request -I ceph_rgw -f /etc/pki/tls/certs/c eph_rgw.crt -c IPA -N CN=controller-0.storage.redhat.local -K ceph_rgw/controller-0.storage.redhat.local -D controller-0.storage.redhat.local -C /usr/bin/certmonger-rgw-refresh.sh -w -k /etc/pki/tls/private/ceph_rgw.key'", "<13>Oct 23 14: 46:24 puppet-user: Warning: Could not get certificate: Execution of '/usr/bin/getcert request -I ceph_rgw -f /etc/pki/tls/certs/ceph_rgw.crt -c IPA -N CN=controller-0.storage.redhat.local -K ceph_rgw/controller-0.storage.redhat.local -D c ontroller-0.storage.redhat.local -C /usr/bin/certmonger-rgw-refresh.sh -w -k /etc/pki/tls/private/ceph_rgw.key' returned 2: New signing request \"ceph_rgw\" added.", "<13>Oct 23 14:46:24 puppet-user: Debug: Executing: '/usr/bin/getcert li st -i ceph_rgw'", "<13>Oct 23 14:46:24 puppet-user: Error: /Stage[main]/Tripleo::Certmonger::Ceph_rgw/Certmonger_certificate[ceph_rgw]: Could not evaluate: Could not get certificate: Server at https://freeipa-0.redhat.local/ipa/xml denied our request, giving up: 2100 (RPC failed at server. Insufficient access: Insufficient 'add' privilege to add the entry 'krbprincipalname=ceph_rgw/controller-0.storage.redhat.local,cn=services,cn=accounts,dc=redhat,dc=local' .).", "<13>Oct 23 14:46:24 puppet-user: Notice: /Stage[main]/Tripleo::Profile::Base::Certmonger_user/Tripleo::Certmonger::Httpd[httpd-ctlplane]/Certmonger_certificate[httpd-ctlplane]/principal: defined 'principal' as 'HTTP/controller-0.ct lplane.redhat.local'", "<13>Oct 23 14:46:24 puppet-user: Debug: Executing: '/usr/bin/getcert resubmit -i httpd-ctlplane -f /etc/pki/tls/certs/httpd/httpd-ctlplane.crt -c IPA -N CN=controller-0.ctlplane.redhat.local -K HTTP/controller-0.ct lplane.redhat.local -D controller-0.ctlplane.redhat.local -C pkill -USR1 httpd -w'", " Expected results: For the host to have permission to create the certificates it needs. Additional info: Worked around this by adding the DNS, the service and allowing the controller-n.redhat.local mange the service. [root@controller-0 ~]# kinit admin Password for admin: [root@controller-0 ~]# ipa dnsrecord-add Record name: controller-0 Zone name: storage.redhat.local Please choose a type of DNS resource record to be added The most common types for this type of zone are: A, AAAA DNS resource record type: A A IP Address: 172.17.3.132 Record name: controller-0 A record: 172.17.3.132 [root@controller-0 ~]# ipa service-add ceph_rgw/controller-0.storage.redhat.local ----------------------------------------------------------------------- Added service "ceph_rgw/controller-0.storage.redhat.local" ----------------------------------------------------------------------- Principal name: ceph_rgw/controller-0.storage.redhat.local Principal alias: ceph_rgw/controller-0.storage.redhat.local Managed by: controller-0.storage.redhat.local [root@controller-0 ~]# ipa service-add-host --hosts controller-0.redhat.local ceph_rgw/controller-0.storage.redhat.local Principal name: ceph_rgw/controller-0.storage.redhat.local Principal alias: ceph_rgw/controller-0.storage.redhat.local Managed by: controller-0.storage.redhat.local, controller-0.redhat.local ------------------------- Number of members added 1 ------------------------- [root@controller-0 ~]# /usr/bin/getcert resubmit -i ceph_rgw Resubmitting "ceph_rgw" to "IPA". [root@controller-0 ~]# /usr/bin/getcert list -i ceph_rgw -v Number of certificates and requests being tracked: 17. Request ID 'ceph_rgw': status: MONITORING stuck: no key pair storage: type=FILE,location='/etc/pki/tls/private/ceph_rgw.key' certificate: type=FILE,location='/etc/pki/tls/certs/ceph_rgw.crt' CA: IPA issuer: CN=Certificate Authority,O=REDHAT.LOCAL subject: CN=controller-0.storage.redhat.local,O=REDHAT.LOCAL expires: 2022-10-29 01:00:25 UTC dns: controller-0.storage.redhat.local principal name: ceph_rgw/controller-0.storage.redhat.local key usage: digitalSignature,nonRepudiation,keyEncipherment,dataEncipherment eku: id-kp-serverAuth,id-kp-clientAuth pre-save command: post-save command: /usr/bin/certmonger-rgw-refresh.sh track: yes auto-renew: yes
It is novajoin's responsibility to add the missing services etc. in IPA prior to certmonger requesting the certificates. The question then, is why novajoin presumably is not being triggered to do this. First off, we need to confirm that the metadata for the server has been updated. David -- Can you provide the metadata for the server once the FFU updates the overcloud? Assuming that the metadata has been updated, I think I might see why novajoin might not be triggered: TLS-E using novajoin is triggered by the following template: https://github.com/openstack/tripleo-heat-templates/blob/stable/train/deployment/ipa/ipaclient-baremetal-ansible.yaml and, in particular, in https://github.com/openstack/tripleo-heat-templates/blob/stable/train/deployment/ipa/ipaclient-baremetal-ansible.yaml#L185-L194 The line that triggers updates by novajoin is: https://github.com/openstack/tripleo-heat-templates/blob/stable/train/deployment/ipa/ipaclient-baremetal-ansible.yaml#L91 which gets the config data from the config drive, causing a call to novajoin as a dynamic vendor data service. Novajoin would then look at the (updated) metadata and add services/hosts etc. as needed. However, as you can see on line : https://github.com/openstack/tripleo-heat-templates/blob/stable/train/deployment/ipa/ipaclient-baremetal-ansible.yaml#L194 this only takes place when the server is not already an ipa client - which would be the case for instance, if we were attempting to do a brownfield deployment. But in this case, we have a server which is already an ipa client - as all the other services were already enrolled in ipa, and so any further updates were skipped. We'd need to examine this logic to see if we can be smarter about the ipa-client check. We should note that this is not a problem if you chose to migrate to tripleo-ipa instead, because the correct services etc. are created as an undercloud task beforehand: https://github.com/openstack/tripleo-heat-templates/blob/stable/train/deployment/ipa/ipaservices-baremetal-ansible.yaml#L98-L122
It looks like the novajoin_notifier was down because of a missed configured transport_url. Looking at the original it had the correct rabbitmq user and password ~~~ less /etc/novajoin/join.conf.rpmsave transport_url=rabbit://d4285439706d8dfc62f3cd78ff751b1a599baf51:4f238728a84593674eb967100e4ec06bb89995ed.24.1// ~~~ But the upgrade configured the transport_url with the user guest. ~~~ less /var/lib/config-data/puppet-generated/novajoin/etc/novajoin/join.conf transport_url=rabbit://guest:4f238728a84593674eb967100e4ec06bb89995ed.redhat.local:5672/?ssl=0 ~~~ After reverting this the service is back up. ~~~ 2020-11-09 21:49:00.134 7 ERROR join (class_id, method_id), ConnectionError) 2020-11-09 21:49:00.134 7 ERROR join amqp.exceptions.AccessRefused: (0, 0): (403) ACCESS_REFUSED - Login was refused using authentication mechanism AMQPLAIN. For details see the broker logfile. 2020-11-09 21:49:00.134 7 ERROR join 2020-11-09 21:49:04.157 7 INFO novajoin.notifications [-] Starting 2020-11-09 21:49:07.093 7 INFO novajoin.notifications [-] [3dc41ebe-db20-405b-8084-573741ae068b] compute instance update for controller-0.redhat.local 2020-11-09 21:49:07.123 7 INFO novajoin.notifications [-] [18eedb92-727c-458c-86dd-326988be8c59] compute instance update for controller-2.redhat.local 2020-11-09 21:49:08.113 7 INFO novajoin.notifications [-] [79a4d964-7e1c-4a05-9de8-82e71cfd299f] compute instance update for controller-1.redhat.local 2020-11-09 21:49:26.229 7 INFO novajoin.notifications [-] Starting ~~~
David, Thats good to know, but it doesn't necessarily affect this situation. The notifier is used primarily to clean up IPA when servers are deleted. The question I had was - what is the metadata for the nodes? That is, what is the output of "openstack server show <<uuid of controller-0>> , and maybe for the other controllers too. I want to see whats in the server metadata. The server metadata should contain for instance, ipa_enroll: True , but also the lists of compact and managed services.
OS-DCF:diskConfig: MANUAL OS-EXT-AZ:availability_zone: nova OS-EXT-SRV-ATTR:host: undercloud-0.redhat.local OS-EXT-SRV-ATTR:hypervisor_hostname: 2cd08d8f-cf30-499e-89bb-ad88e89d219c OS-EXT-SRV-ATTR:instance_name: instance-0000006d OS-EXT-STS:power_state: Running OS-EXT-STS:task_state: null OS-EXT-STS:vm_state: active OS-SRV-USG:launched_at: '2020-10-12T08:26:16.000000' OS-SRV-USG:terminated_at: null accessIPv4: '' accessIPv6: '' addresses: ctlplane=192.168.24.33 config_drive: 'True' created: '2020-10-12T08:20:51Z' flavor: controller (2cb99cee-b0c6-40cf-a105-f0b5aa8babd6) hostId: e89609a4acca37ece09a0a31f5d2983c56edbd655240dad21c90de9d id: 29cacbd7-c092-4e5a-875b-de81c66af778 image: overcloud-full_20201003T070010Z (bea5fd78-ad6b-4aa0-914b-395eea196f96) key_name: default name: controller-0 progress: 0 project_id: 61d63d0900df4ceaaa9ca08353af64a8 properties: compact_service_HTTP='["ctlplane", "storage", "storagemgmt", "internalapi", "external"]', compact_service_ceph_rgw='["storage"]', compact_service_haproxy='["ctlplane", "storage", "storagemgmt", "internalapi"]', compact_service_libvirt-vnc='["internalapi"]', compact_service_mysql='["internalapi"]', compact_service_neutron='["internalapi"]', compact_service_novnc-proxy='["internalapi"]', compact_service_rabbitmq='["internalapi"]', compact_service_redis='["internalapi"]', ipa_enroll='true', managed_service_haproxyctlplane='haproxy/overcloud.ctlplane.redhat.local', managed_service_haproxyexternal='haproxy/overcloud.redhat.local', managed_service_haproxyinternal_api='haproxy/overcloud.internalapi.redhat.local', managed_service_haproxystorage='haproxy/overcloud.storage.redhat.local', managed_service_haproxystorage_mgmt='haproxy/overcloud.storagemgmt.redhat.local', managed_service_mysqlinternal_api='mysql/overcloud.internalapi.redhat.local', managed_service_redisinternal_api='redis/overcloud.internalapi.redhat.local' security_groups: name='default' status: ACTIVE updated: '2020-10-23T03:45:16Z' user_id: 074c7d66e40243908472df0e417cede6 volumes_attached: '' OS-DCF:diskConfig: MANUAL OS-EXT-AZ:availability_zone: nova OS-EXT-SRV-ATTR:host: undercloud-0.redhat.local OS-EXT-SRV-ATTR:hypervisor_hostname: 85c785ae-7fdd-4efd-b087-6078263e60f4 OS-EXT-SRV-ATTR:instance_name: instance-0000006a OS-EXT-STS:power_state: Running OS-EXT-STS:task_state: null OS-EXT-STS:vm_state: active OS-SRV-USG:launched_at: '2020-10-12T08:23:38.000000' OS-SRV-USG:terminated_at: null accessIPv4: '' accessIPv6: '' addresses: ctlplane=192.168.24.35 config_drive: 'True' created: '2020-10-12T08:20:50Z' flavor: controller (2cb99cee-b0c6-40cf-a105-f0b5aa8babd6) hostId: e89609a4acca37ece09a0a31f5d2983c56edbd655240dad21c90de9d id: c1d5e113-e8d6-4412-bc8a-b12b0e3cebae image: overcloud-full_20201003T070010Z (bea5fd78-ad6b-4aa0-914b-395eea196f96) key_name: default name: controller-2 progress: 0 project_id: 61d63d0900df4ceaaa9ca08353af64a8 properties: compact_service_HTTP='["ctlplane", "storage", "storagemgmt", "internalapi", "external"]', compact_service_ceph_rgw='["storage"]', compact_service_haproxy='["ctlplane", "storage", "storagemgmt", "internalapi"]', compact_service_libvirt-vnc='["internalapi"]', compact_service_mysql='["internalapi"]', compact_service_neutron='["internalapi"]', compact_service_novnc-proxy='["internalapi"]', compact_service_rabbitmq='["internalapi"]', compact_service_redis='["internalapi"]', ipa_enroll='true', managed_service_haproxyctlplane='haproxy/overcloud.ctlplane.redhat.local', managed_service_haproxyexternal='haproxy/overcloud.redhat.local', managed_service_haproxyinternal_api='haproxy/overcloud.internalapi.redhat.local', managed_service_haproxystorage='haproxy/overcloud.storage.redhat.local', managed_service_haproxystorage_mgmt='haproxy/overcloud.storagemgmt.redhat.local', managed_service_mysqlinternal_api='mysql/overcloud.internalapi.redhat.local', managed_service_redisinternal_api='redis/overcloud.internalapi.redhat.local' security_groups: name='default' status: ACTIVE updated: '2020-10-23T03:45:16Z' user_id: 074c7d66e40243908472df0e417cede6 volumes_attached: '' OS-DCF:diskConfig: MANUAL OS-EXT-AZ:availability_zone: nova OS-EXT-SRV-ATTR:host: undercloud-0.redhat.local OS-EXT-SRV-ATTR:hypervisor_hostname: e003381f-95b3-455f-a114-56390ebdbd38 OS-EXT-SRV-ATTR:instance_name: instance-0000006c OS-EXT-STS:power_state: Running OS-EXT-STS:task_state: null OS-EXT-STS:vm_state: active OS-SRV-USG:launched_at: '2020-10-12T08:23:33.000000' OS-SRV-USG:terminated_at: null accessIPv4: '' accessIPv6: '' addresses: ctlplane=192.168.24.20 config_drive: 'True' created: '2020-10-12T08:20:50Z' flavor: controller (2cb99cee-b0c6-40cf-a105-f0b5aa8babd6) hostId: e89609a4acca37ece09a0a31f5d2983c56edbd655240dad21c90de9d id: e5d1b72b-c114-4cfd-baa6-b2b7bb4d16de image: overcloud-full_20201003T070010Z (bea5fd78-ad6b-4aa0-914b-395eea196f96) key_name: default name: controller-1 progress: 0 project_id: 61d63d0900df4ceaaa9ca08353af64a8 properties: compact_service_HTTP='["ctlplane", "storage", "storagemgmt", "internalapi", "external"]', compact_service_ceph_rgw='["storage"]', compact_service_haproxy='["ctlplane", "storage", "storagemgmt", "internalapi"]', compact_service_libvirt-vnc='["internalapi"]', compact_service_mysql='["internalapi"]', compact_service_neutron='["internalapi"]', compact_service_novnc-proxy='["internalapi"]', compact_service_rabbitmq='["internalapi"]', compact_service_redis='["internalapi"]', ipa_enroll='true', managed_service_haproxyctlplane='haproxy/overcloud.ctlplane.redhat.local', managed_service_haproxyexternal='haproxy/overcloud.redhat.local', managed_service_haproxyinternal_api='haproxy/overcloud.internalapi.redhat.local', managed_service_haproxystorage='haproxy/overcloud.storage.redhat.local', managed_service_haproxystorage_mgmt='haproxy/overcloud.storagemgmt.redhat.local', managed_service_mysqlinternal_api='mysql/overcloud.internalapi.redhat.local', managed_service_redisinternal_api='redis/overcloud.internalapi.redhat.local' security_groups: name='default' status: ACTIVE updated: '2020-10-23T03:45:16Z' user_id: 074c7d66e40243908472df0e417cede6 volumes_attached: ''
I don't believe this has been changed or rerun since the orginal OSP 13 deploy ~~~ {"join": {"hostname": "controller-1.redhat.local" "ipaotp": "1d163c1672dc4c32a2474051306bca07" "krb_realm": "REDHAT.LOCAL"} "static": {"cloud-init": "#cloud-config packages: - python-simplejson - ipa-client - ipa-admintools - openldap-clients - hostname write_files: - content: | #!/bin/sh function get_metadata_config_drive { if [ -f /run/cloud-init/status.json ]; then # Get metadata from config drive data=`cat /run/cloud-init/status.json` config_drive=`echo $data | python -c 'import json,re,sys;obj=json.load(sys.stdin);ds=obj.get(\"v1\", {}).get(\"datasource\"); print(re.findall(r\"source=(.*)]\", ds)[0])'` if [[ -b $config_drive ]]; then temp_dir=`mktemp -d` mount $config_drive $temp_dir if [ -f $temp_dir/openstack/latest/vendor_data2.json ]; then data=`cat $temp_dir/openstack/latest/vendor_data2.json` umount $config_drive rmdir $temp_dir else umount $config_drive rmdir $temp_dir fi else echo \"Unable to retrieve metadata from config drive.\" return 1 fi else echo \"Unable to retrieve metadata from config drive.\" return 1 fi return 0 } function get_metadata_network { # Get metadata over the network data=$(timeout 300 /bin/bash -c 'data=\"\"; while [ -z \"$data\" ]; do sleep $[ ( $RANDOM % 10 ) + 1 ]s; data=`curl -s http://169.254.169.254/openstack/2016-10-06/vendor_data2.json 2>/dev/null`; done; echo $data') if [[ $? != 0 ]] ; then echo \"Unable to retrieve metadata from metadata service.\" return 1 fi } function get_fqdn { # Get the instance hostname out of the metadata fqdn=`echo $data | python -c 'import json,sys;obj=json.load(sys.stdin);print(obj.get(\"join\", {}).get(\"hostname\", \"\"))'` if [ -z \"$fqdn\"]; then echo \"Unable to determine hostname\" return 1 fi return 0 } if ! get_metadata_config_drive || ! get_fqdn; then if ! get_metadata_network || ! get_fqdn; then echo \"FATAL: No metadata available or could not read the hostname from the metadata\" exit 1 fi fi realm=`echo $data | python -c 'import json,sys;obj=json.load(sys.stdin);print(obj.get(\"join\", {}).get(\"krb_realm\", \"\"))'` otp=`echo $data | python -c 'import json,sys;obj=json.load(sys.stdin);print(obj.get(\"join\", {}).get(\"ipaotp\", \"\"))'` if [ -z \"$otp\" ]; then echo \"FATAL: Could not read OTP from the metadata. This means that a host with the same name was already enrolled in IPA.\" exit 1 fi # run ipa-client-install OPTS=\"-U -w $otp --hostname $fqdn --mkhomedir\" if [ -n \"$realm\" ]; then OPTS=\"$OPTS --realm=$realm\" fi ipa-client-install $OPTS path: /root/setup-ipa-client.sh permissions: '0700' owner: root:root runcmd: - sh -x /root/setup-ipa-client.sh > /var/log/setup-ipa-client.log 2>&1"}} ~~~
Created attachment 1728260 [details] setup-ipa-client.log
{"admin_pass": "vbciaFfye5QE" "random_seed": "ZcJ3R+j9wVFFdFJYubnlnc5XIZr5DlIFF7Wt33Mb97bF3lU0RilTwVUaB0Y7dSqrOZjYqyrQZcWXvrUazOAspuoEmMEPMiwCQgtZPU7KOPYgr5Zf7uBuxTCuKbI1QkujxgSb+WPbR1QYsXdUqAijAdOvoMjU2rV7a+2fus0jpPE9qGhWK/sJYNOxWilMKq0hLB4lPl/pQJOoNyIOnLAtJDPbipqx3WmG2HAZaXTDhCBNy8sRRKLkiqC+AJ6g4DDpygqGtTockdMRGiatxSFNBK5i1ANiF4ecjgTWFa52xVk1ZuqMl2CdpLG5aF89kqXX0SD95O0NZ3Rq3DvKGndEOzzBHkwgwPHUFhnDBLwTnTXViRH5z6iKCx1MuC0MBTekrl4RVu3XGi+Q+NQhe/THzIIL6zeV9GqQcwgSS5oT+cWzCS2IUnF6gwP9IE2LDglKeC/eLspu8S8hv1BQtB74PkluTQDWtU270Hid8ek9kfzkgH1qawSL9tqqwaKWvzaVz03+qN4XcOBbXIy3FUkw/s7zGckdydyl8dh/kH5X0/NZa3QhVMJP1k5npYOaNAemVhKEenpWbN3quH27hzZ/W69AvWUptCVFaLZzLZEHCS7/zbctdblcYyyKLbD+n1lvYhh9bReFpPeOaD5a59OxBJOr4qPaXUE6/KvfaeUMivA=" "uuid": "e5d1b72b-c114-4cfd-baa6-b2b7bb4d16de" "availability_zone": "nova" "keys": [{"data": "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC9yrgjkk2hsgmSU2/1YjJLDFrmEs3fOnDCOCd1Qc0ETq8fQyPAiZvbC8xiYWMUTdG8741irib28+ujP1LoUmvJo5j45+JbikNQiEcgTEVCqJy1eheAKkL8ES8Xq/HQ7pTmxYwoxOQHoqOFEPOwY3ICJ6GQObdD0/7n530eNx5SwfzjfJ8Zs9bxONRXv/b38TgcIkCWpfxucFzEo8ZQFhSbtO253Vd/s7gHTKGpE9xMdx304F+A0yOGTGQIVKtH7D1FPtJj6OMmnLuaqpeA6G6ODnQdNSAmweDZNGpxvQ3/BKuGcU+s9MXJkRIHLVgxyOrK7WjENPBKLzMvEhexSSSz" "type": "ssh" "name": "default"}] "hostname": "controller-1" "launch_index": 0 "devices": [] "meta": {"compact_service_libvirt-vnc": "[\"internalapi\"]" "compact_service_novnc-proxy": "[\"internalapi\"]" "managed_service_mysqlinternal_api": "mysql/overcloud.internalapi.redhat.local" "managed_service_haproxyctlplane": "haproxy/overcloud.ctlplane.redhat.local" "compact_service_redis": "[\"internalapi\"]" "managed_service_haproxystorage": "haproxy/overcloud.storage.redhat.local" "compact_service_mysql": "[\"internalapi\"]" "compact_service_HTTP": "[\"ctlplane\", \"storage\", \"storagemgmt\", \"internalapi\", \"external\"]" "managed_service_haproxyexternal": "haproxy/overcloud.redhat.local" "compact_service_haproxy": "[\"ctlplane\", \"storage\", \"storagemgmt\", \"internalapi\"]" "compact_service_rabbitmq": "[\"internalapi\"]" "ipa_enroll": "true" "compact_service_neutron": "[\"internalapi\"]" "managed_service_haproxyinternal_api": "haproxy/overcloud.internalapi.redhat.local" "managed_service_redisinternal_api": "redis/overcloud.internalapi.redhat.local" "managed_service_haproxystorage_mgmt": "haproxy/overcloud.storagemgmt.redhat.local"} "public_keys": {"default": "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC9yrgjkk2hsgmSU2/1YjJLDFrmEs3fOnDCOCd1Qc0ETq8fQyPAiZvbC8xiYWMUTdG8741irib28+ujP1LoUmvJo5j45+JbikNQiEcgTEVCqJy1eheAKkL8ES8Xq/HQ7pTmxYwoxOQHoqOFEPOwY3ICJ6GQObdD0/7n530eNx5SwfzjfJ8Zs9bxONRXv/b38TgcIkCWpfxucFzEo8ZQFhSbtO253Vd/s7gHTKGpE9xMdx304F+A0yOGTGQIVKtH7D1FPtJj6OMmnLuaqpeA6G6ODnQdNSAmweDZNGpxvQ3/BKuGcU+s9MXJkRIHLVgxyOrK7WjENPBKLzMvEhexSSSz"} "project_id": "61d63d0900df4ceaaa9ca08353af64a8" "name": "controller-1"}
David, Thanks for the data. From what I see in https://bugzilla.redhat.com/show_bug.cgi?id=1895758#c5, there appears to be metadata that indicates that the ceph_rgw data should be set: properties: compact_service_HTTP='["ctlplane", "storage", "storagemgmt", "internalapi", "external"]', compact_service_ceph_rgw='["storage"]', compact_service_haproxy='["ctlplane", ... With this data, this principal should have been added by novajoin (krbprincipalname=ceph_rgw/controller-1.storage.redhat.local,cn=services,cn=accounts,dc=redhat,dc=local), but was not -- probably because of the code issue I just mentioned. I think at this point, we have enough information to conclude this is likely a bug. Would you be able to test a small fix to THT to test out the theory I suggested above? Also, what is the data in https://bugzilla.redhat.com/show_bug.cgi?id=1895758#c8 ?
Actually, not sure if we could do a simple THT fix ..
Sorry I should have mentioned it in the comment, that is the meta_data on controller one. It was located on the first partition of the disk, so I believe this would be the metadata from the server when it was deployed in OSP 13. How is the does novajoin add these services, is it from the service on the director or part of the IPA enrolment on the overcloud node?
@dsedgman, see my comment in https://bugzilla.redhat.com/show_bug.cgi?id=1895758#c2 above. Novajoin adds this through the service on the director. (https://github.com/openstack/tripleo-heat-templates/blob/stable/train/deployment/ipa/ipaclient-baremetal-ansible.yaml) The service invokes the script /root/setup-ipa-client.sh on the node - which retrieves the metadata -> which triggers novajoin to add the services. Unfortunately, right now, because of the logic I pointed out above, we are not running this script on upgrade. So -> service in director -> host_prep_tasks -> run setup-ipa-client.sh script -> retrieve metadata -> invoke novajoin metadata service -> add services
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenStack Platform 16.1.4 director bug fix advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:0817