Description of problem: We are unable upload images: var/log/glance/api.log: 2018-04-19 19:31:32.865 139939 DEBUG glance_store.location [req-dfb1bf42-2abe-45b3-9c1f-e4279668119c e52a82a36b8243778df0257afda656bf acd27d64a1a541f1be0fb8619ef2ce52 - default default] Registering scheme rbd with {'location_class': <class 'glance_store._drivers.rbd.StoreLocation'>, 'store': <glance_store._drivers.rbd.Store object at 0x7f5cd587e250>, 'store_entry': 'glance.store.rbd.Store'} register_scheme_map /usr/lib/python2.7/site-packages/glance_store/location.py:88 2018-04-19 19:31:32.899 139939 ERROR glance_store._drivers.rbd [req-dfb1bf42-2abe-45b3-9c1f-e4279668119c e52a82a36b8243778df0257afda656bf acd27d64a1a541f1be0fb8619ef2ce52 - default default] Error connecting to ceph cluster. 2018-04-19 19:31:32.899 139939 ERROR glance_store._drivers.rbd Traceback (most recent call last): 2018-04-19 19:31:32.899 139939 ERROR glance_store._drivers.rbd File "/usr/lib/python2.7/site-packages/glance_store/_drivers/rbd.py", line 273, in get_connection 2018-04-19 19:31:32.899 139939 ERROR glance_store._drivers.rbd client.connect(timeout=self.connect_timeout) 2018-04-19 19:31:32.899 139939 ERROR glance_store._drivers.rbd File "rados.pyx", line 785, in rados.Rados.connect (rados.c:8969) 2018-04-19 19:31:32.899 139939 ERROR glance_store._drivers.rbd ObjectNotFound: error connecting to the cluster 2018-04-19 19:31:32.899 139939 ERROR glance_store._drivers.rbd 2018-04-19 19:31:32.901 139939 DEBUG oslo_messaging._drivers.amqpdriver [req-dfb1bf42-2abe-45b3-9c1f-e4279668119c e52a82a36b8243778df0257afda656bf acd27d64a1a541f1be0fb8619ef2ce52 - default default] CAST unique_id: fbddddb9c31e417aadbc8b531be102e2 NOTIFY exchange 'glance' topic 'notifications.error' _send /usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py:552 2018-04-19 19:31:32.906 139939 ERROR glance.api.v2.image_data [req-dfb1bf42-2abe-45b3-9c1f-e4279668119c e52a82a36b8243778df0257afda656bf acd27d64a1a541f1be0fb8619ef2ce52 - default default] Failed to upload image data due to internal error 2018-04-19 19:31:32.906 139939 ERROR glance.api.v2.image_data Traceback (most recent call last): 2018-04-19 19:31:32.906 139939 ERROR glance.api.v2.image_data File "/usr/lib/python2.7/site-packages/glance/api/v2/image_data.py", line 115, in upload 2018-04-19 19:31:32.906 139939 ERROR glance.api.v2.image_data image.set_data(data, size) 2018-04-19 19:31:32.906 139939 ERROR glance.api.v2.image_data File "/usr/lib/python2.7/site-packages/glance/domain/proxy.py", line 195, in set_data 2018-04-19 19:31:32.906 139939 ERROR glance.api.v2.image_data self.base.set_data(data, size) 2018-04-19 19:31:32.906 139939 ERROR glance.api.v2.image_data File "/usr/lib/python2.7/site-packages/glance/notifier.py", line 479, in set_data 2018-04-19 19:31:32.906 139939 ERROR glance.api.v2.image_data _send_notification(notify_error, 'image.upload', msg) 2018-04-19 19:31:32.906 139939 ERROR glance.api.v2.image_data File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2018-04-19 19:31:32.906 139939 ERROR glance.api.v2.image_data self.force_reraise() 2018-04-19 19:31:32.906 139939 ERROR glance.api.v2.image_data File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2018-04-19 19:31:32.906 139939 ERROR glance.api.v2.image_data six.reraise(self.type_, self.value, self.tb) 2018-04-19 19:31:32.906 139939 ERROR glance.api.v2.image_data File "/usr/lib/python2.7/site-packages/glance/notifier.py", line 426, in set_data 2018-04-19 19:31:32.906 139939 ERROR glance.api.v2.image_data self.repo.set_data(data, size) 2018-04-19 19:31:32.906 139939 ERROR glance.api.v2.image_data File "/usr/lib/python2.7/site-packages/glance/api/policy.py", line 187, in set_data 2018-04-19 19:31:32.906 139939 ERROR glance.api.v2.image_data return self.image.set_data(*args, **kwargs) 2018-04-19 19:31:32.906 139939 ERROR glance.api.v2.image_data File "/usr/lib/python2.7/site-packages/glance/quota/__init__.py", line 304, in set_data 2018-04-19 19:31:32.906 139939 ERROR glance.api.v2.image_data self.image.set_data(data, size=size) 2018-04-19 19:31:32.906 139939 ERROR glance.api.v2.image_data File "/usr/lib/python2.7/site-packages/glance/location.py", line 440, in set_data 2018-04-19 19:31:32.906 139939 ERROR glance.api.v2.image_data verifier=verifier) 2018-04-19 19:31:32.906 139939 ERROR glance.api.v2.image_data File "/usr/lib/python2.7/site-packages/glance_store/backend.py", line 452, in add_to_backend 2018-04-19 19:31:32.906 139939 ERROR glance.api.v2.image_data verifier) 2018-04-19 19:31:32.906 139939 ERROR glance.api.v2.image_data File "/usr/lib/python2.7/site-packages/glance_store/backend.py", line 425, in store_add_to_backend 2018-04-19 19:31:32.906 139939 ERROR glance.api.v2.image_data verifier=verifier) 2018-04-19 19:31:32.906 139939 ERROR glance.api.v2.image_data File "/usr/lib/python2.7/site-packages/glance_store/capabilities.py", line 225, in op_checker 2018-04-19 19:31:32.906 139939 ERROR glance.api.v2.image_data return store_op_fun(store, *args, **kwargs) 2018-04-19 19:31:32.906 139939 ERROR glance.api.v2.image_data File "/usr/lib/python2.7/site-packages/glance_store/_drivers/rbd.py", line 447, in add 2018-04-19 19:31:32.906 139939 ERROR glance.api.v2.image_data rados_id=self.user) as conn: 2018-04-19 19:31:32.906 139939 ERROR glance.api.v2.image_data File "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__ 2018-04-19 19:31:32.906 139939 ERROR glance.api.v2.image_data return self.gen.next() 2018-04-19 19:31:32.906 139939 ERROR glance.api.v2.image_data File "/usr/lib/python2.7/site-packages/glance_store/_drivers/rbd.py", line 277, in get_connection 2018-04-19 19:31:32.906 139939 ERROR glance.api.v2.image_data raise exceptions.BackendException() 2018-04-19 19:31:32.906 139939 ERROR glance.api.v2.image_data BackendException 2018-04-19 19:31:32.906 139939 ERROR glance.api.v2.image_data However the ceph healt looks ok. Version-Release number of selected component (if applicable): ceph-base.x86_64 2:10.2.10-17.el7cp puppet-ceph.noarch 2.4.1-2.el7ost python-rados.x86_64 2:10.2.10-17.el7cp puppet-tripleo.noarch 5.6.4-3.el7ost python-glance.noarch 1:13.0.0-2.el7ost How reproducible: always
*** Bug 1570026 has been marked as a duplicate of this bug. ***
The openstack.keyring created by puppet-ceph for use by glance doesn't seem to have adequate privileges. [root@controller-0 ~]# rbd -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring -p images ls [root@controller-0 ~]# rbd -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.openstack.keyring -p images ls rbd: couldn't connect to the cluster! 2018-04-20 18:56:59.486026 7f27c4687d80 0 librados: client.admin authentication error (22) Invalid argument [root@controller-0 ~]#
ceph tell mon.0 injectargs --debug-auth 20/20 rbd -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.openstack.keyring -p images ls ==> ceph-mon.controller-0.log <== 2018-04-20 19:56:58.987116 7f29708ca700 10 cephx keyserver: _check_rotating_secrets 2018-04-20 19:56:59.910703 7f296d2b8700 10 In get_auth_session_handler for protocol 0 2018-04-20 19:56:59.911271 7f29700c9700 10 cephx server client.admin: start_session server_challenge 63e5a7ace587694d 2018-04-20 19:56:59.911568 7f29700c9700 10 cephx server client.admin: handle_request get_auth_session_key for client.admin 2018-04-20 19:56:59.911613 7f29700c9700 0 mon.controller-0@0(leader).auth v13 caught error when trying to handle auth request, probably malformed request 2018-04-20 19:57:03.987244 7f29708ca700 10 cephx keyserver: _check_rotating_secrets ----------------- rbd -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring -p images ls ==> ceph-mon.controller-0.log <== 2018-04-20 20:00:23.755990 7f296d2b8700 10 In get_auth_session_handler for protocol 0 2018-04-20 20:00:23.756829 7f29700c9700 10 cephx server client.admin: start_session server_challenge d7ebe7ac2388f3e0 2018-04-20 20:00:23.758262 7f29700c9700 10 cephx server client.admin: handle_request get_auth_session_key for client.admin 2018-04-20 20:00:23.758321 7f29700c9700 20 cephx server client.admin: checking key: req.key=3cf507a4ff9086f6 expected_key=3cf507a4ff9086f6 2018-04-20 20:00:23.758360 7f29700c9700 10 cephx: build_service_ticket_reply encoding 1 tickets with secret AQBtydlaAAAAABAA0ddLdyQL84ZrfOcOqSfUCA== 2018-04-20 20:00:23.758375 7f29700c9700 10 cephx: build_service_ticket service auth secret_id 2 ticket_info.ticket.name=client.admin 2018-04-20 20:00:23.758386 7f29700c9700 10 cephx keyserverdata: get_caps: name=client.admin 2018-04-20 20:00:23.758387 7f29700c9700 10 cephx keyserverdata: get_secret: num of caps=3 2018-04-20 20:00:23.758801 7f29700c9700 10 cephx server client.admin: handle_request get_principal_session_key 2018-04-20 20:00:23.758803 7f29700c9700 10 cephx: verify_authorizer decrypted service auth secret_id=2 2018-04-20 20:00:23.758827 7f29700c9700 10 cephx: verify_authorizer global_id=4225 2018-04-20 20:00:23.758836 7f29700c9700 10 cephx: verify_authorizer ok nonce 6b8b4567327b23c6 reply_bl.length()=36 2018-04-20 20:00:23.758838 7f29700c9700 10 cephx server client.admin: ticket_req.keys = 5 2018-04-20 20:00:23.758840 7f29700c9700 10 cephx server client.admin: adding key for service mon 2018-04-20 20:00:23.758862 7f29700c9700 10 cephx server client.admin: adding key for service osd 2018-04-20 20:00:23.758877 7f29700c9700 10 cephx keyserverdata: get_caps: name=client.admin 2018-04-20 20:00:23.758878 7f29700c9700 10 cephx keyserverdata: get_secret: num of caps=3 2018-04-20 20:00:23.758882 7f29700c9700 10 cephx: build_service_ticket_reply encoding 2 tickets with secret AQDXRtpaw1YzLRAA0/+G4CM2Ng8qqtdQ08pHMA== 2018-04-20 20:00:23.758892 7f29700c9700 10 cephx: build_service_ticket service mon secret_id 9 ticket_info.ticket.name=client.admin 2018-04-20 20:00:23.758903 7f29700c9700 10 cephx: build_service_ticket service osd secret_id 9 ticket_info.ticket.name=client.admin 2018-04-20 20:00:23.993846 7f29708ca700 10 cephx keyserver: _check_rotating_secrets
I was wrong in comment #7. My request was malformed. Here is the same key that was created working: [root@controller-0 ceph]# rbd -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.openstack.keyring -p images ls --id openstack foo [root@controller-0 ceph]# [root@controller-0 ceph]# rbd -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.openstack.keyring -p images ls -n client.openstack foo [root@controller-0 ceph]# ==> ceph-mon.controller-0.log <== 2018-04-24 12:48:26.406232 7f29708ca700 10 cephx keyserver: _check_rotating_secrets 2018-04-24 12:48:31.406369 7f29708ca700 10 cephx keyserver: _check_rotating_secrets 2018-04-24 12:48:36.406495 7f29708ca700 10 cephx keyserver: _check_rotating_secrets 2018-04-24 12:48:39.168739 7f29708ca700 0 mon.controller-0@0(leader).data_health(3) update_stats avail 76% total 40947 MB, used 9793 MB, avail 31153 MB 2018-04-24 12:48:41.406701 7f29708ca700 10 cephx keyserver: _check_rotating_secrets 2018-04-24 12:48:46.406859 7f29708ca700 10 cephx keyserver: _check_rotating_secrets 2018-04-24 12:48:51.407040 7f29708ca700 10 cephx keyserver: _check_rotating_secrets 2018-04-24 12:48:56.407211 7f29708ca700 10 cephx keyserver: _check_rotating_secrets 2018-04-24 12:49:01.407375 7f29708ca700 10 cephx keyserver: _check_rotating_secrets 2018-04-24 12:49:06.407520 7f29708ca700 10 cephx keyserver: _check_rotating_secrets 2018-04-24 12:49:11.407698 7f29708ca700 10 cephx keyserver: _check_rotating_secrets 2018-04-24 12:49:16.407820 7f29708ca700 10 cephx keyserver: _check_rotating_secrets 2018-04-24 12:49:16.971214 7f296d2b8700 10 In get_auth_session_handler for protocol 0 2018-04-24 12:49:16.971887 7f29700c9700 10 cephx server client.openstack: start_session server_challenge 9562e3f8b3dfdb21 2018-04-24 12:49:16.972241 7f29700c9700 10 cephx server client.openstack: handle_request get_auth_session_key for client.openstack 2018-04-24 12:49:16.972264 7f29700c9700 20 cephx server client.openstack: checking key: req.key=a2bb462b5222e920 expected_key=a2bb462b5222e920 2018-04-24 12:49:16.972325 7f29700c9700 10 cephx: build_service_ticket_reply encoding 1 tickets with secret AQBtydlaAAAAABAAQ6qzhegXg+GXw7WAnaFFDA== 2018-04-24 12:49:16.972346 7f29700c9700 10 cephx: build_service_ticket service auth secret_id 9 ticket_info.ticket.name=client.openstack 2018-04-24 12:49:16.972355 7f29700c9700 10 cephx keyserverdata: get_caps: name=client.openstack 2018-04-24 12:49:16.972356 7f29700c9700 10 cephx keyserverdata: get_secret: num of caps=2 2018-04-24 12:49:16.972919 7f29700c9700 10 cephx server client.openstack: handle_request get_principal_session_key 2018-04-24 12:49:16.972921 7f29700c9700 10 cephx: verify_authorizer decrypted service auth secret_id=9 2018-04-24 12:49:16.972945 7f29700c9700 10 cephx: verify_authorizer global_id=4234 2018-04-24 12:49:16.972953 7f29700c9700 10 cephx: verify_authorizer ok nonce 6b8b4567327b23c6 reply_bl.length()=36 2018-04-24 12:49:16.972954 7f29700c9700 10 cephx server client.openstack: ticket_req.keys = 5 2018-04-24 12:49:16.972956 7f29700c9700 10 cephx server client.openstack: adding key for service mon 2018-04-24 12:49:16.972976 7f29700c9700 10 cephx server client.openstack: adding key for service osd 2018-04-24 12:49:16.972989 7f29700c9700 10 cephx keyserverdata: get_caps: name=client.openstack 2018-04-24 12:49:16.972991 7f29700c9700 10 cephx keyserverdata: get_secret: num of caps=2 2018-04-24 12:49:16.972994 7f29700c9700 10 cephx: build_service_ticket_reply encoding 2 tickets with secret AQDMJ99aOcrzORAAW1HLRP2RYyKTQFUWK8t0yg== 2018-04-24 12:49:16.973004 7f29700c9700 10 cephx: build_service_ticket service mon secret_id 98 ticket_info.ticket.name=client.openstack 2018-04-24 12:49:16.973014 7f29700c9700 10 cephx: build_service_ticket service osd secret_id 98 ticket_info.ticket.name=client.openstack
Root cause: The fix for CVE-2017-12155 contained patches for puppet-tripleo and tripleo-heat-templates (THT). https://review.openstack.org/#/c/528599/ (puppet-tripleo) https://review.openstack.org/#/c/537874/ (puppet-tripleo) https://review.openstack.org/#/c/528595/ (tripleo-heat-templates) The problem is that only the THT patches are in this puddle [1] and both puppet-tripleo patches are missing [2]. So tripleo-heat-templates were setting the key to 640 [1] but puppet-tripleo was not adding the ACLs that glance, cinder, or nova will need to be able to read the key. This seems to be a packaging issue. [1] [stack@undercloud-0 ~]$ grep -B 2 640 /usr/share/openstack-tripleo-heat-templates/puppet/services/ceph-base.yaml client.CLIENT_USER: { secret: 'CLIENT_KEY', mode: '0640', [stack@undercloud-0 ~]$ [2] [root@controller-0 base]# grep -i acl nova/libvirt.pp [root@controller-0 base]# grep -i acl glance/api.pp [root@controller-0 base]# pwd /etc/puppet/modules/tripleo/manifests/profile/base [root@controller-0 base]# rpm -q puppet-tripleo puppet-tripleo-5.6.4-3.el7ost.noarch [root@controller-0 base]#
Marking this CLOSED NOTABUG because phase1 is passing now that we are using overcloud images that contain the right content.