Description of problem: ---------------------- RHEL 7.3 with OSP 10. Director deployed using external ceph cluster. Cinder, nova, glance using RHCS 2.0 backend. Snapshot of running instance fails. Component Version-Release: ------------------------- root@overcloud-controller-0:~ # cat /etc/redhat-release Red Hat Enterprise Linux Server release 7.3 Beta (Maipo) root@overcloud-controller-0:~ # uname -r 3.10.0-510.el7.x86_64 root@overcloud-controller-0:~ # yum list installed | egrep -i 'openstack|ceph|vswitch|neutron' ceph-base.x86_64 1:10.2.2-41.el7cp @rhos-10.0-ceph-2.0-mon-signed ceph-common.x86_64 1:10.2.2-41.el7cp @rhos-10.0-ceph-2.0-mon-signed ceph-mon.x86_64 1:10.2.2-41.el7cp @rhos-10.0-ceph-2.0-mon-signed ceph-osd.x86_64 1:10.2.2-41.el7cp @rhos-10.0-ceph-2.0-osd-signed ceph-radosgw.x86_64 1:10.2.2-41.el7cp @rhos-10.0-ceph-2.0-tools-signed ceph-selinux.x86_64 1:10.2.2-41.el7cp @rhos-10.0-ceph-2.0-mon-signed fcgi.x86_64 2.4.0-25.el7cp @rhos-10.0-ceph-2.0-mon-signed gperftools-libs.x86_64 2.4-8.el7 @rhos-10.0-ceph-2.0-mon-signed leveldb.x86_64 1.12.0-5.el7cp @rhos-10.0-ceph-2.0-mon-signed libbabeltrace.x86_64 1.2.4-3.el7cp @rhos-10.0-ceph-2.0-mon-signed libcephfs1.x86_64 1:10.2.2-41.el7cp @rhos-10.0-ceph-2.0-mon-signed librados2.x86_64 1:10.2.2-41.el7cp @rhos-10.0-ceph-2.0-mon-signed librbd1.x86_64 1:10.2.2-41.el7cp @rhos-10.0-ceph-2.0-mon-signed librgw2.x86_64 1:10.2.2-41.el7cp @rhos-10.0-ceph-2.0-mon-signed lttng-ust.x86_64 2.4.1-1.el7cp @rhos-10.0-ceph-2.0-mon-signed openstack-aodh-api.noarch 3.0.0-0.20160921151816.bb5103e.el7ost openstack-aodh-common.noarch 3.0.0-0.20160921151816.bb5103e.el7ost openstack-aodh-evaluator.noarch 3.0.0-0.20160921151816.bb5103e.el7ost openstack-aodh-listener.noarch 3.0.0-0.20160921151816.bb5103e.el7ost openstack-aodh-notifier.noarch 3.0.0-0.20160921151816.bb5103e.el7ost openstack-ceilometer-api.noarch 1:7.0.0-0.20160928024313.67bbd3f.el7ost openstack-ceilometer-central.noarch 1:7.0.0-0.20160928024313.67bbd3f.el7ost openstack-ceilometer-collector.noarch openstack-ceilometer-common.noarch 1:7.0.0-0.20160928024313.67bbd3f.el7ost openstack-ceilometer-compute.noarch 1:7.0.0-0.20160928024313.67bbd3f.el7ost openstack-ceilometer-notification.noarch openstack-ceilometer-polling.noarch 1:7.0.0-0.20160928024313.67bbd3f.el7ost openstack-cinder.noarch 1:9.0.0-0.20160928223334.ab95181.el7ost openstack-dashboard.noarch 1:10.0.0-0.20161002185148.3252153.1.el7ost openstack-dashboard-theme.noarch 1:10.0.0-0.20161002185148.3252153.1.el7ost openstack-glance.noarch 1:13.0.0-0.20160928121721.4404ae6.el7ost openstack-gnocchi-api.noarch 3.0.1-0.20160923180636.c6b2c51.el7ost openstack-gnocchi-carbonara.noarch 3.0.1-0.20160923180636.c6b2c51.el7ost openstack-gnocchi-common.noarch 3.0.1-0.20160923180636.c6b2c51.el7ost openstack-gnocchi-indexer-sqlalchemy.noarch openstack-gnocchi-metricd.noarch 3.0.1-0.20160923180636.c6b2c51.el7ost openstack-gnocchi-statsd.noarch 3.0.1-0.20160923180636.c6b2c51.el7ost openstack-heat-api.noarch 1:7.0.0-0.20160926200847.dd707bc.el7ost openstack-heat-api-cfn.noarch 1:7.0.0-0.20160926200847.dd707bc.el7ost openstack-heat-api-cloudwatch.noarch 1:7.0.0-0.20160926200847.dd707bc.el7ost openstack-heat-common.noarch 1:7.0.0-0.20160926200847.dd707bc.el7ost openstack-heat-engine.noarch 1:7.0.0-0.20160926200847.dd707bc.el7ost openstack-ironic-api.noarch 1:6.2.1-0.20160930163405.3f54fec.el7ost openstack-ironic-common.noarch 1:6.2.1-0.20160930163405.3f54fec.el7ost openstack-ironic-conductor.noarch 1:6.2.1-0.20160930163405.3f54fec.el7ost openstack-keystone.noarch 1:10.0.0-0.20160928144040.6520523.el7ost openstack-manila.noarch 1:3.0.0-0.20160916162617.8f2fa31.el7ost openstack-manila-share.noarch 1:3.0.0-0.20160916162617.8f2fa31.el7ost openstack-manila-ui.noarch 2.5.1-0.20160929180323.81c354a.el7ost openstack-mistral-api.noarch 3.0.0-0.20160929083341.c0a4501.el7ost openstack-mistral-common.noarch 3.0.0-0.20160929083341.c0a4501.el7ost openstack-mistral-engine.noarch 3.0.0-0.20160929083341.c0a4501.el7ost openstack-mistral-executor.noarch 3.0.0-0.20160929083341.c0a4501.el7ost openstack-neutron.noarch 1:9.0.0-0.20160929051647.71f2d2b.el7ost openstack-neutron-bigswitch-agent.noarch openstack-neutron-bigswitch-lldp.noarch openstack-neutron-common.noarch 1:9.0.0-0.20160929051647.71f2d2b.el7ost openstack-neutron-lbaas.noarch 1:9.0.0-0.20160921180958.6528738.el7ost openstack-neutron-metering-agent.noarch openstack-neutron-ml2.noarch 1:9.0.0-0.20160929051647.71f2d2b.el7ost openstack-neutron-openvswitch.noarch 1:9.0.0-0.20160929051647.71f2d2b.el7ost openstack-neutron-sriov-nic-agent.noarch openstack-nova-api.noarch 1:14.0.0-0.20160929203854.59653c6.el7ost openstack-nova-cert.noarch 1:14.0.0-0.20160929203854.59653c6.el7ost openstack-nova-common.noarch 1:14.0.0-0.20160929203854.59653c6.el7ost openstack-nova-compute.noarch 1:14.0.0-0.20160929203854.59653c6.el7ost openstack-nova-conductor.noarch 1:14.0.0-0.20160929203854.59653c6.el7ost openstack-nova-console.noarch 1:14.0.0-0.20160929203854.59653c6.el7ost openstack-nova-novncproxy.noarch 1:14.0.0-0.20160929203854.59653c6.el7ost openstack-nova-scheduler.noarch 1:14.0.0-0.20160929203854.59653c6.el7ost openstack-puppet-modules.noarch 1:9.0.0-0.20160915155755.8c758d6.el7ost openstack-sahara.noarch 1:5.0.0-0.20160926213141.cbd51fa.el7ost openstack-sahara-api.noarch 1:5.0.0-0.20160926213141.cbd51fa.el7ost openstack-sahara-common.noarch 1:5.0.0-0.20160926213141.cbd51fa.el7ost openstack-sahara-engine.noarch 1:5.0.0-0.20160926213141.cbd51fa.el7ost openstack-sahara-ui.noarch 5.0.0-0.20160927074353.fdd3a75.el7ost openstack-selinux.noarch 0.7.9-1.el7ost @rhos-10.0-puddle openstack-swift-account.noarch 2.10.1-0.20160929005314.3349016.el7ost openstack-swift-container.noarch 2.10.1-0.20160929005314.3349016.el7ost openstack-swift-object.noarch 2.10.1-0.20160929005314.3349016.el7ost openstack-swift-plugin-swift3.noarch 1.11.1-0.20160929001717.e7a2b88.el7ost openstack-swift-proxy.noarch 2.10.1-0.20160929005314.3349016.el7ost openstack-utils.noarch 2016.1-1.el7ost @rhelosp-10.0-devtools-puddle openstack-zaqar.noarch 1:3.0.0-0.20160921221617.3ef0881.el7ost openvswitch.x86_64 1:2.5.0-5.git20160628.el7fdb puppet-ceph.noarch 2.2.0-1.el7ost @rhos-10.0-puddle puppet-neutron.noarch 9.4.0-1.el7ost @rhos-10.0-puddle puppet-openstack_extras.noarch 9.4.0-1.el7ost @rhos-10.0-puddle puppet-openstacklib.noarch 9.4.0-0.20160929212001.0e58c86.el7ost puppet-vswitch.noarch 5.4.0-1.el7ost @rhos-10.0-puddle python-cephfs.x86_64 1:10.2.2-41.el7cp @rhos-10.0-ceph-2.0-mon-signed python-crypto.x86_64 2.6.1-1.1.el7 @rhos-10.0-ceph-2.0-mon-signed python-django-openstack-auth.noarch 2.4.1-0.20160927170809.55ebf6b.el7ost python-flask.noarch 1:0.10.1-5.el7 @rhos-10.0-ceph-2.0-mon-signed python-jinja2.noarch 2.7.2-2.el7cp @rhos-10.0-ceph-2.0-mon-signed python-netifaces.x86_64 0.10.4-3.el7ost @rhos-10.0-ceph-2.0-tools-signed python-neutron.noarch 1:9.0.0-0.20160929051647.71f2d2b.el7ost python-neutron-lbaas.noarch 1:9.0.0-0.20160921180958.6528738.el7ost python-neutron-lib.noarch 0.4.0-0.20160915221324.705fd90.el7ost python-neutron-tests.noarch 1:9.0.0-0.20160929051647.71f2d2b.el7ost python-neutronclient.noarch 6.0.0-0.20160916123315.f53d624.el7ost python-openstack-mistral.noarch 3.0.0-0.20160929083341.c0a4501.el7ost python-openstackclient.noarch 3.2.0-0.20160914003636.8241f08.el7ost python-openstacksdk.noarch 0.9.5-0.20160912180601.d7ee3ad.el7ost python-openvswitch.noarch 1:2.5.0-5.git20160628.el7fdb python-rados.x86_64 1:10.2.2-41.el7cp @rhos-10.0-ceph-2.0-mon-signed python-rbd.x86_64 1:10.2.2-41.el7cp @rhos-10.0-ceph-2.0-mon-signed userspace-rcu.x86_64 0.7.9-2.el7rhgs @rhos-10.0-ceph-2.0-mon-signed How reproducible: ---------------- Consistent Steps to Reproduce: ------------------ 1. Observe running and accessible guest 2. Attempt to snapshot running guest 3. Observe guest state show 'image_uploading' and 'image-list' lists target image 4. Observe completion error, guest state return to Running, and 'image-list' no longer listing target image (see Additional Info) Actual results: -------------- Attempt throws error, no new image is created. Expected results: ---------------- Attempt creates new image recognized and listed by glance. Additional info: --------------- Similar to OSP5 bz 1239260 but the currently documented steps for deploying the overcloud using an external ceph cluster instruct the user to apply rwx permissions for user client.openstack and the glance-api.conf file specifies rbd_store_user=openstack. Even though client.openstack does have the correct permissions, I still can not snapshot to ceph storage. (root@c04-h33-6018r) - (20:29) - (~) -=>>ceph auth get client.openstack exported keyring for client.openstack [client.openstack] key = AQCI2wRY5K13KRAAecXoDV75tdjrLtXH40cBag== caps mon = "allow r" caps osd = "allow class-read object_prefix rbd_children, allow rwx pool=volumes, allow rwx pool=vms, allow rwx pool=images" While glance-api.conf specifies rbd_store_user=openstack, I noticed that client.glance did not have permissions to the vms pool so for the sake of knowing I added it using ceph auth caps. (root@c04-h33-6018r) - (20:27) - (~) -=>>ceph auth get client.glance exported keyring for client.glance [client.glance] key = AQABmwdYH487NRAAqIi+t4uEni4uLz1AVLxpGg== caps mon = "allow r" caps osd = "allow class-read object_prefix rbd_children, allow rwx pool=images, allow rwx pool=vms" I've restarted all openstack services but still can not snapshot. # nova image-create --show --poll a0260de6-6c5a-478d-8d65-92021b5611df rhel7.2_snap1 Server snapshotting... 0% # glance image-list ; nova list +--------------------------------------+---------------+ | ID | Name | +--------------------------------------+---------------+ | d7e02e24-e71d-4b06-88c6-4239c62a19c3 | rhel7.2_snap1 | | 106a15e2-7105-4bc2-9eba-caecb771be0a | rhel_7.2 | +--------------------------------------+---------------+ +--------------------------------------+-------+--------+-----------------+-------------+------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+-------+--------+-----------------+-------------+------------------+ | a0260de6-6c5a-478d-8d65-92021b5611df | test1 | ACTIVE | image_uploading | Running | private=10.0.0.6 | +--------------------------------------+-------+--------+-----------------+-------------+------------------+ Server snapshotting... 0% completeERROR (NotFound): Not found (HTTP 404) # glance image-list ; nova list +--------------------------------------+----------+ | ID | Name | +--------------------------------------+----------+ | 106a15e2-7105-4bc2-9eba-caecb771be0a | rhel_7.2 | +--------------------------------------+----------+ +--------------------------------------+-------+--------+------------+-------------+------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+-------+--------+------------+-------------+------------------+ | a0260de6-6c5a-478d-8d65-92021b5611df | test1 | ACTIVE | - | Running | private=10.0.0.6 | +--------------------------------------+-------+--------+------------+-------------+------------------+ This is blocking time critical testing of RHCS 2.0 in RHOSP 10 in the scale lab.
Created attachment 1213491 [details] nova.conf
Created attachment 1213492 [details] cinder.conf
Created attachment 1213493 [details] glance-api.conf
#show_multiple_locations = false in Glance API conf. We should be hitting on this https://github.com/openstack/glance/blob/master/glance/api/v2/images.py#L290-L294 I will try to find the Nova code catching that/documentation explaining it. This default should have been fixed already on upstream stable/newton commit: 75adc9f031a43effddf0e16bd74f0c9ead9dbf3a
Sorry for not clarifying sooner, this is a duplicate of a bug caught recently in automation, but we will want to be certain the fix is working. The bug and fix are actually in openstack-tripleo-heat-templates-5.0.0-0.5.0rc3.el7ost. I'm going to mark it as a duplicate and the fix should be available in recent puddles. *** This bug has been marked as a duplicate of bug 1382737 ***
(In reply to Paul Grist from comment #6) > Sorry for not clarifying sooner, this is a duplicate of a bug caught > recently in automation, but we will want to be certain the fix is working. > The bug and fix are actually in > openstack-tripleo-heat-templates-5.0.0-0.5.0rc3.el7ost. > > I'm going to mark it as a duplicate and the fix should be available in > recent puddles. > > *** This bug has been marked as a duplicate of bug 1382737 *** Bug 1382737 is listed as a documentation issue but our issue is not resolved by the permissions fix proposed. The openstack-tripleo-heat-templates RPM exists only on my director node so I assume it requires a redeploy of the overcloud to acquire the fix. Is that correct? Is there any way to diff the package and apply the resolution to my existing overcloud? A redeploy is not preferred given the current scale of our test env. Thanks.
In reply to Tim Wilkinson, #c7 The duplicate bug is a fix to the tripleO heat template yaml files to deploy the correct settings? was the is comment intended for this bug? I may be missing the connection. I'm not well versed in the logic behind the settings, but it definitely looks like the same issue so I was expecting the same fix to be the root cause of your issues. We can follow up more with the people verifying Bug 1382737 if that doesn't appear to be the case.
The issue I have is the inability to create a nova image via snapshot of an existing instance (nova image-create) in my OSP10 overcloud. Is that the same problem as bug 1382737? That appears to be about the snapshot functionality in the undercloud.
I brought this one up again this morning to confirm the fix should address this issue as well. Folks agreed that this is the same problem. Have you been able to try this again on one of the more recent OSP10 puddles? I have added some more folks to the cc list to get more comments if needed too.
I've updated the openstack-tripleo-heat-templates in my undercloud (osp10d) to the latest version but I'm confused as to how that will resolve the problem on the overcloud-controller-0 where the snapshots fail. The scale of the test env prevents me from redeploying. Plz excuse my ignorance, I'm obviously missing something.
Can someone give us a hint WHICH heat templates would be relevant here? There are approx. 480 files in this RPM. Or give us a pointer to the diff of the git commit specified in comment 4? Even if I do a brute-force diff the 2 RPMs versions, which i can do, I won't know which changes were relevant. If there is some really small change to be done to a config file on the compute hosts, then we can just copy that to them and be done with this. The only clue I see in comment #4 is this: #show_multiple_locations = false in Glance API conf. Is that the only change that we need to make to Glance conf file?
So the code indicates that it throws an exception if show_multiple_locations is set to false, which it is by default apparently. It seems harmless to set it to true, is it? If I do this, it won't trigger this exception, right? But will it have any other side effects? I saw https://github.com/openstack/glance-specs/commit/484999de7601117e42c6c0b3ea0db86dce261a72 Unless you object, this is what we'll try, ok?
(In reply to Tim Wilkinson from comment #11) > I've updated the openstack-tripleo-heat-templates in my undercloud (osp10d) > to the latest version but I'm confused as to how that will resolve the > problem on the overcloud-controller-0 where the snapshots fail. The scale of > the test env prevents me from redeploying. Plz excuse my ignorance, I'm > obviously missing something. Tim, The THT change indeed will need redeployment to become effective. If that's not an option you will need to change the glance-api.conf on each node manually and restart it. AFAIK this is not affecting OSP10 anymore.
(In reply to Erno Kuvaja from comment #14) > > The THT change indeed will need redeployment to become effective. If that's > not an option you will need to change the glance-api.conf on each node > manually and restart it. AFAIK this is not affecting OSP10 anymore. Thanks, Erno. That's just what I needed and it did resolve the problem. root@tim-controller-0:~ # nova image-create --show --poll 1b07c502-6cc7-41a3-a38a-c3fcd80f44d0 rhel72_snap1 Server snapshotting... 100% complete Finished +------------------+--------------------------------------------------------------------------------------------------------------------------+ | Property | Value | +------------------+--------------------------------------------------------------------------------------------------------------------------+ | base_image_ref | 106a15e2-7105-4bc2-9eba-caecb771be0a | | checksum | - | | container_format | bare | | created_at | 2016-11-16T14:57:39Z | | direct_url | rbd://6942aa60-f4c4-4954-88d7-c67e0d0a0df5/images/a963041e-739f-4266-94f1-e4dd6052bd60/snap | | disk_format | raw | | file | /v2/images/a963041e-739f-4266-94f1-e4dd6052bd60/file | | id | a963041e-739f-4266-94f1-e4dd6052bd60 | | image_location | snapshot | | image_state | available | | image_type | snapshot | | instance_uuid | 1b07c502-6cc7-41a3-a38a-c3fcd80f44d0 | | locations | [{"url": "rbd://6942aa60-f4c4-4954-88d7-c67e0d0a0df5/images/a963041e-739f-4266-94f1-e4dd6052bd60/snap", "metadata": {}}] | | min_disk | 20 | | min_ram | 0 | | name | rhel72_snap1 | | owner | b06fce501cb7494b8cdffd6d922c49ff | | owner_id | b06fce501cb7494b8cdffd6d922c49ff | | protected | False | | schema | /v2/schemas/image | | self | /v2/images/a963041e-739f-4266-94f1-e4dd6052bd60 | | size | 21474836480 | | status | active | | tags | [] | | updated_at | 2016-11-16T14:59:05Z | | user_id | 28026fd404c44f7387aa0eecdfe34fa7 | | virtual_size | - | | visibility | private | +------------------+--------------------------------------------------------------------------------------------------------------------------+