Bug 1309475

Summary: Booting an instance from volume that's created using image fails on director install using ceph backend
Product: Red Hat OpenStack Reporter: Jeremy <jmelvin>
Component: openstack-glanceAssignee: Flavio Percoco <fpercoco>
Status: CLOSED DUPLICATE QA Contact: nlevinki <nlevinki>
Severity: high Docs Contact:
Priority: high    
Version: 7.0 (Kilo)CC: eglynn, fpercoco, lyarwood, mflusche, rkharwar, scorcora, sgotliv, yeylon
Target Milestone: ---Keywords: Unconfirmed, ZStream
Target Release: 7.0 (Kilo)Flags: jmelvin: needinfo-
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-02-17 23:30:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jeremy 2016-02-17 22:36:06 UTC
Description of problem:Booting an instance from volume that's created using image fails on director install using ceph backend. We expect cinder to convert the qcow2 glance image and create a volume for nova to use to boot the image. However We see: 2016-02-17 15:45:19.722 22249 INFO cinder.api.openstack.wsgi [req-9aec8056-7cd1-46bc-be92-a2022a399c27 c14f271b6da84d04ac2c1ff71d3e3eb9 601ef230893a4e48968929a5f8756099 - - -] HTTP exception thrown: Invalid image 
identifier or unable to access requested image.

It looks like cinder can not find the image or access it from glance. Note the glance images are stored on the ceph backend.




Version-Release number of selected component (if applicable):
openstack-glance-2015.1.2-1.el7ost.noarch 
openstack-cinder-2015.1.2-5.el7ost.noarch 
eph-mon-0.94.1-13.el7cp.x86_64

How reproducible:
100%

Steps to Reproduce:
1.Boot instance by selecting the instance boot source in Horizon : Boot from image(creates a new volume)
2. Note block device mapping instance  state

Actual results:
Instance failed block device setup

Expected results:
Should create a volume from the existing image and boot the instance from volume

Additional info:
Booting from volume created using image fails:

image -- 19884153-8d69-468d-96b1-6b770b81de3f | CentOS-7-x86_64-GenericCloud-1510.2016-01-27._sparse      | qcow2       | bare             | 1802305536 | active |

[root@devjcc1-controller-0 heat-admin]# nova list
+--------------------------------------+---------+--------+----------------------+-------------+---------------------+
| ID                                   | Name    | Status | Task State           | Power State | Networks            |
+--------------------------------------+---------+--------+----------------------+-------------+---------------------+
| bf5a83c8-4cdf-4366-ba45-146a466dfa9d | qwerty  | ERROR  | block_device_mapping | NOSTATE     |                     |
| 8556f399-3322-4366-b41f-c82dbe3f7cd3 | simple  | ACTIVE | -                    | Running     | JPLNet=137.78.18.29 |
| 2b5eef1d-d62b-49b3-82b8-29ee80ac7b74 | simple2 | ERROR  | block_device_mapping | NOSTATE     |                     |
| 97348e01-2267-43f5-aa94-6222d5e8f115 | simple3 | ERROR  | block_device_mapping | NOSTATE     |                     |
| 39b50949-6f4f-43e3-a523-5fcc592ffca3 | simple4 | ERROR  | block_device_mapping | NOSTATE     |                     |
| 026cecfd-09b3-4b8e-b8f6-36b0149f99e4 | yat     | ERROR  | block_device_mapping | NOSTATE     |                     |
+--------------------------------------+---------+--------+----------------------+-------------+---------------------+
compute-0 /cinder/api.log
2016-02-17 15:45:18.577 22249 DEBUG cinder.api.v2.volumes [req-9aec8056-7cd1-46bc-be92-a2022a399c27 c14f271b6da84d04ac2c1ff71d3e3eb9 601ef230893a4e48968929a5f8756099 - - -] Create volume request body: {u'volume': {u'status': u'creating', u'user_id': u'c14f271b6da84d04ac2c1ff71d3e3eb9', u'description': u'', u'imageRef': u'19884153-8d69-468d-96b1-6b770b81de3f', u'availability_zone': u'nova', 'scheduler_hints': {}, u'attach_status': u'detached', u'source_volid': None, u'name': u'', u'metadata': {}, u'consistencygroup_id': None, u'volume_type': None, u'snapshot_id': None, u'project_id': u'601ef230893a4e48968929a5f8756099', u'source_replica': None, u'size': 40}} create /usr/lib/python2.7/site-packages/cinder/api/v2/volumes.py:307
2016-02-17 15:45:18.578 22249 INFO cinder.api.v2.volumes [req-9aec8056-7cd1-46bc-be92-a2022a399c27 c14f271b6da84d04ac2c1ff71d3e3eb9 601ef230893a4e48968929a5f8756099 - - -] Create volume of 40 GB

2016-02-17 15:45:19.722 22249 INFO cinder.api.openstack.wsgi [req-9aec8056-7cd1-46bc-be92-a2022a399c27 c14f271b6da84d04ac2c1ff71d3e3eb9 601ef230893a4e48968929a5f8756099 - - -] HTTP exception thrown: Invalid image identifier or unable to access requested image.

compute-1/nova/compute.log

-2267-43f5-aa94-6222d5e8f115] Instance failed block device setup
2016-02-17 15:45:19.725 3632 TRACE nova.compute.manager [instance: 97348e01-2267-43f5-aa94-6222d5e8f115] Traceback (most recent call last):
2016-02-17 15:45:19.725 3632 TRACE nova.compute.manager [instance: 97348e01-2267-43f5-aa94-6222d5e8f115]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1987, in _prep_block_device
2016-02-17 15:45:19.725 3632 TRACE nova.compute.manager [instance: 97348e01-2267-43f5-aa94-6222d5e8f115]     do_check_attach=do_check_attach) +
2016-02-17 15:45:19.725 3632 TRACE nova.compute.manager [instance: 97348e01-2267-43f5-aa94-6222d5e8f115]   File "/usr/lib/python2.7/site-packages/nova/virt/block_device.py", line 471, in attach_block_devices
2016-02-17 15:45:19.725 3632 TRACE nova.compute.manager [instance: 97348e01-2267-43f5-aa94-6222d5e8f115]     map(_log_and_attach, block_device_mapping)
2016-02-17 15:45:19.725 3632 TRACE nova.compute.manager [instance: 97348e01-2267-43f5-aa94-6222d5e8f115]   File "/usr/lib/python2.7/site-packages/nova/virt/block_device.py", line 469, in _log_and_attach
2016-02-17 15:45:19.725 3632 TRACE nova.compute.manager [instance: 97348e01-2267-43f5-aa94-6222d5e8f115]     bdm.attach(*attach_args, **attach_kwargs)
2016-02-17 15:45:19.725 3632 TRACE nova.compute.manager [instance: 97348e01-2267-43f5-aa94-6222d5e8f115]   File "/usr/lib/python2.7/site-packages/nova/virt/block_device.py", line 380, in attach
2016-02-17 15:45:19.725 3632 TRACE nova.compute.manager [instance: 97348e01-2267-43f5-aa94-6222d5e8f115]     availability_zone=av_zone)
2016-02-17 15:45:19.725 3632 TRACE nova.compute.manager [instance: 97348e01-2267-43f5-aa94-6222d5e8f115]   File "/usr/lib/python2.7/site-packages/nova/volume/cinder.py", line 408, in create
2016-02-17 15:45:19.725 3632 TRACE nova.compute.manager [instance: 97348e01-2267-43f5-aa94-6222d5e8f115]     raise exception.InvalidInput(reason=e)
2016-02-17 15:45:19.725 3632 TRACE nova.compute.manager [instance: 97348e01-2267-43f5-aa94-6222d5e8f115] InvalidInput: Invalid input received: Invalid image identifier or unable to access requested image. (HTTP 400) (Request-ID: req-9aec8056-7cd1-46bc-be92-a2022a399c27)
2016-02-17 15:45:19.725 3632 TRACE nova.compute.manager [instance: 97348e01-2267-43f5-aa94-6222d5e8f115]
2016-02-17 15:45:19.726 3632 ERROR nova.compute.manager [req-916897af-63ec-430d-a4e1-0ea3f73c3bca c14f271b6da84d04ac2c1ff71d3e3eb9 601ef230893a4e48968929a5f8756099 - - -] [instance: 97348e01-2267-43f5-aa94-6222d5e8f115] Failure prepping block device
2016-02-17 15:45:19.726 3632 TRACE nova.compute.manager [instance: 97348e01-2267-43f5-aa94-6222d5e8f115] Traceback (most recent call last):
2016-02-17 15:45:19.726 3632 TRACE nova.compute.manager [instance: 97348e01-2267-43f5-aa94-6222d5e8f115]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2461, in _build_resources
2016-02-17 15:45:19.726 3632 TRACE nova.compute.manager [instance: 97348e01-2267-43f5-aa94-6222d5e8f115]     block_device_mapping)
2016-02-17 15:45:19.726 3632 TRACE nova.compute.manager [instance: 97348e01-2267-43f5-aa94-6222d5e8f115]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2015, in _prep_block_device
2016-02-17 15:45:19.726 3632 TRACE nova.compute.manager [instance: 97348e01-2267-43f5-aa94-6222d5e8f115]     raise exception.InvalidBDM()
2016-02-17 15:45:19.726 3632 TRACE nova.compute.manager [instance: 97348e01-2267-43f5-aa94-6222d5e8f115] InvalidBDM: Block Device Mapping is Invalid.
2016-02-17 15:45:19.726 3632 TRACE nova.compute.manager [instance: 97348e01-2267-43f5-aa94-6222d5e8f115]
2016-02-17 15:45:19.726 3632 DEBUG nova.compute.claims [req-916897af-63ec-430d-a4e1-0ea3f73c3bca c14f271b6da84d04ac2c1ff71d3e3eb9 601ef230893a4e48968929a5f8756099 - - -] [instance: 97348e01-2267-43f5-aa94-6222d5e8f115] Aborting claim: [Claim: 4096 MB memory, 40 GB disk] abort /usr/lib/python2.7/site-packages/nova/compute/claims.py:130

****
ENabled debug logging in glance, this is only relevant looking information:
e48968929a5f8756099 - - -] Registry request GET /images/19884153-8d69-468d-96b1-6b770b81de3f HTTP 200 request id req-fd50f500-976b-4dc2-a15d-6a9c19af19ac do_request /usr/lib/python2.7/site-packages/glance/registry/client/v1/client.py:124
2016-02-17 21:48:37.833 32070 INFO eventlet.wsgi.server [req-4339fa89-4f8a-4648-b9c8-5dacfadcadcc c14f271b6da84d04ac2c1ff71d3e3eb9 601ef230893a4e48968929a5f8756099 - - -] 172.17.1.12 - - [17/Feb/2016 21:48:37] "HEAD /v1/images/19884153-8d69-468d-96b1-6b770b81de3f HTTP/1.1" 200 864 0.522762
2016-02-17 21:48:40.351 32071 DEBUG glance.api.middleware.version_negotiation [req-f8e2d649-53d7-4d03-8b12-020f7103bf60 c14f271b6da84d04ac2c1ff71d3e3eb9 601ef230893a4e48968929a5f8756099 - - -] Determining version of request: GET /v2/images/19884153-8d69-468d-96b1-6b770b81de3f Accept: */* process_request /usr/lib/python2.7/site-packages/glance/api/middleware/version_negotiation.py:47
2016-02-17 21:48:40.351 32071 DEBUG glance.api.middleware.version_negotiation [req-f8e2d649-53d7-4d03-8b12-020f7103bf60 c14f271b6da84d04ac2c1ff71d3e3eb9 601ef230893a4e48968929a5f8756099 - - -] Using url versioning process_request /usr/lib/python2.7/site-packages/glance/api/middleware/version_negotiation.py:60
2016-02-17 21:48:40.352 32071 DEBUG glance.api.middleware.version_negotiation [req-f8e2d649-53d7-4d03-8b12-020f7103bf60 c14f271b6da84d04ac2c1ff71d3e3eb9 601ef230893a4e48968929a5f8756099 - - -] Matched version: v2 process_request /usr/lib/python2.7/site-packages/glance/api/middleware/version_negotiation.py:72
2016-02-17 21:48:40.352 32071 DEBUG glance.api.middleware.version_negotiation [req-f8e2d649-53d7-4d03-8b12-020f7103bf60 c14f271b6da84d04ac2c1ff71d3e3eb9 601ef230893a4e48968929a5f8756099 - - -] new path /v2/images/19884153-8d69-468d-96b1-6b770b81de3f process_request /usr/lib/python2.7/site-packages/glance/api/middleware/version_negotiation.py:73


Noticed difference in ceph config for rbd pool on controller vs compute. May be related to why we can't seem to access the glance images?

controller
[stack@dev-ospd ~]$ ssh heat-admin.187.126 'sudo grep -r "rbd" /etc/glance/glance-api.conf'
#      glance.store.rbd.Store,
stores=glance.store.http.Store,glance.store.rbd.Store
default_store=rbd
#rbd_store_ceph_conf=/etc/ceph/ceph.conf
rbd_store_ceph_conf=/etc/ceph/ceph.conf
# in rbd_store_ceph_conf
#rbd_store_user=<None>
rbd_store_user=openstack
#rbd_store_pool=images
rbd_store_pool=images
#rbd_store_chunk_size=8
rbd_store_chunk_size=8

compute
[stack@dev-ospd ~]$ ssh heat-admin.187.129 'sudo grep -r "rbd" /etc/nova/nova.conf'
# rbd, default. If default is specified, then use_cow_images
images_type=rbd
# The RADOS pool in which rbd volumes are stored (string
#images_rbd_pool=rbd
images_rbd_pool=vms
#images_rbd_ceph_conf=
images_rbd_ceph_conf=/etc/ceph/ceph.conf
# The RADOS client name for accessing rbd volumes (string
#rbd_user=<None>
rbd_user=openstack
# The libvirt UUID of the secret for the rbd_uservolumes
#rbd_secret_uuid=<None>
rbd_secret_uuid=1ab1700e-d503-11e5-b1c5-0050569455eb

Comment 5 Jeremy 2016-02-18 14:19:26 UTC
Hey Sergey,

According to the sosreport the customer is using  openstack-cinder-2015.1.2-5.el7ost.noarch .  I think the problem here is that cinder 2015.1.2-6 made it into rhosp 7.2 repos, however it was not added into the director 7.2 deploy images in time so we have this issue only on director deploys.

I was able to find this kcs article that talks about this problem:https://access.redhat.com/solutions/2144941

I will try and update the kcs to be more searchable in the future. Also the errros in this kcs seem to be from when the image is booted from volume from the cli. I noticed we did not have the same errors when doing this test via horizon.

Pretty much a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1290548, however completing the task with horizon presents with different symptoms. 

The customer updated python-glanceclient and it fixed the issue. So this BZ can be closed.