Hi, I could use some help with something: We've noticed that booting new instances is failing around up to 30% of the time. IOW, nova boot ... --min-count=100 could leave somewhere between 0% and 30% of the instances in ERROR state, and logs show this error: Invalid: 400 Bad Request: Unknown scheme 'file' found in URI (HTTP 400) If I try to create say, 3 nodes, the failure rates are similar. Maybe 1 of the 3 will fail. I can't find any references to other reports of this error. https://gist.github.com/jeremyeder/4fe51bebec5f37ca8404eefe6aa8d989 [heat-admin@overcloud-controller-0 ~]$ rpm -qa|grep openstack|sort openstack-ceilometer-alarm-5.0.2-2.el7ost.noarch openstack-ceilometer-api-5.0.2-2.el7ost.noarch openstack-ceilometer-central-5.0.2-2.el7ost.noarch openstack-ceilometer-collector-5.0.2-2.el7ost.noarch openstack-ceilometer-common-5.0.2-2.el7ost.noarch openstack-ceilometer-compute-5.0.2-2.el7ost.noarch openstack-ceilometer-notification-5.0.2-2.el7ost.noarch openstack-ceilometer-polling-5.0.2-2.el7ost.noarch openstack-cinder-7.0.1-8.el7ost.noarch openstack-dashboard-8.0.1-2.el7ost.noarch openstack-dashboard-theme-8.0.1-2.el7ost.noarch openstack-glance-11.0.1-4.el7ost.noarch openstack-heat-api-5.0.1-5.el7ost.noarch openstack-heat-api-cfn-5.0.1-5.el7ost.noarch openstack-heat-api-cloudwatch-5.0.1-5.el7ost.noarch openstack-heat-common-5.0.1-5.el7ost.noarch openstack-heat-engine-5.0.1-5.el7ost.noarch openstack-keystone-8.0.1-1.el7ost.noarch openstack-manila-1.0.1-3.el7ost.noarch openstack-manila-share-1.0.1-3.el7ost.noarch openstack-neutron-7.1.1-3.el7ost.noarch openstack-neutron-bigswitch-agent-2015.3.8-1.el7ost.noarch openstack-neutron-bigswitch-lldp-2015.3.8-1.el7ost.noarch openstack-neutron-common-7.1.1-3.el7ost.noarch openstack-neutron-lbaas-7.0.0-2.el7ost.noarch openstack-neutron-metering-agent-7.1.1-3.el7ost.noarch openstack-neutron-ml2-7.1.1-3.el7ost.noarch openstack-neutron-openvswitch-7.1.1-3.el7ost.noarch openstack-nova-api-12.0.3-1.el7ost.noarch openstack-nova-cert-12.0.3-1.el7ost.noarch openstack-nova-common-12.0.3-1.el7ost.noarch openstack-nova-compute-12.0.3-1.el7ost.noarch openstack-nova-conductor-12.0.3-1.el7ost.noarch openstack-nova-console-12.0.3-1.el7ost.noarch openstack-nova-novncproxy-12.0.3-1.el7ost.noarch openstack-nova-scheduler-12.0.3-1.el7ost.noarch openstack-puppet-modules-7.0.19-1.el7ost.noarch openstack-selinux-0.6.58-1.el7ost.noarch openstack-swift-2.5.0-2.el7ost.noarch openstack-swift-account-2.5.0-2.el7ost.noarch openstack-swift-container-2.5.0-2.el7ost.noarch openstack-swift-object-2.5.0-2.el7ost.noarch openstack-swift-plugin-swift3-1.9-1.el7ost.noarch openstack-swift-proxy-2.5.0-2.el7ost.noarch openstack-utils-2014.2-1.el7ost.noarch python-django-openstack-auth-2.0.1-1.2.el7ost.noarch python-openstackclient-1.7.2-1.el7ost.noarch [heat-admin@overcloud-controller-0 ~]$
It sounds like some nodes are misconfigured. As a wild guess, I'd say one of the Glance nodes is configured to use the file store, whereas the rest of the nodes are using Swift. A snapshot/image was created from the node using the file store and then the failure happens when you try to boot an instance using that image from one of the other nodes. This is, of course, a wild guess. Is it possible to have the config files and some more info about the environment?
This looks similar to https://bugs.launchpad.net/glance/+bug/1581111 Could you make sure "file" is part of the "stores" option of the "glance_store" section in all of your glance-api.conf configuration files?
The URI issue is also seen in the glance log : https://gist.github.com/jtaleric/c34d96cbc141df275ba4cbe1723ea2f9 Here is the image in question : https://gist.github.com/jtaleric/2379f1e4512bc163929698a17d805344 Here is reviewing a different image in glance : https://gist.github.com/jtaleric/9541722e02066c46762142bf5bea3a7a Could this because the Snapshot (file) vs Swift backend? Exporting the image via glance image-download ( failed 2 out of the 3 times -- which would be expected since only a single controller had the snapshot). However, re-adding the image to glace so it was backed by swift seems to have fixed the issue. Question is, why didn't this happen auto-magically?
Re-opening as we need to understand what we expect Snapshot to do in this case. Please refer to my comment. #4
(In reply to Joe Talerico from comment #6) > Re-opening as we need to understand what we expect Snapshot to do in this > case. Please refer to my comment. #4 I'm sorry, I'm having a bit of a hard time parsing this. What do you mean exactly? What happened with the snapshot? Could you share the configuration files? How did you take the snapshot?
Jeremy can you share how you took the snapshot? Flavio, what exactly do you not understand? Let me breaking this down... 1) Import a image into Glance. 2) Launch a guest, set some things up in the guest, take snapshot. 3) Create a bunch of guests from the above snapshot -- this will result in many failures. To fix this: 1) Export snapshot from glance (this will fail 2 out of the 3 times you attempt, depending on where HAProxy sends the request) 2) Re-import image into Glance The biggest difference between the snapshot image vs the re-imported image is the location of the image. The snapshot is file:// and the re-imported image is the swift:// location.
I clicked "Create Snapshot" in Horizon.
(In reply to Joe Talerico from comment #8) > Jeremy can you share how you took the snapshot? > > Flavio, what exactly do you not understand? > > Let me breaking this down... > 1) Import a image into Glance. > 2) Launch a guest, set some things up in the guest, take snapshot. > 3) Create a bunch of guests from the above snapshot -- this will result in > many failures. > > To fix this: > 1) Export snapshot from glance (this will fail 2 out of the 3 times you > attempt, depending on where HAProxy sends the request) > 2) Re-import image into Glance > > The biggest difference between the snapshot image vs the re-imported image > is the location of the image. The snapshot is file:// and the re-imported > image is the swift:// location. As I've mentioned already in the email thread and earlier in this BZ, the above sounds like there's a misconfigured glance-api node. Could you please upload the config files for these glance-api nodes and/or verify that they are all configured the same way. It really sounds like this snapshot was created on a glance-api node that was using the `file` store. On the upload the node saved the image locally and all other glance-api nodes fail to download it because they're not configured to use such store. Now it might be working because it's pointing to swift where, apparently, all glance-api nodes have access to.
We are going to close this one out given it happened some time ago. If there are still similar or new issues getting something like this configured or tested in OSP11, please open a new bug. Thanks.