Bug 1085005
| Summary: | openstack-nova: several instances are able to be configure the same bootable volume | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Vladan Popovic <vpopovic> |
| Component: | openstack-nova | Assignee: | Nikola Dipanov <ndipanov> |
| Status: | CLOSED ERRATA | QA Contact: | Yogev Rabl <yrabl> |
| Severity: | high | Docs Contact: | |
| Priority: | medium | ||
| Version: | 4.0 | CC: | ajeain, breeler, dallan, dron, eglynn, ndipanov, sclewis, scohen, tshefi, xqueralt, yeylon, yrabl |
| Target Milestone: | z4 | Keywords: | Triaged, ZStream |
| Target Release: | 4.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | storage | ||
| Fixed In Version: | openstack-nova-2013.2.1-1.el6ost | Doc Type: | Bug Fix |
| Doc Text: |
Cause: There is a race condition with checking the status of requested volumes in the API, and combined with the fact that we would attempt to reschedule a failed instance to a different host, they can cause a volume to be "stolen" from an instance that managed to attach it successfully.
Consequence: If several instances get requested close to each other, and they request the same volume, it was possible for an instance that got rescheduled due to failing the volume setup (for the already taken volume), to disconnect that volume from the instance that already had it attached in the reschedule process.
Fix: If an instance fails the volume setup during boot (due to the volume being set up by a different instance), we will not attempt to re-schedule the failed instance to a different host and thus avoid disconnecting a completely attached volume.
Result: Only one instance that requested the same volume succeeds while all others go into the ERROR state.
|
Story Points: | --- |
| Clone Of: | 1020501 | Environment: | |
| Last Closed: | 2014-05-29 20:35:40 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1020501 | ||
|
Comment 1
Dafna Ron
2014-04-07 14:12:16 UTC
Dafna - Comment 1 sounds like a good plan to test it. Attaching volumes should not really be part of this bug, as the issue was with how we handled rescheduling on failures. But I do urge you to test attaching as well and possibly report a different bug. Moving back to 4.0 as this is indeed fixed in 4.0 verified on: python-novaclient-2.15.0-4.el6ost.noarch openstack-nova-conductor-2013.2.3-7.el6ost.noarch openstack-nova-scheduler-2013.2.3-7.el6ost.noarch openstack-nova-common-2013.2.3-7.el6ost.noarch openstack-nova-api-2013.2.3-7.el6ost.noarch openstack-nova-console-2013.2.3-7.el6ost.noarch openstack-nova-network-2013.2.3-7.el6ost.noarch openstack-nova-cert-2013.2.3-7.el6ost.noarch python-nova-2013.2.3-7.el6ost.noarch openstack-nova-compute-2013.2.3-7.el6ost.noarch openstack-nova-novncproxy-2013.2.3-7.el6ost.noarch python-cinderclient-1.0.7-2.el6ost.noarch Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2014-0578.html |