Bug 1851051 - copy-image can start multiple importing threads due to race condition
Summary: copy-image can start multiple importing threads due to race condition
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-glance
Version: 16.1 (Train)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: beta
: 16.2 (Train on RHEL 8.4)
Assignee: Abhishek Kekane
QA Contact: Mike Abrams
RHOS Documentation Team
URL:
Whiteboard:
Depends On: 1866592
Blocks: 1852496
TreeView+ depends on / blocked
 
Reported: 2020-06-25 13:55 UTC by Abhishek Kekane
Modified: 2022-08-26 15:09 UTC (History)
8 users (show)

Fixed In Version: openstack-glance-19.0.4-2.20210216215005.5bbd356.el8ost
Doc Type: Bug Fix
Doc Text:
Before this update, RBD performance was degraded when multiple instances were launched simultaneously. This was due to the Image service (glance) starting multiple threads to perform the same copying operation. This update resolves the issue.
Clone Of:
: 1852496 (view as bug list)
Environment:
Last Closed: 2021-09-15 07:08:44 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1884596 0 None None None 2020-06-25 13:55:31 UTC
OpenStack gerrit 747999 0 None MERGED Add a test to replicate the owner-required behavior of copy-image 2021-02-19 03:48:04 UTC
OpenStack gerrit 748000 0 None MERGED Refactor common auth token code in images test 2021-02-19 03:48:04 UTC
OpenStack gerrit 748001 0 None MERGED Add a functional test for non-owned image copying 2021-02-19 03:48:04 UTC
OpenStack gerrit 748002 0 None MERGED Make import task capable of running as admin on behalf of user 2021-02-19 03:48:05 UTC
OpenStack gerrit 748003 0 None MERGED Make test_copy_image_revert_lifecycle handle 409 on import retry 2021-02-19 03:48:05 UTC
OpenStack gerrit 748004 0 None MERGED Add testing for _CompleteTask in api_image_import 2021-02-19 03:48:05 UTC
OpenStack gerrit 748005 0 None MERGED Add tests for _ImportToStore.execute() 2021-02-19 03:48:05 UTC
OpenStack gerrit 748006 0 None MERGED Flesh out FakeImage for extra_properties 2021-02-19 03:48:06 UTC
OpenStack gerrit 748007 0 None MERGED Add image_set_property_atomic() helper 2021-02-19 03:48:06 UTC
OpenStack gerrit 748008 0 None MERGED Add image_delete_property_atomic() helper 2021-02-19 03:48:06 UTC
OpenStack gerrit 748009 0 None MERGED Heartbeat the actual work of the task 2021-02-19 03:48:07 UTC
OpenStack gerrit 748010 0 None MERGED Update task message during import 2021-02-19 03:48:07 UTC
OpenStack gerrit 748011 0 None MERGED Functional reproducer for bug 1891352 2021-02-19 03:48:08 UTC
OpenStack gerrit 748012 0 None MERGED Fix import failure status reporting when all_stores_must_succeed=True 2021-02-19 03:48:08 UTC
OpenStack gerrit 748013 0 None MERGED Add context.elevated() helper for getting admin privileges 2021-02-19 03:48:08 UTC
OpenStack gerrit 748014 0 None MERGED Implement time-limited import locking 2021-02-19 03:48:09 UTC
OpenStack gerrit 748239 0 None MERGED Fix non-deterministic copy_image_revert_lifecycle test 2021-02-19 03:48:09 UTC
OpenStack gerrit 748240 0 None MERGED Poll for final state on test_copy_image_revert_lifecycle() 2021-02-19 03:48:09 UTC
OpenStack gerrit 749507 0 None MERGED Add FakeData generator test utility 2021-02-19 03:48:10 UTC
OpenStack gerrit 749508 0 None MERGED Add functional test for task status updating 2021-02-19 03:48:10 UTC
OpenStack gerrit 749509 0 None MERGED Move SynchronousAPIBase to a generalized location 2021-02-19 03:48:10 UTC
OpenStack gerrit 749510 0 None MERGED Handle atomic image properties separately 2021-02-19 03:48:11 UTC
OpenStack gerrit 749511 0 None MERGED Functional test enhancement for lock busting 2021-02-19 03:48:11 UTC
OpenStack gerrit 749512 0 None MERGED Add ImageLock to base flow checks 2021-02-19 03:48:11 UTC
OpenStack gerrit 749513 0 None MERGED Cleanup import status information after busting a lock 2021-02-19 03:48:11 UTC
OpenStack gerrit 749514 0 None MERGED Add a release note about import locking 2021-02-19 03:48:11 UTC
Red Hat Issue Tracker OSP-2230 0 None None None 2022-08-26 15:09:02 UTC
Red Hat Product Errata RHEA-2021:3483 0 None None None 2021-09-15 07:09:06 UTC

Description Abhishek Kekane 2020-06-25 13:55:31 UTC
Two subsequent copy-image API calls for same image to different set of stores or same set of stores will cause in race condition and can provide different results than expected.

Especially in a situation where glance is running on multiple control plane nodes (i.e. any real-world situation), I believe there is a race condition whereby two closely-timed requests to copy an image to a store will result in two copy operations in glance proceeding in parallel.

Glance will not realize that a thread is already running to complete the initial task and will start another. In a situation where a user spawns a thousand new instances to a thousand compute nodes in a single operation where the image needs copying first, it's highly plausible to have _many_ duplicate glance operations going, impacting write performance on the rbd cluster at the very least.

Comment 1 Cyril Roelandt 2020-06-26 14:17:54 UTC
Is there some kind of workaround available right now? Or should the admin just be careful? Should we release an update to the documentation?

Comment 2 Abhishek Kekane 2020-06-29 05:17:37 UTC
(In reply to Cyril Roelandt from comment #1)
> Is there some kind of workaround available right now? Or should the admin
> just be careful? Should we release an update to the documentation?

At the moment admin should be just careful to avoid this scenario. We should mention this in documentation as well.

Comment 3 Cyril Roelandt 2020-06-29 18:32:55 UTC
@Laura: can this be added as a warning in our documentation?

Comment 4 Laura Marsh 2020-06-30 14:32:45 UTC
yes, working on the wording now

Comment 8 Cyril Roelandt 2020-09-02 17:16:32 UTC
And also https://review.opendev.org/744024

Comment 9 Cyril Roelandt 2020-09-02 17:18:40 UTC
Disregard #8

Comment 11 Abhishek Kekane 2020-09-04 05:24:00 UTC
(In reply to Cyril Roelandt from comment #10)
> @Abhishek: the patches in the "Links" box are still required, right? These
> ones:
> 
> https://review.opendev.org/737868/
> https://review.opendev.org/743426/
> https://review.opendev.org/743427/
> https://review.opendev.org/743593/
> https://review.opendev.org/743595/
> https://review.opendev.org/743596/
> https://review.opendev.org/743597/
> https://review.opendev.org/743839/

Hi Cyril,

Patches mentioned in comment #6 and #7 includes these patches.
Actually patches in links box are from master branch and patches in comment #6 and #7 are upstream backports to ussuri.

We need to backport patches from #6 and #7 as there are some changes in those backports not included from master branch as those were related to new feature.
So it will be quiet easy for us if we consider those #6 and #7 patches only.

Thank you,

Abhishek

Comment 20 Mike Abrams 2021-06-01 10:25:40 UTC
to verify:

=== use copy-image method to copy to a store while a copy-image op is underway (should result in a 409 error)

(central) [stack@site-undercloud-0 ~]$ glance image-import 3a0ddfd6-c4e2-41d5-915e-4d8a39aadb5e --import-method copy-image --stores dcn1; glance image-import 3a0ddfd6-c4e2-41d5-915e-4d8a39aadb5e --import-method copy-image --stores dcn2 
(central) [stack@site-undercloud-0 ~]$ glance image-create-via-import --container-format ami --disk-format ami --name import_scenario --import-method web-download --uri https://cloud.centos.org/centos/6/images/CentOS-6-x86_64-GenericCloud-1510.qcow2
+-----------------------+--------------------------------------+
| Property              | Value                                |
+-----------------------+--------------------------------------+
| checksum              | None                                 |
| container_format      | ami                                  |
| created_at            | 2021-06-01T08:48:41Z                 |
| disk_format           | ami                                  |
| id                    | cb8d05cd-a19f-42f6-b2c9-7c16f14c4482 |
| locations             | []                                   |
| min_disk              | 0                                    |
| min_ram               | 0                                    |
| name                  | import_scenario                      |
| os_glance_import_task | 9b4fe0c9-b8d6-4b5b-9c09-60238c5173b8 |
| os_hash_algo          | None                                 |
| os_hash_value         | None                                 |
| os_hidden             | False                                |
| owner                 | fbf93b9227c548938a91ed1b43507a69     |
| protected             | False                                |
| size                  | None                                 |
| status                | queued                               |
| tags                  | []                                   |
| updated_at            | 2021-06-01T08:48:41Z                 |
| virtual_size          | Not available                        |
| visibility            | shared                               |
+-----------------------+--------------------------------------+
(central) [stack@site-undercloud-0 ~]$
(central) [stack@site-undercloud-0 ~]$ glance image-import cb8d05cd-a19f-42f6-b2c9-7c16f14c4482 --import-method copy-image --stores dcn1
+-------------------------------+----------------------------------------------------------------------------------+
| Property                      | Value                                                                            |
+-------------------------------+----------------------------------------------------------------------------------+
| checksum                      | dcef39a9f88ef73cf782d9633d7e3451                                                 |
| container_format              | bare                                                                             |
| created_at                    | 2021-06-01T08:48:41Z                                                             |
| direct_url                    | rbd://62cc482d-756a-407c-868a-4700d73a0cba/images/cb8d05cd-a19f-42f6-b2c9-7c16f1 |
|                               | 4c4482/snap                                                                      |
| disk_format                   | raw                                                                              |
| id                            | cb8d05cd-a19f-42f6-b2c9-7c16f14c4482                                             |
| locations                     | [{"url": "rbd://62cc482d-756a-407c-868a-4700d73a0cba/images/cb8d05cd-a19f-42f6-b |
|                               | 2c9-7c16f14c4482/snap", "metadata": {"store": "central"}}]                       |
| min_disk                      | 0                                                                                |
| min_ram                       | 0                                                                                |
| name                          | import_scenario                                                                  |
| os_glance_failed_import       |                                                                                  |
| os_glance_import_task         | 7995f4b8-8342-49bf-932a-c45a74a044a7                                             |
| os_glance_importing_to_stores |                                                                                  |
| os_hash_algo                  | sha512                                                                           |
| os_hash_value                 | 2ae24da25883b04af54f2e0edc65627772b2c4150108613e3d399718efab5b5353f5271e84ba5578 |
|                               | 9fcdd19ab5ff4bac1ef4934f5c0f204400c9f8bb541dbe44                                 |
| os_hidden                     | False                                                                            |
| owner                         | fbf93b9227c548938a91ed1b43507a69                                                 |
| protected                     | False                                                                            |
| size                          | 8589934592                                                                       |
| status                        | active                                                                           |
| stores                        | central                                                                          |
| tags                          | []                                                                               |
| updated_at                    | 2021-06-01T08:55:24Z                                                             |
| virtual_size                  | 8589934592                                                                       |
| visibility                    | shared                                                                           |
+-------------------------------+----------------------------------------------------------------------------------+
(central) [stack@site-undercloud-0 ~]$ 

=== try to upload the image that is actively being uploaded to a third store:

(central) [stack@site-undercloud-0 ~]$ glance image-import cb8d05cd-a19f-42f6-b2c9-7c16f14c4482 --import-method copy-image --stores dcn2
HTTP 502 Proxy Error: Proxy Error: The proxy server received an invalid: response from an upstream server.: The proxy server could not handle the requestReason: Error reading from remote server
(central) [stack@site-undercloud-0 ~]$ glance image-import cb8d05cd-a19f-42f6-b2c9-7c16f14c4482 --import-method copy-image --stores dcn2
HTTP 409 Conflict: Image has active task
(central) [stack@site-undercloud-0 ~]$ 

=== ...409 expected

=== wait until the copy op is done to the second store:
(central) [stack@site-undercloud-0 ~]$ while :; do glance image-show cb8d05cd-a19f-42f6-b2c9-7c16f14c4482 | grep stores; sleep 5; done
| os_glance_importing_to_stores | dcn1                                                                             |
| stores                        | central         
...

=== once the original copy to store task is complete, should be able to copy to additional store without the 409 error; the whole process can be seen here:

(central) [stack@site-undercloud-0 ~]$ glance image-create-via-import --container-format ami --disk-format ami --name import_scenario --import-method web-download --uri https://download.cirros-cloud.net/0.5.2/cirros-0.5.2-x86_64-disk.img
+-----------------------+--------------------------------------+
| Property              | Value                                |
+-----------------------+--------------------------------------+
| checksum              | None                                 |
| container_format      | ami                                  |
| created_at            | 2021-06-01T10:22:44Z                 |
| disk_format           | ami                                  |
| id                    | 678ebcbd-0c23-4239-8b64-82ae7f0e2bd2 |
| locations             | []                                   |
| min_disk              | 0                                    |
| min_ram               | 0                                    |
| name                  | import_scenario                      |
| os_glance_import_task | 93399714-52c2-4fc8-b37a-45383eb1c51f |
| os_hash_algo          | None                                 |
| os_hash_value         | None                                 |
| os_hidden             | False                                |
| owner                 | fbf93b9227c548938a91ed1b43507a69     |
| protected             | False                                |
| size                  | None                                 |
| status                | queued                               |
| tags                  | []                                   |
| updated_at            | 2021-06-01T10:22:44Z                 |
| virtual_size          | Not available                        |
| visibility            | shared                               |
+-----------------------+--------------------------------------+
(central) [stack@site-undercloud-0 ~]$ glance image-show 678ebcbd-0c23-4239-8b64-82ae7f0e2bd2 | grep status
| status                        | active                                                                           |
(central) [stack@site-undercloud-0 ~]$ glance image-import 678ebcbd-0c23-4239-8b64-82ae7f0e2bd2 --import-method copy-image --stores dcn1                                    
+-------------------------------+----------------------------------------------------------------------------------+
| Property                      | Value                                                                            |
+-------------------------------+----------------------------------------------------------------------------------+
| checksum                      | 318f4eb47a165b55db0be707e0ecbf63                                                 |
| container_format              | bare                                                                             |
| created_at                    | 2021-06-01T10:22:44Z                                                             |
| direct_url                    | rbd://62cc482d-756a-407c-868a-4700d73a0cba/images/678ebcbd-0c23-4239-8b64-82ae7f |
|                               | 0e2bd2/snap                                                                      |
| disk_format                   | raw                                                                              |
| id                            | 678ebcbd-0c23-4239-8b64-82ae7f0e2bd2                                             |
| locations                     | [{"url": "rbd://62cc482d-756a-407c-868a-4700d73a0cba/images/678ebcbd-0c23-4239-8 |
|                               | b64-82ae7f0e2bd2/snap", "metadata": {"store": "central"}}]                       |
| min_disk                      | 0                                                                                |
| min_ram                       | 0                                                                                |
| name                          | import_scenario                                                                  |
| os_glance_failed_import       |                                                                                  |
| os_glance_import_task         | e4bff141-1633-4d86-b118-43511813b74d                                             |
| os_glance_importing_to_stores |                                                                                  |
| os_hash_algo                  | sha512                                                                           |
| os_hash_value                 | a3fd523ada34f4e4a8d2cd1eeb94a6c1b05aabd63b3e0ff2c6522f3fd2f221d79569e8c2fffa92c2 |
|                               | 7d4a6e8e832771cb035c6d162f7a0192ae3896b168ecba51                                 |
| os_hidden                     | False                                                                            |
| owner                         | fbf93b9227c548938a91ed1b43507a69                                                 |
| protected                     | False                                                                            |
| size                          | 117440512                                                                        |
| status                        | active                                                                           |
| stores                        | central                                                                          |
| tags                          | []                                                                               |
| updated_at                    | 2021-06-01T10:22:52Z                                                             |
| virtual_size                  | 117440512                                                                        |
| visibility                    | shared                                                                           |
+-------------------------------+----------------------------------------------------------------------------------+
(central) [stack@site-undercloud-0 ~]$ glance image-import 678ebcbd-0c23-4239-8b64-82ae7f0e2bd2 --import-method copy-image --stores dcn2
HTTP 409 Conflict: Image has active task
(central) [stack@site-undercloud-0 ~]$ glance image-show 678ebcbd-0c23-4239-8b64-82ae7f0e2bd2 | grep stores
| os_glance_importing_to_stores |                                                                                  |
| stores                        | central,dcn1                                                                     |
(central) [stack@site-undercloud-0 ~]$ glance image-import 678ebcbd-0c23-4239-8b64-82ae7f0e2bd2 --import-method copy-image --stores dcn2
+-------------------------------+----------------------------------------------------------------------------------+
| Property                      | Value                                                                            |
+-------------------------------+----------------------------------------------------------------------------------+
| checksum                      | 318f4eb47a165b55db0be707e0ecbf63                                                 |
| container_format              | bare                                                                             |
| created_at                    | 2021-06-01T10:22:44Z                                                             |
| direct_url                    | rbd://62cc482d-756a-407c-868a-4700d73a0cba/images/678ebcbd-0c23-4239-8b64-82ae7f |
|                               | 0e2bd2/snap                                                                      |
| disk_format                   | raw                                                                              |
| id                            | 678ebcbd-0c23-4239-8b64-82ae7f0e2bd2                                             |
| locations                     | [{"url": "rbd://62cc482d-756a-407c-868a-4700d73a0cba/images/678ebcbd-0c23-4239-8 |
|                               | b64-82ae7f0e2bd2/snap", "metadata": {"store": "central"}}, {"url": "rbd://e0eff2 |
|                               | f5-f00b-4888-b558-33b99b9a48b2/images/678ebcbd-0c23-4239-8b64-82ae7f0e2bd2/snap" |
|                               | , "metadata": {"store": "dcn1"}}]                                                |
| min_disk                      | 0                                                                                |
| min_ram                       | 0                                                                                |
| name                          | import_scenario                                                                  |
| os_glance_failed_import       |                                                                                  |
| os_glance_import_task         | 8a915808-e6fc-4b7b-a830-eacf99a960f7                                             |
| os_glance_importing_to_stores |                                                                                  |
| os_hash_algo                  | sha512                                                                           |
| os_hash_value                 | a3fd523ada34f4e4a8d2cd1eeb94a6c1b05aabd63b3e0ff2c6522f3fd2f221d79569e8c2fffa92c2 |
|                               | 7d4a6e8e832771cb035c6d162f7a0192ae3896b168ecba51                                 |
| os_hidden                     | False                                                                            |
| owner                         | fbf93b9227c548938a91ed1b43507a69                                                 |
| protected                     | False                                                                            |
| size                          | 117440512                                                                        |
| status                        | active                                                                           |
| stores                        | central,dcn1                                                                     |
| tags                          | []                                                                               |
| updated_at                    | 2021-06-01T10:23:42Z                                                             |
| virtual_size                  | 117440512                                                                        |
| visibility                    | shared                                                                           |
+-------------------------------+----------------------------------------------------------------------------------+
(central) [stack@site-undercloud-0 ~]$ glance image-show 678ebcbd-0c23-4239-8b64-82ae7f0e2bd2 | grep stores
| os_glance_importing_to_stores | dcn2                                                                             |
| stores                        | central,dcn1                                                                     |
(central) [stack@site-undercloud-0 ~]$ glance image-show 678ebcbd-0c23-4239-8b64-82ae7f0e2bd2 | grep stores
| os_glance_importing_to_stores | dcn2                                                                             |
| stores                        | central,dcn1                                                                     |
(central) [stack@site-undercloud-0 ~]$ glance image-show 678ebcbd-0c23-4239-8b64-82ae7f0e2bd2 | grep stores
| os_glance_importing_to_stores |                                                                                  |
| stores                        | central,dcn1,dcn2                                                                |
(central) [stack@site-undercloud-0 ~]$ 


VERIFIED

Comment 23 errata-xmlrpc 2021-09-15 07:08:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform (RHOSP) 16.2 enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2021:3483


Note You need to log in before you can comment on or make changes to this bug.