Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1535593 - Concurrent report_state from multiple agents: segment_host_mapping fails - StaleDataError
Concurrent report_state from multiple agents: segment_host_mapping fails - St...
Status: CLOSED ERRATA
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron (Show other bugs)
13.0 (Queens)
Unspecified Unspecified
high Severity high
: beta
: 13.0 (Queens)
Assigned To: Assaf Muller
Roee Agiman
: Triaged
Depends On:
Blocks: 1214284
  Show dependency treegraph
 
Reported: 2018-01-17 12:13 EST by Harald Jensås
Modified: 2018-06-27 09:43 EDT (History)
7 users (show)

See Also:
Fixed In Version: openstack-neutron-12.0.1-0.20180327195360.68b8980.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2018-06-27 09:42:25 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Reproducer Script and logs. (73.30 KB, text/plain)
2018-01-17 12:13 EST, Harald Jensås
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Launchpad 1743579 None None None 2018-01-17 12:13 EST
OpenStack gerrit 534449 None master: MERGED neutron: Add retry decorator update_segment_host_mapping() (I616457f094d000a4016c610b454be8269d9b4948) 2018-03-29 12:09 EDT
Red Hat Product Errata RHEA-2018:2086 None None None 2018-06-27 09:43 EDT

  None (edit)
Description Harald Jensås 2018-01-17 12:13:11 EST
Created attachment 1382568 [details]
Reproducer Script and logs.

Description of problem:
When multiple host agents rapidly report_state for the first time we get StaleDataError and _update_segment_host_mapping_for_agent does not complete for all hosts.

Attached is a file with logs as well as reproducer script and instruction on how to set up devstack environment similar to the one I am using.



Version-Release number of selected component (if applicable):


How reproducible:
Every time with reproduces script that does report_state for 3x hosts.


Steps to Reproduce:
Run reproducer script with the delay, time.sleep(10), commented.

__NOTE__: The reproducer script is included in the attachement
          with logs and a devstack instuctions to test agains
          upstream openstack


Actual results:
 Results:
  * 2x StaleDataError
  * Only 1 attempt to add host to placement/host-aggregate.

MariaDB [neutron]> MariaDB [neutron]> SELECT * FROM segmenthostmappings;
+--------------------------------------+---------------------------------+
| segment_id | host                                                      |
+--------------------------------------+---------------------------------+
| a974ae4c-1389-4e41-9ab9-820165c26acd | host2                           |
| a974ae4c-1389-4e41-9ab9-820165c26acd | routed-devstack.lab.example.com |
| bc626d3d-5503-4875-9db8-e1bcfad35979 | host2                           |
| bc626d3d-5503-4875-9db8-e1bcfad35979 | routed-devstack.lab.example.com |
| ec7717dd-8533-464f-a3c8-4ecc7dc08d10 | host2                           |
| ec7717dd-8533-464f-a3c8-4ecc7dc08d10 | routed-devstack.lab.example.com |
+--------------------------------------+---------------------------------+

Conclusions:
  * 2x StaleDataError
  * 1x successful _update_segment_host_mapping after_create.


Expected results:
We should see 3x attempts to add to placement/host-aggregate, one for each host agent. And all 3 hosts should have entries in segmenthostmappings table in the database.


Additional info:


When running the reproducer script with the delay of 10 seconds between each agent update there is no issue.
------------------------------------------------------------------------------------------------------------

Run script with the delay, time.sleep(10), enabled.
Results:
  * No StaleDataError
  * 3 attempts to add the host to placemenb/host-aggregate.

MariaDB [neutron]> SELECT * FROM segmenthostmappings;
+--------------------------------------+---------------------------------+
| segment_id                           | host                            |
+--------------------------------------+---------------------------------+
| 11b9258f-8712-43b7-8f39-3eab627a8c7f | host0                           |
| 11b9258f-8712-43b7-8f39-3eab627a8c7f | host1                           |
| 11b9258f-8712-43b7-8f39-3eab627a8c7f | host2                           |
| 11b9258f-8712-43b7-8f39-3eab627a8c7f | routed-devstack.lab.example.com |
| 89f96bee-424c-4ee2-8639-2ca8e07a70e6 | host0                           |
| 89f96bee-424c-4ee2-8639-2ca8e07a70e6 | host1                           |
| 89f96bee-424c-4ee2-8639-2ca8e07a70e6 | host2                           |
| 89f96bee-424c-4ee2-8639-2ca8e07a70e6 | routed-devstack.lab.example.com |
| a7a7d2f4-c809-4ebb-916f-930c97fbec47 | host0                           |
| a7a7d2f4-c809-4ebb-916f-930c97fbec47 | host1                           |
| a7a7d2f4-c809-4ebb-916f-930c97fbec47 | host2                           |
| a7a7d2f4-c809-4ebb-916f-930c97fbec47 | routed-devstack.lab.example.com |
+--------------------------------------+---------------------------------+

Conclution:
  * 3x successfull _update_segment_host_mapping after_create.

** NOTE: **
The RESP BODY: {"itemNotFound": {"message": "Compute host host1 could not be found.", "code": 404}} errors in the logs is expected, the fake host is not in Nova, so this is expeced.
Comment 2 Nir Yechiel 2018-01-18 07:07:59 EST
Fix proposed here: https://review.openstack.org/#/c/534449/
Comment 3 Nir Yechiel 2018-01-21 02:42:42 EST
Merged upstream.
Comment 16 errata-xmlrpc 2018-06-27 09:42:25 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:2086

Note You need to log in before you can comment on or make changes to this bug.