Bug 1578901
| Summary: | [UPGRADES] TempestFailure: One of cinder-scheduler services is too old to accept create_snapshot request | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Yurii Prokulevych <yprokule> |
| Component: | openstack-tripleo-heat-templates | Assignee: | Alan Bishop <abishop> |
| Status: | CLOSED ERRATA | QA Contact: | Tzach Shefi <tshefi> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 13.0 (Queens) | CC: | abishop, augol, ccamacho, cschwede, jschluet, knylande, lbezdick, mbultel, mburns, mcornea, scohen, srevivo, tshefi, yprokule |
| Target Milestone: | rc | Keywords: | Triaged |
| Target Release: | 13.0 (Queens) | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | openstack-tripleo-heat-templates-8.0.2-29.el7ost | Doc Type: | Bug Fix |
| Doc Text: |
After upgrading to a new release, Block Storage services (cinder) were stuck using the old RPC versions from the prior release. Because of this, all cinder API requests requiring the latest RPC versions failed.
When upgrading to a new release, all cinder RPC versions are updated to match the latest release.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2018-06-27 13:56:23 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
This seems to be an upgrade issue similar to bug #1554122. That BZ contains a reference to a patch [1] that relates to sequencing the cinder-volume service restarts under pacemaker. This BZ describes a similar problem about mixed versions of the cinder-scheduler service, except that cinder-scheduler does not run under pacemaker. Hey Alan, In this case, we have specifically an upgrade_tasks section on THT where you can restart any service you want. Let's sync up for a proper fix. Yuri, can you try a local patch to verify it works before I propose it upstream? After upgrading the undercloud but before you upgrade the overcloud, patch the cinder-manage command at [1] to add the "--bump-versions" option, like this: "su cinder -s /bin/bash -c 'cinder-manage db sync --bump-versions'" [1] https://github.com/openstack/tripleo-heat-templates/blob/stable/queens/docker/services/cinder-api.yaml#L139 Tzach, maybe you could also try this? FYI Alan,Alex,Yurri I'd "cherry picked" (manually added) --bump-versions, on an upgraded undercloud before overcloud upgrade started. Suggested fix worked, I can do cinder create and cinder create snapshot. Not getting version conflict error Yuri and I got before. Before fix on an upgraded system, I got 19 Cinder related failures due to version issue, now only 3 failed (known reason). This would be OK to verify once fix lands in RPM build/deployment. Thanks, Tzach! I will propose a patch upstream, and backport to OSP-13 ASAP. Patch has been approved upstream. Verified on: openstack-tripleo-heat-templates-8.0.2-29.el7ost.noarch Upgraded a system from OSP12 to OSP13. Post upgrade ran some Cinder commands without errors: cinder create cinder snapshot-create .. No mention of original issue -> One of cinder-scheduler services is too old to accept create_snapshot OK to verify. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:2086 |
Description of problem: ----------------------- Few tests from tempest's scenario suit fail after major upgrade <testcase classname="tempest.scenario.test_volume_boot_pattern.TestVolumeBootPattern" name="test_create_ebs_image_and_check_boot[compute,id-36c34c67-7b54-4b59-b188-02a2f458a63b,image,volume]" classname="tempest.scenario.test_volume_boot_pattern.TestVolumeBootPattern" name="test_create_server_from_volume_snapshot[compute,id-05795fb2-b2a7-4c9f-8fac-ff25aedb1489,image,slow,volume]" classname="tempest.scenario.test_volume_boot_pattern.TestVolumeBootPattern" name="test_volume_boot_pattern[compute,id-557cd2c2-4eb8-4dce-98be-f86765ff311b,image,volume]" ... traceback-1: {{{ Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/tempest/lib/common/utils/test_utils.py", line 84, in call_and_ignore_notfound_exc return func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/tempest/lib/services/volume/v2/volumes_client.py", line 103, in delete_volume resp, body = self.delete(url) File "/usr/lib/python2.7/site-packages/tempest/lib/common/rest_client.py", line 310, in delete return self.request('DELETE', url, extra_headers, headers, body) File "/usr/lib/python2.7/site-packages/tempest/lib/services/volume/base_client.py", line 38, in request method, url, extra_headers, headers, body, chunked) File "/usr/lib/python2.7/site-packages/tempest/lib/common/rest_client.py", line 668, in request self._error_checker(resp, resp_body) File "/usr/lib/python2.7/site-packages/tempest/lib/common/rest_client.py", line 779, in _error_checker raise exceptions.BadRequest(resp_body, resp=resp) tempest.lib.exceptions.BadRequest: Bad request Details: {u'message': u'Invalid volume: Volume status must be available or error or error_restoring or error_extending or error_managing and must not be migrating, attached, belong to a group, have snapshots or be disassociated from snapshots after volume transfer.', u'code': 400} }}} traceback-2: {{{ Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/tempest/lib/common/rest_client.py", line 880, in wait_for_resource_deletion raise exceptions.TimeoutException(message) tempest.lib.exceptions.TimeoutException: Request timed out Details: (TestVolumeBootPattern:_run_cleanups) Failed to delete volume 60dd4644-df86-4590-a885-faa9dd711b20 within the required time (300 s). }}} Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/tempest/common/utils/__init__.py", line 88, in wrapper return f(*func_args, **func_kwargs) File "/usr/lib/python2.7/site-packages/tempest/scenario/test_volume_boot_pattern.py", line 135, in test_volume_boot_pattern snapshot = self.create_volume_snapshot(volume_origin['id'], force=True) File "/usr/lib/python2.7/site-packages/tempest/scenario/manager.py", line 251, in create_volume_snapshot metadata=metadata)['snapshot'] File "/usr/lib/python2.7/site-packages/tempest/lib/services/volume/v2/snapshots_client.py", line 65, in create_snapshot resp, body = self.post('snapshots', post_body) File "/usr/lib/python2.7/site-packages/tempest/lib/common/rest_client.py", line 279, in post return self.request('POST', url, extra_headers, headers, body, chunked) File "/usr/lib/python2.7/site-packages/tempest/lib/common/rest_client.py", line 668, in request self._error_checker(resp, resp_body) File "/usr/lib/python2.7/site-packages/tempest/lib/common/rest_client.py", line 779, in _error_checker raise exceptions.BadRequest(resp_body, resp=resp) tempest.lib.exceptions.BadRequest: Bad request Details: {u'message': u'One of cinder-scheduler services is too old to accept create_snapshot request. Required RPC API version is 3.9. Are you running mixed versions of cinder-schedulers?', u'code': 400} Version-Release number of selected component (if applicable): ------------------------------------------------------------- puppet-cinder-12.4.1-0.20180329071637.4011a82.el7ost.noarch python-cinder-12.0.1-0.20180418194613.c476898.el7ost.noarch python2-cinderclient-3.5.0-1.el7ost.noarch openstack-cinder-12.0.1-0.20180418194613.c476898.el7ost.noarch openstack-tripleo-heat-templates-8.0.2-19.el7ost.noarch Steps to Reproduce: ------------------- 1. Run major upgrade of RHOS-12 to RHOS-13 2. Launch tempest scenarios suite after upgrade Additional info: ---------------- Virtual setup: 3controllers + 3messaging + 3database + 3ceph + 2network + 2compute IPv6, custom overcloud name - 'qe-Cloud-0' Related BZs for ffwd: --------------------- https://bugzilla.redhat.com/show_bug.cgi?id=1554122 https://bugzilla.redhat.com/show_bug.cgi?id=1557331