Bug 1791602 - [RHOSP-13] Stuck updates of deleted accounts crash account server in a loop
Summary: [RHOSP-13] Stuck updates of deleted accounts crash account server in a loop
Keywords:
Status: CLOSED EOL
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-swift
Version: 13.0 (Queens)
Hardware: All
OS: All
high
high
Target Milestone: ---
: ---
Assignee: Pete Zaitcev
QA Contact:
Andy Stillman
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-01-16 09:06 UTC by Nilesh
Modified: 2023-07-11 20:37 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-07-11 20:37:11 UTC
Target Upstream Version:
Embargoed:
zaitcev: needinfo-


Attachments (Terms of Use)
test script 1 (2.28 KB, text/plain)
2021-02-10 03:56 UTC, Pete Zaitcev
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1861233 0 None None None 2020-01-28 22:57:33 UTC
OpenStack gerrit 704435 0 None NEW Mark a container reported if account was reclaimed 2021-02-15 06:06:54 UTC
OpenStack gerrit 743797 0 None NEW Band-aid and test the crash of the account server 2021-02-15 06:06:54 UTC
Red Hat Issue Tracker OSP-2175 0 None None None 2021-12-01 22:30:32 UTC

Comment 35 Gal Amado 2020-09-30 14:13:45 UTC
Won't reproduce.
seems like debugging was done remotly on customer's platform.

Comment 36 Pete Zaitcev 2020-10-01 04:11:14 UTC
(In reply to Gal Amado from comment #35)
> Won't reproduce.

It happens at a drop of a hat at customer sites. As I gather,
the following happens:
1. They deploy a fairly vanilla cloud, with telemetry of course.
2. They back Gnocchi with Swift.
3. They run some test workloads.
4. Gnocchi is hitting Swift hard and keeps it busy. This includes
always having updates for containers, because every incoming metric
adds and removes objects, so containers always have outstanding
updates queued up.
5. For whatever reason, they suddenly re-deploy overcloud.
Doing so creates all new swift.conf with its hash, new tenant ID
for the account "gnocchi" in Keystone. However, they do not wipe
Swift volumes, so old containers with outstanding updates remain.
6. Now, the scene is all set. Updaters find containers through
scanning the filesystem in /srv/node. Therefore, they find old
containers with outstanding updates. However, when they try to
update the account, it's no longer there (because hash prefix
is changed) and account server crashes.

Comment 44 Pete Zaitcev 2021-02-10 03:56:25 UTC
Created attachment 1756086 [details]
test script 1

Finds containers with outstanding updates, #1.

Comment 60 Lon Hohberger 2023-07-11 20:37:11 UTC
OSP 13 was retired on June 27, 2023. No further work is expected to occur on this issue.


Note You need to log in before you can comment on or make changes to this bug.