Bug 1791602

Summary: [RHOSP-13] Stuck updates of deleted accounts crash account server in a loop
Product: Red Hat OpenStack Reporter: Nilesh <nchandek>
Component: openstack-swiftAssignee: Pete Zaitcev <zaitcev>
Status: CLOSED EOL QA Contact:
Severity: high Docs Contact: Andy Stillman <astillma>
Priority: high    
Version: 13.0 (Queens)CC: cschwede, derekh, gamado, lmarsh, pgrist, satmakur, schhabdi, yocha, zaitcev
Target Milestone: ---Keywords: Triaged, ZStream
Target Release: ---Flags: zaitcev: needinfo-
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-07-11 20:37:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
test script 1 none

Comment 35 Gal Amado 2020-09-30 14:13:45 UTC
Won't reproduce.
seems like debugging was done remotly on customer's platform.

Comment 36 Pete Zaitcev 2020-10-01 04:11:14 UTC
(In reply to Gal Amado from comment #35)
> Won't reproduce.

It happens at a drop of a hat at customer sites. As I gather,
the following happens:
1. They deploy a fairly vanilla cloud, with telemetry of course.
2. They back Gnocchi with Swift.
3. They run some test workloads.
4. Gnocchi is hitting Swift hard and keeps it busy. This includes
always having updates for containers, because every incoming metric
adds and removes objects, so containers always have outstanding
updates queued up.
5. For whatever reason, they suddenly re-deploy overcloud.
Doing so creates all new swift.conf with its hash, new tenant ID
for the account "gnocchi" in Keystone. However, they do not wipe
Swift volumes, so old containers with outstanding updates remain.
6. Now, the scene is all set. Updaters find containers through
scanning the filesystem in /srv/node. Therefore, they find old
containers with outstanding updates. However, when they try to
update the account, it's no longer there (because hash prefix
is changed) and account server crashes.

Comment 44 Pete Zaitcev 2021-02-10 03:56:25 UTC
Created attachment 1756086 [details]
test script 1

Finds containers with outstanding updates, #1.

Comment 60 Lon Hohberger 2023-07-11 20:37:11 UTC
OSP 13 was retired on June 27, 2023. No further work is expected to occur on this issue.