Bug 1371212

Summary: radosgw-admin orphan find command doesn't complete
Product: Red Hat Ceph Storage Reporter: jquinn <jquinn>
Component: RGWAssignee: Matt Benjamin (redhat) <mbenjamin>
Status: CLOSED ERRATA QA Contact: Ramakrishnan Periyasamy <rperiyas>
Severity: medium Docs Contact: Bara Ancincova <bancinco>
Priority: unspecified    
Version: 1.3.2CC: cbodley, ceph-eng-bugs, flucifre, hnallurv, jquinn, kbader, kdreyer, mbenjamin, owasserm, sweil, uboppana, vumrao
Target Milestone: rc   
Target Release: 2.2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: RHEL: ceph-10.2.5-23.el7cp Ubuntu: ceph_10.2.5-16redhat1xenial Doc Type: Bug Fix
Doc Text:
.The radosgw-admin orphan find command works as expected When listing objects, a segment marker caused incorrect listing of a subset of the Ceph Object Gateway internal objects. This behavior caused the `radosgw-admin orphan find` command to enter an infinite loop. This bug has been fixed, and the `radosgw-admin orphan find` command now works correctly.
Story Points: ---
Clone Of:
: 1497583 (view as bug list) Environment:
Last Closed: 2017-03-14 15:45:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1412948, 1497583    
Attachments:
Description Flags
debug logs none

Description jquinn 2016-08-29 15:06:11 UTC
Created attachment 1195410 [details]
debug logs

Description of problem:
Customer is running the radosgw-admin orphan find command to gather a list of the orphans so that they can be cleaned up.  When running the tool it appears to run normally in the beginning, but then goes into a loop repeating the same 2 lines over and over again (see below) and never progresses or completes.  

We enabled debug logging and can see that the tool appears to hang on 1 as it appears in the log file over 600K times.  

Matt Benjamin has already taken a look through the logs and re-produced the issue.  I am opening this ticket for official tracking purposes.  

** repeated lines in log file** 
storing 1 entries at orphan.scan.nexttest27.linked.53
storing 1 entries at orphan.scan.nexttest27.linked.6

** debug logs** 

2016-08-12 09:18:29.826567 7fb7f9d28820 20 adding obj: default.1995992.63__shadow_30GB.bin.2~xjyegqAg1akHx1UzceywmVjKgz2qWLK

[jquinn@jquinn Aug 12]$ grep -c  default.1995992.63__multipart_30GB.bin.2~xjyegqAg1akHx1UzceywmVjKgz2qWLK orphan-find-2016-08-12-0846-part3
620604
[jquinn@jquinn Aug 12]$ 


Version-Release number of selected component (if applicable):

On the gateway node the following version is installed:
ceph version 0.94.7 (d56bdf93ced6b80b07397d57e3fa68fe68304432)

on the OSD and monitor nodes we have:
ceph version 0.94.3.3 (7a9c85cd09c58d4a03bf44d444c0b2af0938226a)
which is RHCS 1.3.1. For your information: our CD10000 Ceph appliance is using RHCS 1.3.1 in the latest released version.

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 15 Ramakrishnan Periyasamy 2017-02-15 07:34:43 UTC
Tried to verify this bz in build: 10.2.5-26.el7cp, not seeing any issues.
Command is getting executed without any hung or failure. 

command execution stdout available in this pastebin location http://pastebin.test.redhat.com/455612

Moving this bug to verified state.

Comment 20 errata-xmlrpc 2017-03-14 15:45:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2017-0514.html