Bug 1388647 - [1.3.3/0.94.9-5.el7cp radosgw-admin fails during purgedata
Summary: [1.3.3/0.94.9-5.el7cp radosgw-admin fails during purgedata
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: RGW
Version: 1.3.3
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: rc
: 1.3.4
Assignee: Orit Wasserman
QA Contact: ceph-qe-bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-10-25 20:00 UTC by Vasu Kulkarni
Modified: 2022-02-21 18:17 UTC (History)
8 users (show)

Fixed In Version: RHEL: ceph-0.94.9-8.el7cp Ubuntu: ceph_0.94.9-9redhat1trusty
Doc Type: Bug Fix
Doc Text:
After backporting certain upstream changes, an attempt to delete objects with underscore characters failed. The change causing this bug has been removed, and objects can now be deleted as expected.
Clone Of:
Environment:
Last Closed: 2016-12-01 21:39:09 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Ceph Project Bug Tracker 18063 0 None None None 2016-11-29 10:07:27 UTC
Red Hat Product Errata RHSA-2016:2847 0 normal SHIPPED_LIVE Moderate: Red Hat Ceph Storage 1.3 security, bug fix, and enhancement update 2016-12-02 02:38:33 UTC

Description Vasu Kulkarni 2016-10-25 20:00:10 UTC
Description of problem:


At the end of the s3 test, radosgw-admin cli fails during purge-data

2016-10-24T19:59:02.228 INFO:teuthology.orchestra.run.magna055.stderr:s3tests.fuzz.test.test_fuzzer.test_expand_headers ... ok
2016-10-24T19:59:02.229 INFO:teuthology.orchestra.run.magna055.stderr:test_realistic.TestFileValidator.test_new_file_is_valid ... ok
2016-10-24T19:59:02.269 INFO:teuthology.orchestra.run.magna055.stderr:test_realistic.TestFileValidator.test_new_file_is_valid_on_several_calls ... ok
2016-10-24T19:59:02.270 INFO:teuthology.orchestra.run.magna055.stderr:test_realistic.TestFileValidator.test_new_file_is_valid_when_size_is_1 ... ok
2016-10-24T19:59:02.310 INFO:teuthology.orchestra.run.magna055.stderr:test_realistic.TestFiles.test_random_file_valid ... ok
2016-10-24T19:59:02.311 INFO:teuthology.orchestra.run.magna055.stderr:
2016-10-24T19:59:02.351 INFO:teuthology.orchestra.run.magna055.stderr:----------------------------------------------------------------------
2016-10-24T19:59:02.352 INFO:teuthology.orchestra.run.magna055.stderr:Ran 323 tests in 273.024s
2016-10-24T19:59:02.392 INFO:teuthology.orchestra.run.magna055.stderr:
2016-10-24T19:59:02.393 INFO:teuthology.orchestra.run.magna055.stderr:OK (SKIP=4)
2016-10-24T19:59:02.435 INFO:tasks.s3tests:Cleaning up boto...
2016-10-24T19:59:02.436 INFO:teuthology.orchestra.run.magna055:Running: 'rm /home/ubuntu/cephtest/boto.cfg'
2016-10-24T19:59:02.525 INFO:teuthology.orchestra.run.magna055:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage radosgw-admin -n client.0 user rm --uid bar.client.0 --purge-data'
2016-10-24T19:59:02.944 INFO:teuthology.orchestra.run.magna055:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage radosgw-admin -n client.0 user rm --uid foo.client.0 --purge-data'
2016-10-24T19:59:16.235 INFO:teuthology.orchestra.run.magna055.stderr:2016-10-24 23:59:04.578046 7f28df6db880  0 RGWObjManifest::operator++(): result: ofs=3145728 stripe_ofs=3145728 part_ofs=0 rule->part_size=0
2016-10-24T19:59:16.260 INFO:teuthology.orchestra.run.magna055.stderr:could not remove user: unable to remove user, unable to delete user data
2016-10-24T19:59:16.261 INFO:tasks.s3tests:Removing s3-tests...
2016-10-24T19:59:16.302 INFO:teuthology.orchestra.run.magna055:Running: 'rm -rf /home/ubuntu/cephtest/s3-tests'
2016-10-24T19:59:16.398 ERROR:teuthology.run_tasks:Saw exception from tasks.


Full logs:
http://magna002.ceph.redhat.com/vasu-2016-10-24_18:43:28-smoke-hammer---basic-magna/257458/teuthology.log

Comment 2 Orit Wasserman 2016-10-26 15:35:50 UTC
It seems like the test called "user rm" command twice and the second call returned ENOENT because the user was already removed.

Comment 3 Vasu Kulkarni 2016-10-26 16:18:21 UTC
Orit,

the code hasn't changed and it worked before, but I do see the issue in the code since the config should never had second client

https://github.com/ceph/ceph-qa-suite/blob/hammer/tasks/s3tests.py#L216-L230

Do you think ENOENT is valid now and based on hardware speed?

Comment 4 Orit Wasserman 2016-10-26 17:23:52 UTC
This is the first time I see this problem I just see the double logging. 
I am not sure what happened as the configuration seems fine.

We don't have any radosgw logs (the log level is too low) so we are not sure there is no other level.
Can you try to run again with higher log level for radosgw?

Comment 5 Vasu Kulkarni 2016-10-26 22:36:30 UTC
I looked at it more closely, if you look at the second command the user is 'foo' and in the first one its 'bar', so its failing for 'foo', I will try to give the system in same state.

Comment 6 Vasu Kulkarni 2016-10-27 18:44:47 UTC
This is what i see in logs, one of the bucket delete returns -2

2016-10-27 18:29:47.929434 7ffa7a7fc700  1 ====== req done req=0x7ffa6400d490 http_status=200 ======
2016-10-27 18:29:47.929467 7ffa7a7fc700  1 civetweb: 0x7ffa640008c0: 127.0.0.1 - - [27/Oct/2016:18:29:47 +0000] "GET /test-client.0-imqaibnu3v3kufj-/ HTTP/1.1" 200 0 - Boto/2.43.0 Python/2.7.5 Linux/3.10.0-327.36.3.el7.x86_64
2016-10-27 18:29:47.932001 7ff9d7fff700  1 ====== starting new request req=0x7ff9c8031bb0 =====
2016-10-27 18:29:48.006921 7ff9d7fff700  1 ====== req done req=0x7ff9c8031bb0 http_status=204 ======
2016-10-27 18:29:48.006963 7ff9d7fff700  1 civetweb: 0x7ff9c80008c0: 127.0.0.1 - - [27/Oct/2016:18:29:47 +0000] "DELETE /test-client.0-imqaibnu3v3kufj-/foo HTTP/1.1" 204 0 - Boto/2.43.0 Python/2.7.5 Linux/3.10.0-327.36.3.el7.x86_64
2016-10-27 18:29:48.008890 7ff9f4ff9700  1 ====== starting new request req=0x7ff9d000d980 =====
2016-10-27 18:29:48.491281 7ff9f4ff9700  1 ====== req done req=0x7ff9d000d980 http_status=204 ======
2016-10-27 18:29:48.491317 7ff9f4ff9700  1 civetweb: 0x7ff9d00008c0: 127.0.0.1 - - [27/Oct/2016:18:29:48 +0000] "DELETE /test-client.0-imqaibnu3v3kufj-/ HTTP/1.1" 204 0 - Boto/2.43.0 Python/2.7.5 Linux/3.10.0-327.36.3.el7.x86_64
2016-10-27 18:32:35.567047 7ffafabf9700  0 ERROR: rgw_bucket_sync_user_stats() for user=foo.client.0, bucket=test-client.0-imqaibnu3v3kufj-(@{i=.rgw.buckets.index,e=.rgw.buckets.extra}.rgw.buckets[default.4128.196]) returned -2
2016-10-27 18:32:35.567063 7ffafabf9700  0 WARNING: sync_bucket() returned r=-2

Comment 8 Orit Wasserman 2016-10-31 18:22:59 UTC
This is the error:

2016-10-28 19:02:32.534216 7f1750ff9700 20 cls_bucket_header() returned -2
2016-10-28 19:02:32.534219 7f1750ff9700  0 ERROR: rgw_bucket_sync_user_stats() for user=foo.client.0, bucket=test-client.0-84e5wlt0fkx0mg4-172(@{i=.rgw.buckets.index,e=.rgw.buckets.extra}.rgw.buckets[default.4121.175]) returned -2
2016-10-28 19:02:32.534225 7f1750ff9700  0 WARNING: sync_bucket() returned r=-2

we don't have cls logs.
I will try to reproduce it locally.

Comment 19 Vasu Kulkarni 2016-11-30 00:43:26 UTC
Verified in v0.94.9-8

Comment 24 errata-xmlrpc 2016-12-01 21:39:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-2847.html


Note You need to log in before you can comment on or make changes to this bug.