Bug 1358275
| Summary: | Rados df gives wrong degraded object count | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | anmol babu <anbabu> |
| Component: | RADOS | Assignee: | Josh Durgin <jdurgin> |
| Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | ceph-qe-bugs <ceph-qe-bugs> |
| Severity: | urgent | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 2.0 | CC: | anbabu, ceph-eng-bugs, dzafman, kchai, kdreyer, sjust |
| Target Milestone: | rc | ||
| Target Release: | 2.1 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2016-09-21 18:44:03 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1349913 | ||
The builds listed above are pretty old. Please confirm that this is still happening with the latest builds (10.2.2-24.el7cp) It's actually ok for there to be more degraded objects than objects. If the pool is configured for 4 replicas but you only have 2, each object is degraded twice. This appears to have happened with 2 osds and pool size=3, however, so it seems like each object should have been degraded once. Possibly a bug with contructing the stats. I do not think this should be a 2.0 blocker. I haven't been able to reproduce this myself. One possible cause would be PGs getting mapped to a smaller acting set than expected, as can happen with older crush tunables. Can you reproduce with osd debugging (debug osd = 20, debug ms = 1) enabled and post the osd logs and output of 'ceph pg dump'? Closing on the assumption that it's just the normal behavior absent any other information. |
Description of problem: The rados df cli command shows number of degraded objects greater than the total number of objects when some of the PGs are degraded Version-Release number of selected component (if applicable): Mon rpms: rpm -qa|grep ceph ceph-selinux-10.2.2-5.el7cp.x86_64 python-cephfs-10.2.2-5.el7cp.x86_64 ceph-common-10.2.2-5.el7cp.x86_64 ceph-base-10.2.2-5.el7cp.x86_64 libcephfs1-10.2.2-5.el7cp.x86_64 ceph-mon-10.2.2-5.el7cp.x86_64 OSD rpms: rpm -qa|grep ceph ceph-selinux-10.2.2-9.el7cp.x86_64 ceph-common-10.2.2-9.el7cp.x86_64 ceph-base-10.2.2-9.el7cp.x86_64 libcephfs1-10.2.2-9.el7cp.x86_64 python-cephfs-10.2.2-9.el7cp.x86_64 ceph-osd-10.2.2-9.el7cp.x86_64 How reproducible: Frequently Steps to Reproduce: 1. prepare some pool and add there several objects e.g. 4 objects 2. remove some OSDs so there is less OSDs than the pool requires 3. create another object Actual results: degraded object count > total object count Expected results: Degraded object count should not be more than total object count. Additional info: rados df --cluster c1 --format json {"pools":[{"name":"p1","id":"1","size_bytes":"114","size_kb":"1","num_objects":"4","num_object_clones":"0","num_object_copies":"12","num_objects_missing_on_primary":"0","num_objects_unfound":"0","num_objects_degraded":"8","read_ops":"5483","read_bytes":"4009984","write_ops":"8","write_bytes":"2048"},{"name":"p2","id":"2","size_bytes":"0","size_kb":"0","num_objects":"1","num_object_clones":"0","num_object_copies":"3","num_objects_missing_on_primary":"0","num_objects_unfound":"0","num_objects_degraded":"2","read_ops":"0","read_bytes":"0","write_ops":"2","write_bytes":"0"}],"total_objects":"5","total_used":"74284","total_avail":"31360428","total_space":"31434712"} ceph -s --cluster c1 cluster ef7329fe-01e5-4b60-8427-71112db95c9d health HEALTH_WARN 256 pgs degraded 256 pgs stuck unclean 256 pgs undersized recovery 10/15 objects degraded (66.667%) monmap e1: 1 mons at {dhcp41-235=10.70.41.235:6789/0} election epoch 3, quorum 0 dhcp41-235 osdmap e37: 2 osds: 2 up, 2 in flags sortbitwise pgmap v700: 256 pgs, 2 pools, 114 bytes data, 5 objects 74284 kB used, 30625 MB / 30697 MB avail 10/15 objects degraded (66.667%) 256 active+undersized+degraded