1739233 – [Ceph] ceph -s json output is inconsistent for osdmap section

Bug 1739233 - [Ceph] ceph -s json output is inconsistent for osdmap section

Summary: [Ceph] ceph -s json output is inconsistent for osdmap section

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Ceph Storage
Classification:	Red Hat Storage
Component:	RADOS
Sub Component:
Version:	3.2
Hardware:	All
OS:	All
Priority:	low
Severity:	medium
Target Milestone:	rc
Target Release:	4.0
Assignee:	David Zafman
QA Contact:	Manohar Murthy
Docs Contact:	Amrita
URL:
Whiteboard:
Depends On:
Blocks:	1730176
TreeView+	depends on / blocked

Reported:	2019-08-08 19:48 UTC by jquinn
Modified:	2020-01-31 12:47 UTC (History)
CC List:	13 users (show)
Fixed In Version:	ceph-14.2.4-59.el8cp, ceph-14.2.4-9.el7cp
Doc Type:	Bug Fix
Doc Text:	.Deprecated JSON fields have been removed This update removes deprecated fields from the JSON output of the `ceph status` command.
Clone Of:
Environment:
Last Closed:	2020-01-31 12:46:52 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Ceph Project Bug Tracker	42015	None	None	None	2019-09-23 21:29:16 UTC
Github	ceph ceph pull 30900	'None'	closed	nautilus: osd: Remove unused osdmap flags full, nearfull from output	2020-10-19 20:19:15 UTC
Red Hat Product Errata	RHBA-2020:0312	None	None	None	2020-01-31 12:47:23 UTC

Description jquinn 2019-08-08 19:48:08 UTC

Description of problem: When an OSD becomes full/nearfull it's being flagged in the ceph -s output.  When we check the ceph -s -f json-pretty output though it's flagged under the "health" section as nearfull, but under the osdmap section the "full" and "nearfull" values remain set to false.   


Version-Release number of selected component (if applicable):12.2.8-89.el7cp


How reproducible: Can reproduce by filling OSD 


Steps to Reproduce:
1. Create files in /var/lib/ceph/osd/ceph-<osd id> to fill to full or almost full threshold
2. ceph -s -f json-pretty
3.

Actual results:

[root@site1-2 ceph-2]# ceph -s -f json-pretty

{
    "fsid": "e0b84dbb-63c6-4227-bf4d-439c28b540c2",
    "health": {
        "checks": {
            "OSD_NEARFULL": {
                "severity": "HEALTH_WARN",
                "summary": {
                    "message": "1 nearfull osd(s)"
                }
            },
            "POOL_NEARFULL": {
                "severity": "HEALTH_WARN",
                "summary": {
                    "message": "30 pool(s) nearfull"
                }
            },
            "MANY_OBJECTS_PER_PG": {
                "severity": "HEALTH_WARN",
                "summary": {
                    "message": "1 pools have many more objects per pg than average"
                }
            },
            "POOL_APP_NOT_ENABLED": {
                "severity": "HEALTH_WARN",
                "summary": {
                    "message": "application not enabled on 1 pool(s)"
                }
            }
        },
        "status": "HEALTH_WARN",
        "summary": [
            {
                "severity": "HEALTH_WARN",
                "summary": "'ceph health' JSON format has changed in luminous. If you see this your monitoring system is scraping the wrong fields. Disable this with 'mon health preluminous compat warning = false'"
            }
        ],
        "overall_status": "HEALTH_WARN"
    },
    "election_epoch": 51,
    "quorum": [
        0
    ],
    "quorum_names": [
        "site1-1"
    ],
    "monmap": {
        "epoch": 2,
        "fsid": "e0b84dbb-63c6-4227-bf4d-439c28b540c2",
        "modified": "2019-01-15 12:05:42.550312",
        "created": "2019-01-15 12:05:42.550312",
        "features": {
            "persistent": [
                "kraken",
                "luminous"
            ],
            "optional": []
        },
        "mons": [
            {
                "rank": 0,
                "name": "site1-1",
                "addr": "10.74.131.9:6789/0",
                "public_addr": "10.74.131.9:6789/0"
            }
        ]
    },
    "osdmap": {
        "osdmap": {
            "epoch": 1647,
            "num_osds": 9,
            "num_up_osds": 8,
            "num_in_osds": 8,
            "full": false,
            "nearfull": false,
            "num_remapped_pgs": 0
        }
    },
    "pgmap": {
        "pgs_by_state": [
            {
                "state_name": "active+clean",
                "count": 368
            }
        ],
        "num_pgs": 368,
        "num_pools": 30,
        "num_objects": 406438,
        "data_bytes": 415559695,
        "bytes_used": 19873714176,
        "bytes_avail": 64045756416,
        "bytes_total": 83919470592,
        "read_bytes_sec": 17560,
        "read_op_per_sec": 17,
        "write_op_per_sec": 0
    },
    "fsmap": {
        "epoch": 1,
        "by_rank": []
    },
    "mgrmap": {
        "epoch": 335,
        "active_gid": 425143,
        "active_name": "site1-1",
        "active_addr": "10.74.131.9:6800/11457",
        "available": true,
        "standbys": [],
        "modules": [
            "prometheus",
            "status"
        ],
        "available_modules": [
            "balancer",
            "dashboard",
            "influx",
            "localpool",
            "prometheus",
            "restful",
            "selftest",
            "status",
            "zabbix"
        ],
        "services": {
            "prometheus": "http://site1-1.home:9283/"
        }
    },
    "servicemap": {
        "epoch": 71,
        "modified": "2019-07-24 15:04:27.562961",
        "services": {
            "rgw": {
                "daemons": {
                    "summary": "",
                    "site1-5": {
                        "start_epoch": 71,
                        "start_stamp": "2019-07-24 15:04:27.028689",
                        "gid": 425078,
                        "addr": "10.74.130.180:0/3735715916",
                        "metadata": {
                            "arch": "x86_64",
                            "ceph_version": "ceph version 12.2.8-52.el7cp (3af3ca15b68572a357593c261f95038d02f46201) luminous (stable)",
                            "cpu": "Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz",
                            "distro": "rhel",
                            "distro_description": "Employee SKU",
                            "distro_version": "7.5",
                            "frontend_config#0": "civetweb port=10.74.130.180:8080 num_threads=100",
                            "frontend_type#0": "civetweb",
                            "hostname": "site1-5",
                            "kernel_description": "#1 SMP Wed Mar 21 18:14:51 EDT 2018",
                            "kernel_version": "3.10.0-862.el7.x86_64",
                            "mem_swap_kb": "421884",
                            "mem_total_kb": "1015508",
                            "num_handles": "1",
                            "os": "Linux",
                            "pid": "24993",
                            "zone_id": "2a4ab455-30b5-42f6-bb65-54bab4cd9de4",
                            "zone_name": "merritt",
                            "zonegroup_id": "61fb9103-18d3-4e77-a3e4-7b6bf9ad1b0f",
                            "zonegroup_name": "us-zonegroup"
                        }
                    }
                }
            }
        }
    }
}




Expected results: The nearfull and full lines in the osdmap section should have the same status as the health section above. 


   "osdmap": {
        "osdmap": {
            "epoch": 1647,
            "num_osds": 9,
            "num_up_osds": 8,
            "num_in_osds": 8,
            "full": False,    <<---- these lines should update. 
            "nearfull": True,  <<---- these lines should update. 
            "num_remapped_pgs": 0




Additional info:  I tested marking an OSD down to see if other sections of the OSDMap are being updated, and they are working correctly.  

** If I stop an OSD it recognized that the OSD is now 'down' **  

  "osdmap": {
        "osdmap": {
            "epoch": 1644,
            "num_osds": 9,
            "num_up_osds": 7,
            "num_in_osds": 8,
            "full": false,
            "nearfull": false,
            "num_remapped_pgs": 0

** I bring the osd back 'up' and it's reflected ** 
    "osdmap": {
        "osdmap": {
            "epoch": 1647,
            "num_osds": 9,
            "num_up_osds": 8,
            "num_in_osds": 8,
            "full": false,
            "nearfull": false,
            "num_remapped_pgs": 0
        }
    },

Comment 16 errata-xmlrpc 2020-01-31 12:46:52 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0312

Note You need to log in before you can comment on or make changes to this bug.