Bug 1473780 - check_volume_status.py from gluster-nagios-addons crashes when requesting -t self-heal with a missing brick
Summary: check_volume_status.py from gluster-nagios-addons crashes when requesting -t ...
Keywords:
Status: CLOSED EOL
Alias: None
Product: GlusterFS
Classification: Community
Component: unclassified
Version: mainline
Hardware: All
OS: All
unspecified
medium
Target Milestone: ---
Assignee: Sahina Bose
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-07-21 16:00 UTC by Ted Miller
Modified: 2018-09-18 09:08 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-09-18 09:08:14 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Ted Miller 2017-07-21 16:00:41 UTC
Description of problem: check_volume_status.py from gluster-nagios-addons crashes when requesting -t self-heal on volume with a missing brick. e.g. 
./check_volume_status.py -v <volume_name> -t self-heal

Version-Release number of selected component (if applicable):
gluster-nagios-addons.rpm 1.1.0
gluster-nagios-common 1.1.0
glusterfs 3.7.20

How reproducible: 100%

Steps to Reproduce:
1. Use a replica 5 volume with one brick offline
2. ./check_volume_status.py -v <volume_name> -t self-heal
3.

Actual results: Traceback (most recent call last):
  File "./check_volume_status.py", line 176, in <module>
    exitstatus, message = getVolumeSelfHealSplitBrainStatus(args)
  File "./check_volume_status.py", line 88, in getVolumeSelfHealSplitBrainStatus
    volume = glustercli.volumeHealSplitBrainStatus(args.volume)
  File "/usr/lib64/python2.7/site-packages/glusternagios/glustercli.py", line 639, in volumeHealSplitBrainStatus
    return _volumeHealCommandOutput(volumeName, command, remoteServer)
  File "/usr/lib64/python2.7/site-packages/glusternagios/glustercli.py", line 657, in _volumeHealCommandOutput
    value = _parseVolumeSelfHealInfo(out)
  File "/usr/lib64/python2.7/site-packages/glusternagios/glustercli.py", line 508, in _parseVolumeSelfHealInfo
    entries = int(line.split(':')[1])
ValueError: invalid literal for int() with base 10: '-'

Expected results: (formatted for easy reading)
['Brick 10.130.12.121:/bricks/brick_songs1/songs1', 
'Status: Connected', 
'Number of entries in split-brain: 0', 
'', 
'Brick 10.130.12.131:/bricks/brick_songs1/songs1', 
'Status: Connected', 
'Number of entries in split-brain: 0', 
'', 
'Brick 10.130.12.111:/bricks/brick_songs1/songs1', 
'Status: Connected', 
'Number of entries in split-brain: 0', 
'', 
'Brick 10.130.12.109:/bricks/brick_songs1/songs1', 
'Status: Transport endpoint is not connected', 
'Number of entries in split-brain: -', 
'', 
'Brick 10.130.12.105:/bricks/brick_songs1/songs1', 
'Status: Connected', 
'Number of entries in split-brain: 0', 
'']
No split brain state entries found. (or maybe an error message?)

Additional info:
Problem is in _parseVolumeSelfHealInfo() from glustercli.py in the gluster-nagios-common (following from current github)

def _parseVolumeSelfHealInfo(out):
    value = {}
    splitbrainentries = 0
    for line in out:
        if line.startswith('Number of entries'):
            entries = int(line.split(':')[1])

As can be seen in the "Expected Results", 4th "Number of lines in split-brain" line makes the last code line above crash, because it has a '-' instead of an integer after the colon.
(Expected result above was obtained by inserting a print statement near the beginning of the code snippet above.)

Comment 1 Amar Tumballi 2018-09-18 09:08:14 UTC
gluster-nagios plugin is not planned to be maintained anymore. Please post any concerns and questions on alternative setups in our mailing lists.


Note You need to log in before you can comment on or make changes to this bug.