Description of problem: check_volume_status.py from gluster-nagios-addons crashes when requesting -t self-heal on volume with a missing brick. e.g. ./check_volume_status.py -v <volume_name> -t self-heal Version-Release number of selected component (if applicable): gluster-nagios-addons.rpm 1.1.0 gluster-nagios-common 1.1.0 glusterfs 3.7.20 How reproducible: 100% Steps to Reproduce: 1. Use a replica 5 volume with one brick offline 2. ./check_volume_status.py -v <volume_name> -t self-heal 3. Actual results: Traceback (most recent call last): File "./check_volume_status.py", line 176, in <module> exitstatus, message = getVolumeSelfHealSplitBrainStatus(args) File "./check_volume_status.py", line 88, in getVolumeSelfHealSplitBrainStatus volume = glustercli.volumeHealSplitBrainStatus(args.volume) File "/usr/lib64/python2.7/site-packages/glusternagios/glustercli.py", line 639, in volumeHealSplitBrainStatus return _volumeHealCommandOutput(volumeName, command, remoteServer) File "/usr/lib64/python2.7/site-packages/glusternagios/glustercli.py", line 657, in _volumeHealCommandOutput value = _parseVolumeSelfHealInfo(out) File "/usr/lib64/python2.7/site-packages/glusternagios/glustercli.py", line 508, in _parseVolumeSelfHealInfo entries = int(line.split(':')[1]) ValueError: invalid literal for int() with base 10: '-' Expected results: (formatted for easy reading) ['Brick 10.130.12.121:/bricks/brick_songs1/songs1', 'Status: Connected', 'Number of entries in split-brain: 0', '', 'Brick 10.130.12.131:/bricks/brick_songs1/songs1', 'Status: Connected', 'Number of entries in split-brain: 0', '', 'Brick 10.130.12.111:/bricks/brick_songs1/songs1', 'Status: Connected', 'Number of entries in split-brain: 0', '', 'Brick 10.130.12.109:/bricks/brick_songs1/songs1', 'Status: Transport endpoint is not connected', 'Number of entries in split-brain: -', '', 'Brick 10.130.12.105:/bricks/brick_songs1/songs1', 'Status: Connected', 'Number of entries in split-brain: 0', ''] No split brain state entries found. (or maybe an error message?) Additional info: Problem is in _parseVolumeSelfHealInfo() from glustercli.py in the gluster-nagios-common (following from current github) def _parseVolumeSelfHealInfo(out): value = {} splitbrainentries = 0 for line in out: if line.startswith('Number of entries'): entries = int(line.split(':')[1]) As can be seen in the "Expected Results", 4th "Number of lines in split-brain" line makes the last code line above crash, because it has a '-' instead of an integer after the colon. (Expected result above was obtained by inserting a print statement near the beginning of the code snippet above.)
gluster-nagios plugin is not planned to be maintained anymore. Please post any concerns and questions on alternative setups in our mailing lists.