180414 – Error: Cannot open resource file '/etc/nagios/private/resource.cfg' for reading!

Bug 180414 - Error: Cannot open resource file '/etc/nagios/private/resource.cfg' for reading!

Summary: Error: Cannot open resource file '/etc/nagios/private/resource.cfg' for reading!

Keywords:
Status:	CLOSED NEXTRELEASE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	nagios
Sub Component:
Version:	rawhide
Hardware:	i386
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Mike McGrath
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2006-02-08 00:07 UTC by R. Michael Richer
Modified:	2007-11-30 22:11 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2006-08-03 14:44:30 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description R. Michael Richer 2006-02-08 00:07:57 UTC

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.1) Gecko/20060111 Firefox/1.5.0.1

Description of problem:
Nagios configuration runs fine for a couple of minutes... Then it starts to fail all the plugins with the plugin error -127 out of bounds...  The error "Error: Cannot open resource file '/etc/nagios/private/resource.cfg' for reading!" is first reported in the log.  (It received a SIGHUP from somewhere as well... or it thinks it did...).

Default permissions on the file are 0440 with root.root as owner/group.


Version-Release number of selected component (if applicable):
nagios-2.0-0.2.rc2.fc4

How reproducible:
Always

Steps to Reproduce:
1. Clean install and rename *sample to *cfg (with minor edits)
2. Run Nagios for a little while... ( 2 minutes on my system)
3. Experience the resource failure..
  

Actual Results:  [1139103353] Caught SIGHUP, restarting...
[1139103353] Error: Cannot open resource file '/etc/nagios/private/resource.cfg' for reading!
[1139103353] Nagios 2.0rc2 starting... (PID=17066)
[1139103353] LOG VERSION: 2.0


Expected Results:  Continue running as normal...

Additional info:

Comment 1 Mike McGrath 2006-02-08 00:23:51 UTC

hmmm.  Curious, what command do you use to start Nagios?  Also what version of
the plugins do you have installed?

One more thing, when you say it runs fine for a little while, what does the
status look like on the nagios status page?  Do the services check as OK for a
while and then fail?  Or do they start as undetermined and then fail?

Comment 2 R. Michael Richer 2006-02-08 00:48:02 UTC

/etc/init.d/nagios start

Manual Plugins ... some perl .. some pieced together.. most from
nagiosplug.sourceforge.net

Services check as OK for a while and then after 2 minutes they all switch to
(-127 out of bounds, etc. etc.)... 

It looks like the nagios system receives a SIGHUP after executing one of my
scripts (which basically intercepts if it is a HARD fail it will
/etc/init.d/sendmail restart)...

Anyways, modifying the permissions on resource.cfg to 0444 fixes it (although
the resource file shouldn't be read by others... but in my case it's an internal
secured server anyways..

Comment 3 Mike McGrath 2006-02-08 03:43:16 UTC

Thats interesting, I can't seem to recreate this issue.  If you wouldn't mind
could you upload the plugin that you think is sending Nagios the SIGHUP?

Comment 4 R. Michael Richer 2006-02-08 18:08:58 UTC

There isn't a plugin that SIGHUPs the nagios process...
When one service checks fails... it goes through the regular process of
switching to a SOFT FAIL and then eventually to a HARD FAIL.  These events are
handed to the event handler (a simple shell script that is basically a case, and
one of the cases if HARD will execute /etc/init.d/sendmail restart).  Once the
event handler fires with the failure, that is when I see the SIGHUP...

Anyways, here is the snip of the 
#!/bin/sh
case "$1" in
OK)
   ;;
WARNING)
   ;;
UNKNOWN)
   ;;
CRITICAL)
   case "$2" in
      SOFT)
         ;;
      HARD)
         sudo /etc/init.d/sendmail restart
         ;;
   esac
   ;;
esac
exit 0

Comment 5 R. Michael Richer 2006-02-08 18:10:54 UTC

Going back through the logs... it does occur at other times... it's just more
frequent with event handlers..., and for no apparent common reason...  Is there
a way to increase logging to see where it's getting this SIGHUP signal from?

Comment 6 Mike McGrath 2006-02-08 19:09:33 UTC

Hmm, not sure if there's a way that would be helpful for this, now that you have
the perms on the file/directory set to world readable, are you still seeing the
SIGHUP's?  Do any of your scripts send SIGHUP's as part of restarting some
failed service?  I'm curious if this is just something that has always happened
on your system but was only noticable when the private directory was locked down.

Comment 7 Mike McGrath 2006-02-08 19:18:30 UTC

Actually it doesn't matter at this point, I'm going to make private and
resource.cfg-sample nagios group readable.  I don't know whats sending teh
SIGHUPs.  It's not happening on my box but when I send a SIGHUP it does fail
open private/resource.cfg  And that shouldn't happen.

Comment 8 Jose Pedro Oliveira 2006-08-03 03:55:45 UTC

Mike,

Re-opening this ticket.
The solution mentioned in the previous comment still hasn't been 
committed (the resource.cfg-sample file still belongs to the root group).

$ rpm -q nagios
nagios-2.5-1.fc5

$ rpm -qlv nagios | grep private
drwxr-x--- 2 root root     0 Jul 14 17:49 /etc/nagios/private
-rw-r----- 1 root root  1331 Jul 14 17:49 /etc/nagios/private/resource.cfg-sample

jpo

Comment 9 Mike McGrath 2006-08-03 14:44:30 UTC

I don't know how this slipped my mind but it is now saved and built.  Thanks for
re-opening it.

Note You need to log in before you can comment on or make changes to this bug.