Bug 507836

Summary: Problem with host monitoring - ssh command error
Product: [Community] Spacewalk Reporter: Jessica Jones <fedora>
Component: ServerAssignee: Miroslav Suchý <msuchy>
Status: CLOSED CURRENTRELEASE QA Contact: Red Hat Satellite QA List <satqe-list>
Severity: high Docs Contact:
Priority: medium    
Version: 0.5CC: cperry, fedora, jhutar, msuchy, roysjosh
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-01-04 08:45:04 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 543511    

Description Jessica Jones 2009-06-24 13:27:47 UTC
Description of problem:
I'm trying to set up monitoring of a particular host, and not really getting anywhere.

Version-Release number of selected component (if applicable):
Spacewalk 0.5 (just installed from the main spacewalk yum repositories).

How reproducible:

It is the same for all hosts we have tried to add monitoring to.  None of them exhibit any other behaviour, including one which is also CentOS 5 x86_64 (identical to the server).

Steps to Reproduce:
1. Add client machine to spacewalk server
2. Make sure that rhnmd is running on the client
2. Add default load monitor with no params
3. Wait.
  
Actual results:

This is the error output from the spacewalk server monitoring daemon - it's the default load monitor for Spacewalk 0.5, with a test client being added.  The server is running an up to date, recently installed CentOS 5 (x86_54) (nothing else is installed besides Spacewalk - the database is on our Oracle server).

SELinux is in permissive mode on the server, but disabled entirely on the client, which is running a recently installed and updated Fedora 10 (i386).

The RHN Monitoring Daemon (RHNMD) is not responding: ssh: : Name or service not known. Please make sure the daemon is running and the host is accessible from the monitoring scout. Command was: /usr/bin/ssh -l nocpulse -p 4545 -i /var/lib/nocpulse/.ssh/nocpulse-identity -o StrictHostKeyChecking=no -o BatchMode=yes /bin/sh -s

If I run this command myself from the spacewalk server, I quickly notice that the order of the command should be slightly different, and that the hostname appears to be missing.. (it should be before the command being run):

[root@gyne repo]# /usr/bin/ssh -l nocpulse -p 4545 -i /var/lib/nocpulse/.ssh/nocpulse-identity -o StrictHostKeyChecking=no -o BatchMode=yes /bin/sh -s mapc-test.maths.bath.ac.uk
ssh: /bin/sh: Name or service not known

The man page indicates that if you switch the ordering so that /bin/sh comes last, then ssh executes (although there is still a problem):

[root@gyne repo]# /usr/bin/ssh -l nocpulse -p 4545 -i /var/lib/nocpulse/.ssh/nocpulse-identity -o StrictHostKeyChecking=no -o BatchMode=yes -s mapc-test.maths.bath.ac.uk /bin/sh
Warning: Permanently added 'mapc-test.maths.bath.ac.uk,138.38.96.47' (DSA) to the list of known hosts.
Request for subsystem '/bin/sh' failed on channel 0

If I go to the host we are trying to connect to and tail /var/log/messages I see this:

Jun 24 10:00:55 mapc-test rhnmd[16022]: subsystem request for /bin/sh failed, subsystem not found
Jun 24 10:01:05 mapc-test rhnmd[16028]: WARNING: /etc/ssh/moduli does not exist, using fixed modulus
Jun 24 10:01:05 mapc-test rhnmd[16028]: Accepted publickey for nocpulse from 138.38.33.22 port 46790 ssh2
Jun 24 10:01:19 mapc-test rhnmd[16032]: WARNING: /etc/ssh/moduli does not exist, using fixed modulus
Jun 24 10:01:19 mapc-test rhnmd[16032]: Accepted publickey for nocpulse from 138.38.33.22 port 46792 ssh2
Jun 24 10:01:19 mapc-test rhnmd[16032]: fatal: chown(/dev/pts/0, 489, 5) failed: Operation not permitted
Jun 24 10:01:19 mapc-test rhnmd[16032]: Attempt to write login records by non-root user (aborting)

Expected results:
Presumably it should tell me what the load is, or if it is too high..

Additional info:

Comment 1 Joshua Roys 2009-07-15 17:53:48 UTC
Just in case you hadn't came across this yet, the "-s" should go after the /bin/sh.  In the above it's being treated as an argument to ssh rather than sh.

Comment 2 Miroslav Suchý 2009-11-17 12:53:54 UTC
Please try following on you spacewalk server:
su - nocpulse
/usr/bin/ssh -l nocpulse -p 4545 -i /var/lib/nocpulse/.ssh/nocpulse-identity mapc-test.maths.bath.ac.uk

If this will not work, then the error is most probably not in rhnmd itself.

Since rhnmd is just simple wrapper around sshd to force it start on port 4545, I doubt, there is error in rhnmd itself. My tip is some PAM or other policy in game on client.

Comment 3 Jessica Jones 2009-12-22 09:41:58 UTC
We've upgraded since then, and it seems to be working now.  I have no idea what the fix was, but would assume that it was something that changed between versions as we have made no changes.

Comment 4 Miroslav Suchý 2010-01-04 08:45:04 UTC
OK. Closing. If you find the problem again, feel free to reopen this bug.

Comment 5 Michael Mráka 2010-02-16 12:58:53 UTC
Spacewalk 0.8 has been released