Bug 1181296 - rhcert listener doesn't always start
Summary: rhcert listener doesn't always start
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Certification Program
Classification: Red Hat
Component: redhat-certification
Version: 1.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: brose
QA Contact: Brian Brock
URL:
Whiteboard:
: 1228240 1260866 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-01-12 20:16 UTC by Brian Brock
Modified: 2023-09-14 02:53 UTC (History)
7 users (show)

Fixed In Version: redhat-certification-1.0-20150812
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-11-19 16:35:00 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2015:2479 0 normal SHIPPED_LIVE redhat-certification bug fix and enhancement update 2015-11-19 21:34:15 UTC

Description Brian Brock 2015-01-12 20:16:55 UTC
Description of problem:
rhcert listener daemon does not always start

Version-Release number of selected component (if applicable):
redhat-certification-1.0-20150109.1.el7.noarch and earlier

How reproducible:
No reproducible case yet, so it appears nearly random.  Often the error will occur several times in a row, and seems to clear up by itself

Steps to Reproduce:
1. install redhat-certification
2. run `rhcert-backend server start`
3. check logs

Actual results (first results from a fresh install):
# rhcert-backend server start
registered Test from rhcert.test
Starting rhcert daemon
Starting rhcert listener

followed immediately by checking status

# rhcert-backend server status
registered Test from rhcert.test
The rhcert daemon is running
The rhcert listener is NOT running

there's also a traceback in the logs

Expected results:
listener starts instead of resulting in a traceback.

Additional info:
[ /var/log/rhcert/RedHatCertificationListener.log ]
registered Test from rhcert.test
The rhcert daemon is already started
The rhcert listener is running
Stopping rhcert listener
Starting listener
Traceback (most recent call last):
  File "/usr/bin/rhcert-backend", line 37, in <module>
    success = rhcertBackend.do(args)
  File "/usr/lib/python2.7/site-packages/rhcert/client/backend.py", line 162, in do
    return self.doServer(args)
  File "/usr/lib/python2.7/site-packages/rhcert/client/backend.py", line 205, in doServer
    return listener.run()
  File "/usr/lib/python2.7/site-packages/rhcert/listener/listen.py", line 89, in run
    allow_none=True)
  File "/usr/lib64/python2.7/SimpleXMLRPCServer.py", line 593, in __init__
    SocketServer.TCPServer.__init__(self, addr, requestHandler, bind_and_activate)
  File "/usr/lib64/python2.7/SocketServer.py", line 419, in __init__
    self.server_bind()
  File "/usr/lib64/python2.7/SocketServer.py", line 430, in server_bind
    self.socket.bind(self.server_address)
  File "/usr/lib64/python2.7/socket.py", line 224, in meth
    return getattr(self._sock,name)(*args)
socket.error: [Errno 98] Address already in use

[ /var/log/rhcert/RedHatCertDaemon.log ]
registered Test from rhcert.test
The rhcert listener is already started
The rhcert daemon is running
Stopping rhcert daemon
Starting daemon

Comment 1 Brian Brock 2015-01-12 20:53:19 UTC
several minutes later, without running any other commands on the system:

# rhcert-backend server start
registered Test from rhcert.test
The rhcert daemon is already started
Starting rhcert listener

# rhcert-backend server status
registered Test from rhcert.test
The rhcert daemon is running
The rhcert listener is running

Comment 2 Brian Brock 2015-01-12 21:11:33 UTC
The first time the server is started, it always gives the error.  If followed immediately with `rhcert-backend server start` (a 2nd time overall), the listener daemon will launch.

This occurs each time the server is stopped.  After `rhcert-backend server stop`, the next start will not successfully start the listener daemon and instead gives the error above.

Comment 3 Gary Case 2015-05-19 20:11:54 UTC
I'm seeing almost the identical traceback on another system. I was about to dismiss the issue as this is a notoriously unstable box, but maybe there's more to it. I have to give this machine back (it's an Intel IoT test system), so I will likely not be able to deliver it for further testing.

Installed versions
------------------
redhat-certification-hardware-1.7.1-20150304.el7.noarch
redhat-certification-1.0-20150505.el7.noarch
redhat-certification-information-1.7.1-20150304.el7.noarch

Contents of /var/log/rhcert/RedHatCertificationListener.log
-----------------------------------------------------------
The rhcert daemon is already started
The rhcert listener is running
Stopping rhcert listener
Starting listener
Traceback (most recent call last):
  File "/bin/rhcert-backend", line 37, in <module>
    success = rhcertBackend.do(args)
  File "/usr/lib/python2.7/site-packages/rhcert/client/backend.py", line 165, in do
    return self.doServer(args)
  File "/usr/lib/python2.7/site-packages/rhcert/client/backend.py", line 209, in doServer
    return listener.run()
  File "/usr/lib/python2.7/site-packages/rhcert/listener/listen.py", line 102, in run
    allow_none=True)
  File "/usr/lib64/python2.7/SimpleXMLRPCServer.py", line 593, in __init__
    SocketServer.TCPServer.__init__(self, addr, requestHandler, bind_and_activate)
  File "/usr/lib64/python2.7/SocketServer.py", line 419, in __init__
    self.server_bind()
  File "/usr/lib64/python2.7/SocketServer.py", line 430, in server_bind
    self.socket.bind(self.server_address)
  File "/usr/lib64/python2.7/socket.py", line 224, in meth
    return getattr(self._sock,name)(*args)
socket.error: [Errno 98] Address already in use

Comment 4 brose 2015-07-27 19:00:20 UTC
observations:

That error is from trying to start listener at port 8009, where it is not freed properly.  You can see the port is used with lsof -i | grep 8009.

Stopping the rhcert server (rhcert-backend server stop) AND httpd (service httpd stop) frees port 8009.

Running rhcert-backend server start again, appears to run fine, shows no issues.  But running status shows that listener is started but not daemon, even though output from running start command was that the daemon was already started...

Comment 5 brose 2015-07-27 20:44:53 UTC
status of 8009 using netstat -vatn is TIME_WAIT.

Comment 6 brose 2015-07-27 20:45:18 UTC
I mean, after calling rhcert-backend server stop.

Comment 7 brose 2015-07-27 20:53:49 UTC
Getting the Address already in use error mentioned above, EVEN THOUGH...

1) ran rhcert-backend server stop
2) service httpd stop
3) lsof -t -i:8009 has no results
4) waited until netstat -vatn | grep 8009 has no results

But after running it the first time and it fails putting out the error to the log file, running rhcert-backend server start back to back seems like it works consistently for listener.

Comment 8 Greg Nichols 2015-08-10 21:28:30 UTC
This seems to happen more on el6.

Comment 11 Greg Nichols 2015-08-13 14:25:17 UTC
*** Bug 1228240 has been marked as a duplicate of this bug. ***

Comment 12 Lenny Verkhovsky 2015-08-13 14:28:10 UTC
In my case there were no open sockets.
The following workaround worked for me

# /usr/bin/rhcert-backend server stop
# nohup /usr/bin/python /usr/bin/rhcert-backend server listener &

Comment 16 Lenny Verkhovsky 2015-09-16 06:33:52 UTC
I do consider this issue a bug. So if you need logs or remote session to debug this I will be glad to cooperate. 
Also my workaround works for me, so this is not urgent.

Comment 17 Brian Brock 2015-09-16 21:35:07 UTC
verified in:

redhat-certification-2.0-20150916.el7.noarch
redhat-certification-backend-2.0-20150916.el7.noarch

Comment 20 Greg Nichols 2015-10-28 13:17:52 UTC
*** Bug 1260866 has been marked as a duplicate of this bug. ***

Comment 22 errata-xmlrpc 2015-11-19 16:35:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2479.html

Comment 23 Red Hat Bugzilla 2023-09-14 02:53:11 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days


Note You need to log in before you can comment on or make changes to this bug.