Bug 617703 - the rhsmcertd is logging "update failed (1), retry in 10 minutes"
the rhsmcertd is logging "update failed (1), retry in 10 minutes"
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: subscription-manager (Show other bugs)
6.1
All Linux
low Severity high
: rc
: ---
Assigned To: Bryan Kearney
wes hayutin
:
Depends On:
Blocks: 568421
  Show dependency treegraph
 
Reported: 2010-07-23 15:01 EDT by John Sefler
Modified: 2011-05-19 09:41 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2011-05-19 09:41:54 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2011:0611 normal SHIPPED_LIVE new package: subscription-manager 2011-05-18 13:56:21 EDT

  None (edit)
Description John Sefler 2010-07-23 15:01:01 EDT
Description of problem:
I do not know what is causing this, but the rhsmcertd is logging failures and I've seen this in the past few builds of subscription-manager-cli on three different clients.


Version-Release number of selected component (if applicable):
[root@client01-rhel6-beta2 src]# rpm -q subscription-manager
subscription-manager-0.68-1.el6.i686
[root@jsefler-rhel6-clientpin tmp]# rpm -q subscription-manager
subscription-manager-0.72-1.el6.i686


How reproducible:


Steps to Reproduce:
install subscription-manager and configure it to connect to a candlepin
# mv /var/log/rhsm/rhsmcertd.log /var/log/rhsm/rhsmcertd.log.bak
# /etc/init.d/rhsmcertd restart
Stopping rhsmcertd                                         [  OK  ]
Starting rhsmcertd 240                                     [  OK  ]
# tail /var/log/rhsm/rhsmcertd.log
Fri Jul 23 14:48:26 2010: started: interval = 240 minutes
Fri Jul 23 14:48:26 2010: update failed (1), retry in 10 minutes
  
Actual results:
update failed (1)

Expected results:
certificates updated
Comment 1 John Sefler 2010-07-23 15:09:16 EDT
I suspect this has something to do with a needed CA cert from the candlepin
server.
Comment 2 John Sefler 2010-07-26 14:26:29 EDT
I believe this issue is a consequence of the fix/disablement done for https://bugzilla.redhat.com/show_bug.cgi?id=617303
Comment 3 Ajay Kumar Nadathur Sreenivasan 2010-08-17 11:34:38 EDT
Developer comments:
------------------
I am able to reproduce the above scenario by running subscription-manager against host not running candlepin(server shutdown) which is expected. 

From the code I see that the daemon prints out these failure messages when the certmgr.py blows up; I have added extra try/except to log any failures to /var/log/rhsm/rhsm.log so that, if this happens in the future, we can look at the logs and agree whether its a bug or not.

git sha: d54ab44d2acc62e97eff351a4a1dfa5ea148aee7
Comment 4 John Sefler 2010-09-08 11:04:08 EDT
While attempting to verify this bugzilla, I seem to have run into the opposite problem.  Here's what I did:

[root@jsefler-rhel6-client01 ~]# cat /etc/rhsm/rhsm.conf
# Red Hat Subscription Manager Configuration File
# Contents of this file will be overwritten on registration. 

# Remote server Hostname
hostname=jsefler-f12-candlepin.usersys.redhat.com

# Remote Server port
port=8443

# Remote base URL
baseurl= http://redhat.com/content

# Directory containing consumer key & certificate 
consumerid = /etc/pki/consumer

# The frequency used to refresh certificates (mins)
certFrequency=1

# Flag to enable Unsupported entitlement pools in GUI
# change this value to 1 to enable this option
showIncompatiblePools = 0

# Flip this flag to 1 to Enable insecure mode.
insecure=1

# server CA certificate location.
candlepin_ca_file = None


[root@jsefler-rhel6-client01 ~]# tail -f /var/log/rhsm/rhsmcertd.log
Wed Sep  8 10:17:18 2010: started: interval = 1 minutes
Wed Sep  8 10:17:18 2010: certificates updated
Wed Sep  8 10:18:18 2010: certificates updated
Wed Sep  8 10:19:19 2010: certificates updated


while this was tailing on my client machine, I stopped the tomcat service on my on-premises candlepin server:

[root@jsefler-f12-candlepin ruby]# /etc/init.d/tomcat6 stop
Stopping tomcat6:                                          [  OK  ]


despite stopping the server, the rhsmcertd.log continues to show the certificates are updated:   How is this possible?

Wed Sep  8 10:20:19 2010: certificates updated
Wed Sep  8 10:21:19 2010: certificates updated
Wed Sep  8 10:22:19 2010: certificates updated
Wed Sep  8 10:23:19 2010: certificates updated
Wed Sep  8 10:24:19 2010: certificates updated
Wed Sep  8 10:25:20 2010: certificates updated
Wed Sep  8 10:26:20 2010: certificates updated
Wed Sep  8 10:27:20 2010: certificates updated


[jsefler@jsefler ~]$ telnet jsefler-f12-candlepin.usersys.redhat.com 8443
Trying 10.16.120.146...
telnet: connect to address 10.16.120.146: Connection refused

This indicates to me that my candlepin server is definitely down.


[root@jsefler-rhel6-client01 ~]# tail -f /var/log/rhsm/rhsm.log

2010-09-08 11:01:25,704 [INFO] __init__() @connection.py:136 - Connection Established: host: jsefler-f12-candlepin.usersys.redhat.com, port: 8443, handler: /candlepin
2010-09-08 11:01:25,707 [INFO] __init__() @connection.py:137 - Connection using cert_file: /etc/pki/consumer/cert.pem, key_file: /etc/pki/consumer/key.pem, ca_file: None insecure_mode: True
2010-09-08 11:01:25,707 [INFO] perform() @certlib.py:120 - Disconnected, not updated
2010-09-08 11:01:25,709 [INFO] __init__() @connection.py:136 - Connection Established: host: jsefler-f12-candlepin.usersys.redhat.com, port: 8443, handler: /candlepin
2010-09-08 11:01:25,709 [INFO] __init__() @connection.py:137 - Connection using cert_file: /etc/pki/consumer/cert.pem, key_file: /etc/pki/consumer/key.pem, ca_file: None insecure_mode: True
2010-09-08 11:01:25,709 [INFO] perform() @factlib.py:53 - Disconnected, facts not updated


This shows that a Connection Established.  Again.  How is this possible?

Moving back to ON_DEV.  I suspect an exception is getting swallowed.
Comment 5 Ajay Kumar Nadathur Sreenivasan 2010-09-09 10:54:50 EDT
I tried doing the same steps as mentioned above and could not reproduce it.
Using: http://download.devel.redhat.com/brewroot/packages/subscription-manager/0.76/1.el6/i686/subscription-manager-0.76-1.el6.i686.rpm
1. Stop the server:
  -----------------
[ajay@garuda-rh src{master}]$ ~/softwares/apache-tomcat-6.0.26/bin/shutdown.sh 
Using CATALINA_BASE:   /home/ajay/softwares/apache-tomcat-6.0.26
Using CATALINA_HOME:   /home/ajay/softwares/apache-tomcat-6.0.26
Using CATALINA_TMPDIR: /home/ajay/softwares/apache-tomcat-6.0.26/temp
Using JRE_HOME:        /usr/lib/jvm/java-1.6.0
Using CLASSPATH:       /home/ajay/softwares/apache-tomcat-6.0.26/bin/bootstrap.jar
Sep 9, 2010 10:26:52 AM org.apache.catalina.startup.Catalina stopServer

2. Start the daemon:
 -------------------
[ajay@garuda-rh src{master}]$ sudo sh rhsmcertd.init.d start
Starting rhsmcertd 1                                       [  OK  ]

3. rhsmcertd log:
  --------------
[ajay@garuda-rh ~]$ tail -f /var/log/rhsm/rhsmcertd.log 
Thu Sep  9 10:26:58 2010: started: interval = 1 minutes
Thu Sep  9 10:26:58 2010: update failed (255), retry in 1 minutes
Thu Sep  9 10:27:58 2010: update failed (255), retry in 1 minutes
Thu Sep  9 10:28:58 2010: update failed (255), retry in 1 minutes
Thu Sep  9 10:29:58 2010: update failed (255), retry in 1 minutes
Thu Sep  9 10:30:58 2010: update failed (255), retry in 1 minutes

4. rhsm log:
  -----------
Error: [Errno 111] Connection refused
2010-09-09 10:52:01,432 [INFO] __init__() @connection.py:136 - Connection Established: host: garuda-rh, port: 8443, handler: /candlepin
2010-09-09 10:52:01,434 [INFO] __init__() @connection.py:137 - Connection using cert_file: /etc/pki/consumer/cert.pem, key_file: /etc/pki/consumer/key.pem, ca_file: /etc/candlepin/certs/candlepin-ca.crt1 insecure_mode: True
2010-09-09 10:52:01,436 [INFO] _request() @connection.py:64 - loading ca_file located at: /etc/candlepin/certs/candlepin-ca.crt1
2010-09-09 10:52:01,436 [INFO] _request() @connection.py:66 - work in insecure mode ?:True
2010-09-09 10:52:01,438 [ERROR] <module>() @certmgr.py:85 - Error while updating certificates using daemon
2010-09-09 10:52:01,438 [ERROR] <module>() @certmgr.py:87 - [Errno 111] Connection refused
Traceback (most recent call last):
  File "/usr/share/rhsm/certmgr.py", line 83, in <module>
    main()
  File "/usr/share/rhsm/certmgr.py", line 75, in main
    updates = mgr.update()
  File "/usr/share/rhsm/certmgr.py", line 66, in update
    updates += lib.update()
  File "/usr/share/rhsm/certlib.py", line 50, in update
    return action.perform()
  File "/usr/share/rhsm/certlib.py", line 124, in perform
    expected = self.getExpected(uep, report)
  File "/usr/share/rhsm/certlib.py", line 144, in getExpected
    exp = uep.getCertificateSerials()
  File "/usr/share/rhsm/certlib.py", line 241, in getCertificateSerials
    reply = UEPConnection.getCertificateSerials(self, self.uuid)
  File "/usr/share/rhsm/connection.py", line 231, in getCertificateSerials
    return self.conn.request_get(method)
  File "/usr/share/rhsm/connection.py", line 96, in request_get
    return self._request("GET", method)
  File "/usr/share/rhsm/connection.py", line 75, in _request
    headers=self.headers)
  File "/usr/lib/python2.6/httplib.py", line 874, in request
    self._send_request(method, url, body, headers)
  File "/usr/lib/python2.6/httplib.py", line 911, in _send_request
    self.endheaders()
  File "/usr/lib/python2.6/httplib.py", line 868, in endheaders
    self._send_output()
  File "/usr/lib/python2.6/httplib.py", line 740, in _send_output
    self.send(msg)
  File "/usr/lib/python2.6/httplib.py", line 699, in send
    self.connect()
  File "/usr/lib/python2.6/site-packages/M2Crypto/httpslib.py", line 50, in connect
    self.sock.connect((self.host, self.port))
  File "/usr/lib/python2.6/site-packages/M2Crypto/SSL/Connection.py", line 177, in connect
    self.socket.connect(addr)
  File "<string>", line 1, in connect
error: [Errno 111] Connection refused

5. conf contents:
   -------------
hostname=garuda-rh
port=8443
baseurl= http://redhat.com/content
consumerid = /etc/pki/consumer
certFrequency=1
candlepin_ca_file=/etc/candlepin/certs/candlepin-ca.crt1
insecure = 1
Comment 6 John Sefler 2010-09-09 15:32:58 EDT
Try again without being registered.
The scenario in comment #4 is occurring without being registered (ie. there are no certificates on the client).

Moving back to ON_DEV
Comment 7 Ajay Kumar Nadathur Sreenivasan 2010-09-09 16:45:39 EDT
Yes, I am able to reproduce the above scenario.

Behavior after the fix:

rhsm.log:
--------
2010-09-09 16:45:37,817 [ERROR] main() @certmgr.py:75 - Either the consumer is not registered with candlepin or the certificates are corrupted. Certificate updation using daemon failed.

rhsmcertd.log:
-------------
Thu Sep  9 16:42:38 2010: update failed (255), retry in 1 minutes
Thu Sep  9 16:43:39 2010: update failed (255), retry in 1 minutes
Thu Sep  9 16:44:39 2010: update failed (255), retry in 1 minutes
Thu Sep  9 16:44:39 2010: update failed (255), retry in 1 minutes
Thu Sep  9 16:45:39 2010: update failed (255), retry in 1 minutes


fix in sha: 3f61ee61e596506b66faa57588420a1db09672a1
Comment 8 John Sefler 2010-09-09 17:54:33 EDT
VERSION:
[root@jsefler-rhel6-client01 ~]# rpm -q subscription-manager
subscription-manager-0.77-1.git.1.3f61ee6.fc12.i386


Here is a tail of the /var/log/rhsm/rhsmcertd.log while interjecting "> events" by hand.

> restarted rhsmcertd with a frequency of 1 minute with nobody registered
Thu Sep  9 17:37:28 2010: started: interval = 1 minutes
Thu Sep  9 17:37:28 2010: update failed (255), retry in 1 minutes

> then registered a consumer
Thu Sep  9 17:38:29 2010: certificates updated
Thu Sep  9 17:39:29 2010: certificates updated
Thu Sep  9 17:40:30 2010: certificates updated

> then stopped my tomcat6 candlepin server
Thu Sep  9 17:41:30 2010: update failed (255), retry in 1 minutes
Thu Sep  9 17:42:30 2010: update failed (255), retry in 1 minutes

> the started my tomcat6 candlepin server
Thu Sep  9 17:43:31 2010: certificates updated

> then unregistered my consumer
Thu Sep  9 17:44:32 2010: update failed (255), retry in 1 minutes
Thu Sep  9 17:45:32 2010: update failed (255), retry in 1 minutes


I also verified the /var/log/rhsm/rhsm.log  It looked good and contained the following after unregisterring:
2010-09-09 17:44:32,031 [ERROR] main() @certmgr.py:75 - Either the consumer is not registered with candlepin or the certificates are corrupted. Certificate updation using daemon failed.
2010-09-09 17:45:32,171 [ERROR] main() @certmgr.py:75 - Either the consumer is not registered with candlepin or the certificates are corrupted. Certificate updation using daemon failed.

MOVING TO VERIFIED

PS:  Is "updation" a word?
Comment 10 errata-xmlrpc 2011-05-19 09:41:54 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2011-0611.html

Note You need to log in before you can comment on or make changes to this bug.