654456 – subscription manager caches failed connection

Bug 654456 - subscription manager caches failed connection

Summary: subscription manager caches failed connection

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Candlepin
Classification:	Community
Component:	candlepin
Sub Component:
Version:	0.5
Hardware:	Unspecified
OS:	Unspecified
Priority:	low
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Chris Duryee
QA Contact:	spandey
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	Entitlement-Beta
TreeView+	depends on / blocked

Reported:	2010-11-17 21:52 UTC by Eric Blake
Modified:	2015-05-14 15:31 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2011-02-02 19:04:27 UTC
Embargoed:

Attachments	(Terms of Use)
error rhsm.log (141.73 KB, text/x-log) 2011-01-21 09:47 UTC, spandey	no flags	Details
strace.out (8.57 KB, text/plain) 2011-02-02 13:02 UTC, spandey	no flags	Details
View All

Description Eric Blake 2010-11-17 21:52:53 UTC

Description of problem:
The subscription manager remembers if it failed to connect to the entitlement server, and even if steps are taken to allow that connection, it requires restarting the subscription manager before a new connection attempt can take place.

Version-Release number of selected component (if applicable):
subscription-manager-gnome-0.92.9-1.el6.x86_64


How reproducible:
100%

Steps to Reproduce:
1. take down the vpn connection
2. start subscription-manager-gui
3. try to register a system
4. restore the vpn connection
5. try to register a system
  
Actual results:
At step 3, the attempt fails with 'network error, unable to connect to server', which is expected.  But at step 5, even though 'ping candlepin.redhat.com' succeeds, the manager still gives the same network error failure.  In other words, it is caching the failed connection attempt and not retrying.

Expected results:
each attempt to connect should start from scratch, in case I've brought up the vpn in the meantime

Workaround:
exit the subscription manager gui, bring up the network, then restart connection manager; at that point, the connection attempt will succeed.

Additional info:
http://post-office.corp.redhat.com/archives/entitlement-feedback/2010-November/msg00010.html

Comment 1 Devan Goodwin 2010-11-18 18:40:17 UTC

I think this is ok in the new UI.

Tested by shutting down the tomcat6 server hosting candlepin, attempts to register and such will popup a network error dialog. I can then bring the server back up and re-attempt (without re-starting RHSM gui) and the operation succeeds. This is likely because of how we manage connections vs the globals used before.

Comment 2 spandey 2011-01-21 09:47:14 UTC

Again I am able to repro issue

Prequisities : 

subscription-manager-0.93.13-1.el6.x86_64
subscription-manager-gnome-0.93.13-1.el6.x86_64
subscription-manager-firstboot-0.93.13-1.el6.x86_64

Steps to verify : 

1) Disable client network
2) Invoke Subscription manager gui 
3) try to register client to candlepin 
4) Enable client network 
5) try to register to candlepin 

Expected Result :  
Client registration process should fail at step 3 
Client should register to candlepin at step 5 

Actual Result : 
Client registration process is getting failed at step 3 and 5.
Able to register client to candlepin  after subscription manager restart .

Comment 3 spandey 2011-01-21 09:47:46 UTC

Created attachment 474605 [details]
error rhsm.log

Comment 4 Chris Duryee 2011-01-27 19:30:54 UTC

I'm not able to reproduce this. Can you do me a favor, and run the following during step #2?

strace -f -s 128 -e network -o /tmp/strace.out subscription-manager-gui

After that, attach strace.out to this bug and I'll take a look at what's going on.

Comment 5 spandey 2011-02-02 13:00:46 UTC

Again I am bale to repro this 

Attached strace.out

Comment 6 spandey 2011-02-02 13:02:10 UTC

Created attachment 476563 [details]
strace.out

Comment 7 Chris Duryee 2011-02-02 19:04:27 UTC

Here is what's going on:

The issue at hand is related to dnsmasq, a dns caching utility that runs on most systems.

When you start up sm-gui, glibc will determine the best network interface to make DNS calls on. In this case, it will be the loopback interface, since that's the only one with a running DNS "server" (dnsmasq). All calls to this will return "host not found", which is correct behavior since you're not connected to anything and the cache is empty.

Once you restart your network, dnsmasq will still return "host not found" for subscriptions.webqa.redhat.com, since it uses a fancy algorithm that doesn't prefer company-specific DNS servers over public servers (see http://www.thekelleys.org.uk/dnsmasq/docs/FAQ for more detail). This usually isn't an issue, since glibc will try eth0 for the DNS query and it will work. However, if the process is still only using the loopback interface, this is an issue and you get the "host not found" error.

I'm going to mark this as NOTABUG, since it's a design decision with glibc and dnsmasq, let me know if you disagree and we can look into it more.

Note You need to log in before you can comment on or make changes to this bug.