1022192 – It should be possible to run more than one nova-consoleauth services without running memcached

Bug 1022192 - It should be possible to run more than one nova-consoleauth services without running memcached

Summary: It should be possible to run more than one nova-consoleauth services without ...

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	openstack-nova
Sub Component:
Version:	3.0
Hardware:	All
OS:	All
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	7.0 (Kilo)
Assignee:	Nikola Dipanov
QA Contact:	nlevinki
Docs Contact:
URL:	https://blueprints.launchpad.net/nova...
Whiteboard:	upstream_milestone_next upstream_stat...
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2013-10-22 16:40 UTC by Dan Yocum
Modified:	2019-09-09 15:50 UTC (History)
CC List:	11 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2015-12-15 15:44:42 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Launchpad	1243306	None	None	None	Never
OpenStack gerrit	68155	None	None	None	Never
OpenStack gerrit	75140	None	None	None	Never
Red Hat Knowledge Base (Solution)	1418723	None	None	None	Never

Description Dan Yocum 2013-10-22 16:40:21 UTC

Description of problem:

When >1 nova-consoleauth services are running on the same cloud, i.e., multiple controller systems for HA, nova-novncproxy fails to establish a connection to the VNC console.  Only after right-clicking and select "Reload Frame" in the VNC iframe, does a connection succeed.

Version-Release number of selected component (if applicable):

Folsom and Grizzly

How reproducible:

Every

Steps to Reproduce:
1.  Install a pair of controller nodes, each with nova-consoleauth and nova-novncproxy service running
2.  Navigate to the dashboard, click on a running VM, and select the VNC (Folsom) or Console (Grizzly) tab.


Actual results:

Red bar with text: "Failed to connect to server (code: 1006)"

Expected results:

Grey bar with text: "Connected (encrypted) to: QEMU (instance-00000390)"

Additional info:

Known bug, apparently.  Lame solution (e.g., don't run >1 consoleauth service)

https://bugs.launchpad.net/horizon/+bug/1068602

Comment 2 Xavier Queralt 2013-11-04 12:05:26 UTC

Actually the solution provided upstream is wrong.

nova-consoleauth has been supporting storing the tokens in memcache since folsom (See https://bugs.launchpad.net/nova/+bug/989337) which allows having multiple services in the same system. When not configured the tokens will be stored in memory causing the problem you and the bug you link describe.

To make consoleauth use memcached you'll need to:

 1. Install and start memcached in one of the hosts:
   - yum install memcached
   - chkconfig memcached on && service memcached start

 2. Set the key 'memcached_servers' under DEFAULT section of nova.conf to point to the ip and port where memached is listening (do this in all the hosts with consoleauth):
    - memcached_servers = <memcached ip>:11211

 4. Restart all of the consoleauth services

With this configuration, all the consoleauth services will know about all the tokens and it will be possible to have multiple services running in the same cluster.

Comment 3 Dan Yocum 2013-11-04 13:56:55 UTC

The goal is to make all services - including memcached - clustered, shared-nothing for high availability.  Installing *a single* memcached server is, again, not solving the problem.

Why is that all the other services, glance, nova-api, etc. appear to not have the problem that nova-consoleauth has?  I think nova-consoleauth has a problem that needs to be fixed.

Comment 4 Xavier Queralt 2013-11-06 11:07:41 UTC

Ouch, I read the description too fast and missed the HA part, sorry.

I agree with you on that consoleauth has a problem and it should be solved if we want to have several instances in the same cluster.

The main problem here is that consoleauth wrongly using memcached to store the tokens and the connection info instead of using it only as a cache for the data in the database (which is what memcached is meant for).

I've just opened a blueprint upstream (see bz's url) for this issue and I'll propose a patch soon.

Comment 5 Dan Yocum 2013-11-06 12:47:16 UTC

But, let's back up a minute - I'm not using memcached anywhere for anything (I probably should be for the dashboard, but... that's a different topic), so you can leave memcached out of the equation.

Comment 6 Xavier Queralt 2013-11-06 14:58:56 UTC

(In reply to Dan Yocum from comment #5)
> But, let's back up a minute - I'm not using memcached anywhere for anything
> (I probably should be for the dashboard, but... that's a different topic),
> so you can leave memcached out of the equation.

Sure, I get what you mean. I was just explaining what consoleauth is doing wrong with memcached right now.

What I am suggesting (see linked blueprint) is to make consoleauth store the tokens and the connection info in the database when registering them so other services of the same type can access them. This way we won't have to rely on only one consoleauth or memcached service. The caching thing would be kept the same, where you can configure it to use memcached optionally.

I hope it makes sense now.

Comment 8 Dan Yocum 2014-02-24 16:26:38 UTC

Does that mean that the issue is resolved in Icehouse, then?

Comment 9 Stephen Gordon 2014-02-24 16:29:09 UTC

(In reply to Dan Yocum from comment #8)
> Does that mean that the issue is resolved in Icehouse, then?

TBD, code has been submitted but it has not yet been merged (then of course there is the question of whether it passes testing ;)).

Comment 13 Matt Ruzicka 2014-06-04 17:26:33 UTC

Stephen, can you verify if this was merged into Icehouse?

Comment 14 Stephen Gordon 2014-06-04 17:36:32 UTC

Can't imagine it was based on upstream state. It made it to POST because a patch was submitted but never progressed to MODIFIED because it was not merged.

Comment 15 Matt Ruzicka 2014-06-04 18:32:20 UTC

Pardon my ignorance on process, but where does that leave this BZ. Does it need to be moved to/approved for Target Release 6 at this point?

Comment 18 Nikola Dipanov 2014-07-03 10:13:10 UTC

So to summarize the points that were mentioned above before closing the bug.

1) The behavior described in the bug originally was due to misconfiguration (we need to be running memchached and configure all the consolauth services to use it) as described in comment #2

2) THe HA side of thing that has been raised in comment #3 has not been fully addressed so I will address it here. When running multiple memchached servers, python memcached client we ship (python-memcached-1.48-4.el7.noarch) and nova consoleauth uses is smart enough to treat them as a simple consistent hashing ring. Basically if one of the servers goes down - it's tokens will be lost and all the sessions that were stored on it will be invalidated, but further writes and thus authentication will still work as long as there is at least one server running. We have agreed that this is sufficient for us to consider this resilient.

Based on this - closing as NOTABUG, however feel free to revisit in case you disagree with the above.

Comment 19 Dan Yocum 2015-01-07 21:17:33 UTC

This bug is still an issue in Icehouse.

No, consoleauth must not REQUIRE memcached in an HA environment - nothing else does.

The blueprint referred to in comment #4 was unapproved. 

What happens, now?

Comment 21 Nikola Dipanov 2015-03-12 16:17:52 UTC

So we would really recommend not pursuing the direction of the patch (storing tokens in the DB).

The fanout topic seems like a much better choice.

Comment 22 Nikola Dipanov 2015-03-12 16:21:44 UTC

This however will require work upstream - so targeting this bug for RHOS 7 (although it is ulikely that it will merge in Kilo at this point.

Pablo - is customer fine with this being worked on for the next version?

Comment 25 Eoghan Glynn 2015-12-15 15:44:42 UTC

This bug was closed as part of a backlog clean up.
If you see value in tracking this bug please re-open it.

Note You need to log in before you can comment on or make changes to this bug.