Bug 1238336 - 2/3 requests for Nova instance VNC console fail in HA overcloud
Summary: 2/3 requests for Nova instance VNC console fail in HA overcloud
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 7.0 (Kilo)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ga
: Director
Assignee: Giulio Fidente
QA Contact: Marius Cornea
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-07-01 15:46 UTC by Marius Cornea
Modified: 2015-08-05 13:58 UTC (History)
10 users (show)

Fixed In Version: openstack-tripleo-heat-templates-0.8.6-43.el7ost
Doc Type: Bug Fix
Doc Text:
Controller nodes did not share consoleauth tokens, which caused failures with parts of authentication requests. This fix incorporates memcached to share consoleauth tokens. Authentication requests are now successful.
Clone Of:
Environment:
Last Closed: 2015-08-05 13:58:09 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
OpenStack gerrit 202744 None None None Never
Red Hat Product Errata RHEA-2015:1549 normal SHIPPED_LIVE Red Hat Enterprise Linux OpenStack Platform director Release 2015-08-05 17:49:10 UTC

Description Marius Cornea 2015-07-01 15:46:15 UTC
Description of problem:
I'm running a HA setup with 3 controllers and 1 compute on baremetal with network isolation. When trying to access an overcloud instance console via Horizon 2/3 requests fail. It appears that it loads only when requests are hitting one of the controllers (overcloud-controller-1 in my tests)

How reproducible:
100%

Steps to Reproduce:
1. Deploy overcloud with 3 controllers and network isolation
2. Run instance on the overcloud
3. Load VNC console of that instance from Horizon
4. If you get a Failed to connect to server (code: 1006) message hit refresh. 

Actual results:
Only 1 of 3 requests get the console loaded.

Expected results:
Console loads all the time.

Additional info:
Addresses in 10.35.169.0 network are in the internalAPI network and 10.35.173.10 is the public IP in the external network.  

[heat-admin@overcloud-controller-0 ~]$ sudo grep novncproxy_base_url /etc/nova/nova.conf  | grep -v ^#
novncproxy_base_url=http://10.35.169.14:6080/vnc_auto.html
[heat-admin@overcloud-controller-0 ~]$ sudo lsof -i :6080 -n -P
COMMAND     PID    USER   FD   TYPE  DEVICE SIZE/OFF NODE NAME
haproxy   22719 haproxy    3u  IPv4 1656925      0t0  TCP 10.35.173.10:6080->10.34.131.165:38384 (ESTABLISHED)
haproxy   22719 haproxy   31u  IPv4   69189      0t0  TCP 10.35.169.10:6080 (LISTEN)
haproxy   22719 haproxy   32u  IPv4   69190      0t0  TCP 10.35.173.10:6080 (LISTEN)
haproxy   22719 haproxy   59u  IPv4 1656927      0t0  TCP 10.35.169.14:34645->10.35.169.11:6080 (ESTABLISHED)
nova-novn 51784    nova    4u  IPv4  152844      0t0  TCP 10.35.169.14:6080 (LISTEN)


[heat-admin@overcloud-controller-1 ~]$ sudo grep novncproxy_base_url /etc/nova/nova.conf  | grep -v ^#
novncproxy_base_url=http://10.35.169.11:6080/vnc_auto.html
[heat-admin@overcloud-controller-1 ~]$ sudo lsof -i :6080 -n -P
COMMAND     PID    USER   FD   TYPE  DEVICE SIZE/OFF NODE NAME
haproxy   24322 haproxy   31u  IPv4   42687      0t0  TCP 10.35.169.10:6080 (LISTEN)
haproxy   24322 haproxy   32u  IPv4   42688      0t0  TCP 10.35.173.10:6080 (LISTEN)
nova-novn 49539    nova    4u  IPv4  101299      0t0  TCP 10.35.169.11:6080 (LISTEN)
nova-novn 83318    nova    4u  IPv4  101299      0t0  TCP 10.35.169.11:6080 (LISTEN)
nova-novn 83318    nova    6u  IPv4 1672483      0t0  TCP 10.35.169.11:6080->10.35.169.14:34645 (ESTABLISHED)


[heat-admin@overcloud-controller-2 ~]$ sudo grep novncproxy_base_url /etc/nova/nova.conf  | grep -v ^#
novncproxy_base_url=http://10.35.169.12:6080/vnc_auto.html
[heat-admin@overcloud-controller-2 ~]$ sudo lsof -i :6080 -n -P
COMMAND     PID    USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
haproxy   21934 haproxy   31u  IPv4  46511      0t0  TCP 10.35.169.10:6080 (LISTEN)
haproxy   21934 haproxy   32u  IPv4  46512      0t0  TCP 10.35.173.10:6080 (LISTEN)
nova-novn 49523    nova    4u  IPv4 145539      0t0  TCP 10.35.169.12:6080 (LISTEN)


[heat-admin@overcloud-compute-0 ~]$ sudo grep novncproxy_base_url /etc/nova/nova.conf  | grep -v ^#
novncproxy_base_url=http://10.35.173.10:6080/vnc_auto.html

[heat-admin@overcloud-controller-0 ~]$ grep -A6 nova_novncproxy /etc/haproxy/haproxy.cfg 
listen nova_novncproxy
  bind 10.35.169.10:6080 
  bind 10.35.173.10:6080 
  option httpchk GET /
  server overcloud-controller-0 10.35.169.14:6080 check fall 5 inter 2000 rise 2
  server overcloud-controller-1 10.35.169.11:6080 check fall 5 inter 2000 rise 2
  server overcloud-controller-2 10.35.169.12:6080 check fall 5 inter 2000 rise 2

Comment 3 chris alfonso 2015-07-01 17:42:52 UTC
Is there a way we can work around this for GA if the once controller with the vncproxy fails?

Comment 4 Jiri Stransky 2015-07-03 16:16:43 UTC
Still not sure about the root cause but by observed behavior it might be related to consoleauth tokens being valid for only one backend server. I'm guessing this will need to be fixed rather than worked around. More investigation tbd.

Comment 7 Ryan O'Hara 2015-07-16 19:02:11 UTC
Could you try adding 'balance source' right below the bind lines in haproxy.cfg for this particular proxy?

Comment 10 errata-xmlrpc 2015-08-05 13:58:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2015:1549


Note You need to log in before you can comment on or make changes to this bug.