Description of problem: I'm running a HA setup with 3 controllers and 1 compute on baremetal with network isolation. When trying to access an overcloud instance console via Horizon 2/3 requests fail. It appears that it loads only when requests are hitting one of the controllers (overcloud-controller-1 in my tests) How reproducible: 100% Steps to Reproduce: 1. Deploy overcloud with 3 controllers and network isolation 2. Run instance on the overcloud 3. Load VNC console of that instance from Horizon 4. If you get a Failed to connect to server (code: 1006) message hit refresh. Actual results: Only 1 of 3 requests get the console loaded. Expected results: Console loads all the time. Additional info: Addresses in 10.35.169.0 network are in the internalAPI network and 10.35.173.10 is the public IP in the external network. [heat-admin@overcloud-controller-0 ~]$ sudo grep novncproxy_base_url /etc/nova/nova.conf | grep -v ^# novncproxy_base_url=http://10.35.169.14:6080/vnc_auto.html [heat-admin@overcloud-controller-0 ~]$ sudo lsof -i :6080 -n -P COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME haproxy 22719 haproxy 3u IPv4 1656925 0t0 TCP 10.35.173.10:6080->10.34.131.165:38384 (ESTABLISHED) haproxy 22719 haproxy 31u IPv4 69189 0t0 TCP 10.35.169.10:6080 (LISTEN) haproxy 22719 haproxy 32u IPv4 69190 0t0 TCP 10.35.173.10:6080 (LISTEN) haproxy 22719 haproxy 59u IPv4 1656927 0t0 TCP 10.35.169.14:34645->10.35.169.11:6080 (ESTABLISHED) nova-novn 51784 nova 4u IPv4 152844 0t0 TCP 10.35.169.14:6080 (LISTEN) [heat-admin@overcloud-controller-1 ~]$ sudo grep novncproxy_base_url /etc/nova/nova.conf | grep -v ^# novncproxy_base_url=http://10.35.169.11:6080/vnc_auto.html [heat-admin@overcloud-controller-1 ~]$ sudo lsof -i :6080 -n -P COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME haproxy 24322 haproxy 31u IPv4 42687 0t0 TCP 10.35.169.10:6080 (LISTEN) haproxy 24322 haproxy 32u IPv4 42688 0t0 TCP 10.35.173.10:6080 (LISTEN) nova-novn 49539 nova 4u IPv4 101299 0t0 TCP 10.35.169.11:6080 (LISTEN) nova-novn 83318 nova 4u IPv4 101299 0t0 TCP 10.35.169.11:6080 (LISTEN) nova-novn 83318 nova 6u IPv4 1672483 0t0 TCP 10.35.169.11:6080->10.35.169.14:34645 (ESTABLISHED) [heat-admin@overcloud-controller-2 ~]$ sudo grep novncproxy_base_url /etc/nova/nova.conf | grep -v ^# novncproxy_base_url=http://10.35.169.12:6080/vnc_auto.html [heat-admin@overcloud-controller-2 ~]$ sudo lsof -i :6080 -n -P COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME haproxy 21934 haproxy 31u IPv4 46511 0t0 TCP 10.35.169.10:6080 (LISTEN) haproxy 21934 haproxy 32u IPv4 46512 0t0 TCP 10.35.173.10:6080 (LISTEN) nova-novn 49523 nova 4u IPv4 145539 0t0 TCP 10.35.169.12:6080 (LISTEN) [heat-admin@overcloud-compute-0 ~]$ sudo grep novncproxy_base_url /etc/nova/nova.conf | grep -v ^# novncproxy_base_url=http://10.35.173.10:6080/vnc_auto.html [heat-admin@overcloud-controller-0 ~]$ grep -A6 nova_novncproxy /etc/haproxy/haproxy.cfg listen nova_novncproxy bind 10.35.169.10:6080 bind 10.35.173.10:6080 option httpchk GET / server overcloud-controller-0 10.35.169.14:6080 check fall 5 inter 2000 rise 2 server overcloud-controller-1 10.35.169.11:6080 check fall 5 inter 2000 rise 2 server overcloud-controller-2 10.35.169.12:6080 check fall 5 inter 2000 rise 2
Is there a way we can work around this for GA if the once controller with the vncproxy fails?
Still not sure about the root cause but by observed behavior it might be related to consoleauth tokens being valid for only one backend server. I'm guessing this will need to be fixed rather than worked around. More investigation tbd.
Could you try adding 'balance source' right below the bind lines in haproxy.cfg for this particular proxy?
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2015:1549