Description of problem: In IPv6 deployments Redis replication reports the internal_api vip and the redis vip addresses as slaves. Version-Release number of selected component (if applicable): openstack-tripleo-heat-templates-0.8.14-1.el7ost.noarch How reproducible: Steps to Reproduce: 1. Deploy HA IPv6 environment 2. Check the redis replication on the master node Actual results: [root@overcloud-controller-0 heat-admin]# nc fd00:fd00:fd00:2000::15 6379 AUTH yazJ2ppHyj8gE2HBRfv6cMADA +OK info replication $345 # Replication role:master connected_slaves:2 slave0:ip=fd00:fd00:fd00:2000::11,port=6379,state=online,offset=914214,lag=0 slave1:ip=fd00:fd00:fd00:2000::10,port=6379,state=online,offset=907896,lag=31 master_repl_offset:914282 repl_backlog_active:1 repl_backlog_size:1048576 repl_backlog_first_byte_offset:2 repl_backlog_histlen:914281 fd00:fd00:fd00:2000::11 and fd00:fd00:fd00:2000::10 respectively are the internal api and redis vips: neutron port-list | egrep 'internal_api|redis' | awk {'print $4 $12 $11'} redis_virtual_ip|"fd00:fd00:fd00:2000::10"} internal_api_virtual_ip|"fd00:fd00:fd00:2000::11"} Expected results: The local IP addresses should be reported as the slave's addresses. Additional info: The VIPs show as the first address when listing the addresses so maybe this is why they show as the slave addresses as well. [root@overcloud-controller-1 heat-admin]# ip a | grep fd00:fd00:fd00:2000 inet6 fd00:fd00:fd00:2000::10/64 scope global inet6 fd00:fd00:fd00:2000::13/64 scope global [root@overcloud-controller-2 heat-admin]# ip a | grep fd00:fd00:fd00:2000 inet6 fd00:fd00:fd00:2000::11/64 scope global inet6 fd00:fd00:fd00:2000::14/64 scope global I'm not sure if this can cause any harm during failover but I think it'd be best if we could have them report the local IPs instead of the vips.
After troubleshooting this with Marius it looks like the IP addresses seen in the Redis 'info replication' output are those which the slaves use to connect to the master (the slaves source address). When the slaves have multiple IP addresses on the same network, which is often a case for us given we don't know where Pacemaker will be hosting the VIPs, the source address which Redis uses to connect to the master might not be the same where it is binding on and I don't think this is a misbehaviour on the Redis part. When a relocation is initiated Pacemaker will always provide to the slaves the master address by its hostname, not its IP address so the slaves will always connect to the IP address which the name resolve to; the network against which we resolve hostnames is user-customizable via ControllerHostnameResolveNetwork parameter and it defaults to internal_api; the slaves will never get provided with the Redis VIP as their master. All things considered, I am inclined to not consider this a bug. Marius, what do you think?
Thanks, Giulio. I agree, as mentioned in the initial report I don't think this is causing trouble. Nevertheless I would like to run a complete set of failover tests so I can be sure that it doesn't break during failover. I'll get back with my results.
This bug did not make the OSP 8.0 release. It is being deferred to OSP 10.
I'm closing this as not a bug as it seems to be only cosmetic. During failover synchronization completed ok for the node that holds the Redis VIP: redis_virtual_ip { "ip_address": "fd00:fd00:fd00:2000::11"} On the master node we can see: Slave fd00:fd00:fd00:2000::11:6379 asks for synchronization Full resync requested by slave fd00:fd00:fd00:2000::11:6379 Starting BGSAVE for SYNC with target: disk Background saving started by pid 25930 DB saved on disk RDB: 2 MB of memory used by copy-on-write Background saving terminated with success Synchronization with slave fd00:fd00:fd00:2000::11:6379 succeeded