Doc Text:
|
Previously, OpenStack was using the NeutronScale puppet resource that was enabled on controller nodes and tasked with rewriting the neutron agents' "host" entries to look like "neutron-n-0" on controller 0 or "neutron-n-1" on controller 1. This renaming was done toward the end of the deployment, when the corresponding neutron-scale resource was started by pacemaker. Mostly reported in VM environments, neutron would subsequently complain about not having enough L3 agents for L3 HA, and there would be inconsistency in the overcloud neutron agent-list. Consequently, in some cases, the error manifested itself in an error message from Neutron that there were not enough L3 agents to provide HA (the default minimum of 2). The "neutron agent-list" command on the overcloud would show inconsistency in the agents; for example, duplicate entries for each agent with both the original agent on host "overcloud-controller-1.localdomain" (typically shown "XXX") and the "newer" agent on host "neutron-n-1" (alive status ":-)", or at least eventually). In other cases, agent renaming would cause one of the neutron agents, openvswitch, to fail when there was only one controller, and then the rest of the agents under it would also fail to start as they were chained, resulting in no L3, metadata, or dhcp agents.
This problem has been fixed by ensuring that the native neutron L3 High Availability is used, and that enough DHCP agents per network are enabled for native neutron HA. The latter is a needed addition as it was previously statically set at two in all cases. This was added as a configurable parameter in the tripleo heat templates with a default value of '3' and also wired up to deploy in the oscplugin. The NeutronScale resource itself has been removed from the tripleo heat templates where the overcloud controller puppet manifest is kept. As a result, deployments made after this fix will not have the neutron-scale resource on controller nodes, which can be verified by the following commands:
1. On a controller node:
# pcs status | grep -n neutron -A 1
You should not see any "neutron-scale" clone set or resource definition.
2. On the undercloud:
$ source overcloudrc
$ neutron agent-list
All the neutron agents should be reported as being on a host with a name like "overcloud-controller-0.localdomain" or "overcloud-controller-2.localdomain" but not "neutron-n-0" or "neutron-n-2".
|