Bug 1241793

Summary: 3-controller cluster is not properly formed
Product: Red Hat OpenStack Reporter: bigswitch <rhosp-bugs-internal>
Component: rhosp-directorAssignee: Dan Sneddon <dsneddon>
Status: CLOSED NEXTRELEASE QA Contact: yeylon <yeylon>
Severity: urgent Docs Contact:
Priority: high    
Version: 7.0 (Kilo)CC: brad, mburns, rhel-osp-director-maint, rhosp-bugs-internal, riontel, srevivo
Target Milestone: gaKeywords: Triaged
Target Release: Director   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-07-17 17:35:39 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
network-environment.yam
none
undercloud.conf none

Description bigswitch 2015-07-10 06:47:32 UTC
Description of problem:
We are using RHOSP7 beta2 to install a 3-controller cluster using vlan to isolate tenant traffic. The specific cmd we are using is as following

[stack@manager ~]$ openstack overcloud deploy -e /home/stack/network-environment.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml --bm-network-gateway 193.168.2.1 --floating-ip-cidr 192.168.2.0/24 --floating-ip-start 192.168.2.60 --floating-ip-end 192.168.2.90 --neutron-bridge-mappings datacentre:br-ex,vlan:br-bond --neutron-network-type vlan --neutron-network-vlan-ranges vlan:50:90 --neutron-disable-tunneling --compute-scale 1 --control-scale 3 --ceph-storage-scale 0 --plan-uuid 7fa9507f-8e9b-4567-b1fc-e883efe89c3e --debug

The installation went fine and "heat resource-list overcloud" shows that all resources are at CREATE_COMPLETE status, except that in the end of overcloud installation, we got following

DEBUG: neutronclient.v2_0.client Error message: {"NeutronError": {"message": "Invalid input for operation: Gateway is not valid on subnet.", "type": "InvalidInput", "detail": ""}}
ERROR: openstack Invalid input for operation: Gateway is not valid on subnet.
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/cliff/app.py", line 295, in run_subcommand
    result = cmd.run(parsed_args)
  File "/usr/lib/python2.7/site-packages/cliff/command.py", line 53, in run
    self.take_action(parsed_args)
  File "/usr/lib/python2.7/site-packages/rdomanager_oscplugin/v1/overcloud_deploy.py", line 678, in take_action
    self._deploy_postconfig(stack, parsed_args)
  File "/usr/lib/python2.7/site-packages/rdomanager_oscplugin/v1/overcloud_deploy.py", line 557, in _deploy_postconfig
    keystone_client=keystone_client,
  File "/usr/lib/python2.7/site-packages/os_cloud_config/neutron.py", line 52, in initialize_neutron
    _create_subnet(neutron_client, ext_net, network_desc, 'external', None)
  File "/usr/lib/python2.7/site-packages/os_cloud_config/neutron.py", line 137, in _create_subnet
    return neutron.create_subnet({'subnet': subnet})
  File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 102, in with_params
    ret = self.function(instance, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 597, in create_subnet
    return self.post(self.subnets_path, body=body)
  File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 298, in post
    headers=headers, params=params)
  File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 211, in do_request
    self._handle_fault_response(status_code, replybody)
  File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 185, in _handle_fault_response
    exception_handler_v20(status_code, des_error_body)
  File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 70, in exception_handler_v20
    status_code=status_code)
BadRequest: Invalid input for operation: Gateway is not valid on subnet.
DEBUG: openstackclient.shell clean_up DeployOvercloud
DEBUG: openstackclient.shell got an error: Invalid input for operation: Gateway is not valid on subnet.
ERROR: openstackclient.shell Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/openstackclient/shell.py", line 176, in run
    return super(OpenStackShell, self).run(argv)
  File "/usr/lib/python2.7/site-packages/cliff/app.py", line 230, in run
    result = self.run_subcommand(remainder)
  File "/usr/lib/python2.7/site-packages/cliff/app.py", line 295, in run_subcommand
    result = cmd.run(parsed_args)
  File "/usr/lib/python2.7/site-packages/cliff/command.py", line 53, in run
    self.take_action(parsed_args)
  File "/usr/lib/python2.7/site-packages/rdomanager_oscplugin/v1/overcloud_deploy.py", line 678, in take_action
    self._deploy_postconfig(stack, parsed_args)
  File "/usr/lib/python2.7/site-packages/rdomanager_oscplugin/v1/overcloud_deploy.py", line 557, in _deploy_postconfig
    keystone_client=keystone_client,
  File "/usr/lib/python2.7/site-packages/os_cloud_config/neutron.py", line 52, in initialize_neutron
    _create_subnet(neutron_client, ext_net, network_desc, 'external', None)
  File "/usr/lib/python2.7/site-packages/os_cloud_config/neutron.py", line 137, in _create_subnet
    return neutron.create_subnet({'subnet': subnet})
  File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 102, in with_params
    ret = self.function(instance, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 597, in create_subnet
    return self.post(self.subnets_path, body=body)
  File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 298, in post
    headers=headers, params=params)
  File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 211, in do_request
    self._handle_fault_response(status_code, replybody)
  File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 185, in _handle_fault_response
    exception_handler_v20(status_code, des_error_body)
  File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 70, in exception_handler_v20
    status_code=status_code)
BadRequest: Invalid input for operation: Gateway is not valid on subnet.

After installation, we found cluster is not running on any of the 3 controller nodes
#pcs status
#Error: cluster is not currently running on this node

Also neutron services are not properly running
[stack@manager ~]$ source overcloudrc 

[stack@manager ~]$ neutron agent-list

+--------------------------------------+--------------------+------------------------------------+-------+----------------+---------------------------+

| id                                   | agent_type         | host                               | alive | admin_state_up | binary                    |

+--------------------------------------+--------------------+------------------------------------+-------+----------------+---------------------------+

| 26caf01b-75db-49c6-bd7f-9fba512a3eb8 | Metadata agent     | overcloud-controller-2.localdomain | xxx   | True           | neutron-metadata-agent    |

| 5a1a9d85-5e2b-4c7c-b96e-198c597ab18a | Metadata agent     | overcloud-controller-0.localdomain | xxx   | True           | neutron-metadata-agent    |

| 5c7b948c-3901-4eb4-a5c6-4c32a2748c25 | Open vSwitch agent | neutron-n-1                        | xxx   | True           | neutron-openvswitch-agent |

| 6f9433b4-de37-430b-b0eb-93bfdac8ba48 | Open vSwitch agent | neutron-n-2                        | xxx   | True           | neutron-openvswitch-agent |

| 740f6601-3776-4c18-944d-1534aeb22533 | L3 agent           | overcloud-controller-1.localdomain | xxx   | True           | neutron-l3-agent          |

| ad926177-bd8d-4b56-a53a-976700534d10 | Metadata agent     | overcloud-controller-1.localdomain | xxx   | True           | neutron-metadata-agent    |

| c20507ca-60fd-4ad5-93e9-86ff3c96e0e4 | DHCP agent         | neutron-n-0                        | xxx   | True           | neutron-dhcp-agent        |

| cd54366a-e672-4961-8328-4ad8d951753c | Open vSwitch agent | neutron-n-0                        | xxx   | True           | neutron-openvswitch-agent |

| cefd81b7-8988-4494-9294-6b5515edab30 | DHCP agent         | neutron-n-1                        | xxx   | True           | neutron-dhcp-agent        |

| d399d5be-d368-441e-9e7b-9b8ce6077815 | L3 agent           | overcloud-controller-0.localdomain | xxx   | True           | neutron-l3-agent          |

| db9685dc-0dbe-4362-83bc-5af3d3d50de2 | L3 agent           | overcloud-controller-2.localdomain | xxx   | True           | neutron-l3-agent          |

| eb9375d6-d98b-43ed-bd99-0b9270001de8 | DHCP agent         | neutron-n-2                        | xxx   | True           | neutron-dhcp-agent        |

| eee301e8-694e-48d6-9160-e976ebb5000a | Open vSwitch agent | overcloud-compute-0.localdomain    | :-)   | True           | neutron-openvswitch-agent |

+--------------------------------------+--------------------+------------------------------------+-------+----------------+---------------------------+


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 3 Mike Burns 2015-07-10 11:32:17 UTC
*** Bug 1241801 has been marked as a duplicate of this bug. ***

Comment 4 chris alfonso 2015-07-10 17:12:23 UTC
Was pcsstatus run as non-root?

Comment 5 Mike Burns 2015-07-10 17:25:42 UTC
Also, this error makes it seem like you have a misconfiguration:

ERROR: openstack Invalid input for operation: Gateway is not valid on subnet.

Comment 6 bigswitch 2015-07-10 17:36:08 UTC
We are not clear what does this error message mean here. What gateway does it referring to? Is it --bm-network-gateway 193.168.2.1, which is the external network gateway, or the pxe network gateway? Where should we fix the configuration? I attached our network-environment.yaml and undercloud.conf. Could you please point out where is our mistake? 

Even if the gateway configuration is wrong, it has nothing to do with HA cluster being not formed. 3 controller nodes talks to each other without using any gateway. They are in the same L2 network.

Comment 7 bigswitch 2015-07-10 17:40:02 UTC
Created attachment 1050764 [details]
network-environment.yam

Comment 8 bigswitch 2015-07-10 17:40:25 UTC
Created attachment 1050765 [details]
undercloud.conf

Comment 9 Mike Burns 2015-07-10 17:42:03 UTC
Can you confirm that you ran pcs status as root or with sudo?  If you run as non-root, it won't work.

Comment 10 bigswitch 2015-07-10 20:38:55 UTC
We just did a fresh installation again. We use the exact same configuration as attached above. The only difference this time is that we set up the external gateway 192.168.2.1 and it is properly working.

However, at the end of overcloud installation, we ran into the same error

DEBUG: neutronclient.v2_0.client Error message: {"NeutronError": {"message": "Invalid input for operation: Gateway is not valid on subnet.", "type": "InvalidInput", "detail": ""}}
ERROR: openstack Invalid input for operation: Gateway is not valid on subnet.
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/cliff/app.py", line 295, in run_subcommand
    result = cmd.run(parsed_args)
  File "/usr/lib/python2.7/site-packages/cliff/command.py", line 53, in run
    self.take_action(parsed_args)
  File "/usr/lib/python2.7/site-packages/rdomanager_oscplugin/v1/overcloud_deploy.py", line 678, in take_action
    self._deploy_postconfig(stack, parsed_args)
  File "/usr/lib/python2.7/site-packages/rdomanager_oscplugin/v1/overcloud_deploy.py", line 557, in _deploy_postconfig
    keystone_client=keystone_client,
  File "/usr/lib/python2.7/site-packages/os_cloud_config/neutron.py", line 52, in initialize_neutron
    _create_subnet(neutron_client, ext_net, network_desc, 'external', None)
  File "/usr/lib/python2.7/site-packages/os_cloud_config/neutron.py", line 137, in _create_subnet
    return neutron.create_subnet({'subnet': subnet})
  File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 102, in with_params
    ret = self.function(instance, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 597, in create_subnet
    return self.post(self.subnets_path, body=body)
  File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 298, in post
    headers=headers, params=params)
  File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 211, in do_request
    self._handle_fault_response(status_code, replybody)
  File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 185, in _handle_fault_response
    exception_handler_v20(status_code, des_error_body)
  File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 70, in exception_handler_v20
    status_code=status_code)
BadRequest: Invalid input for operation: Gateway is not valid on subnet.
DEBUG: openstackclient.shell clean_up DeployOvercloud
DEBUG: openstackclient.shell got an error: Invalid input for operation: Gateway is not valid on subnet.
ERROR: openstackclient.shell Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/openstackclient/shell.py", line 176, in run
    return super(OpenStackShell, self).run(argv)
  File "/usr/lib/python2.7/site-packages/cliff/app.py", line 230, in run
    result = self.run_subcommand(remainder)
  File "/usr/lib/python2.7/site-packages/cliff/app.py", line 295, in run_subcommand
    result = cmd.run(parsed_args)
  File "/usr/lib/python2.7/site-packages/cliff/command.py", line 53, in run
    self.take_action(parsed_args)
  File "/usr/lib/python2.7/site-packages/rdomanager_oscplugin/v1/overcloud_deploy.py", line 678, in take_action
    self._deploy_postconfig(stack, parsed_args)
  File "/usr/lib/python2.7/site-packages/rdomanager_oscplugin/v1/overcloud_deploy.py", line 557, in _deploy_postconfig
    keystone_client=keystone_client,
  File "/usr/lib/python2.7/site-packages/os_cloud_config/neutron.py", line 52, in initialize_neutron
    _create_subnet(neutron_client, ext_net, network_desc, 'external', None)
  File "/usr/lib/python2.7/site-packages/os_cloud_config/neutron.py", line 137, in _create_subnet
    return neutron.create_subnet({'subnet': subnet})
  File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 102, in with_params
    ret = self.function(instance, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 597, in create_subnet
    return self.post(self.subnets_path, body=body)
  File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 298, in post
    headers=headers, params=params)
  File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 211, in do_request
    self._handle_fault_response(status_code, replybody)
  File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 185, in _handle_fault_response
    exception_handler_v20(status_code, des_error_body)
  File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 70, in exception_handler_v20
    status_code=status_code)
BadRequest: Invalid input for operation: Gateway is not valid on subnet.


The good news is that this time pcs status shows following
[heat-admin@overcloud-controller-0 ~]$ sudo pcs status
Cluster name: tripleo_cluster
Last updated: Fri Jul 10 16:18:34 2015
Last change: Thu Jul  9 20:40:37 2015
Stack: corosync
Current DC: overcloud-controller-0 (1) - partition with quorum
Version: 1.1.12-a14efad
3 Nodes configured
112 Resources configured


Online: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]

Full list of resources:

 ip-192.168.1.51	(ocf::heartbeat:IPaddr2):	Started overcloud-controller-0 
 ip-192.168.1.52	(ocf::heartbeat:IPaddr2):	Started overcloud-controller-1 
 Clone Set: haproxy-clone [haproxy]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 ip-172.18.0.10	(ocf::heartbeat:IPaddr2):	Started overcloud-controller-2 
 ip-192.168.2.10	(ocf::heartbeat:IPaddr2):	Started overcloud-controller-0 
 ip-172.17.0.10	(ocf::heartbeat:IPaddr2):	Started overcloud-controller-1 
 Master/Slave Set: galera-master [galera]
     Masters: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 ip-172.19.0.10	(ocf::heartbeat:IPaddr2):	Started overcloud-controller-2 
 Master/Slave Set: redis-master [redis]
     Masters: [ overcloud-controller-2 ]
     Slaves: [ overcloud-controller-0 overcloud-controller-1 ]
 Clone Set: mongod-clone [mongod]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: rabbitmq-clone [rabbitmq]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: memcached-clone [memcached]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-nova-scheduler-clone [openstack-nova-scheduler]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: neutron-l3-agent-clone [neutron-l3-agent]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-ceilometer-alarm-notifier-clone [openstack-ceilometer-alarm-notifier]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-heat-engine-clone [openstack-heat-engine]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-ceilometer-api-clone [openstack-ceilometer-api]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: neutron-metadata-agent-clone [neutron-metadata-agent]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: neutron-ovs-cleanup-clone [neutron-ovs-cleanup]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: neutron-netns-cleanup-clone [neutron-netns-cleanup]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-heat-api-clone [openstack-heat-api]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-cinder-scheduler-clone [openstack-cinder-scheduler]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-nova-api-clone [openstack-nova-api]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-heat-api-cloudwatch-clone [openstack-heat-api-cloudwatch]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-ceilometer-collector-clone [openstack-ceilometer-collector]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-keystone-clone [openstack-keystone]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-nova-consoleauth-clone [openstack-nova-consoleauth]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-glance-registry-clone [openstack-glance-registry]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-ceilometer-notification-clone [openstack-ceilometer-notification]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-cinder-api-clone [openstack-cinder-api]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: neutron-dhcp-agent-clone [neutron-dhcp-agent]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-glance-api-clone [openstack-glance-api]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: neutron-scale-clone [neutron-scale] (unique)
     neutron-scale:0	(ocf::neutron:NeutronScale):	Started overcloud-controller-0 
     neutron-scale:1	(ocf::neutron:NeutronScale):	Started overcloud-controller-1 
     neutron-scale:2	(ocf::neutron:NeutronScale):	Started overcloud-controller-2 
 Clone Set: neutron-openvswitch-agent-clone [neutron-openvswitch-agent]
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-nova-novncproxy-clone [openstack-nova-novncproxy]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: delay-clone [delay]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: neutron-server-clone [neutron-server]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-ceilometer-central-clone [openstack-ceilometer-central]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-ceilometer-alarm-evaluator-clone [openstack-ceilometer-alarm-evaluator]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-heat-api-cfn-clone [openstack-heat-api-cfn]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 openstack-cinder-volume	(systemd:openstack-cinder-volume):	Started overcloud-controller-0 
 Clone Set: openstack-nova-conductor-clone [openstack-nova-conductor]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]

Failed actions:
    neutron-ovs-cleanup_start_0 on overcloud-controller-2 'unknown error' (1): call=146, status=complete, exit-reason='none', last-rc-change='Thu Jul  9 15:32:01 2015', queued=0ms, exec=888ms
    neutron-ovs-cleanup_start_0 on overcloud-controller-0 'unknown error' (1): call=148, status=complete, exit-reason='none', last-rc-change='Thu Jul  9 20:33:23 2015', queued=0ms, exec=2140ms
    neutron-ovs-cleanup_start_0 on overcloud-controller-1 'unknown error' (1): call=147, status=complete, exit-reason='none', last-rc-change='Thu Jul  9 15:33:00 2015', queued=0ms, exec=1100ms


PCSD Status:
  overcloud-controller-0: Online
  overcloud-controller-1: Online
  overcloud-controller-2: Online

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled


rabbitmq cluster is running
[root@overcloud-controller-0 heat-admin]# rabbitmqctl cluster_status
Cluster status of node 'rabbit@overcloud-controller-0' ...
[{nodes,[{disc,['rabbit@overcloud-controller-0',
                'rabbit@overcloud-controller-1',
                'rabbit@overcloud-controller-2']}]},
 {running_nodes,['rabbit@overcloud-controller-2',
                 'rabbit@overcloud-controller-1',
                 'rabbit@overcloud-controller-0']},
 {cluster_name,<<"rabbit">>},
 {partitions,[]}]
...done.


However, neutron agents are still not in :-) state
[stack@manager ~]$ source overcloudrc 
[stack@manager ~]$ neutron agent-list
+--------------------------------------+--------------------+------------------------------------+-------+----------------+---------------------------+
| id                                   | agent_type         | host                               | alive | admin_state_up | binary                    |
+--------------------------------------+--------------------+------------------------------------+-------+----------------+---------------------------+
| 3bd22205-9af9-4472-a117-03a709acb1bf | Metadata agent     | overcloud-controller-0.localdomain | xxx   | True           | neutron-metadata-agent    |
| 5445ec02-0ded-4e45-b9b0-8c88f165b65e | Open vSwitch agent | neutron-n-0                        | xxx   | True           | neutron-openvswitch-agent |
| 62086b64-ac74-4bbc-a423-3b9287264cc2 | Metadata agent     | overcloud-controller-1.localdomain | xxx   | True           | neutron-metadata-agent    |
| 86f7a4dc-390c-4758-b301-e2d61391d1b7 | DHCP agent         | neutron-n-1                        | xxx   | True           | neutron-dhcp-agent        |
| c5815f49-e6ef-4e8d-89ab-4d023b86fdd3 | L3 agent           | overcloud-controller-0.localdomain | xxx   | True           | neutron-l3-agent          |
| d9181b30-6434-4ee8-8522-c1b5e4055c23 | Open vSwitch agent | overcloud-compute-0.localdomain    | :-)   | True           | neutron-openvswitch-agent |
| f8fd0bb6-d7d5-40a3-a00d-7fe025d9af15 | DHCP agent         | neutron-n-0                        | xxx   | True           | neutron-dhcp-agent        |
| fd2de258-b8a4-45eb-a61c-0352cb098853 | L3 agent           | overcloud-controller-1.localdomain | xxx   | True           | neutron-l3-agent          |
+--------------------------------------+--------------------+------------------------------------+-------+----------------+---------------------------+



After I dig into neutron agent logs, I found following
[root@overcloud-controller-0 heat-admin]# cat /var/log/neutron/l3-agent.log 
2015-07-09 20:33:14.888 17433 WARNING oslo_config.cfg [-] Option "use_namespaces" from group "DEFAULT" is deprecated for removal.  Its value may be silently ignored in the future.
2015-07-09 20:34:14.975 17433 WARNING neutron.agent.l3.agent [req-c1a72d51-0bb9-49b9-93e7-1069601062e9 ] l3-agent cannot check service plugins enabled on the neutron server. Retrying. Detail message: Timed out waiting for a reply to message ID 2224ccbd863847c0a134f3f4252cd802
2015-07-09 20:34:15.106 17433 ERROR neutron.agent.l3.agent [-] Failed synchronizing routers due to RPC error
2015-07-09 20:34:15.106 17433 TRACE neutron.agent.l3.agent Traceback (most recent call last):
2015-07-09 20:34:15.106 17433 TRACE neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 507, in fetch_and_sync_all_routers
2015-07-09 20:34:15.106 17433 TRACE neutron.agent.l3.agent     routers = self.plugin_rpc.get_routers(context)
2015-07-09 20:34:15.106 17433 TRACE neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 91, in get_routers
2015-07-09 20:34:15.106 17433 TRACE neutron.agent.l3.agent     router_ids=router_ids)
2015-07-09 20:34:15.106 17433 TRACE neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 156, in call
2015-07-09 20:34:15.106 17433 TRACE neutron.agent.l3.agent     retry=self.retry)
2015-07-09 20:34:15.106 17433 TRACE neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/oslo_messaging/transport.py", line 90, in _send
2015-07-09 20:34:15.106 17433 TRACE neutron.agent.l3.agent     timeout=timeout, retry=retry)
2015-07-09 20:34:15.106 17433 TRACE neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 350, in send
2015-07-09 20:34:15.106 17433 TRACE neutron.agent.l3.agent     retry=retry)
2015-07-09 20:34:15.106 17433 TRACE neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 341, in _send
2015-07-09 20:34:15.106 17433 TRACE neutron.agent.l3.agent     raise result
2015-07-09 20:34:15.106 17433 TRACE neutron.agent.l3.agent RemoteError: Remote error: AgentNotFoundByTypeHost Agent with agent_type=L3 agent and host=overcloud-controller-0.localdomain could not be found
2015-07-09 20:34:15.106 17433 TRACE neutron.agent.l3.agent [u'Traceback (most recent call last):\n', u'  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 142, in _dispatch_and_reply\n    executor_callback))\n', u'  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 186, in _dispatch\n    executor_callback)\n', u'  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 130, in _do_dispatch\n    result = func(ctxt, **new_args)\n', u'  File "/usr/lib/python2.7/site-packages/neutron/api/rpc/handlers/l3_rpc.py", line 81, in sync_routers\n    context, host, router_ids))\n', u'  File "/usr/lib/python2.7/site-packages/neutron/db/l3_agentschedulers_db.py", line 290, in list_active_sync_routers_on_active_l3_agent\n    context, constants.AGENT_TYPE_L3, host)\n', u'  File "/usr/lib/python2.7/site-packages/neutron/db/agents_db.py", line 197, in _get_agent_by_type_and_host\n    host=host)\n', u'AgentNotFoundByTypeHost: Agent with agent_type=L3 agent and host=overcloud-controller-0.localdomain could not be found\n'].
2015-07-09 20:34:15.106 17433 TRACE neutron.agent.l3.agent 
2015-07-09 20:34:33.110 17433 ERROR oslo_messaging._drivers.impl_rabbit [-] Failed to consume message from queue: 


[root@overcloud-controller-0 heat-admin]# cat /var/log/neutron/metadata-agent.log 
2015-07-09 20:34:25.159 18346 ERROR neutron.agent.metadata.agent [-] Failed reporting state!
2015-07-09 20:34:25.159 18346 TRACE neutron.agent.metadata.agent Traceback (most recent call last):
2015-07-09 20:34:25.159 18346 TRACE neutron.agent.metadata.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/metadata/agent.py", line 303, in _report_state
2015-07-09 20:34:25.159 18346 TRACE neutron.agent.metadata.agent     use_call=self.agent_state.get('start_flag'))
2015-07-09 20:34:25.159 18346 TRACE neutron.agent.metadata.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/rpc.py", line 80, in report_state
2015-07-09 20:34:25.159 18346 TRACE neutron.agent.metadata.agent     return method(context, 'report_state', **kwargs)
2015-07-09 20:34:25.159 18346 TRACE neutron.agent.metadata.agent   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 156, in call
2015-07-09 20:34:25.159 18346 TRACE neutron.agent.metadata.agent     retry=self.retry)
2015-07-09 20:34:25.159 18346 TRACE neutron.agent.metadata.agent   File "/usr/lib/python2.7/site-packages/oslo_messaging/transport.py", line 90, in _send
2015-07-09 20:34:25.159 18346 TRACE neutron.agent.metadata.agent     timeout=timeout, retry=retry)
2015-07-09 20:34:25.159 18346 TRACE neutron.agent.metadata.agent   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 350, in send
2015-07-09 20:34:25.159 18346 TRACE neutron.agent.metadata.agent     retry=retry)
2015-07-09 20:34:25.159 18346 TRACE neutron.agent.metadata.agent   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 339, in _send
2015-07-09 20:34:25.159 18346 TRACE neutron.agent.metadata.agent     result = self._waiter.wait(msg_id, timeout)
2015-07-09 20:34:25.159 18346 TRACE neutron.agent.metadata.agent   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 243, in wait
2015-07-09 20:34:25.159 18346 TRACE neutron.agent.metadata.agent     message = self.waiters.get(msg_id, timeout=timeout)
2015-07-09 20:34:25.159 18346 TRACE neutron.agent.metadata.agent   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 149, in get
2015-07-09 20:34:25.159 18346 TRACE neutron.agent.metadata.agent     'to message ID %s' % msg_id)
2015-07-09 20:34:25.159 18346 TRACE neutron.agent.metadata.agent MessagingTimeout: Timed out waiting for a reply to message ID 1d043cca2f7f4f7b90bae0fd18f603f9
2015-07-09 20:34:25.159 18346 TRACE neutron.agent.metadata.agent 
2015-07-09 20:34:25.182 18346 WARNING neutron.openstack.common.loopingcall [-] task <bound method UnixDomainMetadataProxy._report_state of <neutron.agent.metadata.agent.UnixDomainMetadataProxy object at 0x168ad90>> run outlasted interval by 30.06 sec


To summarize,
1. What was the reason we got "ERROR: openstack Invalid input for operation: Gateway is not valid on subnet." at the end of the overcloud installation? If it is a configuration mistake, where did we do wrong?
2. Does the failed actions in pcs status output matter? If they do, what is the workaround?
3. Please let us know if you need any particular log to debug the neutron agent problem. We'll attach those required logs.

Comment 12 Brad P. Crochet 2015-07-17 17:22:18 UTC
This should no longer be an issue for GA, since we pulled out the neutron network setup in the post deploy config.

Comment 13 Mike Burns 2015-07-17 17:35:39 UTC
This will be fixed with the GA release in the 2015-07-17.1 puddle