Description of problem: Running tempest scenario tests fail with: 'No valid host was found. There are not enough hosts available.' In addition, when running 'nova service-list' on overcloud, there is no compute service in the list Version-Release number of selected component (if applicable): 7 (using puddle) How reproducible: 99% Steps to Reproduce: 1. Deploy OSP-d 7 on existing openstack (OVB) 2. Run tempest tests or 'nova service-list' on overcloud Actual results: 'No valid host was found. There are not enough hosts available.' Compute service is not available Expected results: Successful tests Additional info:
It seems that the compute host didn't come up properly because rabbit wasn't up, yet. I'd fix that before looking for other causes.
Looks like the credentials in the compute node are incorrect: 2016-03-01 12:06:51.784 14588 ERROR oslo_messaging._drivers.impl_rabbit [-] AMQP server 172.25.0.9:5672 closed the connection. Check login credentials: Socket closed 2016-03-01 12:07:51.864 14588 ERROR oslo_messaging._drivers.impl_rabbit [-] AMQP server 172.25.0.9:5672 closed the connection. Check login credentials: Socket closed
This seems to be related to installer and not nova. Switching from 'openstack-nova' to 'rhel-osp-director' component.
Reassigned to Ben Nemec (invented OVB). Ben, could you have a look here? Thanks.
I think the compute node is connecting fine once rabbit is up, and it does eventually contact conductor (the waiting for conductor messages stop), but I'm seeing a lot of =ERROR REPORT==== 1-Mar-2016::12:21:17 === closing AMQP connection <0.4617.1> (172.25.0.8:60761 -> 172.25.0.9:5672): {inet_error,etimedout} in the rabbit logs on the controller (note that it appears 172.25.0.8 is the compute node, .9 is the controller). I think we need to find the cause of these connection timeouts. Is there any way I could get access to the environment where this is being seen?
Issue resolved. MTU should be set to 1450 on all nodes interfaces. This should be part of OVB templates.