Bug 1573973
| Summary: | VMs sometimes fail to start (no compute host available) when one controller gets removed from the cluster. | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Tomas Jamrisko <tjamrisk> | ||||||||
| Component: | opendaylight | Assignee: | Stephen Kitt <skitt> | ||||||||
| Status: | CLOSED DUPLICATE | QA Contact: | Itzik Brown <itbrown> | ||||||||
| Severity: | high | Docs Contact: | |||||||||
| Priority: | high | ||||||||||
| Version: | 13.0 (Queens) | CC: | aadam, jluhrsen, mkolesni, nyechiel, skitt, tjamrisk | ||||||||
| Target Milestone: | beta | ||||||||||
| Target Release: | --- | ||||||||||
| Hardware: | Unspecified | ||||||||||
| OS: | Unspecified | ||||||||||
| Whiteboard: | odl_netvirt, odl_ha | ||||||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||||
| Doc Text: | Story Points: | --- | |||||||||
| Clone Of: | Environment: |
N/A
|
|||||||||
| Last Closed: | 2018-05-15 08:01:17 UTC | Type: | Bug | ||||||||
| Regression: | --- | Mount Type: | --- | ||||||||
| Documentation: | --- | CRM: | |||||||||
| Verified Versions: | Category: | --- | |||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||
| Embargoed: | |||||||||||
| Attachments: |
|
||||||||||
|
Description
Tomas Jamrisko
2018-05-02 15:43:17 UTC
another job that saw this is here: https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/DFG-opendaylight-odl-netvirt-13_director-rhel-virthost-3cont_2comp-ipv4-vxlan-ha-csit/26/robot/report/log.html#s1-s5-t11-k4-k1-k3-k4-k2 the "fault" in "server show" is: {"message": "Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance 464b5640-8ef3-4237-9a3b-22fb05f29787.", "code": 500, "details": " File \"/usr/lib/python2.7/site-packages/nova/conductor/manager.py\", line 580, in build_instances | | | raise exception.MaxRetriesExceeded(reason=msg) We need to dig in the other openstack logs, I think. like nova and maybe neutron. Please attach logs from neutron & ODL. Created attachment 1432698 [details]
controller-0 karaf.log
Created attachment 1432699 [details]
odl and neutron logs for all three controllers
This seems like a duplicate of bug 1575150 judging by the logs, they have the same time outs while trying to contact ODL. The difference seems to be that here the "agents" were detected as "dead" before the VM creation so it seems like a different error, but the root cause is the same. If the other bug solution doesn't solve this one, please reopen. *** This bug has been marked as a duplicate of bug 1575150 *** |