Bug 1678868
Summary: | OSP14 - networking-ansible - Error contacting Ironic server: Node <uuid> can not be updated while a state transition is in progress. (HTTP 409) | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Chris Janiszewski <cjanisze> |
Component: | python-networking-ansible | Assignee: | Dan Radez <dradez> |
Status: | CLOSED DUPLICATE | QA Contact: | Arkady Shtempler <ashtempl> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 14.0 (Rocky) | CC: | cjanisze, jlibosva, michapma |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2019-02-21 08:59:56 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Chris Janiszewski
2019-02-19 18:50:04 UTC
We've encountered an issue with similar symptoms on QE environment. The cause was that Dell machines took long time after restart to boot up - it was about 5 minutes from the point where node was powered on till it got to the boot loader phase. The workaround was to bump api_max_retries settings for nova's ironic client. The original issue is described in bug 1647005. Chris, any chance you can try to change this value and re-test if you still hit the issue? I'm pasting here Doc text form the bug 1647005 for your convenience: Nova-compute ironic driver tries to update BM node while the node is being cleaned up. The cleaning takes approximately five minutes but nova-compute attempts to update the node for approximately two minutes. After timeout, nova-compute stops and puts nova instance into ERROR state. As a workaround, set the following configuration option for nova-compute service: [ironic] api_max_retries = 180 As a result, nova-compute continues to attempt to update BM node longer and eventually succeeds. Adjusting value for api_max_retries has worked around this issue. [ironic] api_max_retries = 180 that extra time that networking-ansible need to take to set the port in my environment must make a difference to go over the default retries value. I wonder if we could inject this into templates for ironic defaults. The BM nodes to take longer then VMs to delete, due to cleanup steps and whatever needs to be done on the switch for multi-tenancy. (In reply to Chris Janiszewski from comment #2) > Adjusting value for api_max_retries has worked around this issue. > [ironic] > api_max_retries = 180 > > that extra time that networking-ansible need to take to set the port in my > environment must make a difference to go over the default retries value. I > wonder if we could inject this into templates for ironic defaults. The BM > nodes to take longer then VMs to delete, due to cleanup steps and whatever > needs to be done on the switch for multi-tenancy. We should, I'm working on it here: https://review.openstack.org/#/c/638119/ I'm closing this bug as a dup of the bug 1647005 but thanks for your findings. I raised priority of the original bug and started working on it right away. *** This bug has been marked as a duplicate of bug 1647005 *** |