Description of problem: I'm trying to deploy RHOS 13 with various topologies, but having more than 10 nodes of same type. I've tried 3 Controllers + 11 computes and 1 Controller + 12 Computes and fail with a same issue. Version-Release number of selected component (if applicable): 2018-04-19.2 How reproducible: 100% Steps to Reproduce: 1. Try to deploy 1 Controller + 12 Compute nodes with Infrared, partially (only virsh + undercloud part) or fully 2. 3. Actual results: Stack overcloud CREATE_FAILED with overcloud.Compute.7. Nova list reports following line: | 24df54fa-2374-431e-a2c9-cddfb6e8f709 | compute-7 | BUILD | scheduling | NOSTATE | | and ironic log contains record like: ironic.conductor.manager [req-ce1043d9-0179-467b-88c5-c230edd5c46f b715c5845eb34c06b950712c5e012bfa fd53286d965846608ba8f3b15674ef5e - default default] Error in deploy of node 4104c21e-fcc4-4250-9f19-70f3156cb1f6: IPMI call failed: power on.: IPMIFailure: IPMI call failed: power on. Expected results: All nodes are powered on and deployed successfully Additional info: Controller in this topology was set to 4 vCPU, 20 Gb RAM, each Compute node was set to 2 vCPU, 4 Gb of RAM to be able to fit 32 thread CPU/128 Gb server. Powering on missed node manually works (with openstack baremetal node power on), but it fails to boot and participate in the deployment.
I've encountered this elsewhere. Te underlying cause appears to be coming up from libvirt when the domain is being powered on, which cascades causing other failures elsewhere in the test environment. I think someone who has a great deal of libvirt experience should likely take a look at this 2018-04-24 15:23:56,314.314 15263 ERROR VirtualBMC [-] Error powering on the domain controller-0. Error: Network not found: no network with matching name 'data' 2018-04-24 15:23:56,314.314 15379 ERROR VirtualBMC [-] Error powering on the domain ceph-0. Error: Network not found: no network with matching name 'data' 2018-04-24 15:23:57,053.053 15552 ERROR VirtualBMC [-] Error powering on the domain compute-0. Error: Network not found: no network with matching name 'management' 2018-04-24 15:23:57,193.193 15495 ERROR VirtualBMC [-] Error powering on the domain compute-1. Error: Network not found: no network with matching name 'management' Although libvirt does not log any errors. Compute-0's log: 2018-04-24 15:23:44.364+0000: shutting down, reason=shutdown 2018-04-24 15:23:56.454+0000: shutting down, reason=failed Journalctl: Apr 24 15:23:31 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: 2018-04-24 15:23:31.742+0000: 28171: info : libvirt version: 3.9.0, package: 14.el7_5.2 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2018-03-20-14:43:16, x86-041.build.eng.bos.redhat.com) Apr 24 15:23:31 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: 2018-04-24 15:23:31.742+0000: 28171: info : hostname: sealusa10.mobius.lab.eng.rdu2.redhat.com Apr 24 15:23:31 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: 2018-04-24 15:23:31.742+0000: 28171: error : qemuMonitorIO:704 : internal error: End of file from qemu monitor Apr 24 15:23:37 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: 2018-04-24 15:23:37.284+0000: 28171: error : qemuMonitorIO:704 : internal error: End of file from qemu monitor Apr 24 15:23:42 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: 2018-04-24 15:23:42.278+0000: 28171: error : qemuMonitorIO:704 : internal error: End of file from qemu monitor Apr 24 15:23:42 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: 2018-04-24 15:23:42.876+0000: 28171: error : qemuMonitorIO:704 : internal error: End of file from qemu monitor Apr 24 15:23:43 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: 2018-04-24 15:23:43.265+0000: 28171: error : qemuMonitorIO:704 : internal error: End of file from qemu monitor Apr 24 15:23:43 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: 2018-04-24 15:23:43.307+0000: 28171: error : qemuMonitorIO:704 : internal error: End of file from qemu monitor Apr 24 15:23:48 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: 2018-04-24 15:23:48.144+0000: 28171: error : qemuMonitorIO:704 : internal error: End of file from qemu monitor Apr 24 15:23:48 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: 2018-04-24 15:23:48.166+0000: 28171: error : qemuMonitorIO:704 : internal error: End of file from qemu monitor Apr 24 15:23:56 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: 2018-04-24 15:23:56.309+0000: 28174: error : virHashSearch:727 : Hash operation not allowed during iteration Apr 24 15:23:56 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: 2018-04-24 15:23:56.309+0000: 28174: error : virHashSearch:727 : Hash operation not allowed during iteration Apr 24 15:23:56 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: 2018-04-24 15:23:56.309+0000: 28174: error : virHashSearch:727 : Hash operation not allowed during iteration Apr 24 15:23:56 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: 2018-04-24 15:23:56.309+0000: 28174: error : virHashSearch:727 : Hash operation not allowed during iteration Apr 24 15:23:56 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: 2018-04-24 15:23:56.310+0000: 28175: error : virHashSearch:727 : Hash operation not allowed during iteration Apr 24 15:23:56 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: 2018-04-24 15:23:56.310+0000: 28175: error : virHashSearch:727 : Hash operation not allowed during iteration Apr 24 15:23:56 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: 2018-04-24 15:23:56.310+0000: 28175: error : virHashSearch:727 : Hash operation not allowed during iteration Apr 24 15:23:56 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: 2018-04-24 15:23:56.310+0000: 28175: error : virHashSearch:727 : Hash operation not allowed during iteration Apr 24 15:23:56 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: 2018-04-24 15:23:56.313+0000: 28174: error : virHashSearch:727 : Hash operation not allowed during iteration Apr 24 15:23:56 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: 2018-04-24 15:23:56.313+0000: 28175: error : virHashSearch:727 : Hash operation not allowed during iteration Apr 24 15:23:56 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: 2018-04-24 15:23:56.313+0000: 28174: error : virHashSearch:727 : Hash operation not allowed during iteration Apr 24 15:23:56 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: 2018-04-24 15:23:56.313+0000: 28175: error : virHashSearch:727 : Hash operation not allowed during iteration Apr 24 15:23:56 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: 2018-04-24 15:23:56.313+0000: 28174: error : virHashSearch:727 : Hash operation not allowed during iteration Apr 24 15:23:56 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: 2018-04-24 15:23:56.313+0000: 28175: error : virHashSearch:727 : Hash operation not allowed during iteration Apr 24 15:23:56 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: 2018-04-24 15:23:56.454+0000: 34996: error : virHashSearch:727 : Hash operation not allowed during iteration Apr 24 15:23:56 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: 2018-04-24 15:23:56.607+0000: 28172: error : virHashSearch:727 : Hash operation not allowed during iteration Apr 24 15:23:56 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: 2018-04-24 15:23:56.767+0000: 34996: error : virHashSearch:727 : Hash operation not allowed during iteration Apr 24 15:23:56 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: 2018-04-24 15:23:56.904+0000: 28172: error : virHashSearch:727 : Hash operation not allowed during iteration
(In reply to Julia Kreger from comment #1) > I've encountered this elsewhere. Te underlying cause appears to be coming up > from libvirt when the domain is being powered on, which cascades causing > other failures elsewhere in the test environment. I think someone who has a > great deal of libvirt experience should likely take a look at this > > 2018-04-24 15:23:56,314.314 15263 ERROR VirtualBMC [-] Error powering on the > domain controller-0. Error: Network not found: no network with matching name > 'data' > 2018-04-24 15:23:56,314.314 15379 ERROR VirtualBMC [-] Error powering on the > domain ceph-0. Error: Network not found: no network with matching name 'data' > 2018-04-24 15:23:57,053.053 15552 ERROR VirtualBMC [-] Error powering on the > domain compute-0. Error: Network not found: no network with matching name > 'management' > 2018-04-24 15:23:57,193.193 15495 ERROR VirtualBMC [-] Error powering on the > domain compute-1. Error: Network not found: no network with matching name > 'management' > > Although libvirt does not log any errors. Compute-0's log: > > 2018-04-24 15:23:44.364+0000: shutting down, reason=shutdown > 2018-04-24 15:23:56.454+0000: shutting down, reason=failed > > Journalctl: > > Apr 24 15:23:31 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: > 2018-04-24 15:23:31.742+0000: 28171: info : libvirt version: 3.9.0, package: > 14.el7_5.2 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, > 2018-03-20-14:43:16, x86-041.build.eng.bos.redhat.com) > Apr 24 15:23:31 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: > 2018-04-24 15:23:31.742+0000: 28171: info : hostname: > sealusa10.mobius.lab.eng.rdu2.redhat.com > Apr 24 15:23:31 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: > 2018-04-24 15:23:31.742+0000: 28171: error : qemuMonitorIO:704 : internal > error: End of file from qemu monitor > Apr 24 15:23:37 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: > 2018-04-24 15:23:37.284+0000: 28171: error : qemuMonitorIO:704 : internal > error: End of file from qemu monitor > Apr 24 15:23:42 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: > 2018-04-24 15:23:42.278+0000: 28171: error : qemuMonitorIO:704 : internal > error: End of file from qemu monitor > Apr 24 15:23:42 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: > 2018-04-24 15:23:42.876+0000: 28171: error : qemuMonitorIO:704 : internal > error: End of file from qemu monitor > Apr 24 15:23:43 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: > 2018-04-24 15:23:43.265+0000: 28171: error : qemuMonitorIO:704 : internal > error: End of file from qemu monitor > Apr 24 15:23:43 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: > 2018-04-24 15:23:43.307+0000: 28171: error : qemuMonitorIO:704 : internal > error: End of file from qemu monitor > Apr 24 15:23:48 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: > 2018-04-24 15:23:48.144+0000: 28171: error : qemuMonitorIO:704 : internal > error: End of file from qemu monitor > Apr 24 15:23:48 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: > 2018-04-24 15:23:48.166+0000: 28171: error : qemuMonitorIO:704 : internal > error: End of file from qemu monitor > Apr 24 15:23:56 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: > 2018-04-24 15:23:56.309+0000: 28174: error : virHashSearch:727 : Hash > operation not allowed during iteration > Apr 24 15:23:56 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: > 2018-04-24 15:23:56.309+0000: 28174: error : virHashSearch:727 : Hash > operation not allowed during iteration > Apr 24 15:23:56 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: > 2018-04-24 15:23:56.309+0000: 28174: error : virHashSearch:727 : Hash > operation not allowed during iteration > Apr 24 15:23:56 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: > 2018-04-24 15:23:56.309+0000: 28174: error : virHashSearch:727 : Hash > operation not allowed during iteration > Apr 24 15:23:56 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: > 2018-04-24 15:23:56.310+0000: 28175: error : virHashSearch:727 : Hash > operation not allowed during iteration > Apr 24 15:23:56 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: > 2018-04-24 15:23:56.310+0000: 28175: error : virHashSearch:727 : Hash > operation not allowed during iteration > Apr 24 15:23:56 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: > 2018-04-24 15:23:56.310+0000: 28175: error : virHashSearch:727 : Hash > operation not allowed during iteration > Apr 24 15:23:56 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: > 2018-04-24 15:23:56.310+0000: 28175: error : virHashSearch:727 : Hash > operation not allowed during iteration > Apr 24 15:23:56 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: > 2018-04-24 15:23:56.313+0000: 28174: error : virHashSearch:727 : Hash > operation not allowed during iteration > Apr 24 15:23:56 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: > 2018-04-24 15:23:56.313+0000: 28175: error : virHashSearch:727 : Hash > operation not allowed during iteration > Apr 24 15:23:56 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: > 2018-04-24 15:23:56.313+0000: 28174: error : virHashSearch:727 : Hash > operation not allowed during iteration > Apr 24 15:23:56 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: > 2018-04-24 15:23:56.313+0000: 28175: error : virHashSearch:727 : Hash > operation not allowed during iteration > Apr 24 15:23:56 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: > 2018-04-24 15:23:56.313+0000: 28174: error : virHashSearch:727 : Hash > operation not allowed during iteration > Apr 24 15:23:56 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: > 2018-04-24 15:23:56.313+0000: 28175: error : virHashSearch:727 : Hash > operation not allowed during iteration > Apr 24 15:23:56 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: > 2018-04-24 15:23:56.454+0000: 34996: error : virHashSearch:727 : Hash > operation not allowed during iteration > Apr 24 15:23:56 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: > 2018-04-24 15:23:56.607+0000: 28172: error : virHashSearch:727 : Hash > operation not allowed during iteration > Apr 24 15:23:56 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: > 2018-04-24 15:23:56.767+0000: 34996: error : virHashSearch:727 : Hash > operation not allowed during iteration > Apr 24 15:23:56 sealusa10.mobius.lab.eng.rdu2.redhat.com libvirtd[28171]: > 2018-04-24 15:23:56.904+0000: 28172: error : virHashSearch:727 : Hash > operation not allowed during iteration What's the hypervisor's RHEL version in your case? Can it be related to RHEL7.5?
Seems others have encountered this: https://storyboard.openstack.org/#!/story/2001798
This is the same libvirt issue that is showing up elsewhere. A fix to add a retry to virtualbmc is in progress (https://review.openstack.org/#/c/564878/). The signature is this problem turning on power to virtual nodes. ironic.conductor.manager [req-ce1043d9-0179-467b-88c5-c230edd5c46f b715c5845eb34c06b950712c5e012bfa fd53286d965846608ba8f3b15674ef5e - default default] Error in deploy of node 4104c21e-fcc4-4250-9f19-70f3156cb1f6: IPMI call failed: power on.: IPMIFailure: IPMI call failed: power on. I'm marking this as a dup of the bug we are using to track this issue which is flagged as a blocker. *** This bug has been marked as a duplicate of bug 1571384 ***