Bug 1244906 - Introspection failing for all nodes in a virtualized environment
Summary: Introspection failing for all nodes in a virtualized environment
Keywords:
Status: CLOSED DUPLICATE of bug 1243109
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: rhosp-director
Version: Director
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ga
: Director
Assignee: chris alfonso
QA Contact: yeylon@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-07-20 18:11 UTC by Udi Kalifon
Modified: 2016-04-18 07:01 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-07-21 11:58:41 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Udi Kalifon 2015-07-20 18:11:30 UTC
Description of problem:
Installing puddle 2015-07-17 in a virtualized environment and running introspection, I get an error (after a long timeout) that all node discoveries failed:


[stack@instack ~]$ openstack baremetal introspection bulk start
Setting available nodes to manageable...
Starting introspection of node: dddcccc8-4586-471c-a9f0-cad0ccddf339
Starting introspection of node: 3535c1b0-687d-4895-860d-a641e7536aef
Starting introspection of node: 9f4c43da-be9a-47cd-8d1b-b5ab97473b9c
Starting introspection of node: 293864c6-1dfb-455f-b2d3-b8fd10c119dc
Starting introspection of node: 9c33b848-e972-418c-b19e-b073df8f828a
Starting introspection of node: ca292f2d-1299-44e3-b8f8-7619ffc5aed5
Starting introspection of node: fd8b2460-4392-424a-9f2b-22f8c996ae99
Starting introspection of node: fd3d5426-f069-4de3-bcf6-d69445d717b4
Waiting for discovery to finish...
ERROR: rdomanager_oscplugin.utils.wait_for_node_discovery Discovery didn't finish for nodes dddcccc8-4586-471c-a9f0-cad0ccddf339,3535c1b0-687d-4895-860d-a641e7536aef,9f4c43da-be9a-47cd-8d1b-b5ab97473b9c,293864c6-1dfb-455f-b2d3-b8fd10c119dc,9c33b848-e972-418c-b19e-b073df8f828a,ca292f2d-1299-44e3-b8f8-7619ffc5aed5,fd8b2460-4392-424a-9f2b-22f8c996ae99,fd3d5426-f069-4de3-bcf6-d69445d717b4
Setting manageable nodes to available...
Node dddcccc8-4586-471c-a9f0-cad0ccddf339 has been set to available.
Node 3535c1b0-687d-4895-860d-a641e7536aef has been set to available.
Node 9f4c43da-be9a-47cd-8d1b-b5ab97473b9c has been set to available.
Node 293864c6-1dfb-455f-b2d3-b8fd10c119dc has been set to available.
Node 9c33b848-e972-418c-b19e-b073df8f828a has been set to available.
Node ca292f2d-1299-44e3-b8f8-7619ffc5aed5 has been set to available.
Node fd8b2460-4392-424a-9f2b-22f8c996ae99 has been set to available.
Node fd3d5426-f069-4de3-bcf6-d69445d717b4 has been set to available.
Discovery completed.


Version-Release number of selected component (if applicable):
python-rdomanager-oscplugin-0.0.8-41.el7ost.noarch


How reproducible:
100%


Steps to Reproduce:
1. Install according to the guide until you get to the introspection phase


Actual results:
I tried it twice in a row and it seems to fail consistently. After the failure the nodes are left in "power on" state:
[stack@instack ~]$ ironic node-list
+-------+------+---------------+-------------+-----------------+-------------+
| UUID  | Name | Instance UUID | Power State | Provision State | Maintenance |
+-------+------+---------------+-------------+-----------------+-------------+
| dddcc | None | None          | power on    | available       | False       |
| 3535c | None | None          | power on    | available       | False       |
| 9f4c4 | None | None          | power on    | available       | False       |
| 29386 | None | None          | power on    | available       | False       |
| 9c33b | None | None          | power on    | available       | False       |
| ca292 | None | None          | power on    | available       | False       |
| fd8b2 | None | None          | power on    | available       | False       |
| fd3d5 | None | None          | power on    | available       | False       |
+-------+------+---------------+-------------+-----------------+-------------+

I made sure to turn all nodes off before retrying.

Comment 3 John Trowbridge 2015-07-20 18:36:22 UTC
It looks like Ironic is able to manage the power for the virtual machines, but the ramdisk hung at some point. We need to look on the console of the virtual machines to check what happened.

Does this setup have multiple NICs on the provisioning network?

If so, this could be a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1243109

Comment 4 Udi Kalifon 2015-07-20 18:45:48 UTC
How do I check if the setup has multiple NICs on the provisioning network?

Comment 5 Dan Sneddon 2015-07-20 19:26:18 UTC
Udi,

Were you deploying in a virt environment with multiple bridges? If so, I think this is a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1243109

Comment 6 Dan Sneddon 2015-07-20 19:29:45 UTC
(In reply to Udi from comment #4)
> How do I check if the setup has multiple NICs on the provisioning network?

You would have had to use this before running instack-virt-setup:
export TESTENV_ARGS="--baremetal-bridge-names 'brbm brbm1 brbm2'"

I know that's documented, but it has known issues, hence the BZ above.

Comment 7 Udi Kalifon 2015-07-20 19:56:47 UTC
Yes I ran export TESTENV_ARGS="--baremetal-bridge-names 'brbm brbm1 brbm2'". Should we skip this step for now ?

Comment 8 Dan Sneddon 2015-07-20 20:35:39 UTC
(In reply to Udi from comment #7)
> Yes I ran export TESTENV_ARGS="--baremetal-bridge-names 'brbm brbm1 brbm2'".
> Should we skip this step for now ?

Yes, you should skip this for now. There is a known issue which hasn't been properly triaged yet. The purpose of multiple bridges is to test bonding in virt. If you just use the default settings for instack-virt-setup, you can use the single-nic-vlans templates to set up network isolation over a single NIC.

Comment 9 Mike Burns 2015-07-21 11:58:41 UTC

*** This bug has been marked as a duplicate of bug 1243109 ***


Note You need to log in before you can comment on or make changes to this bug.