Rubygem-Staypuft: Nova deployment with VLAN network type fails installing the compute node: Execution of '/usr/bin/nova-manage network create novanetwork 192.168.100.0/21 6 --vlan 10' returned 1: Command failed, please check log for more info Environment: rhel-osp-installer-0.1.6-5.el6ost.noarch openstack-foreman-installer-2.0.16-1.el6ost.noarch ruby193-rubygem-foreman_openstack_simplify-0.0.6-8.el6ost.noarch openstack-puppet-modules-2014.1-19.9.el6ost.noarch Steps to reproduce: 1. Install rhel-osp-installer 2. Create/run a Nova deployment with VLAN network type: a. Vlan range field: 10:15 b. floating ip range for external network: 10.8.30.100/31 c. private IP range for tenant networks: 192.168.100.0/21 Result: The deployment gets paused on error during the deployment of the compute nodes. The puppet error I get is: Info: Applying configuration version '1406908167' Error: Execution of '/usr/bin/nova-manage floating create 10.8.30.100/31' returned 1: Command failed, please check log for more info Error: /Stage[main]/Nova::Network/Nova::Manage::Floating[nova-vm-floating]/Nova_floating[nova-vm-floating]/ensure: change from absent to present failed: Execution of '/usr/bin/nova-manage floating create 10.8.30.100/31' returned 1: Command failed, please check log for more info Error: Execution of '/usr/bin/nova-manage network create novanetwork 192.168.100.0/21 6 --vlan 10' returned 1: Command failed, please check log for more info Error: /Stage[main]/Nova::Network/Nova::Manage::Network[nova-vm-net]/Nova_network[nova-vm-net]/ensure: change from absent to present failed: Execution of '/usr/bin/nova-manage network create novanetwork 192.168.100.0/21 6 --vlan 10' returned 1: Command failed, please check log for more info Notice: Finished catalog run in 5.72 seconds
This is likely either puppet (quickstack) or in nova. Moving to OFI for further investigation
I've verified the command "nova-manage network create novanetwork 192.168.100.0/21 6 --vlan 10" works fine running from the command line against a packstack install.
Sasha, can you see what is in the nova-manage log and attach it? Also, do you get the same fail if you run that command by hand? I am also wondering if it would be useful to compare your nova.conf and Brent's.
Also, the controller's yaml would make it easier for me to attempt to reproduce
So, I am not sure if I have the same settings or not, but with floating ip range for external network: 10.0.1.0/31, I got: Error: Execution of '/usr/bin/nova-manage floating create 10.0.1.0/31' returned 1: Command failed, please check log for more info Error: /Stage[main]/Nova::Network/Nova::Manage::Floating[nova-vm-floating]/Nova_floating[nova-vm-floating]/ensure: change from absent to present failed: Execution of '/usr/bin/no va-manage floating create 10.0.1.0/31' returned 1: Command failed, please check log for more info The error in nova-manage was: 2014-08-04 10:28:05.038 4807 CRITICAL nova [req-26de1d67-aad7-431f-8df9-949bd1bc2bd7 None None] InvalidInput: Invalid input received: /31 should be specified as single address(es) not in cidr format The other command succeeded without issue (the one brent tried): Debug: Executing '/usr/bin/nova-manage network create novanetwork 10.0.0.0/21 1 6 --vlan 10' Notice: /Stage[main]/Nova::Network/Nova::Manage::Network[nova-vm-net]/Nova_network[nova-vm-net]/ensure: created Debug: /Stage[main]/Nova::Network/Nova::Manage::Network[nova-vm-net]/Nova_network[nova-vm-net]: The container Nova::Manage::Network[nova-vm-net] will propagate my refresh event Debug: Nova::Manage::Network[nova-vm-net]: The container Class[Nova::Network] will propagate my refresh event So, I canted the 10.0.1.0/31 to 10.0.1.0/24, and all succeeded. I am not sure if /31 is generally invalid, or some special case for nova in this context, but I can show the success output: Debug: Executing '/usr/bin/nova-manage floating list' Debug: Executing '/usr/bin/nova-manage floating create 10.0.1.0/24' Notice: /Stage[main]/Nova::Network/Nova::Manage::Floating[nova-vm-floating]/Nova_floating[nova-vm-floating]/ensure: created Brent, any thoughts on this?
Created attachment 923969 [details] Nova compute with working settings
(In reply to Jason Guiditta from comment #7) > So, I am not sure if I have the same settings or not, but with floating ip > range for external network: 10.0.1.0/31, I got: > > Error: Execution of '/usr/bin/nova-manage floating create 10.0.1.0/31' > returned 1: Command failed, please check log for more info > Error: > /Stage[main]/Nova::Network/Nova::Manage::Floating[nova-vm-floating]/ > Nova_floating[nova-vm-floating]/ensure: change from absent to present > failed: Execution of '/usr/bin/no > va-manage floating create 10.0.1.0/31' returned 1: Command failed, please > check log for more info > > The error in nova-manage was: > > 2014-08-04 10:28:05.038 4807 CRITICAL nova > [req-26de1d67-aad7-431f-8df9-949bd1bc2bd7 None None] InvalidInput: Invalid > input received: /31 should be specified as single address(es) not in cidr > format > > > The other command succeeded without issue (the one brent tried): > Debug: Executing '/usr/bin/nova-manage network create novanetwork > 10.0.0.0/21 1 6 --vlan 10' > Notice: > /Stage[main]/Nova::Network/Nova::Manage::Network[nova-vm-net]/ > Nova_network[nova-vm-net]/ensure: created > Debug: > /Stage[main]/Nova::Network/Nova::Manage::Network[nova-vm-net]/ > Nova_network[nova-vm-net]: The container Nova::Manage::Network[nova-vm-net] > will propagate my refresh event > Debug: Nova::Manage::Network[nova-vm-net]: The container > Class[Nova::Network] will propagate my refresh event > > So, I canted the 10.0.1.0/31 to 10.0.1.0/24, and all succeeded. I am not Oops, typo here ^ s/canted/changed > sure if /31 is generally invalid, or some special case for nova in this > context, but I can show the success output: > > Debug: Executing '/usr/bin/nova-manage floating list' > Debug: Executing '/usr/bin/nova-manage floating create 10.0.1.0/24' > Notice: > /Stage[main]/Nova::Network/Nova::Manage::Floating[nova-vm-floating]/ > Nova_floating[nova-vm-floating]/ensure: created > > Brent, any thoughts on this?
(In reply to Jason Guiditta from comment #7) > So, I am not sure if I have the same settings or not, but with floating ip > range for external network: 10.0.1.0/31, I got: > > Error: Execution of '/usr/bin/nova-manage floating create 10.0.1.0/31' > returned 1: Command failed, please check log for more info > Error: > /Stage[main]/Nova::Network/Nova::Manage::Floating[nova-vm-floating]/ > Nova_floating[nova-vm-floating]/ensure: change from absent to present > failed: Execution of '/usr/bin/no > va-manage floating create 10.0.1.0/31' returned 1: Command failed, please > check log for more info > > The error in nova-manage was: > > 2014-08-04 10:28:05.038 4807 CRITICAL nova > [req-26de1d67-aad7-431f-8df9-949bd1bc2bd7 None None] InvalidInput: Invalid > input received: /31 should be specified as single address(es) not in cidr > format > > > The other command succeeded without issue (the one brent tried): > Debug: Executing '/usr/bin/nova-manage network create novanetwork > 10.0.0.0/21 1 6 --vlan 10' > Notice: > /Stage[main]/Nova::Network/Nova::Manage::Network[nova-vm-net]/ > Nova_network[nova-vm-net]/ensure: created > Debug: > /Stage[main]/Nova::Network/Nova::Manage::Network[nova-vm-net]/ > Nova_network[nova-vm-net]: The container Nova::Manage::Network[nova-vm-net] > will propagate my refresh event > Debug: Nova::Manage::Network[nova-vm-net]: The container > Class[Nova::Network] will propagate my refresh event > > So, I canted the 10.0.1.0/31 to 10.0.1.0/24, and all succeeded. I am not > sure if /31 is generally invalid, or some special case for nova in this > context, but I can show the success output: > > Debug: Executing '/usr/bin/nova-manage floating list' > Debug: Executing '/usr/bin/nova-manage floating create 10.0.1.0/24' > Notice: > /Stage[main]/Nova::Network/Nova::Manage::Floating[nova-vm-floating]/ > Nova_floating[nova-vm-floating]/ensure: created > > Brent, any thoughts on this? The error you found is indeed the issue. Don't use /31. Nova will reject a network specification if the number of addresses is < 4, which would be /30 or larger. So, this doesn't appear to be a bug. The UI could potentially use some validation to ensure a network of a proper size is specified.
We should in QE use /24 (and not /31) . Example: external network: 10.8.30.100/24 private IP range for tenant networks: 192.168.100.0/24 adding nova-manage.log: ---------------------- 2014-08-04 17:58:47.659 12098 CRITICAL nova [req-b1f47aa0-32c5-4233-ba95-5f6d122f2f13 None None] InvalidInput: Invalid input received: /31 should be specified as single address(es) not in cidr format 2014-08-04 17:58:47.659 12098 TRACE nova Traceback (most recent call last): 2014-08-04 17:58:47.659 12098 TRACE nova File "/usr/bin/nova-manage", line 10, in <module> 2014-08-04 17:58:47.659 12098 TRACE nova sys.exit(main()) 2014-08-04 17:58:47.659 12098 TRACE nova File "/usr/lib/python2.7/site-packages/nova/cmd/manage.py", line 1374, in main 2014-08-04 17:58:47.659 12098 TRACE nova ret = fn(*fn_args, **fn_kwargs) 2014-08-04 17:58:47.659 12098 TRACE nova File "/usr/lib/python2.7/site-packages/nova/cmd/manage.py", line 439, in create 014-08-04 17:58:47.659 12098 TRACE nova for address in self.address_to_hosts(ip_range)) 2014-08-04 17:58:47.659 12098 TRACE nova File "/usr/lib/python2.7/site-packages/nova/cmd/manage.py", line 415, in address_to_hosts 2014-08-04 17:58:47.659 12098 TRACE nova raise exception.InvalidInput(reason=reason) 2014-08-04 17:58:47.659 12098 TRACE nova InvalidInput: Invalid input received: /31 should be specified as single address(es) not in cidr format 2014-08-04 17:58:47.659 12098 TRACE nova 2014-08-04 17:58:50.061 12160 INFO nova.network.driver [-] Loading network driver 'nova.network.linux_net' 2014-08-04 17:58:50.174 12160 CRITICAL nova [req-b98b34c9-681a-48f1-8584-e272498d726b None None] ValueError: The network range is not big enough to fit 6 networks. Network size is 256 2014-08-04 17:58:50.174 12160 TRACE nova Traceback (most recent call last): 2014-08-04 17:58:50.174 12160 TRACE nova File "/usr/bin/nova-manage", line 10, in <module> 2014-08-04 17:58:50.174 12160 TRACE nova sys.exit(main()) 2014-08-04 17:58:50.174 12160 TRACE nova File "/usr/lib/python2.7/site-packages/nova/cmd/manage.py", line 1374, in main 2014-08-04 17:58:50.174 12160 TRACE nova ret = fn(*fn_args, **fn_kwargs) 2014-08-04 17:58:50.174 12160 TRACE nova File "/usr/lib/python2.7/site-packages/nova/cmd/manage.py", line 528, in create 2014-08-04 17:58:50.174 12160 TRACE nova net_manager.create_networks(context.get_admin_context(), **kwargs) 2014-08-04 17:58:50.174 12160 TRACE nova File "/usr/lib/python2.7/site-packages/nova/network/manager.py", line 1860, in create_networks 2014-08-04 17:58:50.174 12160 TRACE nova 'size is %(network_size)s') % kwargs) 2014-08-04 17:58:50.174 12160 TRACE nova ValueError: The network range is not big enough to fit 6 networks. Network size is 256 2014-08-04 17:58:50.174 12160 TRACE nova /var/log/nova/nova-manage.log (END)
Failed again with : external network: 10.8.30.100/24 , private IP range for tenant networks: 192.168.100.0/24
Omri, did you install it with 2 compute nodes?
Sasha, It was one compute on my environment.
This cannot be fixed in the puppet, it is a race condition that can occur if both compute nodes try to create a network at almost exactly the same time. My opinion is that the compute nodes should be orchestrated by staypuft to not run concurrently. Note that this is only the case for computes 1/2. After 1 node is configured, you could configure as many more as you wanted simultaneously.
it failed to me with: Floating IP range for external network: 10.35.117.100/21 and Fixed IP range for tenant networks: 192.168.100.0/22
Reproduced with rhelosp-installer-live-6.5-20140818.3.iso
Resuming the deployment after subsequent run of puppet makes the paused with errors deployment a successful one.
I wish It would pause, but it goes all the way to a successful deployment, not allowing me to resume a failed run. I am basically stuck
*** Bug 1127766 has been marked as a duplicate of this bug. ***
PR is here: https://github.com/theforeman/staypuft/pull/273 The agreed-upon solution for now is that the first compute node will deploy, then the remainder will deploy in parallel. Once we're using PuppetSSH, this can be refactored to allow the provisioning to happen all in parallel, so only the puppet run will happen in this manner, with one completing, then the remainder of compute nodes will run puppet in parallel)
Verified: rhel-osp-installer-0.1.10-2.el6ost.noarch openstack-foreman-installer-2.0.22-1.el6ost.noarch ruby193-rubygem-foreman_openstack_simplify-0.0.6-8.el6ost.noarch openstack-puppet-modules-2014.1-21.7.el6ost.noarch The issue didn't reproduce, the deployment has completed successfully.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2014-1138.html