Bug 1230438 - Neutron cannot run root commands
Summary: Neutron cannot run root commands
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron
Version: 7.0 (Kilo)
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ga
: 7.0 (Kilo)
Assignee: Jakub Libosvar
QA Contact: Eran Kuris
URL:
Whiteboard:
Depends On:
Blocks: 1228096 1230900
TreeView+ depends on / blocked
 
Reported: 2015-06-10 22:24 UTC by Ronelle Landy
Modified: 2023-02-22 23:02 UTC (History)
12 users (show)

Fixed In Version: openstack-neutron-2015.1.0-7.el7ost
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1230900 (view as bug list)
Environment:
Last Closed: 2015-08-05 13:25:57 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
some logs from the failed virt setup (47.33 KB, text/plain)
2015-06-11 09:51 UTC, Marios Andreou
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2015:1548 0 normal SHIPPED_LIVE Red Hat Enterprise Linux OpenStack Platform Enhancement Advisory 2015-08-05 17:07:06 UTC

Description Ronelle Landy 2015-06-10 22:24:32 UTC
Description of problem:

Deploying the overcloud from a virt install of the bits from the latest poodle fails - the overcloud nodes don't PXE boot and show "No bootable device". 

As a side note: was trying a HA deployment with 1 compute and three controller nodes.


Version-Release number of selected component (if applicable):

[stack@instack ~]$ rpm -qa | grep openstack
openstack-nova-console-2015.1.0-7.el7ost.noarch
openstack-neutron-2015.1.0-6.el7ost.noarch
openstack-ironic-conductor-2015.1.0-4.el7ost.noarch
openstack-ceilometer-alarm-2015.1.0-2.el7ost.noarch
openstack-swift-account-2.3.0-1.el7ost.noarch
openstack-tuskar-ui-0.3.0-2.el7ost.noarch
openstack-heat-api-cloudwatch-2015.1.0-3.el7ost.noarch
openstack-ceilometer-notification-2015.1.0-2.el7ost.noarch
openstack-neutron-openvswitch-2015.1.0-6.el7ost.noarch
openstack-nova-api-2015.1.0-7.el7ost.noarch
openstack-tripleo-image-elements-0.9.6-1.el7ost.noarch
python-openstackclient-1.0.3-2.el7ost.noarch
openstack-ironic-discoverd-1.1.0-3.el7ost.noarch
openstack-tripleo-puppet-elements-0.0.1-2.el7ost.noarch
openstack-swift-object-2.3.0-1.el7ost.noarch
openstack-tripleo-0.0.6-0.1.git812abe0.el7ost.noarch
openstack-utils-2014.2-1.el7ost.noarch
openstack-nova-common-2015.1.0-7.el7ost.noarch
openstack-heat-common-2015.1.0-3.el7ost.noarch
openstack-tuskar-0.4.18-2.el7ost.noarch
python-django-openstack-auth-1.2.0-2.el7ost.noarch
openstack-dashboard-theme-2015.1.0-9.el7ost.noarch
openstack-tripleo-heat-templates-0.8.6-2.el7ost.noarch
openstack-tuskar-ui-extras-0.0.3-3.el7ost.noarch
openstack-tempest-kilo-20150507.2.el7ost.noarch
openstack-swift-2.3.0-1.el7ost.noarch
openstack-neutron-ml2-2015.1.0-6.el7ost.noarch
openstack-nova-novncproxy-2015.1.0-7.el7ost.noarch
openstack-keystone-2015.1.0-1.el7ost.noarch
openstack-swift-plugin-swift3-1.7-3.el7ost.noarch
openstack-tripleo-common-0.0.1.dev6-0.git49b57eb.el7ost.noarch
openstack-neutron-common-2015.1.0-6.el7ost.noarch
openstack-heat-engine-2015.1.0-3.el7ost.noarch
openstack-ceilometer-common-2015.1.0-2.el7ost.noarch
openstack-heat-api-cfn-2015.1.0-3.el7ost.noarch
openstack-ceilometer-api-2015.1.0-2.el7ost.noarch
openstack-ironic-api-2015.1.0-4.el7ost.noarch
openstack-swift-proxy-2.3.0-1.el7ost.noarch
openstack-heat-templates-0-0.6.20150605git.el7ost.noarch
openstack-ceilometer-collector-2015.1.0-2.el7ost.noarch
openstack-ironic-common-2015.1.0-4.el7ost.noarch
openstack-selinux-0.6.31-1.el7ost.noarch
openstack-nova-compute-2015.1.0-7.el7ost.noarch
openstack-nova-conductor-2015.1.0-7.el7ost.noarch
openstack-swift-container-2.3.0-1.el7ost.noarch
redhat-access-plugin-openstack-7.0.0-0.el7ost.noarch
openstack-glance-2015.1.0-6.el7ost.noarch
openstack-heat-api-2015.1.0-3.el7ost.noarch
openstack-ceilometer-central-2015.1.0-2.el7ost.noarch
openstack-puppet-modules-2015.1.3-3.el7ost.noarch
openstack-nova-scheduler-2015.1.0-7.el7ost.noarch
openstack-nova-cert-2015.1.0-7.el7ost.noarch
openstack-dashboard-2015.1.0-9.el7ost.noarch


How reproducible:
Always - confirmed by dsneddon on his install

Steps to Reproduce:
1. Install the virt host and undercloud with bits from the latest poodle (06/10/2015)
2. Register nodes and deploy overcloud instack-deploy-overcloud --tuskar
3. See error on overcloud nodes consoles

Actual results:
Overcloud nodes don't get PXE boot -expect deploy will time out and fail

Expected results:
Successful overcloud deployment

Additional info:

Comment 3 Ronelle Landy 2015-06-10 22:27:24 UTC
Added the changes from https://review.gerrithub.io/#/c/235148/ to the install.

Comment 4 Dan Sneddon 2015-06-10 22:29:33 UTC
I tried the same thing with a very basic configuration. RHOS on virt with Delorean trunk, all default settings. The overcloud instances aren't getting PXE booted during deployment (but discovery works).

Comment 5 Dan Sneddon 2015-06-10 23:27:20 UTC
I took a closer look at the instack host when deploying, here is what I found:

Discovery works without a hitch.

During deployment, DHCP requests are seen on the instack host, but no DHCP offer or PXE boot info is sent by the host.

==========

[stack@instack ~]$ find /tftpboot/
/tftpboot/
/tftpboot/token-1ecb36a1-9cad-4690-8aaa-e1e1fcb6e864
/tftpboot/token-bff29ba1-a116-4061-a360-285191122f8d
/tftpboot/pxelinux.0
/tftpboot/master_images
/tftpboot/master_images/3a3a7468-a2eb-4666-b44f-2eb99609295c
/tftpboot/master_images/aa7a69a1-b59b-4b79-b316-ea2c854e1414
/tftpboot/master_images/00d07ccc-ee89-45d8-b00e-8426e9925c7a
/tftpboot/master_images/8a6244bf-1956-44c8-bef4-ca7a6105234a
/tftpboot/undionly.kpxe
/tftpboot/map-file
/tftpboot/pxelinux.cfg

(the pxelinux.cfg is an empty directory).

==========

[stack@instack ~]$ ironic node-list
+--------------------------------------+------+--------------------------------------+-------------+-----------------+-------------+
| UUID                                 | Name | Instance UUID                        | Power State | Provision State | Maintenance |
+--------------------------------------+------+--------------------------------------+-------------+-----------------+-------------+
| bff29ba1-a116-4061-a360-285191122f8d | None | 2bde80ea-5cf3-4101-9f47-79c75f050246 | power on    | wait call-back  | False       |
| 1ecb36a1-9cad-4690-8aaa-e1e1fcb6e864 | None | fc378bb1-3c50-405f-a78b-beea2f4580c1 | power on    | wait call-back  | False       |
+--------------------------------------+------+--------------------------------------+-------------+-----------------+-------------+

==========

[stack@instack ~]$ cat /etc/ironic-discoverd/dnsmasq.conf 
port=0
interface=br-ctlplane
bind-interfaces
dhcp-range=192.0.2.100,192.0.2.120,29
enable-tftp
tftp-root=/tftpboot
dhcp-match=ipxe,175
dhcp-boot=tag:!ipxe,undionly.kpxe,localhost.localdomain,192.0.2.1
dhcp-boot=tag:ipxe,http://192.0.2.1:8088/discoverd.ipxe

==========

Comment 6 Dan Sneddon 2015-06-10 23:57:04 UTC
In a possibly related twist, I just tried deploying the latest upstream TripleO devtest, and I got the same behavior when I tried to deploy. The instances did not get DHCP or PXE boot.

Comment 7 Dan Sneddon 2015-06-11 09:11:34 UTC
Ironic error message:
Jun 10 19:04:43 instack ironic-discoverd: ERROR:ironicclient.common.http:Error contacting Ironic server: A port with MAC address 00:5f:08:71:32:13 already exists. (HTTP 409). Attempt 6 of 6

Comment 8 Marios Andreou 2015-06-11 09:15:08 UTC
yeah, came here to say that ^^^ poking around on Dan's setup this is the only thing i could quickly see. The nova computes aren't even spawned.


Jun 10 19:04:43 instack.localdomain ironic-discoverd[30934]: ERROR:ironicclient.common.http:Error contacting Ironic server: A port with MAC address 00:5f:08:71:32:13 already exists. (HTTP 409). Attempt 6 of 6
Jun 10 19:04:54 instack.localdomain ironic-discoverd[30934]: ERROR:ironicclient.common.http:Error contacting Ironic server: A port with MAC address 00:1a:89:60:d1:b5 already exists. (HTTP 409). Attempt 6 of 6


so in any case it isn't (like other recent issues) to do with overcloud config/heat/puppet

Comment 9 Marios Andreou 2015-06-11 09:16:49 UTC
so actually this isn't an error:

09:12 < dsneddon_zzz> lucasagomes, dtantsur: Error from my deployment: Jun 10 19:04:43 instack ironic-discoverd: ERROR:ironicclient.common.http:Error contacting Ironic 
                      server: A port with MAC address 00:5f:08:71:32:13 already exists. (HTTP 409). Attempt 6 of 6
09:13 < lucasagomes> dsneddon_zzz, this looks like discoverd is trying to create a port resource in ironic with a macaddress that is already registered
09:13 < dtantsur> damn ironicclient, how to silence your faults? >_<
09:13 < dtantsur> dsneddon_zzz, that's not an error. ignore it.
09:14 < dtantsur> the problem is that I don't know the way to tell ironicclient "error is ok, don't report it as ERROR in logs"...
09:14 < dsneddon_zzz> marios, ^

Comment 10 Marios Andreou 2015-06-11 09:51:31 UTC
Created attachment 1037610 [details]
some logs from the failed virt setup

some assorted logs from the failing setup. There are a few different things going on, especially auth issue (nova-api.log keystonemiddleware.auth_token [-] Authorization failed for token). The neutron logs are towards the end, it seems like dhcp agent is having a permissions issue (and can see in ./dhcp-agent.log:2015-06-11 05:39:54.472 9237 ERROR neutron.agent.dhcp.agent [-] Unable to enable dhcp for 693ed227-df2c-423e-a1d3-0374db845c48). Hopefully this helps someone

Comment 11 Marios Andreou 2015-06-11 10:59:10 UTC
some more context. so i could only get neutron dhcp agent to come up clean after I setenforce 0. I then tried the deploy but eventually neutron dhcp agent has auth issues again (still with setenforce 0):

[stack@instack ~]$ sudo service neutron-dhcp-agent status -l
Redirecting to /bin/systemctl status  -l neutron-dhcp-agent.service
neutron-dhcp-agent.service - OpenStack Neutron DHCP Agent
   Loaded: loaded (/usr/lib/systemd/system/neutron-dhcp-agent.service; enabled)
   Active: active (running) since Thu 2015-06-11 06:06:30 EDT; 28min ago
 Main PID: 4923 (neutron-dhcp-ag)
   CGroup: /system.slice/neutron-dhcp-agent.service
           └─4923 /usr/bin/python2 /usr/bin/neutron-dhcp-agent --config-file /usr/share/neutron/neutron-dist.conf --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/dhcp_agent.ini --config-dir /etc/neutron/conf.d/common --config-dir /etc/neutron/conf.d/neutron-dhcp-agent --log-file /var/log/neutron/dhcp-agent.log

Jun 11 06:06:30 instack.localdomain systemd[1]: Starting OpenStack Neutron DHCP Agent...
Jun 11 06:06:30 instack.localdomain systemd[1]: Started OpenStack Neutron DHCP Agent.
Jun 11 06:34:19 instack.localdomain sudo[9147]: pam_unix(sudo:auth): conversation failed
Jun 11 06:34:19 instack.localdomain sudo[9147]: pam_unix(sudo:auth): auth could not identify password for [neutron]

Comment 12 Lucas Alvares Gomes 2015-06-11 15:44:53 UTC
Hi, 

So the problem is permission related. I noticed that the neutron-openvswitch-agent failed to start with [1], and neutron-dhcp-agent was started but the dnsmasq process couldn't be spawned due similar problems [2].

Talking to ajo on IRC he pointed out that the rootwrap-daemon was recently introduced and probably it's not working [3], also on [3] he suggested to have an open rule in /etc/sudoers for the neutron user. And that worked for me, I could start neutron-openvswitch-agent and neutron-dhcp-agent start the dnsmasq process.

So, *as a workaround*:

1) Edit the /etc/sudoers and add and entry at the end like:

neutron		ALL=(ALL)	NOPASSWD: ALL

2) Restart the neutron services:

$ sudo systemctl restart neutron-dhcp-agent neutron-server neutron-openvswitch-agent.service

[1] http://paste.openstack.org/show/284064/
[2] http://paste.openstack.org/show/283919/
[3] http://paste.openstack.org/show/284125/

Cheers,
Lucas

Comment 13 Lucas Alvares Gomes 2015-06-11 15:56:39 UTC
Hi,

Sorry, one more thing. The selinux is in permissive mode. So the workaround is:

1) Edit the /etc/sudoers and add and entry at the end like:

neutron		ALL=(ALL)	NOPASSWD: ALL

2) Put selinux in permissive mode:

$ sudo setenforce 0

3) Restart the neutron services:

$ sudo systemctl restart neutron-dhcp-agent neutron-server neutron-openvswitch-agent.service

...

Creating a selinux rule from the audit logs show me:

cat neutron.te 

module neutron 1.0;

require {
	type neutron_t;
	type chkpwd_exec_t;
	type sudo_db_t;
	type shadow_t;
	type sendmail_exec_t;
	class dir { getattr create add_name };
	class file { execute read execute_no_trans getattr open };
}

#============= neutron_t ==============
allow neutron_t chkpwd_exec_t:file { read execute open execute_no_trans };
allow neutron_t sendmail_exec_t:file execute;
allow neutron_t shadow_t:file { read getattr open };
allow neutron_t sudo_db_t:dir { getattr create add_name };

Comment 14 Jakub Libosvar 2015-06-11 17:03:46 UTC
So as Terry Wilson found out, there is a wrong rule in /etc/sudoers.d/neutron:
neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf

should be without asterisk at the end.

Another bug is the selinux issue, we need to add rules for accessing sudodb for rootwrap-daemon

Comment 16 Eran Kuris 2015-07-01 07:39:33 UTC
Verified - installation and deployment successful 
Connection to 192.0.2.7 closed.
Overcloud Endpoint: http://192.0.2.7:5000/v2.0/
Overcloud Deployed
[stack@instack ~]$ rpm -qa |grep openstack-neutron
openstack-neutron-common-2015.1.0-10.el7ost.noarch
openstack-neutron-openvswitch-2015.1.0-10.el7ost.noarch
openstack-neutron-2015.1.0-10.el7ost.noarch
openstack-neutron-ml2-2015.1.0-10.el7ost.noarch

tested on : RHEL-OSP director puddle 7.0 RC - 2015-06-29.1

Comment 18 errata-xmlrpc 2015-08-05 13:25:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2015:1548


Note You need to log in before you can comment on or make changes to this bug.