Bug 1230438
Summary: | Neutron cannot run root commands | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Ronelle Landy <rlandy> | ||||
Component: | openstack-neutron | Assignee: | Jakub Libosvar <jlibosva> | ||||
Status: | CLOSED ERRATA | QA Contact: | Eran Kuris <ekuris> | ||||
Severity: | urgent | Docs Contact: | |||||
Priority: | urgent | ||||||
Version: | 7.0 (Kilo) | CC: | amuller, chrisw, dsneddon, jlibosva, lmartins, mandreou, mburns, mcornea, nyechiel, rhel-osp-director-maint, sasha, yeylon | ||||
Target Milestone: | ga | Keywords: | Automation, Regression | ||||
Target Release: | 7.0 (Kilo) | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | openstack-neutron-2015.1.0-7.el7ost | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | |||||||
: | 1230900 (view as bug list) | Environment: | |||||
Last Closed: | 2015-08-05 13:25:57 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1228096, 1230900 | ||||||
Attachments: |
|
Description
Ronelle Landy
2015-06-10 22:24:32 UTC
Added the changes from https://review.gerrithub.io/#/c/235148/ to the install. I tried the same thing with a very basic configuration. RHOS on virt with Delorean trunk, all default settings. The overcloud instances aren't getting PXE booted during deployment (but discovery works). I took a closer look at the instack host when deploying, here is what I found: Discovery works without a hitch. During deployment, DHCP requests are seen on the instack host, but no DHCP offer or PXE boot info is sent by the host. ========== [stack@instack ~]$ find /tftpboot/ /tftpboot/ /tftpboot/token-1ecb36a1-9cad-4690-8aaa-e1e1fcb6e864 /tftpboot/token-bff29ba1-a116-4061-a360-285191122f8d /tftpboot/pxelinux.0 /tftpboot/master_images /tftpboot/master_images/3a3a7468-a2eb-4666-b44f-2eb99609295c /tftpboot/master_images/aa7a69a1-b59b-4b79-b316-ea2c854e1414 /tftpboot/master_images/00d07ccc-ee89-45d8-b00e-8426e9925c7a /tftpboot/master_images/8a6244bf-1956-44c8-bef4-ca7a6105234a /tftpboot/undionly.kpxe /tftpboot/map-file /tftpboot/pxelinux.cfg (the pxelinux.cfg is an empty directory). ========== [stack@instack ~]$ ironic node-list +--------------------------------------+------+--------------------------------------+-------------+-----------------+-------------+ | UUID | Name | Instance UUID | Power State | Provision State | Maintenance | +--------------------------------------+------+--------------------------------------+-------------+-----------------+-------------+ | bff29ba1-a116-4061-a360-285191122f8d | None | 2bde80ea-5cf3-4101-9f47-79c75f050246 | power on | wait call-back | False | | 1ecb36a1-9cad-4690-8aaa-e1e1fcb6e864 | None | fc378bb1-3c50-405f-a78b-beea2f4580c1 | power on | wait call-back | False | +--------------------------------------+------+--------------------------------------+-------------+-----------------+-------------+ ========== [stack@instack ~]$ cat /etc/ironic-discoverd/dnsmasq.conf port=0 interface=br-ctlplane bind-interfaces dhcp-range=192.0.2.100,192.0.2.120,29 enable-tftp tftp-root=/tftpboot dhcp-match=ipxe,175 dhcp-boot=tag:!ipxe,undionly.kpxe,localhost.localdomain,192.0.2.1 dhcp-boot=tag:ipxe,http://192.0.2.1:8088/discoverd.ipxe ========== In a possibly related twist, I just tried deploying the latest upstream TripleO devtest, and I got the same behavior when I tried to deploy. The instances did not get DHCP or PXE boot. Ironic error message: Jun 10 19:04:43 instack ironic-discoverd: ERROR:ironicclient.common.http:Error contacting Ironic server: A port with MAC address 00:5f:08:71:32:13 already exists. (HTTP 409). Attempt 6 of 6 yeah, came here to say that ^^^ poking around on Dan's setup this is the only thing i could quickly see. The nova computes aren't even spawned. Jun 10 19:04:43 instack.localdomain ironic-discoverd[30934]: ERROR:ironicclient.common.http:Error contacting Ironic server: A port with MAC address 00:5f:08:71:32:13 already exists. (HTTP 409). Attempt 6 of 6 Jun 10 19:04:54 instack.localdomain ironic-discoverd[30934]: ERROR:ironicclient.common.http:Error contacting Ironic server: A port with MAC address 00:1a:89:60:d1:b5 already exists. (HTTP 409). Attempt 6 of 6 so in any case it isn't (like other recent issues) to do with overcloud config/heat/puppet so actually this isn't an error: 09:12 < dsneddon_zzz> lucasagomes, dtantsur: Error from my deployment: Jun 10 19:04:43 instack ironic-discoverd: ERROR:ironicclient.common.http:Error contacting Ironic server: A port with MAC address 00:5f:08:71:32:13 already exists. (HTTP 409). Attempt 6 of 6 09:13 < lucasagomes> dsneddon_zzz, this looks like discoverd is trying to create a port resource in ironic with a macaddress that is already registered 09:13 < dtantsur> damn ironicclient, how to silence your faults? >_< 09:13 < dtantsur> dsneddon_zzz, that's not an error. ignore it. 09:14 < dtantsur> the problem is that I don't know the way to tell ironicclient "error is ok, don't report it as ERROR in logs"... 09:14 < dsneddon_zzz> marios, ^ Created attachment 1037610 [details]
some logs from the failed virt setup
some assorted logs from the failing setup. There are a few different things going on, especially auth issue (nova-api.log keystonemiddleware.auth_token [-] Authorization failed for token). The neutron logs are towards the end, it seems like dhcp agent is having a permissions issue (and can see in ./dhcp-agent.log:2015-06-11 05:39:54.472 9237 ERROR neutron.agent.dhcp.agent [-] Unable to enable dhcp for 693ed227-df2c-423e-a1d3-0374db845c48). Hopefully this helps someone
some more context. so i could only get neutron dhcp agent to come up clean after I setenforce 0. I then tried the deploy but eventually neutron dhcp agent has auth issues again (still with setenforce 0): [stack@instack ~]$ sudo service neutron-dhcp-agent status -l Redirecting to /bin/systemctl status -l neutron-dhcp-agent.service neutron-dhcp-agent.service - OpenStack Neutron DHCP Agent Loaded: loaded (/usr/lib/systemd/system/neutron-dhcp-agent.service; enabled) Active: active (running) since Thu 2015-06-11 06:06:30 EDT; 28min ago Main PID: 4923 (neutron-dhcp-ag) CGroup: /system.slice/neutron-dhcp-agent.service └─4923 /usr/bin/python2 /usr/bin/neutron-dhcp-agent --config-file /usr/share/neutron/neutron-dist.conf --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/dhcp_agent.ini --config-dir /etc/neutron/conf.d/common --config-dir /etc/neutron/conf.d/neutron-dhcp-agent --log-file /var/log/neutron/dhcp-agent.log Jun 11 06:06:30 instack.localdomain systemd[1]: Starting OpenStack Neutron DHCP Agent... Jun 11 06:06:30 instack.localdomain systemd[1]: Started OpenStack Neutron DHCP Agent. Jun 11 06:34:19 instack.localdomain sudo[9147]: pam_unix(sudo:auth): conversation failed Jun 11 06:34:19 instack.localdomain sudo[9147]: pam_unix(sudo:auth): auth could not identify password for [neutron] Hi, So the problem is permission related. I noticed that the neutron-openvswitch-agent failed to start with [1], and neutron-dhcp-agent was started but the dnsmasq process couldn't be spawned due similar problems [2]. Talking to ajo on IRC he pointed out that the rootwrap-daemon was recently introduced and probably it's not working [3], also on [3] he suggested to have an open rule in /etc/sudoers for the neutron user. And that worked for me, I could start neutron-openvswitch-agent and neutron-dhcp-agent start the dnsmasq process. So, *as a workaround*: 1) Edit the /etc/sudoers and add and entry at the end like: neutron ALL=(ALL) NOPASSWD: ALL 2) Restart the neutron services: $ sudo systemctl restart neutron-dhcp-agent neutron-server neutron-openvswitch-agent.service [1] http://paste.openstack.org/show/284064/ [2] http://paste.openstack.org/show/283919/ [3] http://paste.openstack.org/show/284125/ Cheers, Lucas Hi, Sorry, one more thing. The selinux is in permissive mode. So the workaround is: 1) Edit the /etc/sudoers and add and entry at the end like: neutron ALL=(ALL) NOPASSWD: ALL 2) Put selinux in permissive mode: $ sudo setenforce 0 3) Restart the neutron services: $ sudo systemctl restart neutron-dhcp-agent neutron-server neutron-openvswitch-agent.service ... Creating a selinux rule from the audit logs show me: cat neutron.te module neutron 1.0; require { type neutron_t; type chkpwd_exec_t; type sudo_db_t; type shadow_t; type sendmail_exec_t; class dir { getattr create add_name }; class file { execute read execute_no_trans getattr open }; } #============= neutron_t ============== allow neutron_t chkpwd_exec_t:file { read execute open execute_no_trans }; allow neutron_t sendmail_exec_t:file execute; allow neutron_t shadow_t:file { read getattr open }; allow neutron_t sudo_db_t:dir { getattr create add_name }; So as Terry Wilson found out, there is a wrong rule in /etc/sudoers.d/neutron: neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf should be without asterisk at the end. Another bug is the selinux issue, we need to add rules for accessing sudodb for rootwrap-daemon Verified - installation and deployment successful Connection to 192.0.2.7 closed. Overcloud Endpoint: http://192.0.2.7:5000/v2.0/ Overcloud Deployed [stack@instack ~]$ rpm -qa |grep openstack-neutron openstack-neutron-common-2015.1.0-10.el7ost.noarch openstack-neutron-openvswitch-2015.1.0-10.el7ost.noarch openstack-neutron-2015.1.0-10.el7ost.noarch openstack-neutron-ml2-2015.1.0-10.el7ost.noarch tested on : RHEL-OSP director puddle 7.0 RC - 2015-06-29.1 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2015:1548 |