Bug 1642588 - Director does not start openvswitch on overcloud nodes
Summary: Director does not start openvswitch on overcloud nodes
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: z8
: 13.0 (Queens)
Assignee: Brent Eagles
QA Contact: Candido Campos
URL:
Whiteboard:
: 1645554 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-10-24 17:57 UTC by Lars Kellogg-Stedman
Modified: 2023-09-07 19:28 UTC (History)
12 users (show)

Fixed In Version: openstack-tripleo-heat-templates-8.3.1-76.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-09-03 16:55:32 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1804264 0 None None None 2019-06-04 16:29:52 UTC
OpenStack gerrit 663989 0 'None' MERGED Start/enable OVS on neutron ovs agent nodes 2020-10-15 23:08:42 UTC
Red Hat Product Errata RHBA-2019:2624 0 None None None 2019-09-03 16:55:51 UTC

Description Lars Kellogg-Stedman 2018-10-24 17:57:19 UTC
Description of problem:

When performing a split stack install, Director does not start the openvswitch service on overcloud nodes.  This bug is masked on Director-provisioned install because the service is enabled by default on our overcloud images.

Version-Release number of selected component (if applicable):

$ rpm -qa | grep tripleo
openstack-tripleo-image-elements-8.0.1-1.el7ost.noarch
puppet-tripleo-8.3.4-5.el7ost.noarch
openstack-tripleo-common-containers-8.6.3-13.el7ost.noarch
openstack-tripleo-ui-8.3.2-1.el7ost.noarch
openstack-tripleo-puppet-elements-8.0.1-1.el7ost.noarch
openstack-tripleo-common-8.6.3-13.el7ost.noarch
openstack-tripleo-heat-templates-8.0.4-20.el7ost.noarch
ansible-tripleo-ipsec-8.1.1-0.20180308133440.8f5369a.el7ost.noarch
openstack-tripleo-validations-8.4.2-1.el7ost.noarch
python-tripleoclient-9.2.3-4.el7ost.noarch


Additional info:

See also #1642587, in which Director reports a successful deployment even though it failed to start openvswitch.

Comment 1 Lars Kellogg-Stedman 2018-10-24 23:06:17 UTC
Manually installing openvswitch and starting the service prior to the deploy allowed a deploy to complete successfully and configure openvswitch on the overcloud hosts.

Comment 4 Brian Haley 2018-12-13 16:34:45 UTC
*** Bug 1645554 has been marked as a duplicate of this bug. ***

Comment 6 Aviv Guetta 2019-06-02 08:02:40 UTC
Can we have a status update on this bug?

Comment 9 Brent Eagles 2019-06-07 16:13:50 UTC
With https://review.opendev.org/#/c/663989/ the deployment will take care of enabling and starting openvswitch on nodes where the neutron ovs agent is deployed.

Comment 23 errata-xmlrpc 2019-09-03 16:55:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2624

Comment 24 Lars Kellogg-Stedman 2020-02-21 19:43:11 UTC
It looks like we're stilling hitting this with openstack-tripleo-heat-templates-8.4.1-16.el7ost.noarch. Our OSP 13 deploy just completed, but on newly added compute nodes  we see:

[root@neu-17-2-stackcomp ~]# docker ps -f name=neutron
CONTAINER ID        IMAGE                                                                  COMMA
ND                  CREATED             STATUS                         PORTS               NAMES
82f4612e35b1        172.16.0.5:8787/rhosp13/openstack-neutron-openvswitch-agent:13.0-105   "dumb
-init --singl..."   31 minutes ago      Restarting (1) 3 minutes ago                       neutr
on_ovs_agent

And looking at the logs:

[root@neu-17-2-stackcomp ~]# docker logs neutron_ovs_agent 2>&1 | tail -10
    raise self.last_attempt.result()
  File "/usr/lib/python2.7/site-packages/concurrent/futures/_base.py", line 422, in result
    return self.__get_result()
  File "/usr/lib/python2.7/site-packages/tenacity/__init__.py", line 298, in call
    result = fn(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/neutron/agent/ovsdb/native/connection.py", line 67, in do_get_schema_helper
    return idlutils.get_schema_helper(conn, schema_name)
  File "/usr/lib/python2.7/site-packages/ovsdbapp/backend/ovs_idl/idlutils.py", line 128, in get_schema_helper
    'err': os.strerror(err)})
Exception: Could not retrieve schema from tcp:127.0.0.1:6640: Connection refused

And on the host:

[root@neu-17-2-stackcomp ~]# systemctl is-active openvswitch
inactive
[root@neu-17-2-stackcomp ~]# systemctl is-enabled openvswitch
disabled

It looks like this hasn't been resolved (or there was some sort of regression)?

Comment 25 Bernard Cafarelli 2020-02-25 16:20:04 UTC
openstack-tripleo-heat-templates-8.4.1 is based on final upstream queens release and includes the fix mentioned here:
https://review.opendev.org/#/c/663989/

You can confirm with docker/services/neutron-ovs-agent.yaml having this section in THT:
      host_prep_tasks:
        list_concat:
          - {get_attr: [NeutronLogging, host_prep_tasks]}
          -
            - name: ensure openvswitch service is enabled
              service:
                name: openvswitch
                state: started
                enabled: yes

Which if I read correctly should cover all cases here. So probably different issue, but I will defer to Brent's opinion here

Comment 26 Brent Eagles 2020-10-09 14:00:19 UTC
I think the patches should have this covered. Is this still happening Lars?

Comment 27 Lars Kellogg-Stedman 2020-10-09 16:13:39 UTC
Brent,

Thanks for checking in. Since this only crops up during initial installs, it's hard to tell. I'm not going to have the chance to try this out myself for a while, so if folks want to go ahead and close this ticket that's fine with me.

Comment 28 Lars Kellogg-Stedman 2020-10-09 16:15:22 UTC
Well, I guess it's already closed :)


Note You need to log in before you can comment on or make changes to this bug.