Bug 1433671 - function does not return correct create router timestamp
Summary: function does not return correct create router timestamp
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron
Version: 11.0 (Ocata)
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: rc
: 11.0 (Ocata)
Assignee: Brian Haley
QA Contact: Eran Kuris
URL:
Whiteboard:
Depends On:
Blocks: 1433672
TreeView+ depends on / blocked
 
Reported: 2017-03-19 07:57 UTC by Eran Kuris
Modified: 2017-05-17 20:09 UTC (History)
8 users (show)

Fixed In Version: openstack-neutron-1:10.0.0-17
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1433672 (view as bug list)
Environment:
Last Closed: 2017-05-17 20:09:03 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2017:1245 normal SHIPPED_LIVE Red Hat OpenStack Platform 11.0 Bug Fix and Enhancement Advisory 2017-05-17 23:01:50 UTC
Launchpad 1666864 None None None 2017-03-19 07:57:04 UTC
Launchpad 1679794 None None None 2017-04-04 18:16:40 UTC

Description Eran Kuris 2017-03-19 07:57:04 UTC
Description of problem:
Test failed : neutron.tests.tempest.api.test_timestamp.TestTimeStampWithL3.test_show_router_attribute_with_timestamp

At creation the timestamp is correct: u'created_at': u'2017-02-22T10:47:24Z', u'updated_at': u'2017-02-22T10:47:24Z'

when "show" function display the timestamp it display it with ~3 sec difference:
u'created_at': u'2017-02-22T10:47:24Z', u'updated_at': u'2017-02-22T10:47:26Z'

show function display incorrect timestamp

    218 def test_show_router_attribute_with_timestamp(self):
    219 router = self.create_router(router_name='test')
    220 import ipdb;ipdb.set_trace()
--> 221 body = self.client.show_router(router['id'])
    222 show_router = body['router']
    223 # verify the timestamp from creation and showed is same
    224 import ipdb;ipdb.set_trace()
    225 self.assertEqual(router['created_at'],
    226 show_router['created_at'])

ipdb> router
{u'status': u'ACTIVE', u'external_gateway_info': None, u'availability_zone_hints': [], u'availability_zones': [], u'description': u'', u'admin_state_up': False, u'tenant_id': u'8b9cd1ebd13f4172a0c63789ee9c0de2', u'created_at': u'2017-02-22T10:47:24Z', u'updated_at': u'2017-02-22T10:47:24Z', u'flavor_id': None, u'revision_number': 3, u'routes': [], u'project_id': u'8b9cd1ebd13f4172a0c63789ee9c0de2', u'id': u'545e74b0-2f3d-43b8-8678-93bbb3f1f6f3', u'name': u'test'}
ipdb> n
2017-02-22 10:47:55.084 6919 INFO tempest.lib.common.rest_client [req-eef7ded4-bb01-4401-96cd-325c01b2230b ] Request (TestTimeStampWithL3:test_show_router_attribute_with_timestamp): 200 GET http://10.0.0.104:9696/v2.0/routers/545e74b0-2f3d-43b8-8678-93bbb3f1f6f3 0.224s
2017-02-22 10:47:55.086 6919 DEBUG tempest.lib.common.rest_client [req-eef7ded4-bb01-4401-96cd-325c01b2230b ] Request - Headers: {'Content-Type': 'application/json', 'Accept': 'application/json', 'X-Auth-Token': '<omitted>'}
        Body: None
    Response - Headers: {'status': '200', u'content-length': '462', 'content-location': 'http://10.0.0.104:9696/v2.0/routers/545e74b0-2f3d-43b8-8678-93bbb3f1f6f3', u'date': 'Wed, 22 Feb 2017 10:47:56 GMT', u'content-type': 'application/json', u'connection': 'close', u'x-openstack-request-id': 'req-eef7ded4-bb01-4401-96cd-325c01b2230b'}
        Body: {"router": {"status": "ACTIVE", "external_gateway_info": null, "availability_zone_hints": [], "availability_zones": ["nova"], "description": "", "admin_state_up": false, "tenant_id": "8b9cd1ebd13f4172a0c63789ee9c0de2", "created_at": "2017-02-22T10:47:24Z", "updated_at": "2017-02-22T10:47:26Z", "flavor_id": null, "revision_number": 8, "routes": [], "project_id": "8b9cd1ebd13f4172a0c63789ee9c0de2", "id": "545e74b0-2f3d-43b8-8678-93bbb3f1f6f3", "name": "test"}} _log_request_full tempest/lib/common/rest_client.py:431
> /home/centos/tempest-upstream/neutron/neutron/tests/tempest/api/test_timestamp.py(222)test_show_router_attribute_with_timestamp()

ipdb> router
{u'status': u'ACTIVE', u'external_gateway_info': None, u'availability_zone_hints': [], u'availability_zones': [], u'description': u'', u'admin_state_up': False, u'tenant_id': u'8b9cd1ebd13f4172a0c63789ee9c0de2', u'created_at': u'2017-02-22T10:47:24Z', u'updated_at': u'2017-02-22T10:47:24Z', u'flavor_id': None, u'revision_number': 3, u'routes': [], u'project_id': u'8b9cd1ebd13f4172a0c63789ee9c0de2', u'id': u'545e74b0-2f3d-43b8-8678-93bbb3f1f6f3', u'name': u'test'}
ipdb> show_router
{u'status': u'ACTIVE', u'external_gateway_info': None, u'availability_zone_hints': [], u'availability_zones': [u'nova'], u'description': u'', u'admin_state_up': False, u'tenant_id': u'8b9cd1ebd13f4172a0c63789ee9c0de2', u'created_at': u'2017-02-22T10:47:24Z', u'updated_at': u'2017-02-22T10:47:26Z', u'flavor_id': None, u'revision_number': 8, u'routes': [], u'project_id': u'8b9cd1ebd13f4172a0c63789ee9c0de2', u'id': u'545e74b0-2f3d-43b8-8678-93bbb3f1f6f3', u'name': u'test'}

Version-Release number of selected component (if applicable):
python-neutron-lbaas-10.0.1-0.20170222151526.c6011fb.el7ost.noarch
puppet-neutron-10.3.0-1.el7ost.noarch
openstack-neutron-common-10.0.0-5.el7ost.noarch
openstack-neutron-10.0.0-5.el7ost.noarch
openstack-neutron-openvswitch-10.0.0-5.el7ost.noarch
python-neutronclient-6.1.0-0.20170208193918.1a2820d.el7ost.noarch
python-neutron-10.0.0-5.el7ost.noarch
python-neutron-lib-1.1.0-0.20170213120052.9b3ea8f.el7ost.noarch
openstack-neutron-ml2-10.0.0-5.el7ost.noarch

setup : 
Ocata - 11_director-rhel-7.3-3controllers_2computes-ipv4-gre-lvm-lbaas

How reproducible:
always

Steps to Reproduce:
1. run this tests neutron.tests.tempest.api.test_timestamp.TestTimeStampWithL3.test_show_router_attribute_with_timestamp with tempests.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 Eran Kuris 2017-03-28 08:59:56 UTC
Hi Assaf, Would you mind taking a look at my bug?

Comment 3 Assaf Muller 2017-03-30 14:58:03 UTC
(In reply to Eran Kuris from comment #2)
> Hi Assaf, Would you mind taking a look at my bug?

I added this to our OSP 11 list to be triaged on the next team call.

Comment 4 Brian Haley 2017-03-30 18:25:14 UTC
Hi Eran,

You mean the 'updated_at' timestamp is different, correct?  That is +2s on the show from the original.

My initial though is that this is due to the notification mechanism changes in neutron.  For example, when a router is created, just the base resource is created, then notifiers registered for the "AFTER_CREATE" event are notified.  In some cases, those notified can update the resource, like when it's a distributed router, which could update the timestamp.  A number of code paths changed in this regard in the Ocata timeframe to fix some race conditions.

A quick look doesn't reveal any notifications for a "base" router, but I might not have found it yet, and 2 seconds is a long time when the test issues back-to-back calls.  I'll keep looking.

Comment 5 Ihar Hrachyshka 2017-03-30 18:38:05 UTC
An alternative idea is: maybe that's some other component updating the router object in the meantime. Worth a check of neutron-server logs to see if we get any other HTTP/RPC update requests for the resource in those 3 seconds.

Comment 7 Brian Haley 2017-03-31 15:49:46 UTC
I'm still looking into this, I didn't find the neutron-server logs in that CI run, but I did figure out one thing.

The POST to create the router was issued at 23:44:25, but returned at 23:44:28, so it took a couple of seconds.  The GET was issued at 23:44:29 and returned almost immediately.  That timeline seems Ok.

I am still thinking there was simply a post-creation operation that caused a change to 'updated_at', which was the failure, not a diff of 'created_at'.  So I'm starting to think the second check in the test is invalid, since this field is most likely going to change right after creation.

Comment 8 Brian Haley 2017-03-31 15:51:56 UTC
Ah, the server log is in the controller tar ball, I'm taking a look at that now.

Comment 9 Eran Kuris 2017-04-03 14:09:18 UTC
(In reply to Brian Haley from comment #8)
> Ah, the server log is in the controller tar ball, I'm taking a look at that
> now.

https://rhos-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/RHOS/view/RHOS11/job/qe-DFG-neutron-11_director-rhel-7.3-virthost-3cont_2comp-ipv4-gre-lvm-lbaas/lastCompletedBuild/artifact/

this is the latest run that I saw this issue. 
If you need more details, ping me and I will try to help.

Comment 10 Brian Haley 2017-04-03 16:37:55 UTC
So I was able to match up the logs now, and I was also able to recreate this in a test environment.

The problem is that this test environment has HA enabled, so the create_router() code in l3_hamode_db.py is over-riding the base code.  Here's a snippet of that code:

        router_dict = super(L3_HA_NAT_db_mixin,
                            self).create_router(context, router)
        if is_ha:
            try:
                router_db = self._get_router(context, router_dict['id'])

                self.schedule_router(context, router_dict['id'])

                router_dict['ha_vr_id'] = router_db.extra_attributes.ha_vr_id
                router_dict['status'] = self._update_router_db(
                    context, router_dict['id'],
                    {'status': n_const.ROUTER_STATUS_ACTIVE})['status']
                self._notify_router_updated(context, router_db.id,
                                            schedule_routers=False)


Since a vr_id is allocated, the router DB entry is updated with that info right after it was created.  The original POST call is returning what is essentially "stale" data.

You can see this if you create a router with --ha:

(admin) $ openstack router create --ha --project ae5c9703507546f5801a14ddb124d47e router3

$ openstack router show router3
+-------------------------+--------------------------------------+
| Field                   | Value                                |
+-------------------------+--------------------------------------+
| admin_state_up          | UP                                   |
| availability_zone_hints |                                      |
| availability_zones      | nova                                 |
| created_at              | 2017-04-03T16:01:43Z                 |
| description             |                                      |
| distributed             | False                                |
| external_gateway_info   | None                                 |
| flavor_id               | None                                 |
| ha                      | False                                |
| id                      | 08ec51fb-188e-460a-bb01-61ddf160b77c |
| name                    | router3                              |
| project_id              | ae5c9703507546f5801a14ddb124d47e     |
| revision_number         | 5                                    |
| routes                  |                                      |
| status                  | ACTIVE                               |
| updated_at              | 2017-04-03T16:01:44Z                 |
+-------------------------+--------------------------------------+

Notice not only the 'updated_at' is bumped, but so is 'revision_number'.  A non-HA router doesn't go through that code path, so has matching values, plus the revision_number is lower:

$ openstack router create router4
$ openstack router show router4
+-------------------------+--------------------------------------+
| Field                   | Value                                |
+-------------------------+--------------------------------------+
| admin_state_up          | UP                                   |
| availability_zone_hints |                                      |
| availability_zones      |                                      |
| created_at              | 2017-04-03T16:29:40Z                 |
| description             |                                      |
| distributed             | False                                |
| external_gateway_info   | None                                 |
| flavor_id               | None                                 |
| ha                      | False                                |
| id                      | f55709cc-9972-4819-92cc-641957440576 |
| name                    | router4                              |
| project_id              | ae5c9703507546f5801a14ddb124d47e     |
| revision_number         | 3                                    |
| routes                  |                                      |
| status                  | ACTIVE                               |
| updated_at              | 2017-04-03T16:29:40Z                 |
+-------------------------+--------------------------------------+

So the test was not intended to be run in anything but a "base" environment, without HA or DVR (it will do similar updates).

It would seem the easy solution would be to tweak the tempest/api/base.py create_router() code to set ha=False, but since it's an admin-only option, it's a little more complicated than that, but I think it can be done.  There might be other options as well to work around this.  But I think for now we can call this a test bug.

Comment 11 Brian Haley 2017-04-05 14:46:40 UTC
Change merged upstream, https://review.openstack.org/453285

stable/ocata and stable/newton cherry-picks started.
https://review.openstack.org/#/c/453641/
https://review.openstack.org/#/c/453677/

Comment 12 Brian Haley 2017-04-17 20:59:22 UTC
Not sure why all those flags got set

Comment 13 Brian Haley 2017-04-17 21:00:21 UTC
Change merged to RHOS-11:

https://code.engineering.redhat.com/gerrit/#/c/103330/

Change merged to RHOS-10:

https://code.engineering.redhat.com/gerrit/#/c/103332/

Comment 15 Eran Kuris 2017-04-30 05:47:08 UTC
(In reply to Brian Haley from comment #13)
> Change merged to RHOS-11:
> 
> https://code.engineering.redhat.com/gerrit/#/c/103330/
> 
> Change merged to RHOS-10:
> 
> https://code.engineering.redhat.com/gerrit/#/c/103332/

Please fill "Fixed In Version" field.

Comment 16 Eran Kuris 2017-05-03 08:48:02 UTC
verified on openstack-neutron-10.0.1-1.el7ost.noarch
fixed

Comment 17 errata-xmlrpc 2017-05-17 20:09:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1245


Note You need to log in before you can comment on or make changes to this bug.