Bug 1382361 - Auto-tagging from same label in 2 providers breaks refresh
Summary: Auto-tagging from same label in 2 providers breaks refresh
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat CloudForms Management Engine
Classification: Red Hat
Component: Providers
Version: 5.6.0
Hardware: All
OS: All
unspecified
high
Target Milestone: GA
: 5.8.0
Assignee: Beni Paskin-Cherniavsky
QA Contact: Tony
URL:
Whiteboard: container
Depends On:
Blocks: 1390695 1390697
TreeView+ depends on / blocked
 
Reported: 2016-10-06 13:17 UTC by Beni Paskin-Cherniavsky
Modified: 2017-06-12 17:52 UTC (History)
11 users (show)

Fixed In Version: 5.8.0.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1390695 1390697 (view as bug list)
Environment:
Last Closed: 2017-06-12 17:52:56 UTC
Category: Bug
Cloudforms Team: Container Management
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
cfme-log (347.96 KB, text/plain)
2017-03-22 16:26 UTC, Tony
no flags Details

Description Beni Paskin-Cherniavsky 2016-10-06 13:17:33 UTC
Description of problem:

ContainerLabelTagMapping model caches the full table in memory, currently forever (will change to for reload once per refresh), and assumes nobody else writes to it.
But with an any-value mapping rule, when encountering a new label value the mapping code itself modifies the mapping table adding a specific value.
Thus if the code runs in 2 processes — e.g. if you have 2 Openshift/Kubernetes providers, each gets its own refresh worker — the mapping code will crash, aborting refresh.

Note: auto-tagging is still dormant undocumented feature, but we'll soon add UI to activate it.

A similar crash also happens even with one worker, given 2 any-value rules mapping different label names to same category.
We don't intend to allow this in UI, so it's irrelevant to QE, but I'm covering this in a unit test.


Version-Release number of selected component (if applicable):
All the way from 5.6.0 which first included auto-tagging code to current master.


How reproducible:
Always

Steps to Reproduce:
1. Create ContainerLabelTagMapping rule from label_name: 'name', label_value: nil (means map all values) to some category.
   TODO: By the time this bug gets to QE, document how to do this via UI we'll add.
2. Have 2 Openshift/Kubernetes providers.
3. Do `oc label namespace default name=value` in both clusters
   (doesn't matter what you label, happens with most kinds of entities, e.g. `oc label pod <some-pod-name> name=value` would do it too).
4. Refresh both providers.

Actual results:
One refresh succeeds and tags the entity (Project in steps above), one aborts mid-way with exception
`ActiveRecord::RecordInvalid: Validation failed: Description has already been taken, Name has already been taken`
(visible in log and in provider's "last refresh status")

Expected results:
Both refreshes should succeed.
Both entities should get tagged, with the same tag.


Pending unit tests for this scenario:
https://github.com/ManageIQ/manageiq/compare/master...cben:auto-tagging-parallel-test#diff-33fc34a1f5793927105791e3d268fad0
(includes a refactoring necessary to formulate the tests)
https://travis-ci.org/cben/manageiq/jobs/165499652#L1233-L1266

Comment 3 CFME Bot 2016-10-28 13:11:22 UTC
New commit detected on ManageIQ/manageiq/master:
https://github.com/ManageIQ/manageiq/commit/90342958cdd2ffbb459fe3f7151b3607e6e8a0aa

commit 90342958cdd2ffbb459fe3f7151b3607e6e8a0aa
Author:     Beni Cherniavsky-Paskin <cben>
AuthorDate: Wed Oct 26 18:57:48 2016 +0300
Commit:     Beni Cherniavsky-Paskin <cben>
CommitDate: Wed Oct 26 19:57:14 2016 +0300

    ContainerLabelTagMapping: Don't insert mappings while mapping, find existing tags.
    
    Finding existing tag if already inserted hopefully fixes:
    https://bugzilla.redhat.com/show_bug.cgi?id=1382347
    https://bugzilla.redhat.com/show_bug.cgi?id=1382361
    
    This will map name=value and name=VALUE labels to same tag,
    which does not reflect Kubernetes case-sensitive semantics,
    but is probably OK (or even desirable?).
    
    Also not polluting the table - having only user-created mappings there -
    will be helpful for UI #11591.
    
    TODO: tests for retag_entity

 app/models/container_label_tag_mapping.rb          | 57 +++++++++---------
 app/models/ems_refresh/save_inventory_container.rb |  4 +-
 spec/models/container_label_tag_mapping_spec.rb    | 67 +++++++++++-----------
 3 files changed, 63 insertions(+), 65 deletions(-)

Comment 4 CFME Bot 2016-10-28 13:11:34 UTC
New commit detected on ManageIQ/manageiq/master:
https://github.com/ManageIQ/manageiq/commit/21c1ef66c3243128880cc8b146e67debdd9b7170

commit 21c1ef66c3243128880cc8b146e67debdd9b7170
Author:     Beni Cherniavsky-Paskin <cben>
AuthorDate: Fri Oct 14 10:24:31 2016 +0300
Commit:     Beni Cherniavsky-Paskin <cben>
CommitDate: Wed Oct 26 20:04:51 2016 +0300

    Tests for auto-tagging cache consistency problems
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1382361
    
    - Test concurrent mapping with independent state
      (as happens in 2 refresh workers).
    - Test for 2 labels -> 1 category mapping.
    
    Both used to fail (in a different pre-rewrite formulation),
    since code was assuming if it doesn't have a tag & mapping in cache,
    the DB doesn't either and it should create them.
    Now fixed by always looking up tag before creating.

 spec/models/container_label_tag_mapping_spec.rb | 35 +++++++++++++++++++++++++
 1 file changed, 35 insertions(+)

Comment 5 CFME Bot 2016-11-01 16:26:22 UTC
New commit detected on ManageIQ/manageiq/euwe:
https://github.com/ManageIQ/manageiq/commit/bbcda770928a952d649f8f3df48169294127e93b

commit bbcda770928a952d649f8f3df48169294127e93b
Author:     Gregg Tanzillo <gtanzill>
AuthorDate: Fri Oct 28 09:09:21 2016 -0400
Commit:     Oleg Barenboim <chessbyte>
CommitDate: Tue Nov 1 12:21:35 2016 -0400

    Merge pull request #11806 from cben/auto-tagging-noinsert
    
    Rewrite of auto-tagging, fixing multiple bugs
    (cherry picked from commit 9800efa91dfffd0f34314a27216ec8fe0c233b38)
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1382347
    https://bugzilla.redhat.com/show_bug.cgi?id=1382361

 app/models/container_label_tag_mapping.rb          | 132 +++++------
 app/models/ems_refresh/save_inventory_container.rb |  35 ++-
 .../kubernetes/container_manager/refresh_parser.rb |  32 ++-
 .../openshift/container_manager/refresh_parser.rb  |  12 +-
 spec/models/container_label_tag_mapping_spec.rb    | 253 ++++++++++++++-------
 .../container_manager/refresh_parser_spec.rb       |  14 +-
 .../kubernetes/container_manager/refresher_spec.rb |  46 +++-
 .../container_manager/refresh_parser_spec.rb       |   3 +-
 8 files changed, 341 insertions(+), 186 deletions(-)

Comment 6 Beni Paskin-Cherniavsky 2016-11-01 16:31:13 UTC
Backported to 5.7.

Comment 9 Tony 2017-03-22 16:26:08 UTC
Created attachment 1265435 [details]
cfme-log

Comment 10 Tony 2017-03-22 16:27:20 UTC
Both refresh processes succeeded, I attached the log for reference. Verified on CFME 5.8.0.7


Note You need to log in before you can comment on or make changes to this bug.