Bug 1507323 - Region was offline - after a restart region has lost all data
Summary: Region was offline - after a restart region has lost all data
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat CloudForms Management Engine
Classification: Red Hat
Component: Appliance
Version: 5.8.0
Hardware: All
OS: All
high
urgent
Target Milestone: GA
: 5.10.0
Assignee: Gregg Tanzillo
QA Contact: Dave Johnson
URL:
Whiteboard:
Depends On:
Blocks: 1513508 1513509
TreeView+ depends on / blocked
 
Reported: 2017-10-29 17:16 UTC by Ryan Spagnola
Modified: 2020-12-14 10:41 UTC (History)
6 users (show)

Fixed In Version: 5.10.0.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1513508 1513509 (view as bug list)
Environment:
Last Closed: 2018-06-21 20:56:14 UTC
Category: Bug
Cloudforms Team: CFME Core
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Comment 7 Gregg Tanzillo 2017-10-30 16:13:14 UTC
Looking at the supplied postgres log it seems that a replication subscription was created to the same region. Meaning a subscription was created from region 7 to region 7. As part of creating a subscription, all data from the remote region is deleted from the current region. In this case it's the same region. That's why all the data was removed from region 7

We can see that this happened twice. Also the timestamp of the first one coincides with the exact time that we first see the error in the EVM log indicating that the server was unable to find its id in the db. (Note that the pg log is in UTC and the evm log in local time which is UTC +11)

2017-10-24 17:53:35 GMT:[local]:59ef7d8f.6a32:root@vmdb_production:[27186]:LOG:  duration: 21821.927 ms  execute <unnamed>: DELETE FROM event_streams WHERE id >= 7000000000000 AND id <= 7999999999999
2017-10-24 17:53:58 GMT:[local]:59ef7d8f.6a32:root@vmdb_production:[27186]:LOG:  duration: 15914.969 ms  execute <unnamed>: DELETE FROM metric_rollups WHERE id >= 7000000000000 AND id <= 7999999999999
2017-10-24 17:56:30 GMT:[local]:59ef7d8f.6a32:root@vmdb_production:[27186]:LOG:  duration: 132897.955 ms  execute <unnamed>: DELETE FROM vim_performance_states WHERE id >= 7000000000000 AND id <= 7999999999999
2017-10-24 17:57:51 GMT:[local]:59ef7d8f.6a32:root@vmdb_production:[27186]:LOG:  duration: 79110.368 ms  execute <unnamed>: DELETE FROM vmdb_metrics WHERE id >= 7000000000000 AND id <= 7999999999999
5783173 2017-10-24 17:57:54 GMT:[local]:59ef7d8f.6a32:root@vmdb_production:[27186]:STATEMENT:  SELECT pglogical.create_subscription($1, $2, $3, $4, $5, $6)


2017-10-24 18:33:50 GMT:[local]:59ef84e5.7624:root@vmdb_production:[30244]:LOG:  duration: 53801.086 ms  bind <unnamed>: DELETE FROM vim_performance_states WHERE id >= 7000000000000 AND id <= 7999999999999
2017-10-24 18:33:58 GMT:[local]:59ef84e5.7624:root@vmdb_production:[30244]:LOG:  duration: 6405.231 ms  execute <unnamed>: DELETE FROM vmdb_metrics WHERE id >= 7000000000000 AND id <= 7999999999999
2017-10-24 18:33:59 GMT:[local]:59ef84e5.7624:root@vmdb_production:[30244]:STATEMENT:  SELECT pglogical.create_subscription($1, $2, $3, $4, $5, $6)

Comment 10 CFME Bot 2017-11-13 14:51:21 UTC
New commit detected on ManageIQ/manageiq/master:
https://github.com/ManageIQ/manageiq/commit/f173fdce1ba57458d753150c83d06788c07d9e93

commit f173fdce1ba57458d753150c83d06788c07d9e93
Author:     Gregg Tanzillo <gtanzill>
AuthorDate: Fri Nov 10 14:52:58 2017 -0500
Commit:     Gregg Tanzillo <gtanzill>
CommitDate: Fri Nov 10 17:46:33 2017 -0500

    Prevent replication subscription to the same region as the current region
    
    This change protects a user from accidentally creating a replication subscription to the same region
    he is in which will result in deleting all the data from the current region.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1507323
    
    Fixed typo1
    
    change 1

 app/models/pglogical_subscription.rb       |  7 +++++++
 spec/models/pglogical_subscription_spec.rb | 26 ++++++++++++++++++++++++++
 2 files changed, 33 insertions(+)

Comment 11 CFME Bot 2017-11-13 14:51:28 UTC
New commit detected on ManageIQ/manageiq/master:
https://github.com/ManageIQ/manageiq/commit/192b14eaa2ba554b8bf6a56818fe0fe5cfb58bb1

commit 192b14eaa2ba554b8bf6a56818fe0fe5cfb58bb1
Author:     Gregg Tanzillo <gtanzill>
AuthorDate: Fri Nov 10 14:59:28 2017 -0500
Commit:     Gregg Tanzillo <gtanzill>
CommitDate: Fri Nov 10 17:46:35 2017 -0500

    Update existing tests to handle check for subscription being to a different region than the current region
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1507323
    
    Fixed typo2
    
    Change 2

 spec/models/pglogical_subscription_spec.rb | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)


Note You need to log in before you can comment on or make changes to this bug.