Bug 1622029
Summary: | [geo-rep]: geo-rep reverse sync in FO/FB can accidentally delete the content at original master incase of gfid conflict in 3.4.0 without explicit user rmdir | |||
---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Rahul Hinduja <rhinduja> | |
Component: | geo-replication | Assignee: | Kotresh HR <khiremat> | |
Status: | CLOSED ERRATA | QA Contact: | Rahul Hinduja <rhinduja> | |
Severity: | urgent | Docs Contact: | ||
Priority: | unspecified | |||
Version: | rhgs-3.4 | CC: | amukherj, apaladug, avishwan, csaba, rallan, rhs-bugs, sankarshan, sheggodu, storage-qa-internal | |
Target Milestone: | --- | |||
Target Release: | RHGS 3.4.0 | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | glusterfs-3.12.2-18 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1622076 (view as bug list) | Environment: | ||
Last Closed: | 2018-09-04 06:52:17 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1503137, 1622076 |
Description
Rahul Hinduja
2018-08-24 08:08:14 UTC
Multiple scenario Validation: Build glusterfs-geo-replication-3.12.2-18.el7rhgs.x86_64 Normal Setup => Original Master and Original Slave Scenario 1: ----------- Validating the gfid functionality to work as expected in 3.4.0. A. Create geo-rep between Master and Slave B. Create a directory A at slave and create data inside it. C. Create a directory with same Name A without any data in it. D. Create a file with name file.1 at slave with certain data. F. Create a file with same name file.1 at Master with different data set. Expectation: For Directory => The directory created at Master has different gfid than at slave. The auto gfid resolution detects that and syncs the content from Master to Slave. In this case, the result is both Master and Slave has directory A without any data in it. Log: [2018-08-28 08:27:28.479295] I [master(/rhs/brick2/b5):814:fix_possible_entry_failures] _GMaster: Entry not present on master. Fixing gfid mismatch in slave. Deleting the entry retry_count=1 entry=({'uid': 0, 'gfid': '99f45f16-5340-4740-a49a-c394f4b2354c', 'gid': 0, 'mode': 16877, 'entry': '.gfid/00000000-0000-0000-0000-000000000001/auto_gfid_default_on', 'op': 'MKDIR'}, 17, {'slave_isdir': True, 'gfid_mismatch': True, 'slave_name': None, 'slave_gfid': 'a5b6d78f-0295-4f09-a82b-eb304ebf9d77', 'name_mismatch': False, 'dst': False}) [2018-08-28 08:27:28.480962] I [master(/rhs/brick1/b1):814:fix_possible_entry_failures] _GMaster: Entry not present on master. Fixing gfid mismatch in slave. Deleting the entry retry_count=1 entry=({'uid': 0, 'gfid': '99f45f16-5340-4740-a49a-c394f4b2354c', 'gid': 0, 'mode': 16877, 'entry': '.gfid/00000000-0000-0000-0000-000000000001/auto_gfid_default_on', 'op': 'MKDIR'}, 17, {'slave_isdir': True, 'gfid_mismatch': True, 'slave_name': None, 'slave_gfid': 'a5b6d78f-0295-4f09-a82b-eb304ebf9d77', 'name_mismatch': False, 'dst': False}) [2018-08-28 08:27:28.769546] I [master(/rhs/brick3/b9):1450:crawl] _GM For File => The file created at Master has different gfid than at slave. The auto gfid resolution detects that and syncs the content from Master to Slave. In this case, the result is both Master and Slave has a file withe the content of Master. Log: [2018-08-28 08:37:57.299675] I [master(/rhs/brick3/b9):814:fix_possible_entry_failures] _GMaster: Entry not present on master. Fixing gfid mismatch in slave. Deleting the entry retry_count=1 entry=({'uid': 0, 'gfid': '00131c50-d3f7-4360-866c-3f715a3c36fd', 'gid': 0, 'mode': 33188, 'entry': '.gfid/00000000-0000-0000-0000-000000000001/hosts', 'op': 'CREATE'}, 17, {'slave_isdir': False, 'gfid_mismatch': True, 'slave_name': None, 'slave_gfid': 'b2f34de0-d2b5-4c8e-b8b1-0d021fe98094', 'name_mismatch': False, 'dst': False}) Multiple scenario Validation: Build glusterfs-geo-replication-3.12.2-18.el7rhgs.x86_64 Normal Setup => Original Master and Original Slave Scenario 2: ----------- Validate the config cli A. Setting up wrong value to the cli other than boolean should Fail Actual: It is successful => bug is in place (1622957) B. Resetting the value should work. => Works [root@rhsauto032 scripts]# gluster volume geo-replication Master rhsauto022::Slave config gfid-conflict-resolution rahul [root@rhsauto032 scripts]# gluster volume geo-replication Master rhsauto022::Slave config \!gfid-conflict-resolution geo-replication config updated successfully [root@rhsauto032 scripts]# gluster volume geo-replication Master rhsauto022::Slave config gfid-conflict-resolution [root@rhsauto032 scripts]# C. Setting up boolean values [root@rhsauto032 scripts]# gluster volume geo-replication Master rhsauto022::Slave config gfid-conflict-resolution on geo-replication config updated successfully [root@rhsauto032 scripts]# gluster volume geo-replication Master rhsauto022::Slave config gfid-conflict-resolution on [root@rhsauto032 scripts]# gluster volume geo-replication Master rhsauto022::Slave config gfid-conflict-resolution off geo-replication config updated successfully [root@rhsauto032 scripts]# gluster volume geo-replication Master rhsauto022::Slave config gfid-conflict-resolution off [root@rhsauto032 scripts]# gluster volume geo-replication Master rhsauto022::Slave config gfid-conflict-resolution 1 geo-replication config updated successfully [root@rhsauto032 scripts]# gluster volume geo-replication Master rhsauto022::Slave config gfid-conflict-resolution 1 [root@rhsauto032 scripts]# gluster volume geo-replication Master rhsauto022::Slave config gfid-conflict-resolution 0 geo-replication config updated successfully [root@rhsauto032 scripts]# [root@rhsauto032 scripts]# gluster volume geo-replication Master rhsauto022::Slave config gfid-conflict-resolution 0 [root@rhsauto032 scripts]# [root@rhsauto032 scripts]# gluster volume geo-replication Master rhsauto022::Slave config gfid-conflict-resolution 0 [root@rhsauto032 scripts]# gluster volume geo-replication Master rhsauto022::Slave config gfid-conflict-resolution true geo-replication config updated successfully [root@rhsauto032 scripts]# gluster volume geo-replication Master rhsauto022::Slave config gfid-conflict-resolution true [root@rhsauto032 scripts]# gluster volume geo-replication Master rhsauto022::Slave config gfid-conflict-resolution false geo-replication config updated successfully [root@rhsauto032 scripts]# gluster volume geo-replication Master rhsauto022::Slave config gfid-conflict-resolution false [root@rhsauto032 scripts]# Multiple scenario Validation: Build glusterfs-geo-replication-3.12.2-18.el7rhgs.x86_64 Normal Setup => Original Master and Original Slave Scenario 3: ----------- Validating the "gfid-conflict-resolution" functionality when it is "false" A. Setup geo-rep between Master and Slave B. Set the gfid-conflict-resolution to "false" C. Create a directory (A) with content at Slave D. Create a file (file) with content at Slave F. Create the same directory (A) at master with different content G. Create the same file (file) with different content at master For Files: ========== gfid conflict on: Successfully fixes the issue ++++++++++++++++++++++++++++++++++++++++++++++ [2018-08-28 08:37:57.299675] I [master(/rhs/brick3/b9):814:fix_possible_entry_failures] _GMaster: Entry not present on master. Fixing gfid mismatch in slave. Deleting the entry retry_count=1 entry=({'uid': 0, 'gfid': '00131c50-d3f7-4360-866c-3f715a3c36fd', 'gid': 0, 'mode': 33188, 'entry': '.gfid/00000000-0000-0000-0000-000000000001/hosts', 'op': 'CREATE'}, 17, {'slave_isdir': False, 'gfid_mismatch': True, 'slave_name': None, 'slave_gfid': 'b2f34de0-d2b5-4c8e-b8b1-0d021fe98094', 'name_mismatch': False, 'dst': False}) [2018-08-28 08:37:57.305568] I [master(/rhs/brick3/b9):930:handle_entry_failures] _GMaster: Sucessfully fixed entry ops with gfid mismatch retry_count=1 gfid conflict off: Logs as an error with "ENTRY FAILED" and moves forward. Geo-rep doesn't go to faulty ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ [2018-08-28 09:40:44.24735] E [master(/rhs/brick1/b1):785:log_failures] _GMaster: ENTRY FAILED data=({'uid': 0, 'gfid': '14eb2254-0ddb-4866-b733-b0f268328bf6', 'gid': 0, 'mode': 33188, 'entry': '.gfid/00000000-0000-0000-0000-000000000001/hosts.allow', 'op': 'CREATE'}, 17, {'slave_isdir': False, 'gfid_mismatch': True, 'slave_name': None, 'slave_gfid': '7d812da5-3b86-4f10-8234-4f1b4bfaf07f', 'name_mismatch': False, 'dst': False}) [2018-08-28 09:40:44.648068] I [master(/rhs/brick1/b1):1932:syncjob] Syncer: Sync Time Taken duration=0.1531 num_files=1 job=1 return_code=0 For Directory: ============== gfid conflict on: Successfully fixes the issue +++++++++++++++++++++++++++++++++++++++++++++++ [2018-08-28 08:27:28.479295] I [master(/rhs/brick2/b5):814:fix_possible_entry_failures] _GMaster: Entry not present on master. Fixing gfid mismatch in slave. Deleting the entry retry_count=1 entry=({'uid': 0, 'gfid': '99f45f16-5340-4740-a49a-c394f4b2354c', 'gid': 0, 'mode': 16877, 'entry': '.gfid/00000000-0000-0000-0000-000000000001/auto_gfid_default_on', 'op': 'MKDIR'}, 17, {'slave_isdir': True, 'gfid_mismatch': True, 'slave_name': None, 'slave_gfid': 'a5b6d78f-0295-4f09-a82b-eb304ebf9d77', 'name_mismatch': False, 'dst': False}) [2018-08-28 08:27:28.480962] I [master(/rhs/brick1/b1):814:fix_possible_entry_failures] _GMaster: Entry not present on master. Fixing gfid mismatch in slave. Deleting the entry retry_count=1 entry=({'uid': 0, 'gfid': '99f45f16-5340-4740-a49a-c394f4b2354c', 'gid': 0, 'mode': 16877, 'entry': '.gfid/00000000-0000-0000-0000-000000000001/auto_gfid_default_on', 'op': 'MKDIR'}, 17, {'slave_isdir': True, 'gfid_mismatch': True, 'slave_name': None, 'slave_gfid': 'a5b6d78f-0295-4f09-a82b-eb304ebf9d77', 'name_mismatch': False, 'dst': False}) [2018-08-28 08:27:28.769546] I [master(/rhs/brick3/b9):1450:crawl] _GMaster: slave's time stime=(1535444200, 0) [2018-08-28 08:27:28.798682] I [master(/rhs/brick2/b5):930:handle_entry_failures] _GMaster: Sucessfully fixed entry ops with gfid mismatch retry_count=1 gfid conflict off: Logs as an error with "ENTRY FAILED" and geo-rep remains "FAULTY". It also states to fix the issue to proceed further. +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ [2018-08-28 09:55:07.164000] E [master(/rhs/brick3/b9):785:log_failures] _GMaster: ENTRY FAILED data=({'uid': 0, 'gfid': 'be29aa70-e124-4985-9d58-e888b771fd00', 'gid': 0, 'mode': 16877, 'entry': '.gfid/00000000-0000-0000-0000-000000000001/auto_gfid_default_off', 'op': 'MKDIR'}, 17, {'slave_isdir': True, 'gfid_mismatch': True, 'slave_name': None, 'slave_gfid': 'afac7618-a878-4e02-b48c-447a4f8e1d7d', 'name_mismatch': False, 'dst': False}) [2018-08-28 09:55:07.164477] E [syncdutils(/rhs/brick3/b9):317:log_raise_exception] <top>: The above directory failed to sync. Please fix it to proceed further. [2018-08-28 09:55:07.175034] I [syncdutils(/rhs/brick3/b9):289:finalize] <top>: exiting. After fixing the problematic directories, things works fine. Based on comment 7,8 and 9. Moving this bug to verified state for 3.4.0 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:2607 |