Bug 1138952
Summary: | Geo-Rep: Backport of patches to 3.6 branch. | ||
---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Kotresh HR <khiremat> |
Component: | geo-replication | Assignee: | bugs <bugs> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 3.6.0 | CC: | bugs, gluster-bugs |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | glusterfs-3.6.0beta1 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2014-11-11 08:38:20 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1117822 |
Description
Kotresh HR
2014-09-06 17:08:38 UTC
REVIEW: http://review.gluster.org/8637 (gluster: Fix the recursive goto outs in the source code.) posted (#1) for review on release-3.6 by Kotresh HR (khiremat) REVIEW: http://review.gluster.org/8638 (features/changelog: Capture "correct" internal FOPs) posted (#1) for review on release-3.6 by Kotresh HR (khiremat) REVIEW: http://review.gluster.org/8639 (geo-rep: minimize xsync crawl usage and set upper limit to xsync crawl) posted (#1) for review on release-3.6 by Kotresh HR (khiremat) REVIEW: http://review.gluster.org/8640 (geo-rep/libgfchangelog: Create working dir during changelog_register if not present.) posted (#1) for review on release-3.6 by Kotresh HR (khiremat) REVIEW: http://review.gluster.org/8641 (geo-rep/libgfchangelog: Support of symlinks while creation of working dir.) posted (#1) for review on release-3.6 by Kotresh HR (khiremat) REVIEW: http://review.gluster.org/8642 (feature/geo-rep: Keep marker.tstamp's mtime unchangeable during snapshot.) posted (#1) for review on release-3.6 by Kotresh HR (khiremat) REVIEW: http://review.gluster.org/8643 (geo-rep: Handle RMDIR recursively) posted (#1) for review on release-3.6 by Kotresh HR (khiremat) REVIEW: http://review.gluster.org/8644 (geo-rep: Fixing issue with xsync upper limit) posted (#1) for review on release-3.6 by Kotresh HR (khiremat) REVIEW: http://review.gluster.org/8645 (geo-rep/glusterd: API to check active geo-rep session for the volume) posted (#1) for review on release-3.6 by Kotresh HR (khiremat) REVIEW: http://review.gluster.org/8646 (features/changelog: barrier all entry creation fops) posted (#1) for review on release-3.6 by Kotresh HR (khiremat) REVIEW: http://review.gluster.org/8647 (features/changelog: Removal of redundant fop color count while draining.) posted (#1) for review on release-3.6 by Kotresh HR (khiremat) REVIEW: http://review.gluster.org/8648 (features/changelog: Crash consistency of changelog wrt snapshot) posted (#1) for review on release-3.6 by Kotresh HR (khiremat) COMMIT: http://review.gluster.org/8637 committed in release-3.6 by Vijay Bellur (vbellur) ------ commit f32fc33a01d6b199ccecb7cb38eeb773c20585f5 Author: Avra Sengupta <asengupt> Date: Mon Jul 14 13:07:08 2014 +0000 gluster: Fix the recursive goto outs in the source code. Added a script check_goto.pl, that when run from the source code root, will scan all .c files to match the following pattern: label: if (condition) goto label; On finding such a pattern the script will print the file name and the line number. There are certain cases where the above recursive pattern is intended. Hence adding those labels to ignore-labels. Thanks Vijaikumar Mallikarjuna for the perl script. Also fixed all such existing errors BUG: 1138952 Change-Id: Ie6b75621711736e7e30f2f9d25e50435d58fc1e2 Signed-off-by: Vijaikumar Mallikarjuna <vmallika> Signed-off-by: Avra Sengupta <asengupt> Reviewed-on: http://review.gluster.org/8307 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Jeff Darcy <jdarcy> Reviewed-by: Krishnan Parthasarathi <kparthas> Tested-by: Krishnan Parthasarathi <kparthas> Reviewed-on: http://review.gluster.org/8637 Reviewed-by: Vijay Bellur <vbellur> COMMIT: http://review.gluster.org/8648 committed in release-3.6 by Vijay Bellur (vbellur) ------ commit c99fe7a6e939f6a961d10a12cb03dcecf4a56885 Author: Ajeet Jha <ajha> Date: Sat Aug 23 19:06:45 2014 +0530 features/changelog: Crash consistency of changelog wrt snapshot This patch introduces call-path fop details logging for data operations in CHANGELOG.SNAP. This feature is enabled with barrier-enable notification and disabled with barrier-disable notification. BUG: 1138952 Change-Id: Ic418dd70b0a0b369202c5b79a6f7f96512821065 Signed-off-by: Ajeet Jha <ajha> Reviewed-on: http://review.gluster.org/8533 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Vijay Bellur <vbellur> Reviewed-on: http://review.gluster.org/8648 COMMIT: http://review.gluster.org/8638 committed in release-3.6 by Vijay Bellur (vbellur) ------ commit 0db65a084608a9deb0d0917f084e0d55e23e54a7 Author: Venky Shankar <vshankar> Date: Fri Jul 18 15:36:42 2014 +0530 features/changelog: Capture "correct" internal FOPs This patch fixes changelog capturing internal FOPs in a cascaded setup, where the intermediate master would record internal FOPs (generated by DHT on link()/rename()). This is due to I/O happening on the intermediate slave on geo-replication's auxillary mount with client-pid -1. Currently, the internal FOP capturing logic depends on client pid being non-negative and the presence of a special key in dictionary. Due to this, internal FOPs on an inter-mediate master would be recorded in the changelog. Checking client-pid being non-negative was introduced to capture AFR self-heal traffic in changelog, thereby breaking cascading setups. By coincidence, AFR self-heal daemon uses -1 as frame->root->pid thereby making is hard to differentiate b/w geo-rep's auxillary mount and self-heal daemon. BUG: 1138952 Change-Id: Ia08a2cfa3b02bb785f343794f5b2695d44398c4c Original-Author: Venky Shankar <vshankar> Signed-off-by: Kotresh H R <khiremat> Reviewed-on: http://review.gluster.org/8347 Reviewed-by: Venky Shankar <vshankar> Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Vijay Bellur <vbellur> Reviewed-on: http://review.gluster.org/8638 COMMIT: http://review.gluster.org/8639 committed in release-3.6 by Vijay Bellur (vbellur) ------ commit fca81b1300e2afdf3eb7cb75428657a31e92bc00 Author: Aravinda VK <avishwan> Date: Mon Jun 23 13:43:20 2014 +0530 geo-rep: minimize xsync crawl usage and set upper limit to xsync crawl For effective handling of deletes and renames use history crawl as much as possible. History crawl will run in loop till it syncs all data before live changelog time. When it uses xsync crawl(fallback when changelog not available, or very first crawl) it sets upper limit to crawl. After completing History crawl, it checks actual end time returned by history api to compare with register time, if actual end is less than register time then run history crawl one more time. If first turn history processing time is less than the CHANGELOG ROLLOVER TIME then sleep for the difference, After sleep if it is guaranteed that rollover will happen and switches to live changelog consumption without switching to xsync. This sleep is only when history processing completed < CHANGELOG_ROLLOVER_TIME and sleep only after the first turn, So will not affect the performance. BUG: 1138952 Change-Id: Ida024211d312f60f0e8190805e7469b2165f00e1 Signed-off-by: Aravinda VK <avishwan> Reviewed-on: http://review.gluster.org/8151 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Venky Shankar <vshankar> Tested-by: Venky Shankar <vshankar> Reviewed-on: http://review.gluster.org/8639 Reviewed-by: Vijay Bellur <vbellur> COMMIT: http://review.gluster.org/8640 committed in release-3.6 by Vijay Bellur (vbellur) ------ commit 010c82b0027300e45cc8db5d29ad87d39b290147 Author: Kotresh H R <khiremat> Date: Fri Aug 1 14:08:56 2014 +0530 geo-rep/libgfchangelog: Create working dir during changelog_register if not present. Earlier, xysnc's register was being called first, which was creating working directory before calling changelog_register. Now it is history crawl first. Hence working directory would not have been created. Create it in gf_changelog_register itself if it is not already created. BUG: 1138952 Change-Id: Ie39b9fd8c1ef7385f76a9b67d0acc3c1c2fd2bb2 Signed-off-by: Kotresh H R <khiremat> Reviewed-on: http://review.gluster.org/8399 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Aravinda VK <avishwan> Reviewed-by: Vijay Bellur <vbellur> Reviewed-on: http://review.gluster.org/8640 COMMIT: http://review.gluster.org/8641 committed in release-3.6 by Vijay Bellur (vbellur) ------ commit 85275c5f1d9fe120ed45147a15be74b70d4c7958 Author: Kotresh H R <khiremat> Date: Mon Aug 4 15:33:02 2014 +0530 geo-rep/libgfchangelog: Support of symlinks while creation of working dir. In gf_changelog_register, enable symlink support while creating working directory if its not already created. BUG: 1138952 Change-Id: I8fec52a5768fae46ce30a2331f30f1d8d5e2e173 Signed-off-by: Kotresh H R <khiremat> Reviewed-on: http://review.gluster.org/8409 Reviewed-by: Venky Shankar <vshankar> Tested-by: Gluster Build System <jenkins.com> Tested-by: Venky Shankar <vshankar> Reviewed-on: http://review.gluster.org/8641 Reviewed-by: Vijay Bellur <vbellur> COMMIT: http://review.gluster.org/8642 committed in release-3.6 by Vijay Bellur (vbellur) ------ commit 81a127513da40424c572d566c1f26a7dfb345037 Author: Kotresh H R <khiremat> Date: Fri Aug 1 16:12:38 2014 +0530 feature/geo-rep: Keep marker.tstamp's mtime unchangeable during snapshot. Problem: Geo-replicatoin does a full xsync crawl after snapshot restoration of slave and master. It does not do history crawl. Analysis: Marker creates 'marker.tstamp' file when geo-rep is started for the first time. The virtual extended attribute 'trusted.glusterfs.volume-mark' is maintained and whenever it is queried on gluster mount point, marker fills it on the fly and returns the combination of uuid, ctime of marker.tstamp and others. So ctime of marker.tstamp, in other sense 'volume-mark' marks the geo-rep start time when the session is freshly created. From the above, after the first filesystem crawl(xsync) is done during first geo-rep start, stime should always be less than 'volume-mark'. So whenever stime is less than volume-mark, it does full filesystem crawl (xsync). Root Cause: When snapshot is restored, marker.tstamp file is freshly created losing the timestamps, it was originally created with. Solution: 1. Change is made to depend on mtime instead of ctime. 2. mtime and atime of marker.tstamp is restored back when snapshot is created and restored. BUG: 1138952 Change-Id: I0e19e1cb2593171b9a2b41d0d303330feb7fd2b3 Signed-off-by: Kotresh H R <khiremat> Reviewed-on: http://review.gluster.org/8401 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Vijay Bellur <vbellur> Reviewed-on: http://review.gluster.org/8642 COMMIT: http://review.gluster.org/8643 committed in release-3.6 by Vijay Bellur (vbellur) ------ commit f875a7f82e53349a4a7a88d0eaa41c1485f2a2ba Author: Aravinda VK <avishwan> Date: Tue Aug 12 18:19:30 2014 +0530 geo-rep: Handle RMDIR recursively If RMDIR is recorded in brick changelog which is due to self heal traffic then it will not have UNLINK entries for child files. Geo-rep hangs with ENOTEMPTY error on slave. Now geo-rep recursively deletes the dir if it gets ENOTEMPTY. BUG: 1138952 Change-Id: Ie79db90c52103b39fa795bb8a096b363d450b427 Signed-off-by: Aravinda VK <avishwan> Reviewed-on: http://review.gluster.org/8477 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Venky Shankar <vshankar> Tested-by: Venky Shankar <vshankar> Reviewed-on: http://review.gluster.org/8643 Reviewed-by: Vijay Bellur <vbellur> COMMIT: http://review.gluster.org/8644 committed in release-3.6 by Vijay Bellur (vbellur) ------ commit abf0343e9dada8b119a212db5b24e8a8712a8c4f Author: Aravinda VK <avishwan> Date: Fri Aug 8 15:06:11 2014 +0530 geo-rep: Fixing issue with xsync upper limit While identifying the file/dir to sync, xtime of the file was compared with xsync_upper_limit as `xtime < xsync_upper_limit` After the sync, xtime of parent directory is updated as stime. With the upper limit condition, stime is updated as MIN(xtime_parent, xsync_upper_limit) With this files will get missed if `xtime_of_file == xsync_upper_limit` With this patch xtime_of_file is compared as xtime_of_file <= xsync_upper_limit BUG: 1138952 Change-Id: I469e8638ab6923e518022a539a19e2d040b60eb0 Signed-off-by: Aravinda VK <avishwan> Reviewed-on: http://review.gluster.org/8439 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Kotresh HR <khiremat> Reviewed-by: Vijay Bellur <vbellur> Reviewed-on: http://review.gluster.org/8644 COMMIT: http://review.gluster.org/8645 committed in release-3.6 by Vijay Bellur (vbellur) ------ commit d5c5a83d449b3c78405f7cbcab2c4dd6f0b0d896 Author: Kotresh H R <khiremat> Date: Fri Aug 8 17:17:20 2014 +0530 geo-rep/glusterd: API to check active geo-rep session for the volume Requirement: Snapshot needs an API to fail the CLI if any geo-rep session is active for that volume. Solution: A function "gd_vol_is_geo_rep_active" is provided to check if any geo-rep session is active for that volume. An in memory dict called 'gsync_running_slaves' is maintained in 'volinfo' structure to keep track of active geo-rep session for the volume. The key 'slavenode::slavevol' with value 'running' is added whenever geo-rep is started/resumed into the dict and the same is removed if stopped/paused. So the 'count' in dict is used to decide whether the geo-rep is active or not for that volume. Also added "this->name" in gf_log in routines which this patch is touched. BUG: 1138952 Change-Id: Ib13aeb509a56edf510651b77e20bf3cc43a3e763 Signed-off-by: Kotresh HR <khiremat> Reviewed-on: http://review.gluster.org/8459 Reviewed-by: Krishnan Parthasarathi <kparthas> Tested-by: Krishnan Parthasarathi <kparthas> Reviewed-on: http://review.gluster.org/8645 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Vijay Bellur <vbellur> COMMIT: http://review.gluster.org/8646 committed in release-3.6 by Vijay Bellur (vbellur) ------ commit ee1ce68362ddbd54a4ad33b76dc72a2c0623d646 Author: Vijay Bellur <vbellur> Date: Fri Aug 22 17:15:13 2014 +0530 features/changelog: barrier all entry creation fops when a snapshot is taken, there are chances of entry creation fops not being recorded either in changelog or through the recursive ancestry xtime updation by marker. This causes consumers of changelog (primarily geo-replication as of today) to not be aware of these entries after a snapshot is restored. This can lead to inconsistencies. This patch is an interim workaround to barrier creates till changelog becomes completely crash consistent. BUG: 1138952 Change-Id: Idd5e690a05fe2c7c5d32d1541a0d9b5132881ea7 Signed-off-by: Vijay Bellur <vbellur> Reviewed-on: http://review.gluster.org/8517 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: ajeet jha <ajha> Reviewed-by: Aravinda VK <avishwan> Reviewed-by: Kotresh HR <khiremat> Reviewed-on: http://review.gluster.org/8646 COMMIT: http://review.gluster.org/8647 committed in release-3.6 by Vijay Bellur (vbellur) ------ commit 6f1b9d91cb10e40c7bbdf17dd8cb5a5644d41fd5 Author: Ajeet Jha <ajha> Date: Tue Aug 26 14:39:24 2014 +0530 features/changelog: Removal of redundant fop color count while draining. BUG: 1138952 Change-Id: I594be0d09c6af2e4a34da3e819d1ab6fd85e34c4 Signed-off-by: Ajeet Jha <ajha> Reviewed-on: http://review.gluster.org/8542 Reviewed-by: Vijay Bellur <vbellur> Tested-by: Gluster Build System <jenkins.com> Reviewed-on: http://review.gluster.org/8647 A beta release for GlusterFS 3.6.0 has been released. Please verify if the release solves this bug report for you. In case the glusterfs-3.6.0beta1 release does not have a resolution for this issue, leave a comment in this bug and move the status to ASSIGNED. If this release fixes the problem for you, leave a note and change the status to VERIFIED. Packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update (possibly an "updates-testing" repository) infrastructure for your distribution. [1] http://supercolony.gluster.org/pipermail/gluster-users/2014-September/018836.html [2] http://supercolony.gluster.org/pipermail/gluster-users/ This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.6.1, please reopen this bug report. glusterfs-3.6.1 has been announced [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://supercolony.gluster.org/pipermail/gluster-users/2014-November/019410.html [2] http://supercolony.gluster.org/mailman/listinfo/gluster-users |