Description of problem: ====================== On QE Setups like the one used for longevity and stress, we are seeing lot of promote/demote failed error messages in tier log. If they are genuine, then we need to find the reason and fix them accordingly. If they are spurious errors, we need to clean up such errors before the product release, as it is really difficult for debugging or root causing other issues Version-Release number of selected component (if applicable): ======================================================= glusterfs-server-3.7.5-0.19.git0f5c3e8.el7.centos.x86_64 Steps Carried: ============== 1. Created 12 node cluster 2. Create tiered volume with Hot tier as (6 x 2) and Cold tier as (2 x (6 + 2) = 16) 3. Fuse Mount the volume on 3 clients RHEL7.2,RHEl7.1 and RHEL6.7 4. Start creating data from each client: Client 1: ========= [root@dj ~]# crefi --multi -n 10 -b 10 -d 10 --max=1024k --min=5k --random -T 5 -t text -I 5 --fop=create /mnt/fuse/ Client 2: ========= [root@mia ~]# cd /mnt/fuse/ [root@mia fuse]# for i in {1..10}; do cp -rf /etc etc.$i ; sleep 100 ; done Client 3: ========= [root@wingo fuse]# for i in {1..999}; do dd if=/dev/zero of=dd.$i bs=1M count=1 ; sleep 10 ; done 5. After a while, the data creation of client 1 and client 2 should be completed while the data creation from client 3 will still be inprogress 6. At this point the data creation will be of only 1 file from client 3 in every 10 sec. 7. Monitor the cpu usage using top
REVIEW: http://review.gluster.org/12394 (cluster/tier remove suprious log messages on valid failed migration) posted (#1) for review on release-3.7 by Dan Lambright (dlambrig)
REVIEW: http://review.gluster.org/12395 (cluster/tier update man pages for tier feature) posted (#1) for review on release-3.7 by Dan Lambright (dlambrig)
COMMIT: http://review.gluster.org/12394 committed in release-3.7 by Dan Lambright (dlambrig) ------ commit 6fe5d09826542c37626f8f63299d6bce4671c34f Author: Dan Lambright <dlambrig> Date: Mon Oct 19 09:04:07 2015 -0400 cluster/tier remove suprious log messages on valid failed migration Backport fix 12391 > On a write to a replica volume, we record in all brick's databases an entry. > When the tier daemon runs, it will only move the file if it is the true > owner of the file as defined by the XATTR_NODE_UUID_KEY. > Change-Id: Ib82717f87a3f94f3d0d9f969773de9e88d6aaf22 > BUG: 1273043 > Signed-off-by: Dan Lambright <dlambrig> > Reviewed-on: http://review.gluster.org/12391 > Reviewed-by: Joseph Fernandes > Tested-by: NetBSD Build System <jenkins.org> > Tested-by: Gluster Build System <jenkins.com> Signed-off-by: Dan Lambright <dlambrig> Change-Id: I12147f878cd1927f845867fb7c0b84c4db017ee1 BUG: 1272398 Reviewed-on: http://review.gluster.org/12394 Reviewed-by: Joseph Fernandes Tested-by: NetBSD Build System <jenkins.org> Reviewed-by: Dan Lambright <dlambrig> Tested-by: Dan Lambright <dlambrig>
COMMIT: http://review.gluster.org/12395 committed in release-3.7 by Dan Lambright (dlambrig) ------ commit 05ad7bc4e15b1b0d50d406cdc26402963b22ac77 Author: Dan Lambright <dlambrig> Date: Mon Oct 19 14:16:42 2015 -0400 cluster/tier update man pages for tier feature Add to gluster man pages instructions for tier commands. Backport fix 12391 > Change-Id: I0918460eeaba22bb6a11238d4f5501fa8e61da88 > BUG: 1272557 > Signed-off-by: Dan Lambright <dlambrig> > Reviewed-on: http://review.gluster.org/12380 > Tested-by: NetBSD Build System <jenkins.org> > Reviewed-by: N Balachandran <nbalacha> Change-Id: I2cc16defb2eeb56075357c32d4ef71d6869891bb BUG: 1272398 Signed-off-by: Dan Lambright <dlambrig> Reviewed-on: http://review.gluster.org/12395 Tested-by: Gluster Build System <jenkins.com> Tested-by: NetBSD Build System <jenkins.org>
Dan, The patch merged is only for man page, does this complete the bug fix? If not we can move the state back to assigned on this.
There was a clerical error on my part. The man page fix should not have been associated with this bug. I will move the fix back to assigned and propagate the correct fix to 3.7, which has already been merged upstream.
REVIEW: http://review.gluster.org/12465 (cluster/tier do not log error message on lookup heal for files on hot tier) posted (#1) for review on release-3.7 by Dan Lambright (dlambrig)
REVIEW: http://review.gluster.org/12465 (cluster/tier dont log error on lookup heal for files on hot tier) posted (#2) for review on release-3.7 by Dan Lambright (dlambrig)
COMMIT: http://review.gluster.org/12465 committed in release-3.7 by Dan Lambright (dlambrig) ------ commit c360e8d3e33ac02a3bdb11d16fa4f638fc7dea9c Author: Dan Lambright <dlambrig> Date: Mon Oct 26 14:19:24 2015 -0400 cluster/tier dont log error on lookup heal for files on hot tier This is a backport of 12430 On fix-layout heal files are scanned. Files found are exist on the hot or cold subvolume. Those not found in the cold tier would exist on the hot. They should not be flagged as an error. Replace INFO with TRACE for common tier migration logs. Frequent migration was growing the log files too quickly. On migratation failures, do not acrue files towards cycle limit's budget. > Change-Id: Ie832ee07c43bce5477ae81c939d1fe8416a11615 > BUG: 1275383 > Signed-off-by: Dan Lambright <dlambrig> > Reviewed-on: http://review.gluster.org/12430 > Tested-by: Gluster Build System <jenkins.com> > Reviewed-by: Joseph Fernandes Signed-off-by: Dan Lambright <dlambrig> Change-Id: Ia1ce5c3ac9c8c43cf3f3f7e0bd6161aa13affe5f BUG: 1272398 Signed-off-by: Dan Lambright <dlambrig> Reviewed-on: http://review.gluster.org/12465 Tested-by: Gluster Build System <jenkins.com>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.6, please open a new bug report. glusterfs-3.7.6 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://www.gluster.org/pipermail/gluster-users/2015-November/024359.html [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user