+++ This bug was initially created as a clone of Bug #1214289 +++ Description of problem: I/O failure on attaching tier Version-Release number of selected component (if applicable): glusterfs-server-3.7dev-0.994.git0d36d4f.el6.x86_64 How reproducible: Steps to Reproduce: 1. Create a replica volume 2. Start 100% writes I/O on the volum 3. Attach a a tier while the I/O is in progress 4. Attach tier is successful, but I/O fails Actual results: See that the I/O's are failing. Here is the console o/p: linux-2.6.31.1/arch/ia64/include/asm/sn/mspec.h tar: linux-2.6.31.1/arch/ia64/include/asm/sn/mspec.h: Cannot open: Stale file handle linux-2.6.31.1/arch/ia64/include/asm/sn/nodepda.h tar: linux-2.6.31.1/arch/ia64/include/asm/sn/nodepda.h: Cannot open: Stale file handle linux-2.6.31.1/arch/ia64/include/asm/sn/pcibr_provider.h tar: linux-2.6.31.1/arch/ia64/include/asm/sn/pcibr_provider.h: Cannot open: Stale file handle linux-2.6.31.1/arch/ia64/include/asm/sn/pcibus_provider_defs.h tar: linux-2.6.31.1/arch/ia64/include/asm/sn/pcibus_provider_defs.h: Cannot open: Stale file handle linux-2.6.31.1/arch/ia64/include/asm/sn/pcidev.h tar: linux-2.6.31.1/arch/ia64/include/asm/sn/pcidev.h: Cannot open: Stale file handle linux-2.6.31.1/arch/ia64/include/asm/sn/pda.h tar: linux-2.6.31.1/arch/ia64/include/asm/sn/pda.h: Cannot open: Stale file handle linux-2.6.31.1/arch/ia64/include/asm/sn/pic.h tar: linux-2.6.31.1/arch/ia64/include/asm/sn/pic.h: Cannot open: Stale file handle linux-2.6.31.1/arch/ia64/include/asm/sn/rw_mmr.h tar: linux-2.6.31.1/arch/ia64/include/asm/sn/rw_mmr.h: Cannot open: Stale file handle linux-2.6.31.1/arch/ia64/include/asm/sn/shub_mmr.h tar: linux-2.6.31.1/arch/ia64/include/asm/sn/shub_mmr.h: Cannot open: Stale file handle linux-2.6.31.1/arch/ia64/include/asm/sn/shubio.h tar: linux-2.6.31.1/arch/ia64/include/asm/sn/shubio.h: Cannot open: Stale file handle linux-2.6.31.1/arch/ia64/include/asm/sn/simulator.h tar: linux-2.6.31.1/arch/ia64/include/asm/sn/simulator.h: Cannot open: Stale file handle linux-2.6.31.1/arch/ia64/include/asm/sn/sn2/ Expected results: I/O should continue normally while the tier is being added. Additionally, all the new writes post the tier addition should go to the hot tier. Additional info: --- Additional comment from Anoop on 2015-04-22 07:05:58 EDT --- Volume info before attach: Volume Name: vol1 Type: Replicate Volume ID: b77d4050-7fdc-45ff-a084-f85eec2470fc Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 10.70.35.56:/rhs/brick1 Brick2: 10.70.35.67:/rhs/brick1 Volume Info post attach Volume Name: vol1 Type: Tier Volume ID: b77d4050-7fdc-45ff-a084-f85eec2470fc Status: Started Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: 10.70.35.67:/rhs/brick2 Brick2: 10.70.35.56:/rhs/brick2 Brick3: 10.70.35.56:/rhs/brick1 Brick4: 10.70.35.67:/rhs/brick1 --- Additional comment from Dan Lambright on 2015-04-22 15:46:08 EDT --- When we attach a tier, the new added translator has no cached sub volume for IOs in flight. So IOs to open files fail. Solution is to recompute the cached sub volume for all open FDs with a lookup in tier_init, I believe, working on a fix. --- Additional comment from Anand Avati on 2015-04-28 16:28:27 EDT --- REVIEW: http://review.gluster.org/10435 (cluster/tier: don't use hot tier until subvolumes ready (WIP)) posted (#1) for review on master by Dan Lambright (dlambrig) --- Additional comment from Anand Avati on 2015-04-29 16:22:55 EDT --- REVIEW: http://review.gluster.org/10435 (cluster/tier: don't use hot tier until subvolumes ready (WIP)) posted (#2) for review on master by Dan Lambright (dlambrig) --- Additional comment from Anand Avati on 2015-04-29 18:05:44 EDT --- REVIEW: http://review.gluster.org/10435 (cluster/tier: don't use hot tier until subvolumes ready (WIP)) posted (#3) for review on master by Dan Lambright (dlambrig) --- Additional comment from Anand Avati on 2015-05-04 14:55:52 EDT --- REVIEW: http://review.gluster.org/10435 (cluster/tier: don't use hot tier until subvolumes ready) posted (#4) for review on master by Dan Lambright (dlambrig) --- Additional comment from Dan Lambright on 2015-05-04 14:57:34 EDT --- There may still be a window where an I/O error can happen, but this fix should close most of them. The window will be able to be completely close after BZ 1156637 is resolved. --- Additional comment from Anand Avati on 2015-05-05 11:36:32 EDT --- COMMIT: http://review.gluster.org/10435 committed in master by Kaleb KEITHLEY (kkeithle) ------ commit 377505a101eede8943f5a345e11a6901c4f8f420 Author: Dan Lambright <dlambrig> Date: Tue Apr 28 16:26:33 2015 -0400 cluster/tier: don't use hot tier until subvolumes ready When we attach a tier, the hot tier becomes the hashed subvolume. But directories may not yet have been replicated by the fix layout process. Hence lookups to those directories will fail on the hot subvolume. We should only go to the hashed subvolume once the layout has been fixed. This is known if the layout for the parent directory does not have an error. If there is an error, the cold tier is considered the hashed subvolume. The exception to this rules is ENOCON, in which case we do not know where the file is and must abort. Note we may revalidate a lookup for a directory even if the inode has not yet been populated by FUSE. This case can happen in tiering (where one tier has completed a lookup but the other has not, in which case we revalidate one tier when we call lookup the second time). Such inodes are still invalid and should not be consulted for validation. Change-Id: Ia2bc62e1d807bd70590bd2a8300496264d73c523 BUG: 1214289 Signed-off-by: Dan Lambright <dlambrig> Reviewed-on: http://review.gluster.org/10435 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Raghavendra G <rgowdapp> Reviewed-by: N Balachandran <nbalacha>
COMMIT: http://review.gluster.org/10649 committed in release-3.7 by Vijay Bellur (vbellur) ------ commit d4e9c501a2b949909c4eb0be4cdedb30648cc895 Author: Dan Lambright <dlambrig> Date: Thu May 7 12:27:49 2015 -0400 cluster/tier: don't use hot tier until subvolumes ready This is a backport of fix 10435 to Gluster 3.7. When we attach a tier, the hot tier becomes the hashed subvolume. But directories may not yet have been replicated by the fix layout process. Hence lookups to those directories will fail on the hot subvolume. We should only go to the hashed subvolume once the layout has been fixed. This is known if the layout for the parent directory does not have an error. If there is an error, the cold tier is considered the hashed subvolume. The exception to this rules is ENOCON, in which case we do not know where the file is and must abort. Note we may revalidate a lookup for a directory even if the inode has not yet been populated by FUSE. This case can happen in tiering (where one tier has completed a lookup but the other has not, in which case we revalidate one tier when we call lookup the second time). Such inodes are still invalid and should not be consulted for validation. > http://review.gluster.org/#/c/10435/ > Change-Id: Ia2bc62e1d807bd70590bd2a8300496264d73c523 > BUG: 1214289 > Signed-off-by: Dan Lambright <dlambrig> > Reviewed-on: http://review.gluster.org/10435 > Tested-by: Gluster Build System <jenkins.com> > Reviewed-by: Raghavendra G <rgowdapp> > Reviewed-by: N Balachandran <nbalacha> > Signed-off-by: Dan Lambright <dlambrig> Change-Id: Ia2bc62e1d807bd70590bd2a8300496264d73c523 BUG: 1219547 Signed-off-by: Dan Lambright <dlambrig> Reviewed-on: http://review.gluster.org/10649 Tested-by: NetBSD Build System Reviewed-by: Joseph Fernandes Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Vijay Bellur <vbellur>
Reproduced this ont the BETA2 build too, hence moving it to ASSIGNED.
I am unable to reproduce this. Can you help? 1. What tool do you use for 100% writes? 2. What errors do you see? The way I tried to reproduce this was 1. Create a replica volume 2. Start compile SSL on the volume 3. Attach a a tier while the I/O is in progress 4. Attach tier is successful
I was able to recreated it doing this: for i in {1..10000}; do echo Build $i;dd if=/dev/urandom of=f$i bs=100M count=1 ;done
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report. glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939 [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user