Created attachment 558611 [details] Replace brick logs Description of problem: Striped replicate volume info: Volume Name: vol Type: Striped-Replicate Volume ID: ba41b542-bdbd-4691-989b-6103301135fa Status: Started Number of Bricks: 1 x 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: dagobah:/data/export1 Brick2: dagobah:/data/export2 Brick3: dagobah:/data/export3 Brick4: dagobah:/data/export4 Options Reconfigured: performance.stat-prefetch: off cluster.stripe-block-size: 32KB performance.write-behind: off 1 client was running multiple levels of directory creation & another client was running rdd[./rdd -f 2GB -i rdd.in -o /data/mounts/fuse/rddfile -r 100 -t 4] Issued replace-brick operation, root@Dagobah:/data# gluster volume replace-brick vol dagobah:/data/export1/ dagobah:/data/export5 start replace-brick started successfully root@Dagobah:/data# gluster volume replace-brick vol dagobah:/data/export1/ dagobah:/data/export5 status Number of files migrated = 106 Current file= /playground/crawl/dir.1/dir.1/dir.2 root@Dagobah:/data# gluster volume replace-brick vol dagobah:/data/export3/ dagobah:/data/export7 start replace-brick failed to start root@Dagobah:/data# gluster volume replace-brick vol dagobah:/data/export3/ dagobah:/data/export7 status incorrect source or destination brick root@Dagobah:/data# gluster volume replace-brick vol dagobah:/data/export1/ dagobah:/data/export5 status Source brick dagobah:/data/export1 is not online. Source brick had crashed with following backtrace: Core was generated by `/usr/local/sbin/glusterfsd -s localhost --volfile-id vol.dagobah.data-export1 -'. Program terminated with signal 11, Segmentation fault. #0 __opendir (name=0x0) at ../sysdeps/unix/opendir.c:86 86 ../sysdeps/unix/opendir.c: No such file or directory. in ../sysdeps/unix/opendir.c (gdb) bt #0 __opendir (name=0x0) at ../sysdeps/unix/opendir.c:86 #1 0x00007fa1fbd1a8a7 in posix_opendir (frame=0x7fa1ff990ba0, this=0x23d8040, loc=0x7fa1fe7e9904, fd=0x7fa1f87045b0) at ../../../../../xlators/storage/posix/src/posix.c:568 #2 0x00007fa1fbb07112 in posix_acl_opendir (frame=0x7fa1ff9b0638, this=0x23d95a0, loc=0x7fa1fe7e9904, fd=0x7fa1f87045b0) at ../../../../../xlators/system/posix-acl/src/posix-acl.c:1067 #3 0x00007fa1fb8eddd0 in pl_opendir (frame=0x7fa1ff992930, this=0x23da7d0, loc=0x7fa1fe7e9904, fd=0x7fa1f87045b0) at ../../../../../xlators/features/locks/src/posix.c:388 #4 0x00007fa1fb6d78d8 in iot_opendir_wrapper (frame=0x7fa1ff996904, this=0x23db9e0, loc=0x7fa1fe7e9904, fd=0x7fa1f87045b0) at ../../../../../xlators/performance/io-threads/src/io-threads.c:1469 #5 0x00007fa201340066 in call_resume_wind (stub=0x7fa1fe7e98cc) at ../../../libglusterfs/src/call-stub.c:2359 #6 0x00007fa201347409 in call_resume (stub=0x7fa1fe7e98cc) at ../../../libglusterfs/src/call-stub.c:3932 #7 0x00007fa1fb6ce8ef in iot_worker (data=0x23ed940) at ../../../../../xlators/performance/io-threads/src/io-threads.c:138 #8 0x00007fa200cb6efc in start_thread (arg=0x7fa1f9d37700) at pthread_create.c:304 #9 0x00007fa2009f189d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 #10 0x0000000000000000 in ?? () (gdb) f 1 #1 0x00007fa1fbd1a8a7 in posix_opendir (frame=0x7fa1ff990ba0, this=0x23d8040, loc=0x7fa1fe7e9904, fd=0x7fa1f87045b0) at ../../../../../xlators/storage/posix/src/posix.c:568 568 dir = opendir (real_path); (gdb) p real_path $1 = 0x0 I have attached the logs for further investigation along with the replace-brick logs.
*** Bug 788476 has been marked as a duplicate of this bug. ***
Happens with latest git head also. Replace-brick does not work & crashes source brick
test with 3.3.0qa24 please
CHANGE: http://review.gluster.com/2950 (afr: Copy loc->gfid independent of lookup being fresh or otherwise) merged in master by Anand Avati (avati)
Works, does not crash source brick. Just prints "replace-brick failed to start" to the console.