Description of problem:after hardlink syncing to slave, arequal on slave failed with short read for some files. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> # ./arequal-checksum /mnt/slave md5sum: /mnt/slave/level00/level10/level20/level30/level40/level50/level60/level70/level80/level90/hardlink_to_files/52df9997%%J1LWI41EU7: No data available /mnt/slave/level00/level10/level20/level30/level40/level50/level60/level70/level80/level90/hardlink_to_files/52df9997%%J1LWI41EU7: short read ftw (/mnt/slave) returned -1 (Success), terminating >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Geo-rep client logs for that file 52df9997%%J1LWI41EU7 on slave, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> # grep 52df9997%%J1LWI41EU7 /var/log/glusterfs/geo-replication-slaves/205daa96-ff92-45be-96bc-3e6bd5c0f631\:gluster%3A%2F%2F127.0.0.1%3Aslave.gluster.log [2014-01-22 10:31:06.323927] W [client-rpc-fops.c:256:client3_3_mknod_cbk] 0-slave-client-0: remote operation failed: File exists. Path: <gfid:1c6e53ad-56ac-40b6-bf01-4994c0648493>/52df9997%%J1LWI41EU7 [2014-01-22 10:31:06.324255] W [client-rpc-fops.c:256:client3_3_mknod_cbk] 0-slave-client-1: remote operation failed: File exists. Path: <gfid:1c6e53ad-56ac-40b6-bf01-4994c0648493>/52df9997%%J1LWI41EU7 [2014-01-22 10:31:06.324300] I [fuse-bridge.c:3516:fuse_auxgfid_newentry_cbk] 0-fuse-aux-gfid-mount: failed to create the entry <gfid:1c6e53ad-56ac-40b6-bf01-4994c0648493>/52df9997%%J1LWI41EU7 with gfid (a69bd8bf-8d1f-4b9f-95d9-d6296ed7befb): File exists [2014-01-22 10:31:06.324329] W [fuse-bridge.c:1628:fuse_err_cbk] 0-glusterfs-fuse: 1046: MKNOD() <gfid:1c6e53ad-56ac-40b6-bf01-4994c0648493>/52df9997%%J1LWI41EU7 => -1 (File exists) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Version-Release number of selected component (if applicable):glusterfs-server-3.4.0.57rhs-1 How reproducible: Didn't try to reproduce. Steps to Reproduce: 1. create and start a geo-rep relationship between master(dist-rep) and slave(dist-rep) 2.create some data on master using the command, "./crefi.py -n 10 --multi -b 10 -d 10 --random --max=500K --min=10 /mnt/master/" and let it sync 3.create symlinks to those files "./crefi.py -n 10 --multi -b 10 -d 10 --random --max=500K --min=10 --fop=symlink /mnt/master/" and let them sync 4.stop geo-rep session 5.create hardlink to the regular files created "./crefi.py -n 10 --multi -b 10 -d 10 --random --max=500K --min=10 --fop=hardlink /mnt/master/" 6. start geo-rep session 5. Check the geo-rep log files. Actual results: arequal on slave mount failed with short read for some files. Expected results: it shouldn't fail on read on slave. Additional info:
Created attachment 853859 [details] client log file on which the arequal failed.
Created attachment 853917 [details] sosreports of the all the machine involved in the volume.
This is happening consistently. Happened again in the build glusterfs-server-3.4.0.58rhs-1.
This is also happening in both hybrid crawl and changelog crawl. This was error log of arequal, ftw (-p) returned -1 (No data available), terminating client logs, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [2014-01-31 07:17:36.210805] I [fuse-bridge.c:4811:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.13 kernel 7.13 [2014-01-31 07:17:36.210905] I [client-handshake.c:450:client_set_lk_version_cbk] 0-slave-client-11: Server lk version = 1 [2014-01-31 07:18:30.338235] E [dht-helper.c:777:dht_migration_complete_check_task] 0-slave-dht: /level05/level15/level25/level35/level45/level55/level65/level75/level85/52ea540f%%QMDHDDXP81: failed to get the 'linkto' xattr No data available [2014-01-31 07:18:30.338365] W [fuse-bridge.c:1134:fuse_attr_cbk] 0-glusterfs-fuse: 21456: STAT() /level05/level15/level25/level35/level45/level55/level65/level75/level85/52ea540f%%QMDHDDXP81 => -1 (No data available) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> steps, 1. create and start a geo-rep relationship between master(dist-rep) and slave(dist-rep) 2.create some data on master using the command, "./crefi.py -n 10 --multi -b 10 -d 10 --random --max=500K --min=10 /mnt/master/" and let it sync 3.create symlinks to those files "./crefi.py -n 10 --multi -b 10 -d 10 --random --max=500K --min=10 --fop=symlink /mnt/master/" and let them sync 4.create hardlink to the regular files created "./crefi.py -n 10 --multi -b 10 -d 10 --random --max=500K --min=10 --fop=hardlink /mnt/master/" 5. run arequal-checksum on slave mount point.
Vijaykumar, Was rebalance invoked on the volume by any chance? Also, can you please upload/paste the brick logs where the file in question resides?
There was no rebalance happening. It was plain start of the geo-rep session and syncing regular files and hardlinks.
Created attachment 857753 [details] slave brick log file. attaching slave brick log file, where the file in question was residing.
Thanks Vijaykumar, I suspect we synced the sticky but file (not sure). For hardlinks, in a DHT volume, we would create a sticky bit file (with the link to attribute pointing to the correct subvolume if hashed and cached were different for that file). During hybrid crawl, there would be a race b/w syncing of the actual hardlink and the sticky bit file. If the sticky bit file gets synced, (which is what I see in logs you've given [failed to get the 'linkto' xattr No data available]), it may result in this issue. Can you confirm if we've sticky bit files on the bricks without link-to xattrs.
Looks like it's the issue mentioned in Comment #9. Master: ls -l /bricks/master_brick*/level04/level14/level24/level34/level44/level54/level64/level74/level84/level94/hardlink_to_files/52ea580b%%65F0TGQK1P ---------T 2 41940 5065 0 Jan 31 12:46 /bricks/master_brick1/level04/level14/level24/level34/level44/level54/level64/level74/level84/level94/hardlink_to_files/52ea580b%%65F0TGQK1P -rwx--xrwx 3 41940 5065 1767 Jan 31 12:29 /bricks/master_brick9/level04/level14/level24/level34/level44/level54/level64/level74/level84/level94/hardlink_to_files/52ea580b%%65F0TGQK1P ls -l /bricks/master_brick*/level04/level14/level24/level34/level44/level54/level64/level74/level84/level94/hardlink_to_files/52ea580b%%65F0TGQK1P -rwx--xrwx 3 41940 5065 1767 Jan 31 12:29 /bricks/master_brick10/level04/level14/level24/level34/level44/level54/level64/level74/level84/level94/hardlink_to_files/52ea580b%%65F0TGQK1P ---------T 2 41940 5065 0 Jan 31 12:46 /bricks/master_brick2/level04/level14/level24/level34/level44/level54/level64/level74/level84/level94/hardlink_to_files/52ea580b%%65F0TGQK1P Slave: root@redlemon ~]# ls -l /bricks/slave_brick*/level04/level14/level24/level34/level44/level54/level64/level74/level84/level94/hardlink_to_files/52ea580b%%65F0TGQK1P ---------T 2 41940 41940 0 Jan 31 12:46 /bricks/slave_brick11/level04/level14/level24/level34/level44/level54/level64/level74/level84/level94/hardlink_to_files/52ea580b%%65F0TGQK1P ls -l /bricks/slave_brick*/level04/level14/level24/level34/level44/level54/level64/level74/level84/level94/hardlink_to_files/52ea580b%%65F0TGQK1P ---------T 2 41940 41940 0 Jan 31 12:46 /bricks/slave_brick12/level04/level14/level24/level34/level44/level54/level64/level74/level84/level94/hardlink_to_files/52ea580b%%65F0TGQK1P Vijaykumar, Can you confirm that this does _not_ happen in changelog mode. I see the file names coming up in xsync changelog, which implies there were some files that were synced in the xsync mode (which has this issue). If it's only during xsync mode then the fix is here: http://review.gluster.org/#/c/6792/
If I try it in the changelog mode, I am hitting Bug 1003020, which crashes gsyncd and restarted gsyncd will sync the rest of the hardlinks through hybrid crawl(xsync) which is resulting in the above issue. I am able to hit this consistently with 6x2 volume, but this behavior is inconsistent with 2x2 volume.
Vijaykumar, Maybe you can try this in a pure replicated volume on the slave.
This has happened in the build glusterfs-3.6.0.16-1.el6rhs, with 6x2 volume.
This has happened in cascaded setup on slave level2 volume while syncing hardlinks. There are more number of file on slave level 2 volume than master and slave level 1 volume. This has happened in slave level 2 volume only. =============================================================================== file count on master is 17456 file count on slave is 17489 =============================================================================== There was error while calculating md5sum =============================================================================== Calculating slave checksum ... Failed to get the checksum of slave with following error md5sum: /tmp/tmpZUlbzy/thread3/level01/level11/53c7ad33%%TI64COMAMS: No data available /tmp/tmpZUlbzy/thread3/level01/level11/53c7ad33%%TI64COMAMS: short read ftw (-p) returned -1 (Success), terminating =============================================================================== There are few files with 2 entries in the directory and we can also see stick bit files on the mount point. =============================================================================== # ls /mnt/slave/thread0/level02/level12/level22/level32/hardlink_to_files/ -l total 8 ---------T 1 root root 0 Jul 17 18:08 53c7c386%%0OUTYNSNBL -r-------- 2 60664 2735 1266 Jul 17 16:32 53c7c386%%5UI8FJ3P3V ---------T 1 root root 0 Jul 17 18:08 53c7c386%%7323VONN1K -rw--wxrwx 2 50486 51232 1461 Jul 17 16:41 53c7c386%%OZV5T9I51D ---------T 1 root root 0 Jul 17 18:08 53c7c387%%1M171U4F6V ---------T 1 root root 0 Jul 17 18:08 53c7c387%%2O0FVVBHUZ --wx-wx--x 2 42173 37786 1222 Jul 17 16:32 53c7c387%%67QTB5HYS3 ---xr-xrwx 2 7886 62050 1514 Jul 17 16:41 53c7c387%%7B9NWNYBGV ---xr-xrwx 2 7886 62050 1514 Jul 17 16:41 53c7c387%%7B9NWNYBGV ---------T 1 root root 0 Jul 17 18:08 53c7c387%%9F3CMK6ZLX ---------T 1 root root 0 Jul 17 18:08 53c7c387%%SM0CONAEGX # ls /mnt/slave/thread0/level02/level12/level22/level32/hardlink_to_files/53c7c387%%7B9NWNYBGV -l ---------T 1 root root 0 Jul 17 18:08 /mnt/slave/thread0/level02/level12/level22/level32/hardlink_to_files/53c7c387%%7B9NWNYBGV =============================================================================== In above paste, there is file "53c7c387%%7B9NWNYBGV" which has 2 entries and also there are some file with sticky bit. In the intermediate master (slave level 1 volume) the active node which has the sticky bit for the file 53c7c386%%0OUTYNSNBL has the entry in changelogs like this ============================================================================= # grep -r "d90aff2a-d55f-454f-9794-df4eefd1b82d" * 1f8a8e6b046b00c682675ebf692f5968/.processed/CHANGELOG.1405600673:E d90aff2a-d55f-454f-9794-df4eefd1b82d MKNOD 33280 0 0 28571791-a541-4ab2-8e38-ca5924308b57%2F53c7c386%25%250OUTYNSNBL 1f8a8e6b046b00c682675ebf692f5968/.processed/CHANGELOG.1405600673:M d90aff2a-d55f-454f-9794-df4eefd1b82d NULL ============================================================================== This changelog entry shouldn't be there in the node which has sticky bit file for the file 53c7c386%%0OUTYNSNBL
Seems like, issue mentioned in the comment 14 is because some other issue, though effects are same. Hence its being tracked with this Bug 1121059
This got merged before 3.0 was branched out, so it was already present in 3.0 branch.
verified on the build glusterfs-3.6.0.27. Tried couple of time, didn't observe any issues mentioned in the description.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2014-1278.html