Bug 1001498 - Dist-geo-rep : geo-rep truncates some files to zero byte on slave which were synced to slave, after creating hardlinks to those files on master
Summary: Dist-geo-rep : geo-rep truncates some files to zero byte on slave which were ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: geo-replication
Version: 2.1
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: ---
: ---
Assignee: Venky Shankar
QA Contact: Vijaykumar Koppad
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-08-27 07:31 UTC by Vijaykumar Koppad
Modified: 2014-08-25 00:50 UTC (History)
6 users (show)

Fixed In Version: glusterfs-3.4.0.31rhs-1
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-09-23 22:29:54 UTC
Embargoed:


Attachments (Terms of Use)

Description Vijaykumar Koppad 2013-08-27 07:31:45 UTC
Description of problem: If on master hardlinks created to files which were synced to slave, the few of the files on slave gets truncated to zero byte file.


Version-Release number of selected component (if applicable):glusterfs-3.4.0.23rhs-1.el6rhs.x86_64


How reproducible:Happens everytime 


Steps to Reproduce:
1.Create and start a geo-rep relationship between master(Dist-rep) and slave(Dist-rep)
2.create few file on the master using " ./crefi.py -n 10 --multi  -b 10 -d 10 --random --max=500K --min=10 /mnt/master/"
3.Let it sync to slave. 
4. then create hardlinks to all those files using command "./crefi.py -n 10 --multi  -b 10 -d 10 --random --max=500K --min=10 --fop=hardlink /mnt/master/"

Actual results: geo-rep truncates few files to zero byte which were synced to slave. 


Expected results:It shouldn't truncate any files on the slave.


Additional info:

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
md5sum for few files on master 

ab20f38f6fafc79a7b8dccba80eeb6fa  ./level00/521c4377~~PQ0P2ZU4HC
efce893ac80fee1c282ae2bee68e570c  ./level00/521c4377~~WTTWI8AD4D
863961bd84430791fcd913c73070affd  ./level00/521c4378~~MNDNL43BEX
3b0dd3eb503267c5c2a8fb8f605f232d  ./level00/521c4378~~TB9ZTYX576
0d47edb19d060d9b5aa7e3076c1a4a88  ./level00/521c4378~~U9PUHXWK0U
d1dbf799db856fbcc368001ea9d30cf6  ./level00/521c4378~~UMOIKBI3I9
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
md5sum on slave
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
ab20f38f6fafc79a7b8dccba80eeb6fa  ./level00/521c4377~~PQ0P2ZU4HC
d41d8cd98f00b204e9800998ecf8427e  ./level00/521c4377~~WTTWI8AD4D
d41d8cd98f00b204e9800998ecf8427e  ./level00/521c4378~~MNDNL43BEX
d41d8cd98f00b204e9800998ecf8427e  ./level00/521c4378~~TB9ZTYX576
d41d8cd98f00b204e9800998ecf8427e  ./level00/521c4378~~U9PUHXWK0U
d41d8cd98f00b204e9800998ecf8427e  ./level00/521c4378~~UMOIKBI3I9

except the file 521c4377~~PQ0P2ZU4HC, all have different md5sum on slave and they are zero byte files.


On the slave brick logs, it had some errors, 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
[2013-08-27 06:28:32.181666] E [posix.c:838:posix_mknod] 0-slave-posix: setting gfid on /bricks/brick1/level00/level10/hardlink_to_files/521c4707~~2C6MZNTCQR failed
[2013-08-27 06:28:32.239086] W [posix-handle.c:582:posix_handle_hard] 0-slave-posix: mismatching ino/dev between file /bricks/brick1/level00/level10/level20/hardlink_to_files/521c4707~~SPKWM6AKEQ (14000/64770) and handle /bricks/brick1/.glusterfs/72/6c/726c813e-1066-49ff-8d4d-3f1cfff682b3 (41962117/64770)
[2013-08-27 06:28:32.239137] E [posix.c:838:posix_mknod] 0-slave-posix: setting gfid on /bricks/brick1/level00/level10/level20/hardlink_to_files/521c4707~~SPKWM6AKEQ failed
[2013-08-27 06:28:32.279886] W [posix-handle.c:582:posix_handle_hard] 0-slave-posix: mismatching ino/dev between file /bricks/brick1/level00/level10/level20/level30/hardlink_to_files/521c4707~~ZLBB5V3C8U (33555470/64770) and handle /bricks/brick1/.glusterfs/11/a9/11a908f4-f834-424e-9d56-7ec4fc63005e (16783428/64770)
[2013-08-27 06:28:32.279937] E [posix.c:838:posix_mknod] 0-slave-posix: setting gfid on /bricks/brick1/level00/level10/level20/level30/hardlink_to_files/521c4707~~ZLBB5V3C8U failed
[2013-08-27 06:28:32.282507] W [posix-handle.c:582:posix_handle_hard] 0-slave-posix: mismatching ino/dev between file /bricks/brick1/level00/level10/level20/level30/hardlink_to_files/521c4707~~X7ZG9IZ5Y2 (33555471/64770) and handle /bricks/brick1/.glusterfs/9a/db/9adb676c-9ab5-4b00-8e6d-9020ce10ab34 (16783429/64770)
[2013-08-27 06:28:32.282551] E [posix.c:838:posix_mknod] 0-slave-posix: setting gfid on /bricks/brick1/level00/level10/level20/level30/hardlink_to_files/521c4707~~X7ZG9IZ5Y2 failed
[2013-08-27 06:28:32.285121] W [posix-handle.c:582:posix_handle_hard] 0-slave-posix: mismatching ino/dev between file /bricks/brick1/level00/level10/level20/level30/hardlink_to_files/521c4707~~QCD1KG0PMO (33555472/64770) and handle /bricks/brick1/.glusterfs/7e/80/7e804332-0fd6-4752-8244-bcca412458c5 (16783430/64770)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Comment 4 Vijaykumar Koppad 2013-08-30 17:26:25 UTC
Tried on glusterfs-3.4.0.30rhs-2.el6rhs.x86_64. Still getting some zero byte files on the slaves.

Comment 5 Amar Tumballi 2013-09-03 14:20:27 UTC
https://code.engineering.redhat.com/gerrit/#/c/12415

Comment 6 Vijaykumar Koppad 2013-09-06 09:33:53 UTC
verified on glusterfs-3.4.0.31rhs-1.

Comment 7 Scott Haines 2013-09-23 22:29:54 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html


Note You need to log in before you can comment on or make changes to this bug.