Bug 1144117

Summary:	dist-geo-rep: zero byte files created in slave even before creating any data in master by processing stale changelogs in working dir
Product:	[Red Hat Storage] Red Hat Gluster Storage	Reporter:	M S Vishwanath Bhat <vbhat>
Component:	geo-replication	Assignee:	Aravinda VK <avishwan>
Status:	CLOSED ERRATA	QA Contact:	shilpa <smanjara>
Severity:	low	Docs Contact:
Priority:	medium
Version:	rhgs-3.0	CC:	aavati, annair, avishwan, bmohanra, csaba, khiremat, mzywusko, nlevinki, nsathyan, sanandpa, smanjara, ssamanta
Target Milestone:	---	Keywords:	ZStream
Target Release:	RHGS 3.0.4
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:	glusterfs-3.6.0.46-1	Doc Type:	Bug Fix
Doc Text:	Previously, as the Changelog API were consuming unprocessed Changelogs from the previous run, the Changelogs were replaced in the slave and created empty files/directories. To fix this issue, ensure to cleanup the working directory before Geo-replication Start.	Story Points:	---
Clone Of:
Clones:	1162057 (view as bug list)		Environment:
Last Closed:	2015-03-26 06:34:40 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1162057, 1182947

Description M S Vishwanath Bhat 2014-09-18 17:25:22 UTC

Description of problem:
zero byte files are created in slave volume just after geo-rep session creation and even before any kind of data is created in master volume.

I had some geo-rep session running in cluster. After the test I cleaned up all the logs, /var/lib/glusterd and backend bricks (I have not cleaned up the working dir used by the geo-rep session. And note that I have not done geo-rep stop/delete. I have done hard cleanup). Now I have scripts which re-creates the whole setup with same name and same configuration. When I create a geo-rep session this somehow the brick got hashed to same value ( I think) and has started processing the xsync and changelogs present in there. But since there are no data on the master it has only created the files in the slave. All the files in the slave are zero bytes.

Version-Release number of selected component (if applicable):
glusterfs-3.6.0.28-1.el6rhs.x86_64

How reproducible:
Seen twice. I think the brick should somehow get hashed to same value as the previous session present in the same node.

Steps to Reproduce:
There are no proper steps to it.
1. Create and start a session.
2. Create some data and run anything.
3. Clean the whole thing and re-create the whole session once again.
4. There is chance that slave volume has files before creating the files in master.

Actual results:
Slave volume has zero byte files before any data is created in master.

Expected results:
Slave should only sync the data created in master.

Additional info:
The bug is not serious but very ugly.

Comment 1 M S Vishwanath Bhat 2014-09-18 17:35:08 UTC

The geo-rep status shows that there are files to be synced

MASTER NODE                MASTER VOL    MASTER BRICK      SLAVE                 STATUS     CHECKPOINT STATUS    CRAWL STATUS       FILES SYNCD    FILES PENDING    BYTES PENDING    DELETES PENDING    FILES SKIPPED   
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
spitfire.blr.redhat.com    master        /bricks/brick0    falcon::slave         Active     N/A                  Changelog Crawl    0              265              0                0                  0               
typhoon.blr.redhat.com     master        /bricks/brick3    lightning::slave      Passive    N/A                  N/A                0              0                0                0                  0               
mustang.blr.redhat.com     master        /bricks/brick1    interceptor::slave    Passive    N/A                  N/A                0              0                0                0                  0               
harrier.blr.redhat.com     master        /bricks/brick2    hornet::slave         Active     N/A                  Changelog Crawl    0              320              0                0   

But there are no files.

There are lot of error messages like this


[2014-09-18 22:45:28.304728] E [resource(/bricks/brick0):225:logerr] Popen: ssh> [2014-09-18 15:38:28.390414] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0
[2014-09-18 22:45:28.305028] E [resource(/bricks/brick0):225:logerr] Popen: ssh> [2014-09-18 15:38:28.390429] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0
[2014-09-18 22:45:28.305279] E [resource(/bricks/brick0):225:logerr] Popen: ssh> [2014-09-18 15:38:28.390457] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0
[2014-09-18 22:45:28.305529] E [resource(/bricks/brick0):225:logerr] Popen: ssh> [2014-09-18 15:38:28.390470] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0
[2014-09-18 22:45:28.305809] E [resource(/bricks/brick0):225:logerr] Popen: ssh> [2014-09-18 15:38:28.390483] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0
[2014-09-18 22:45:28.306107] E [resource(/bricks/brick0):225:logerr] Popen: ssh> [2014-09-18 15:38:28.390494] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0
[2014-09-18 22:45:28.306365] E [resource(/bricks/brick0):225:logerr] Popen: ssh> [2014-09-18 15:38:28.390642] T [cli.c:264:cli_rpc_notify] 0-glusterfs: got RPC_CLNT_CONNECT
[2014-09-18 22:45:28.306638] E [resource(/bricks/brick0):225:logerr] Popen: ssh> [2014-09-18 15:38:28.390748] T [cli-quotad-client.c:94:cli_quotad_notify] 0-glusterfs: got RPC_CLNT_CONNECT
[2014-09-18 22:45:28.306934] E [resource(/bricks/brick0):225:logerr] Popen: ssh> [2014-09-18 15:38:28.390771] I [socket.c:2246:socket_event_handler] 0-transport: disconnecting now
[2014-09-18 22:45:28.307188] E [resource(/bricks/brick0):225:logerr] Popen: ssh> [2014-09-18 15:38:28.390801] T [cli-quotad-client.c:100:cli_quotad_notify] 0-glusterfs: got RPC_CLNT_DISCONNECT
[2014-09-18 22:45:28.307437] E [resource(/bricks/brick0):225:logerr] Popen: ssh> [2014-09-18 15:38:28.391571] T [rpc-clnt.c:1381:rpc_clnt_record] 0-glusterfs: Auth Info: pid: 0, uid: 0, gid: 0, owner: 
[2014-09-18 22:45:28.307732] E [resource(/bricks/brick0):225:logerr] Popen: ssh> [2014-09-18 15:38:28.391708] T [rpc-clnt.c:1238:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen 68, payload: 4, rpc hdr: 64
[2014-09-18 22:45:28.308032] E [resource(/bricks/brick0):225:logerr] Popen: ssh> [2014-09-18 15:38:28.392263] T [socket.c:2715:socket_connect] (-->/usr/sbin/gluster(cli_cmd_submit+0x8e) [0x40b3ce] (-->/usr/sbin/gluster(cli_submit_request+0xdb) [0x40a5cb] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_submit+0x468) [0x7f111c953f38]))) 0-glusterfs: connect () called on transport already connected
[2014-09-18 22:45:28.308284] E [resource(/bricks/brick0):225:logerr] Popen: ssh> [2014-09-18 15:38:28.392351] T [rpc-clnt.c:1573:rpc_clnt_submit] 0-rpc-clnt: submitted request (XID: 0x1 Program: Gluster CLI, ProgVers: 2, Proc: 26) to rpc-transport (glusterfs)
[2014-09-18 22:45:28.308535] E [resource(/bricks/brick0):225:logerr] Popen: ssh> [2014-09-18 15:38:28.392382] D [rpc-clnt-ping.c:231:rpc_clnt_start_ping] 0-glusterfs: ping timeout is 0, returning
[2014-09-18 22:45:28.308872] E [resource(/bricks/brick0):225:logerr] Popen: ssh> [2014-09-18 15:38:28.393779] T [rpc-clnt.c:660:rpc_clnt_reply_init] 0-glusterfs: received rpc message (RPC XID: 0x1 Program: Gluster CLI, ProgVers: 2, Proc: 26) from rpc-transport (glusterfs)
[2014-09-18 22:45:28.309135] E [resource(/bricks/brick0):225:logerr] Popen: ssh> [2014-09-18 15:38:28.393827] I [cli-rpc-ops.c:5324:gf_cli_getwd_cbk] 0-cli: Received resp to getwd
[2014-09-18 22:45:28.309403] E [resource(/bricks/brick0):225:logerr] Popen: ssh> [2014-09-18 15:38:28.394217] D [cli-cmd.c:384:cli_cmd_submit] 0-cli: Returning 0
[2014-09-18 22:45:28.309679] E [resource(/bricks/brick0):225:logerr] Popen: ssh> [2014-09-18 15:38:28.394256] D [cli-rpc-ops.c:5353:gf_cli_getwd] 0-cli: Returning 0
[2014-09-18 22:45:28.309979] E [resource(/bricks/brick0):225:logerr] Popen: ssh> [2014-09-18 15:38:28.394270] I [input.c:36:cli_batch] 0-: Exiting with: 0
[2014-09-18 22:45:28.310240] E [resource(/bricks/brick0):225:logerr] Popen: ssh> Killed by signal 15.
[2014-09-18 22:45:28.310914] I [syncdutils(/bricks/brick0):214:finalize] <top>: exiting.
[2014-09-18 22:45:28.322437] I [repce(agent):92:service_loop] RepceServer: terminating on reaching EOF.
[2014-09-18 22:45:28.323137] I [syncdutils(agent):214:finalize] <top>: exiting.
[2014-09-18 22:45:28.354732] I [monitor(monitor):109:set_state] Monitor: new state: faulty



I will attach the geo-rep logs from two active nodes and copy the sosreports to somewhere accessible.


And it seems like geo-rep delete does not delete the temp working directory.

In one of the nodes in the logs

[2014-09-18 21:08:34.186109] I [master(/bricks/brick1):1281:register] _GMaster: xsync temp directory: /var/lib/misc/glusterfsd/master/ssh%3A%2F%2Froot%4010.70.42.217%3Agluster%3A%2F%2F127.0.0.1%3Aslave/bd42ad17ef8864d51407b1c6478f5dc6/xsync

After geo-rep delete

[root@mustang ~]# ls /var/lib/misc/glusterfsd/master/ssh%3A%2F%2Froot%4010.70.42.156%3Agluster%3A%2F%2F127.0.0.1%3Aslave/
bd42ad17ef8864d51407b1c6478f5dc6


The dir is still present.

Comment 3 Aravinda VK 2014-09-19 09:10:20 UTC

Setup issue, not the problem of working directory is not deleted. Old bricks are used without cleanup. .glusterfs/changelogs directory is not deleted when used the same brick for new volume.

Fixes required:
Glusterd: While creating Volume it should cleanup .glusterfs dir inside brick.
Georep: Delete the working directory when Georep session is deleted.

Comment 4 shilpa 2014-11-07 10:29:58 UTC

Observed an issue related to this bug during the following test case:

1. Create a dist-rep 2*2 volume.
2. Start geo-rep.
3. Start untarring a big file. (used linux kernel in my case)
4. While the untar is in progress, reboot an active node in a replica pair on the master cluster.
5. Once it comes back up, check the geo-rep status. 

The passive node which switched to active at the time of active node reboot, did not switch back to passive. Hence both the nodes in the replica pair remained active.

Before rebooting pinkfloyd:

# gluster volume geo-replication master acdc::slave status
 
MASTER NODE                 MASTER VOL    MASTER BRICK      SLAVE               STATUS     CHECKPOINT STATUS    CRAWL STATUS           
--------------------------------------------------------------------------------------------------------------------------------
ccr          master        /bricks/brick0    nirvana::slave      Active     N/A                  Changelog Crawl        
beatles      master        /bricks/brick3    rammstein::slave    Passive    N/A                  N/A                    
metallica    master        /bricks/brick1    acdc::slave         Passive    N/A                  N/A                    
pinkfloyd   master        /bricks/brick2    led::slave          Active     N/A                  Changelog Crawl        

-------------------------------------------

After rebooting pinkfloyd:

# gluster v geo master acdc::slave status detail
 
MASTER NODE                 MASTER VOL    MASTER BRICK      SLAVE               STATUS     CHECKPOINT STATUS    CRAWL STATUS       FILES SYNCD    FILES PENDING    BYTES PENDING    DELETES PENDING    FILES SKIPPED   
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
metallica    master        /bricks/brick1    acdc::slave         Passive    N/A                  N/A                0              0                0                0                  0               
pinkfloyd    master        /bricks/brick2    led::slave          Active     N/A                  Changelog Crawl    24691          0                0                0                  0               
beatles      master        /bricks/brick3    rammstein::slave    Active     N/A                  Changelog Crawl    0              245              0                0                  0               
ccr          master        /bricks/brick0    nirvana::slave      Active     N/A                  Changelog Crawl    24461          0                0                0                  0               


The node that was rebooted was rebooted was pinkfloyd and beatles was passive before the reboot.

Saw these error messages in the logs on the slave node connected to beatles:

[pid  8859] lsetxattr(".gfid/9f78f98a-2042-4a75-a180-c5af76b35a98", "glusterfs.gfid.newfile", "\x00\x00\x00\x00\x00\x00\x00\x005f1deefd-bf1a-4554-b699-1e41de9d2d3e\x00\x00\x00\x81\x80mac-celtic.c\x00\x00\x00\x01\x80\x00\x00\x00\x00\x00\x00\x00", 74, 0) = -1 ENOENT (No such file or directory)
[pid  8859] select(0, NULL, NULL, NULL, {0, 250000}) = 0 (Timeout)
[pid  8859] lsetxattr(".gfid/9f78f98a-2042-4a75-a180-c5af76b35a98", "glusterfs.gfid.newfile", "\x00\x00\x00\x00\x00\x00\x00\x005f1deefd-bf1a-4554-b699-1e41de9d2d3e\x00\x00\x00\x81\x80mac-celtic.c\x00\x00\x00\x01\x80\x00\x00\x00\x00\x00\x00\x00", 74, 0) = -1 ENOENT (No such file or directory)
[pid  8859] select(0, NULL, NULL, NULL, {0, 250000}) = 0 (Timeout)
[pid  8859] lsetxattr(".gfid/9f78f98a-2042-4a75-a180-c5af76b35a98", "glusterfs.gfid.newfile", "\x00\x00\x00\x00\x00\x00\x00\x005f1deefd-bf1a-4554-b699-1e41de9d2d3e\x00\x00\x00\x81\x80mac-celtic.c\x00\x00\x00\x01\x80\x00\x00\x00\x00\x00\x00\x00", 74, 0) = -1 ENOENT (No such file or directory)
[pid  8859] select(0, NULL, NULL, NULL, {0, 250000}) = 0 (Timeout)
[pid  8859] lsetxattr(".gfid/9f78f98a-2042-4a75-a180-c5af76b35a98", "glusterfs.gfid.newfile", "\x00\x00\x00\x00\x00\x00\x00\x005f1deefd-bf1a-4554-b699-1e41de9d2d3e\x00\x00\x00\x81\x80mac-celtic.c\x00\x00\x00\x01\x80\x00\x00\x00\x00\x00\x00\x00", 74, 0) = -1 ENOENT (No such file or directory)
[pid  8859] select(0, NULL, NULL, NULL, {0, 250000} <unfinished ...>
[pid  8843] <... select resumed> )      = 0 (Timeout)
[pid  8843] select(0, [], [], [], {1, 0} <unfinished ...>
[pid  8859] <... select resumed> )      = 0 (Timeout)


Another issue is all the data was not synced to the slave volume after the reboot. Checksum does not match

arequal-checksum of master is : 
 
Entry counts
Regular files   : 44590
Directories     : 2871
Symbolic links  : 3
Other           : 0
Total           : 47464

Metadata checksums
Regular files   : 2cb0
Directories     : 24d74c
Symbolic links  : 5a815a
Other           : 3e9

Checksums
Regular files   : 99f622aeeb5d95599be432f82b0d3195
Directories     : 100d663f041c7b61
Symbolic links  : 4d47000a1d0f3d7c
Other           : 0
Total           : 5f587663d943e2d1

arequal-checksum of geo_rep_slave slave: 
 
Entry counts
Regular files   : 44490
Directories     : 2871
Symbolic links  : 3
Other           : 0
Total           : 47364

Metadata checksums
Regular files   : 49bc
Directories     : 24c719
Symbolic links  : 5a815a
Other           : 3e9

Checksums
Regular files   : b7f34c6812d966cb3ed6d65f78fdc2a5
Directories     : a3f0e6514757768
Symbolic links  : 4d47000a1d0f3d7c
Other           : 0
Total           : ce5d9458635eee7a

Comment 5 Aravinda VK 2015-02-10 08:20:34 UTC

Downstream patch sent.
https://code.engineering.redhat.com/gerrit/#/c/41583/

Upstream patches:
http://review.gluster.org/9082
http://review.gluster.org/9495

Comment 6 shilpa 2015-02-27 09:00:08 UTC

Tested and verified on glusterfs-3.6.0.46-1

Test case:
-Create files, sync a few files and stop geo-rep. 
-Due to incomplete sync we should find a changelog file in .processing directory with the files pending sync.
-Now start geo-rep. 
-At this point the old changelog file in .processing dir should be cleaned up. 

# for i in {1..30}; do touch f$i; done
# gluster v geo-rep master acdc::slave stop
Stopping geo-replication session between master & acdc::slave has been successful
# ls /mnt/slave
f1  f10  f13  f16  f18  f19  f2  f20  f21  f22  f23  f25  f26  f27  f30  f6  f7  f9
# ls /mnt/slave | wc -l
18
# ls /mnt/master | wc -l
30

Working directory changelog file in .processing:

./6ba51664ad363afecedbc65396e2c80c/.history/.processing/CHANGELOG.1424944982

After stop/start geo-rep:

./6ba51664ad363afecedbc65396e2c80c/.history/.processing/CHANGELOG.1425025496

Comment 7 Bhavana 2015-03-23 20:45:52 UTC

Hi Aravinda,

The doc text is updated. review the same and sign off if it looks ok.

Comment 10 errata-xmlrpc 2015-03-26 06:34:40 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-0682.html