Bug 802629

Summary: [6a8fcff3fb6955162dc4eeaeaa627bb31311627e]: posix compliance tests hang with graph change
Product: [Community] GlusterFS Reporter: Raghavendra Bhat <rabhat>
Component: nfsAssignee: Vivek Agarwal <vagarwal>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: mainlineCC: gluster-bugs, sankarshan, vbellur
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.4.0 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-07-24 17:27:02 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Raghavendra Bhat 2012-03-13 06:07:44 UTC
Description of problem

Replicate volume with replica count 2. mounted via nfs. Running posix compliance tests with graph changes happening parallely, hangs posix compliance tests.



Version-Release number of selected component (if applicable):


How reproducible:
always

Steps to Reproduce:
1. create a volume, mount it via nfs
2. run posix compliance tests
3. parallely make graph changes
  
Actual results:

posix compliance tests hangs

Expected results:

posix compliance tests should not hang

Additional info:

space and port: 24007
[2012-03-13 11:33:46.948686] D [client.c:1966:client_rpc_notify] 0-mirror-client-1: got RPC_CLNT_CONNECT
[2012-03-13 11:33:46.948867] D [client-handshake.c:193:client_start_ping] 0-mirror-client-1: returning as transport is already disconnected OR there are no frames (1 || 1)
[2012-03-13 11:33:46.948941] W [client.c:1992:client_rpc_notify] 0-mirror-client-1: Cancelling the grace timer
[2012-03-13 11:33:46.949053] I [client-handshake.c:1533:select_server_supported_programs] 0-mirror-client-1: Using Program GlusterFS 3git, Num (1298437), Version (330)
[2012-03-13 11:33:46.949237] D [client-handshake.c:193:client_start_ping] 0-mirror-client-1: returning as transport is already disconnected OR there are no frames (1 || 1)
[2012-03-13 11:33:46.949434] I [client-handshake.c:1308:client_setvolume_cbk] 0-mirror-client-0: clnt-lk-version = 1, server-lk-version = 0
[2012-03-13 11:33:46.949550] I [client-handshake.c:1334:client_setvolume_cbk] 0-mirror-client-0: Connected to 127.0.0.1:24009, attached to remote volume '/mnt/sda7/export3'.
[2012-03-13 11:33:46.949610] D [client-handshake.c:1195:client_post_handshake] 0-mirror-client-0: no fds to open - notifying all parents child up
[2012-03-13 11:33:46.949670] D [client-handshake.c:452:client_set_lk_version] 0-mirror-client-0: Sending SET_LK_VERSION
[2012-03-13 11:33:46.949918] I [afr-common.c:3498:afr_notify] 0-mirror-replicate-0: Subvolume 'mirror-client-0' came back up; going online.
[2012-03-13 11:33:46.950051] I [client-handshake.c:1308:client_setvolume_cbk] 0-mirror-client-1: clnt-lk-version = 1, server-lk-version = 0
[2012-03-13 11:33:46.950091] I [client-handshake.c:1334:client_setvolume_cbk] 0-mirror-client-1: Connected to 127.0.0.1:24010, attached to remote volume '/mnt/sda8/export3'.
[2012-03-13 11:33:46.950122] D [client-handshake.c:1195:client_post_handshake] 0-mirror-client-1: no fds to open - notifying all parents child up
[2012-03-13 11:33:46.950150] D [client-handshake.c:452:client_set_lk_version] 0-mirror-client-1: Sending SET_LK_VERSION
[2012-03-13 11:33:46.950950] D [client-handshake.c:429:client_set_lk_version_cbk] 0-mirror-client-0: Server lk version = 1
[2012-03-13 11:33:46.951409] D [client-handshake.c:429:client_set_lk_version_cbk] 0-mirror-client-1: Server lk version = 1
[2012-03-13 11:33:46.951578] I [afr-common.c:1857:afr_set_root_inode_on_first_lookup] 0-mirror-replicate-0: added root inode
[2012-03-13 11:33:46.951736] D [afr-self-heal-common.c:148:afr_sh_print_pending_matrix] 0-mirror-replicate-0: pending_matrix: [ 0 0 ]
[2012-03-13 11:33:46.951781] D [afr-self-heal-common.c:148:afr_sh_print_pending_matrix] 0-mirror-replicate-0: pending_matrix: [ 0 0 ]
[2012-03-13 11:33:46.951812] D [afr-self-heal-common.c:753:afr_mark_sources] 0-mirror-replicate-0: Number of sources: 0
[2012-03-13 11:33:46.951849] D [afr-self-heal-data.c:799:afr_lookup_select_read_child_by_txn_type] 0-mirror-replicate-0: returning read_child: 1
[2012-03-13 11:33:46.951885] D [afr-common.c:1275:afr_lookup_select_read_child] 0-mirror-replicate-0: Source selected as 1 for /
[2012-03-13 11:33:46.951922] D [afr-common.c:1082:afr_lookup_build_response_params] 0-mirror-replicate-0: Building lookup response from 1
[2012-03-13 11:33:46.952008] D [nfs.c:237:nfs_subvolume_set_started] 0-nfs: Starting up: mirror , vols started till now: 1
(END)

Comment 1 Vijay Bellur 2012-03-15 09:47:23 UTC
Do you have a sleep between the graph changes? If not, can you try with sleep 30 and see if the behavior persists?

Comment 2 Krishna Srinivas 2012-08-14 10:40:24 UTC
Raghavendra, moving it to on_qa, you can close it if it is not a valid bug.