Bug 787123

Summary: [glusterfs-3.3.0qa21]: replace brick operation failing due to an error in getting the socket address in brick process
Product: [Community] GlusterFS Reporter: Raghavendra Bhat <rabhat>
Component: glusterdAssignee: krishnan parthasarathi <kparthas>
Status: CLOSED DUPLICATE QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 3.2.5CC: gluster-bugs, nsathyan
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-05-03 07:17:49 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Raghavendra Bhat 2012-02-03 08:55:19 UTC
Description of problem:
Replicate volume with replica count 3. 1 fuse and 1 nfs client. fuse client running sanity script and nfs client running rdd. Gave replace-brick.
replace brick operation start said it started. status said that source brick is not online.

Source brick did not restart with pump loaded (log file said address already in use). So gave volume start force. And then gave replace-brick abort, during which the command hung. 


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:

Replace-brick commands are failing (i.e. gluster cli is hanging till timeout and returning the prompt without any message).

Expected results:
Replace-brick operations given by cli should succeed.

Additional info:

Log file information:

GLUSTERD LOGS:

012-02-03 03:44:39.273765] I [glusterd-op-sm.c:2085:glusterd_op_txn_complete] 0-glusterd: Cleared local lock
[2012-02-03 03:44:49.250147] I [glusterd-replace-brick.c:98:glusterd_handle_replace_brick] 0-glusterd: Received replace brick req
[2012-02-03 03:44:49.250197] I [glusterd-replace-brick.c:147:glusterd_handle_replace_brick] 0-glusterd: Received replace brick abort request
[2012-02-03 03:44:49.256549] I [glusterd-utils.c:262:glusterd_lock] 0-glusterd: Cluster lock held by 64565e7a-656f-494e-9fbc-8fc298b4d34d
[2012-02-03 03:44:49.256566] I [glusterd-handler.c:448:glusterd_op_txn_begin] 0-management: Acquired local lock
[2012-02-03 03:44:49.257018] I [glusterd-rpc-ops.c:534:glusterd3_1_cluster_lock_cbk] 0-glusterd: Received ACC from uuid: 85b4f943-e2fa-4ca5-899a-78248455727c
[2012-02-03 03:44:49.257075] I [glusterd-rpc-ops.c:534:glusterd3_1_cluster_lock_cbk] 0-glusterd: Received ACC from uuid: 8045dfcd-f3c7-4979-aadd-ef6a907bf3b7
[2012-02-03 03:44:49.257108] I [glusterd-rpc-ops.c:534:glusterd3_1_cluster_lock_cbk] 0-glusterd: Received ACC from uuid: 20479dd8-3dd4-4766-a949-aa1c8b5a89aa
[2012-02-03 03:44:49.264321] I [glusterd-utils.c:781:glusterd_volume_brickinfo_get_by_brick] 0-: brick: 10.1.11.130:/export-xfs/mirror
[2012-02-03 03:44:49.264514] I [glusterd-utils.c:218:glusterd_is_local_addr] 0-glusterd: 10.1.11.130 is local
[2012-02-03 03:44:49.264547] I [glusterd-utils.c:741:glusterd_volume_brickinfo_get] 0-management: Found brick
[2012-02-03 03:44:49.264592] I [glusterd-utils.c:218:glusterd_is_local_addr] 0-glusterd: 10.1.11.130 is local
[2012-02-03 03:44:49.264766] I [glusterd-op-sm.c:1703:glusterd_op_ac_send_stage_op] 0-glusterd: Sent op req to 3 peers
[2012-02-03 03:44:49.266664] I [glusterd-rpc-ops.c:863:glusterd3_1_stage_op_cbk] 0-glusterd: Received ACC from uuid: 20479dd8-3dd4-4766-a949-aa1c8b5a89aa
[2012-02-03 03:44:49.337955] I [glusterd-rpc-ops.c:863:glusterd3_1_stage_op_cbk] 0-glusterd: Received ACC from uuid: 8045dfcd-f3c7-4979-aadd-ef6a907bf3b7
[2012-02-03 03:44:50.001625] I [glusterd-rpc-ops.c:863:glusterd3_1_stage_op_cbk] 0-glusterd: Received ACC from uuid: 85b4f943-e2fa-4ca5-899a-78248455727c
[2012-02-03 03:44:50.001720] I [glusterd-utils.c:781:glusterd_volume_brickinfo_get_by_brick] 0-: brick: 10.1.11.130:/export-xfs/mirror
[2012-02-03 03:44:50.001898] I [glusterd-utils.c:218:glusterd_is_local_addr] 0-glusterd: 10.1.11.130 is local
[2012-02-03 03:44:50.001921] I [glusterd-utils.c:741:glusterd_volume_brickinfo_get] 0-management: Found brick
[2012-02-03 03:44:50.001955] I [glusterd-utils.c:218:glusterd_is_local_addr] 0-glusterd: 10.1.11.130 is local
[2012-02-03 03:44:50.001970] I [glusterd-replace-brick.c:1243:rb_update_srcbrick_port] 0-: adding src-brick port no
(END) 

SOURCE BRICK LOGS:

[2012-02-03 02:34:07.965108] I [server-handshake.c:540:server_setvolume] 0-mirror-server: accepted client from 10.1.11.104:1021 (version: 3.3.
0qa21)
[2012-02-03 02:34:07.969485] I [server-handshake.c:540:server_setvolume] 0-mirror-server: accepted client from 10.1.11.104:1020 (version: 3.3.
0qa21)
[2012-02-03 02:34:09.085444] W [dict.c:1220:data_to_str] (-->/usr/local/lib/glusterfs/3.3.0qa21/rpc-transport/socket.so(socket_connect+0x261) 
[0x7fde82b2a38c] (-->/usr/local/lib/glusterfs/3.3.0qa21/rpc-transport/socket.so(socket_client_get_remote_sockaddr+0x16a) [0x7fde82b2dcb7] (-->
/usr/local/lib/glusterfs/3.3.0qa21/rpc-transport/socket.so(client_fill_address_family+0x135) [0x7fde82b2ccc8]))) 0-dict: data is NULL
[2012-02-03 02:34:09.085487] W [dict.c:1220:data_to_str] (-->/usr/local/lib/glusterfs/3.3.0qa21/rpc-transport/socket.so(socket_connect+0x261) 
[0x7fde82b2a38c] (-->/usr/local/lib/glusterfs/3.3.0qa21/rpc-transport/socket.so(socket_client_get_remote_sockaddr+0x16a) [0x7fde82b2dcb7] (-->
/usr/local/lib/glusterfs/3.3.0qa21/rpc-transport/socket.so(client_fill_address_family+0x144) [0x7fde82b2ccd7]))) 0-dict: data is NULL
[2012-02-03 02:34:09.085500] E [name.c:151:client_fill_address_family] 0-mirror-replace-brick: transport.address-family not specified and not 
able to determine the same from other options (remote-host:(null) and transport.unix.connect-path:(null))
[2012-02-03 02:34:09.303922] I [server3_1-fops.c:345:server_entrylk_cbk] 0-mirror-server: 362555: ENTRYLK (null) (--) ==> -1 (No such file or 
directory)
[2012-02-03 02:34:09.304856] I [server3_1-fops.c:345:server_entrylk_cbk] 0-mirror-server: 362556: ENTRYLK (null) (--) ==> -1 (No such file or 
directory)
[2012-02-03 02:34:09.305446] I [server3_1-fops.c:345:server_entrylk_cbk] 0-mirror-server: 362557: ENTRYLK (null) (--) ==> -1 (No such file or 
directory)
[2012-02-03 02:34:09.310385] I [server3_1-fops.c:262:server_inodelk_cbk] 0-mirror-server: 362558: INODELK (null) (--) ==> -1 (No such file or 
directory)
[2012-02-03 02:34:09.310976] I [server3_1-fops.c:262:server_inodelk_cbk] 0-mirror-server: 362559: INODELK (null) (--) ==> -1 (No such file or 
directory)
[2012-02-03 02:34:09.320655] I [server3_1-fops.c:262:server_inodelk_cbk] 0-mirror-server: 362561: INODELK (null) (--) ==> -1 (No such file or 
directory)
[2012-02-03 02:34:09.321510] I [server3_1-fops.c:262:server_inodelk_cbk] 0-mirror-server: 362562: INODELK (null) (--) ==> -1 (No such file or 
directory)
[2012-02-03 02:34:09.327338] I [server3_1-fops.c:345:server_entrylk_cbk] 0-mirror-server: 362564: ENTRYLK (null) (--) ==> -1 (No such file or 
directory)
[2012-02-03 02:34:09.328149] I [server3_1-fops.c:345:server_entrylk_cbk] 0-mirror-server: 362565: ENTRYLK (null) (--) ==> -1 (No such file or 
directory)
[2012-02-03 02:34:09.474724] I [server-handshake.c:540:server_setvolume] 0-mirror-server: accepted client from 10.1.11.130:998 (version: 3.3.0
qa21)
[2012-02-03 02:34:09.730825] I [server3_1-fops.c:345:server_entrylk_cbk] 0-mirror-server: 362566: ENTRYLK (null) (--) ==> -1 (No such file or 
directory)
[2012-02-03 02:34:09.731448] I [server3_1-fops.c:345:server_entrylk_cbk] 0-mirror-server: 362567: ENTRYLK (null) (--) ==> -1 (No such file or 
directory)
[2012-02-03 02:34:09.737098] I [server3_1-fops.c:345:server_entrylk_cbk] 0-mirror-server: 362568: ENTRYLK (null) (--) ==> -1 (No such file or 
directory)
[2012-02-03 02:34:09.737644] I [server3_1-fops.c:345:server_entrylk_cbk] 0-mirror-server: 362569: ENTRYLK (null) (--) ==> -1 (No such file or 
directory)


[2012-02-03 02:34:12.085812] W [dict.c:1220:data_to_str] (-->/usr/local/lib/glusterfs/3.3.0qa21/rpc-transport/socket.so(socket_connect+0x261) 
[0x7fde82b2a38c] (-->/usr/local/lib/glusterfs/3.3.0qa21/rpc-transport/socket.so(socket_client_get_remote_sockaddr+0x16a) [0x7fde82b2dcb7] (-->
/usr/local/lib/glusterfs/3.3.0qa21/rpc-transport/socket.so(client_fill_address_family+0x135) [0x7fde82b2ccc8]))) 0-dict: data is NULL
[2012-02-03 02:34:12.085860] W [dict.c:1220:data_to_str] (-->/usr/local/lib/glusterfs/3.3.0qa21/rpc-transport/socket.so(socket_connect+0x261) [0x7fde82b2a38c] (-->/usr/local/lib/glusterfs/3.3.0qa21/rpc-transport/socket.so(socket_client_get_remote_sockaddr+0x16a) [0x7fde82b2dcb7] (-->/usr/local/lib/glusterfs/3.3.0qa21/rpc-transport/socket.so(client_fill_address_family+0x144) [0x7fde82b2ccd7]))) 0-dict: data is NULL
[2012-02-03 02:34:12.085880] E [name.c:151:client_fill_address_family] 0-mirror-replace-brick: transport.address-family not specified and not able to determine the same from other options (remote-host:(null) and transport.unix.connect-path:(null))
[2012-02-03 02:34:12.147971] I [server-handshake.c:540:server_setvolume] 0-mirror-server: accepted client from 10.1.11.130:988 (version: 3.3.0qa21)
[2012-02-03 02:34:15.086311] W [dict.c:1220:data_to_str] (-->/usr/local/lib/glusterfs/3.3.0qa21/rpc-transport/socket.so(socket_connect+0x261) [0x7fde82b2a38c] (-->/usr/local/lib/glusterfs/3.3.0qa21/rpc-transport/socket.so(socket_client_get_remote_sockaddr+0x16a) [0x7fde82b2dcb7] (-->/usr/local/lib/glusterfs/3.3.0qa21/rpc-transport/socket.so(client_fill_address_family+0x135) [0x7fde82b2ccc8]))) 0-dict: data is NULL
[2012-02-03 02:34:15.086366] W [dict.c:1220:data_to_str] (-->/usr/local/lib/glusterfs/3.3.0qa21/rpc-transport/socket.so(socket_connect+0x261) [0x7fde82b2a38c] (-->/usr/local/lib/glusterfs/3.3.0qa21/rpc-transport/socket.so(socket_client_get_remote_sockaddr+0x16a) [0x7fde82b2dcb7] (-->/usr/local/lib/glusterfs/3.3.0qa21/rpc-transport/socket.so(client_fill_address_family+0x144) [0x7fde82b2ccd7]))) 0-dict: data is NULL
[2012-02-03 02:34:15.086389] E [name.c:151:client_fill_address_family] 0-mirror-replace-brick: transport.address-family not specified and not able to determine the same from other options (remote-host:(null) and transport.unix.connect-path:(null))
[2012-02-03 02:34:16.847291] I [server-handshake.c:540:server_setvolume] 0-mirror-server: accepted client from 10.1.11.130:996 (version: 3.3.0qa21)
[2012-02-03 02:34:16.866889] I [server.c:556:server_rpc_notify] 0-mirror-server: disconnected connection from 10.1.11.130:996
[2012-02-03 02:34:16.866923] I [server-helpers.c:763:server_connection_destroy] 0-mirror-server: destroyed connection of node130-1643-2012/02/03-02:34:16:791421-mnt-client
[2012-02-03 02:34:18.086703] W [dict.c:1220:data_to_str] (-->/usr/local/lib/glusterfs/3.3.0qa21/rpc-transport/socket.so(socket_connect+0x261) [0x7fde82b2a38c] (-->/usr/local/lib/glusterfs/3.3.0qa21/rpc-transport/socket.so(socket_client_get_remote_sockaddr+0x16a) [0x7fde82b2dcb7] (-->/usr/local/lib/glusterfs/3.3.0qa21/rpc-transport/socket.so(client_fill_address_family+0x135) [0x7fde82b2ccc8]))) 0-dict: data is NULL
[2012-02-03 02:34:18.086750] W [dict.c:1220:data_to_str] (-->/usr/local/lib/glusterfs/3.3.0qa21/rpc-transport/socket.so(socket_connect+0x261) [0x7fde82b2a38c] (-->/usr/local/lib/glusterfs/3.3.0qa21/rpc-transport/socket.so(socket_client_get_remote_sockaddr+0x16a) [0x7fde82b2dcb7] (-->/usr/local/lib/glusterfs/3.3.0qa21/rpc-transport/socket.so(client_fill_address_family+0x144) [0x7fde82b2ccd7]))) 0-dict: data is NULL
[2012-02-03 02:34:18.086763] E [name.c:151:client_fill_address_family] 0-mirror-replace-brick: transport.address-family not specified and not able to determine the same from other options (remote-host:(null) and transport.unix.connect-path:(null))
[2012-02-03 02:34:19.684061] I [server-handshake.c:540:server_setvolume] 0-mirror-server: accepted client from 10.1.11.130:987 (version: 3.3.0qa21)
[2012-02-03 02:34:19.695127] I [server.c:556:server_rpc_notify] 0-mirror-server: disconnected connection from 10.1.11.130:987
[2012-02-03 02:34:19.695168] I [server-helpers.c:763:server_connection_destroy] 0-mirror-server: destroyed connection of node130-1658-2012/02/03-02:34:19:637790-mnt-client
[2012-02-03 02:34:21.087067] W [dict.c:1220:data_to_str] (-->/usr/local/lib/glusterfs/3.3.0qa21/rpc-transport/socket.so(socket_connect+0x261) [0x7fde82b2a38c] (-->/usr/local/lib/glusterfs/3.3.0qa21/rpc-transport/socket.so(socket_client_get_remote_sockaddr+0x16a) [0x7fde82b2dcb7] (-->/usr/local/lib/glusterfs/3.3.0qa21/rpc-transport/socket.so(client_fill_address_family+0x135) [0x7fde82b2ccc8]))) 0-dict: data is NULL
[2012-02-03 02:34:21.087115] W [dict.c:1220:data_to_str] (-->/usr/local/lib/glusterfs/3.3.0qa21/rpc-transport/socket.so(socket_connect+0x261) [0x7fde82b2a38c] (-->/usr/local/lib/glusterfs/3.3.0qa21/rpc-transport/socket.so(socket_client_get_remote_sockaddr+0x16a) [0x7fde82b2dcb7] (-->/usr/local/lib/glusterfs/3.3.0qa21/rpc-transport/socket.so(client_fill_address_family+0x144) [0x7fde82b2ccd7]))) 0-dict: data is NULL

Comment 1 krishnan parthasarathi 2012-02-08 05:21:11 UTC
Raghavendra,
Could you attach the entire glusterd and source brick logs?

Comment 2 krishnan parthasarathi 2012-05-03 07:17:49 UTC

*** This bug has been marked as a duplicate of bug 816915 ***