Description of problem: sometimes volume start failed. Version-Release number of selected component (if applicable): glusterfs 3.10.3 How reproducible: sometimes Steps to Reproduce: gluster cluster has 2 nodes, and want to create a replica volume,but start failed log as follows: 1.cmd history: [2018-01-31 15:27:29.310233] : volume delete test0019_component_volume : SUCCESS [2018-01-31 15:27:31.018337] : volume create test0021-component_volume replica 2 188.188.188.84:/root/data/platform/test0021-component/test0021-component_volume 188.188.188.83:/root/data/platform/test0021-component/test0021-component_volume force : SUCCESS [2018-01-31 15:27:31.495940] : volume set test0021-component_volume nfs.disable off : SUCCESS [2018-01-31 15:27:31.574767] : volume set test0021-component_volume nfs.rpc-auth-allow 188.188.0.0/16,172.17.0.0/16 : SUCCESS [2018-01-31 15:27:31.651966] : volume set test0021-component_volume diagnostics.brick-log-level WARNING : SUCCESS [2018-01-31 15:27:31.728427] : volume set test0021-component_volume diagnostics.brick-sys-log-level WARNING : SUCCESS [2018-01-31 15:27:31.805802] : volume set test0021-component_volume diagnostics.client-log-level ERROR : SUCCESS [2018-01-31 15:27:31.885078] : volume set test0021-component_volume diagnostics.client-sys-log-level ERROR : SUCCESS [2018-01-31 15:27:31.963931] : volume set test0021-component_volume cluster.server-quorum-type server : SUCCESS [2018-01-31 15:27:31.987742] : volume set all cluster.server-quorum-ratio 60% : SUCCESS [2018-01-31 15:27:32.065940] : volume set test0021-component_volume cluster.quorum-count 2 : SUCCESS [2018-01-31 15:27:32.146743] : volume set test0021-component_volume cluster.quorum-type fixed : SUCCESS [2018-01-31 15:27:32.186824] : volume start test0021-component_volume : FAILED : Commit failed on localhost. Please check log file for details. 2.glusterd.log [2018-01-31 15:25:10.892912] E [MSGID: 101042] [compat.c:569:gf_umount_lazy] 0-management: Lazy unmount of /var/run/gluster/test0019_component_volume_quota_list/ The message "E [MSGID: 101042] [compat.c:569:gf_umount_lazy] 0-management: Lazy unmount of /var/run/gluster/test0019_component_volume_quota_list/" repeated 5 times between [2018-01-31 15:25:10.892912] and [2018-01-31 15:26:38.510580] [2018-01-31 15:27:11.417708] E [MSGID: 101042] [compat.c:569:gf_umount_lazy] 0-management: Lazy unmount of /var/run/gluster/test0019_component_volume_quota_list/ [2018-01-31 15:27:20.847870] E [run.c:191:runner_log] (-->/usr/lib64/glusterfs/3.10.3/xlator/mgmt/glusterd.so(+0x36ba5) [0x7f406ea5cba5] -->/usr/lib64/glusterfs/3.10.3/xlator/mgmt/glusterd.so(+0xdb79b) [0x7f406eb0179b] -->/lib64/libglusterfs.so.0(runner_log+0x115) [0x7f407403e385] ) 0-management: Failed to execute script: /var/lib/glusterd/hooks/1/stop/pre/S30samba-stop.sh --volname=test0019_component_volume --last=no [2018-01-31 15:27:24.918501] W [socket.c:593:__socket_rwv] 0-management: readv on /var/run/gluster/f4add0eba7cd3f12c9f6faad73592c1a.socket failed (No data available) [2018-01-31 15:27:24.918783] W [glusterd-handler.c:5687:__glusterd_brick_rpc_notify] 0-management: got disconnect from stale rpc on /root/data/platform/test0019_component/test0019_component_volume [2018-01-31 15:27:24.919171] W [socket.c:593:__socket_rwv] 0-socket.management: writev on 188.188.188.84:49151 failed (Broken pipe) [2018-01-31 15:27:32.186287] E [MSGID: 106005] [glusterd-utils.c:5525:glusterd_brick_start] 0-management: Unable to start brick 188.188.188.84:/root/data/platform/test0021-component/test0021-component_volume [2018-01-31 15:27:32.186378] E [MSGID: 106123] [glusterd-mgmt.c:323:gd_mgmt_v3_commit_fn] 0-management: Volume start commit failed. [2018-01-31 15:27:32.186395] E [MSGID: 106123] [glusterd-mgmt.c:1532:glusterd_mgmt_v3_commit] 0-management: Commit failed for operation Start on local node [2018-01-31 15:27:32.186409] E [MSGID: 106123] [glusterd-mgmt.c:2145:glusterd_mgmt_v3_initiate_all_phases] 0-management: Commit Op Failed [2018-01-31 15:27:32.260834] E [run.c:191:runner_log] (-->/usr/lib64/glusterfs/3.10.3/xlator/mgmt/glusterd.so(+0x36ba5) [0x7f406ea5cba5] -->/usr/lib64/glusterfs/3.10.3/xlator/mgmt/glusterd.so(+0xdb79b) [0x7f406eb0179b] -->/lib64/libglusterfs.so.0(runner_log+0x115) [0x7f407403e385] ) 0-management: Failed to execute script: /var/lib/glusterd/hooks/1/stop/pre/S30samba-stop.sh --volname=test0021-component_volume --last=yes 3.cli.log [2018-01-31 15:27:31.018453] I [cli-rpc-ops.c:1163:gf_cli_create_volume_cbk] 0-cli: Received resp to create volume [2018-01-31 15:27:31.018602] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2018-01-31 15:27:31.030488] I [cli.c:759:main] 0-cli: Started running gluster with version 3.10.3 [2018-01-31 15:27:31.037326] I [cli-cmd-volume.c:2320:cli_check_gsync_present] 0-: geo-replication not installed [2018-01-31 15:27:31.037671] I [MSGID: 101190] [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2018-01-31 15:27:31.496040] I [cli-rpc-ops.c:2329:gf_cli_set_volume_cbk] 0-cli: Received resp to set [2018-01-31 15:27:31.496214] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2018-01-31 15:27:31.506834] I [cli.c:759:main] 0-cli: Started running gluster with version 3.10.3 [2018-01-31 15:27:31.513681] I [cli-cmd-volume.c:2320:cli_check_gsync_present] 0-: geo-replication not installed [2018-01-31 15:27:31.514030] I [MSGID: 101190] [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2018-01-31 15:27:31.574845] I [cli-rpc-ops.c:2329:gf_cli_set_volume_cbk] 0-cli: Received resp to set [2018-01-31 15:27:31.574971] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2018-01-31 15:27:31.584757] I [cli.c:759:main] 0-cli: Started running gluster with version 3.10.3 [2018-01-31 15:27:31.591421] I [cli-cmd-volume.c:2320:cli_check_gsync_present] 0-: geo-replication not installed [2018-01-31 15:27:31.591755] I [MSGID: 101190] [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2018-01-31 15:27:31.652050] I [cli-rpc-ops.c:2329:gf_cli_set_volume_cbk] 0-cli: Received resp to set [2018-01-31 15:27:31.652182] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2018-01-31 15:27:31.661223] I [cli.c:759:main] 0-cli: Started running gluster with version 3.10.3 [2018-01-31 15:27:31.667749] I [cli-cmd-volume.c:2320:cli_check_gsync_present] 0-: geo-replication not installed [2018-01-31 15:27:31.668525] I [MSGID: 101190] [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2018-01-31 15:27:31.728504] I [cli-rpc-ops.c:2329:gf_cli_set_volume_cbk] 0-cli: Received resp to set [2018-01-31 15:27:31.728634] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2018-01-31 15:27:31.738082] I [cli.c:759:main] 0-cli: Started running gluster with version 3.10.3 [2018-01-31 15:27:31.744711] I [cli-cmd-volume.c:2320:cli_check_gsync_present] 0-: geo-replication not installed [2018-01-31 15:27:31.745048] I [MSGID: 101190] [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2018-01-31 15:27:31.805869] I [cli-rpc-ops.c:2329:gf_cli_set_volume_cbk] 0-cli: Received resp to set [2018-01-31 15:27:31.806021] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2018-01-31 15:27:31.815240] I [cli.c:759:main] 0-cli: Started running gluster with version 3.10.3 [2018-01-31 15:27:31.821861] I [cli-cmd-volume.c:2320:cli_check_gsync_present] 0-: geo-replication not installed [2018-01-31 15:27:31.822207] I [MSGID: 101190] [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2018-01-31 15:27:31.885159] I [cli-rpc-ops.c:2329:gf_cli_set_volume_cbk] 0-cli: Received resp to set [2018-01-31 15:27:31.885299] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2018-01-31 15:27:31.894963] I [cli.c:759:main] 0-cli: Started running gluster with version 3.10.3 [2018-01-31 15:27:31.901576] I [cli-cmd-volume.c:2320:cli_check_gsync_present] 0-: geo-replication not installed [2018-01-31 15:27:31.901921] I [MSGID: 101190] [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2018-01-31 15:27:31.964013] I [cli-rpc-ops.c:2329:gf_cli_set_volume_cbk] 0-cli: Received resp to set [2018-01-31 15:27:31.964153] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2018-01-31 15:27:31.973426] I [cli.c:759:main] 0-cli: Started running gluster with version 3.10.3 [2018-01-31 15:27:31.979980] I [cli-cmd-volume.c:2320:cli_check_gsync_present] 0-: geo-replication not installed [2018-01-31 15:27:31.980316] I [MSGID: 101190] [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2018-01-31 15:27:31.987820] I [cli-rpc-ops.c:2329:gf_cli_set_volume_cbk] 0-cli: Received resp to set [2018-01-31 15:27:31.987948] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2018-01-31 15:27:31.996924] I [cli.c:759:main] 0-cli: Started running gluster with version 3.10.3 [2018-01-31 15:27:32.003351] I [cli-cmd-volume.c:2320:cli_check_gsync_present] 0-: geo-replication not installed [2018-01-31 15:27:32.003694] I [MSGID: 101190] [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2018-01-31 15:27:32.066022] I [cli-rpc-ops.c:2329:gf_cli_set_volume_cbk] 0-cli: Received resp to set [2018-01-31 15:27:32.066174] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2018-01-31 15:27:32.075567] I [cli.c:759:main] 0-cli: Started running gluster with version 3.10.3 [2018-01-31 15:27:32.082204] I [cli-cmd-volume.c:2320:cli_check_gsync_present] 0-: geo-replication not installed [2018-01-31 15:27:32.082536] I [MSGID: 101190] [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2018-01-31 15:27:32.146825] I [cli-rpc-ops.c:2329:gf_cli_set_volume_cbk] 0-cli: Received resp to set [2018-01-31 15:27:32.146972] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2018-01-31 15:27:32.156844] I [cli.c:759:main] 0-cli: Started running gluster with version 3.10.3 [2018-01-31 15:27:32.163480] I [cli-cmd-volume.c:2320:cli_check_gsync_present] 0-: geo-replication not installed [2018-01-31 15:27:32.163825] I [MSGID: 101190] [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2018-01-31 15:27:32.186925] I [cli-rpc-ops.c:1469:gf_cli_start_volume_cbk] 0-cli: Received resp to start volume [2018-01-31 15:27:32.186973] I [input.c:31:cli_batch] 0-: Exiting with: -1 Actual results: volume start fail Expected results: volume start ok Additional info:
Could you attach the glusterd and the respective brick logs from the node where this command failed?
Created attachment 1389836 [details] gluster node2 log gluster node2 log
Created attachment 1389837 [details] gluser node 1 log gluser node 1 log
Sanju - Can you please take a look at the logs and check why the volume start commit failed?
I can't find the respective brick logs on the attachment you provided. Can you please provide the brick logs of the volume which is failed to start.
Can you please run the glusterd in debug mode and provide the logs
Hi Sanju modify gluserd debug mode (/usr/lib/systemd/system/glusterd.service : Environment="LOG_LEVEL=DEBUG") then get the log: gluster1.zip gluster2.zip please have a look
Created attachment 1400087 [details] gluser node 2 log-new
Created attachment 1400088 [details] gluser node 1 log-new
add debug gluserd log,please have a look
(In reply to Sanju from comment #5) > I can't find the respective brick logs on the attachment you provided. Can > you please provide the brick logs of the volume which is failed to start. I'm waiting for brick logs
This bug reported is against a version of Gluster that is no longer maintained (or has been EOL'd). See https://www.gluster.org/release-schedule/ for the versions currently maintained. As a result this bug is being closed. If the bug persists on a maintained version of gluster or against the mainline gluster repository, request that it be reopened and the Version field be marked appropriately.
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days