Bug 1044110

Summary: [Scale] 64 Brick Volume Create Failed Message - Volume Still Created
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Matt Mahoney <mmahoney>
Component: rhscAssignee: Timothy Asir <tjeyasin>
Status: CLOSED WONTFIX QA Contact: storage-qa-internal <storage-qa-internal>
Severity: low Docs Contact:
Priority: unspecified    
Version: 2.1CC: dpati, kmayilsa, rhs-bugs
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-09-01 13:53:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Create Volume Fail Message none

Description Matt Mahoney 2013-12-17 20:09:04 UTC
Description of problem:
While creating a 64 brick volume that consisted of 8-servers with 8-bricks on each server, the following error message happened:


2013-Dec-17, 14:13 Gluster Volume Vol-6 created.
2013-Dec-17, 14:12 Creation of Gluster Volume Vol-6 failed.

Note the failed message (also appears to be such in the engine.log), but the volume was created. Validated via glusterfs that the volume was created, and that the volume was started/stopped with no issues


Version-Release number of selected component (if applicable):
cb12

How reproducible:
Happened 1 time out of 8 created volumes

Steps to Reproduce:
1. Create 64 brick volume (8-servers, 8-bricks on each server)
2.
3.

Actual results:
Creation of Gluster Volume <volumeName> failed.

Expected results:
No error message

Additional info:

Comment 1 Matt Mahoney 2013-12-17 20:11:47 UTC
Created attachment 837938 [details]
Create Volume Fail Message

Comment 3 Kanagaraj 2013-12-18 05:33:37 UTC
VDSM Error 

Thread-30560::DEBUG::2013-12-17 13:29:14,855::BindingXMLRPC::984::vds::(wrapper) client [10.16.159.68]::call volumeCreate with ('Vol-1', ['10.16.159.41:/bricks/b1', '10.16.159.41:/bricks/b2', '10.16.159.41:/bricks/b3', '10.16.159.41:/bricks/b4', '10.16.159.41:/bricks/b5', '10.16.159.41:/bricks/b6', '10.16.159.41:/bricks/b7', '10.16.159.41:/bricks/b8', '10.16.159.23:/bricks/b1', '10.16.159.23:/bricks/b2', '10.16.159.23:/bricks/b3', '10.16.159.23:/bricks/b4', '10.16.159.23:/bricks/b5', '10.16.159.23:/bricks/b6', '10.16.159.23:/bricks/b7', '10.16.159.23:/bricks/b8', '10.16.159.7:/bricks/b1', '10.16.159.7:/bricks/b2', '10.16.159.7:/bricks/b3', '10.16.159.7:/bricks/b4', '10.16.159.7:/bricks/b5', '10.16.159.7:/bricks/b6', '10.16.159.7:/bricks/b7', '10.16.159.7:/bricks/b8', '10.16.159.106:/bricks/b1', '10.16.159.106:/bricks/b2', '10.16.159.106:/bricks/b3', '10.16.159.106:/bricks/b4', '10.16.159.106:/bricks/b5', '10.16.159.106:/bricks/b6', '10.16.159.106:/bricks/b7', '10.16.159.106:/bricks/b8', '10.16.159.67:/bricks/b1', '10.16.159.67:/bricks/b2', '10.16.159.67:/bricks/b3', '10.16.159.67:/bricks/b4', '10.16.159.67:/bricks/b5', '10.16.159.67:/bricks/b6', '10.16.159.67:/bricks/b7', '10.16.159.67:/bricks/b8', '10.16.159.200:/bricks/b1', '10.16.159.200:/bricks/b2', '10.16.159.200:/bricks/b3', '10.16.159.200:/bricks/b4', '10.16.159.200:/bricks/b5', '10.16.159.200:/bricks/b6', '10.16.159.200:/bricks/b7', '10.16.159.200:/bricks/b8', '10.16.159.118:/bricks/b1', '10.16.159.118:/bricks/b2', '10.16.159.118:/bricks/b3', '10.16.159.118:/bricks/b4', '10.16.159.118:/bricks/b5', '10.16.159.118:/bricks/b6', '10.16.159.118:/bricks/b7', '10.16.159.118:/bricks/b8', '10.16.159.43:/bricks/b1', '10.16.159.43:/bricks/b2', '10.16.159.43:/bricks/b3', '10.16.159.43:/bricks/b4', '10.16.159.43:/bricks/b5', '10.16.159.43:/bricks/b6', '10.16.159.43:/bricks/b7', '10.16.159.43:/bricks/b8'], 0, 0, ['TCP'], True) {} flowID [42792772]
Thread-30560::ERROR::2013-12-17 13:29:15,007::BindingXMLRPC::1000::vds::(wrapper) vdsm exception occured
Traceback (most recent call last):
  File "/usr/share/vdsm/BindingXMLRPC.py", line 989, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/gluster/api.py", line 53, in wrapper
    rv = func(*args, **kwargs)
  File "/usr/share/vdsm/gluster/api.py", line 85, in volumeCreate
    transportList, force)
  File "/usr/share/vdsm/supervdsm.py", line 50, in __call__
    return callMethod()
  File "/usr/share/vdsm/supervdsm.py", line 48, in <lambda>
    **kwargs)
  File "<string>", line 2, in glusterVolumeCreate
  File "/usr/lib64/python2.6/multiprocessing/managers.py", line 740, in _callmethod
    raise convert_to_error(kind, result)
GlusterVolumeCreateFailedException: Volume create failed
error: Staging failed on 10.16.159.41. Error: Failed to create brick directory for brick 10.16.159.41:/bricks/b1. Reason : No such file or directory 
Staging failed on 10.16.159.67. Error: Failed to create brick directory for brick 10.16.159.67:/bricks/b1. Reason : No such file or directory 
Staging failed on 10.16.159.23. Error: Failed to create brick directory for brick 10.16.159.23:/bricks/b1. Reason : No such file or directory 
Staging failed on 10.16.159.106. Error: Failed to create brick directory for brick 10.16.159.106:/bricks/b1. Reason : No such file or directory 
Staging failed on 10.16.159.7. Error: Failed to create brick directory for brick 10.16.159.7:/bricks/b1. Reason : No such file or directory 
Staging failed on 10.16.159.118. Error: Failed to create brick directory for brick 10.16.159.118:/bricks/b1. Reason : No such file or directory 
Staging failed on 10.16.159.200. Error: Failed to create brick directory for brick 10.16.159.200:/bricks/b1. Reason : No such file or directory 
Staging failed on 10.16.159.43. Error: Failed to create brick directory for brick 10.16.159.43:/bricks/b1. Reason : No such file or directory 
return code: 115


Gluster Error

[2013-12-17 18:29:15.001523] E [glusterd-syncop.c:102:gd_collate_errors] 0-: Staging failed on 10.16.159.41. Error: Failed to create brick directory for brick 10.16.159.41:/bricks/b1. Reason : No such file or directory 
[2013-12-17 18:29:15.001691] E [glusterd-syncop.c:102:gd_collate_errors] 0-: Staging failed on 10.16.159.67. Error: Failed to create brick directory for brick 10.16.159.67:/bricks/b1. Reason : No such file or directory 
[2013-12-17 18:29:15.001902] E [glusterd-syncop.c:102:gd_collate_errors] 0-: Staging failed on 10.16.159.23. Error: Failed to create brick directory for brick 10.16.159.23:/bricks/b1. Reason : No such file or directory 
[2013-12-17 18:29:15.002295] E [glusterd-syncop.c:102:gd_collate_errors] 0-: Staging failed on 10.16.159.106. Error: Failed to create brick directory for brick 10.16.159.106:/bricks/b1. Reason : No such file or directory 
[2013-12-17 18:29:15.002350] E [glusterd-syncop.c:102:gd_collate_errors] 0-: Staging failed on 10.16.159.7. Error: Failed to create brick directory for brick 10.16.159.7:/bricks/b1. Reason : No such file or directory 
[2013-12-17 18:29:15.002383] E [glusterd-syncop.c:102:gd_collate_errors] 0-: Staging failed on 10.16.159.118. Error: Failed to create brick directory for brick 10.16.159.118:/bricks/b1. Reason : No such file or directory 
[2013-12-17 18:29:15.002416] E [glusterd-syncop.c:102:gd_collate_errors] 0-: Staging failed on 10.16.159.200. Error: Failed to create brick directory for brick 10.16.159.200:/bricks/b1. Reason : No such file or directory 
[2013-12-17 18:29:15.002715] E [glusterd-syncop.c:102:gd_collate_errors] 0-: Staging failed on 10.16.159.43. Error: Failed to create brick directory for brick 10.16.159.43:/bricks/b1. Reason : No such file or directory 
[2013-12-17 18:31:36.763529] I [glusterd-handler.c:1018:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req
[2013-12-17 18:31:37.089730] I [glusterd-handler.c:1073:__glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req
[2013-12-17 18:36:29.805126] I [run.c:190:runner_log] 0-management: Ran script: /var/lib/glusterd/hooks/1/set/post/S30samba-set.sh --volname=Vol-1 -o nfs.disable=off
[2013-12-17 18:36:53.196168] I [run.c:190:runner_log] 0-management: Ran script: /var/lib/glusterd/hooks/1/set/post/S30samba-set.sh --volname=Vol-1 -o user.cifs=enable
[2013-12-17 18:37:13.281808] I [run.c:190:runner_log] 0-management: Ran script: /var/lib/glusterd/hooks/1/set/post/S30samba-set.sh --volname=Vol-1 -o auth.allow=*
[2013-12-17 18:39:11.232694] I [rpc-clnt.c:977:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2013-12-17 18:39:11.232760] I [socket.c:3505:socket_init] 0-management: SSL support is NOT enabled
[2013-12-17 18:39:11.232769] I [socket.c:3520:socket_init] 0-management: using system polling thread
[2013-12-17 18:39:11.233336] I [socket.c:2235:socket_event_handler] 0-transport: disconnecting now
[2013-12-17 18:39:11.237452] I [run.c:190:runner_log] 0-management: Ran script: /var/lib/glusterd/hooks/1/start/post/S29CTDBsetup.sh --volname=Vol-1 --first=yes --version=1 --volume-op=start --gd-workdir=/var/lib/glusterd
[2013-12-17 18:39:11.257430] I [run.c:190:runner_log] 0-management: Ran script: /var/lib/glusterd/hooks/1/start/post/S30samba-start.sh --volname=Vol-1 --first=yes --version=1 --volume-op=start --gd-workdir=/var/lib/glusterd
[2013-12-17 18:39:35.109709] I [run.c:190:runner_log] 0-management: Ran script: /var/lib/glusterd/hooks/1/stop/pre/S29CTDB-teardown.sh --volname=Vol-1 --last=yes
[2013-12-17 18:39:35.124108] I [run.c:190:runner_log] 0-management: Ran script: /var/lib/glusterd/hooks/1/stop/pre/S30samba-stop.sh --volname=Vol-1 --last=yes
[2013-12-17 18:40:17.166131] E [glusterd-utils.c:3873:glusterd_nodesvc_unlink_socket_file] 0-management: Failed to remove /var/run/813402e4dc3236c4c4a378a286ac9746.socket error: Permission denied
[2013-12-17 18:40:17.166497] I [glusterd-utils.c:3907:glusterd_nfs_pmap_deregister] 0-: De-registered MOUNTV3 successfully
[2013-12-17 18:40:17.166610] I [glusterd-utils.c:3912:glusterd_nfs_pmap_deregister] 0-: De-registered MOUNTV1 successfully
[2013-12-17 18:40:17.166730] I [glusterd-utils.c:3917:glusterd_nfs_pmap_deregister] 0-: De-registered NFSV3 successfully
[2013-12-17 18:40:17.166883] I [glusterd-utils.c:3922:glusterd_nfs_pmap_deregister] 0-: De-registered NLM v4 successfully
[2013-12-17 18:40:17.166994] I [glusterd-utils.c:3927:glusterd_nfs_pmap_deregister] 0-: De-registered NLM v1 successfully
[2013-12-17 18:40:17.167117] I [glusterd-utils.c:3932:glusterd_nfs_pmap_deregister] 0-: De-registered ACL v3 successfully
[2013-12-17 18:40:17.167287] I [mem-pool.c:539:mem_pool_destroy] 0-management: size=2236 max=0 total=0
[2013-12-17 18:40:17.167308] I [mem-pool.c:539:mem_pool_destroy] 0-management: size=124 max=0 total=0

Comment 4 Dusmant 2015-09-01 13:53:18 UTC
We are not planning to fix it. Hence closing this BZ. If you think, it's applicable for 3.x release and would have an impact on customer, pls. open up a new BZ with the appropriate version. Thanks, -Dusmant