Bug 920332 - Mounting issues
Summary: Mounting issues
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: rdma
Version: 3.3.1
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kaleb KEITHLEY
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-03-11 20:39 UTC by Dean Bruhn
Modified: 2013-07-24 17:22 UTC (History)
2 users (show)

Fixed In Version: glusterfs-3.4.0
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-07-24 17:22:05 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Dean Bruhn 2013-03-11 20:39:04 UTC
Description of problem:
After you have created and RDMA volume trying to mount from a client the mount command will just sit and not close, it will also not mount the file system.


Version-Release number of selected component (if applicable):
Redhat 6.4 X64
Gluster 3.3.1 EPL6

Additional IB Specific Packages installed. 
ibutils
libibverbs
libnes
opensm
libibmad
infiniband-diags
libibverbs-utils
libibverbs-devel
perftest
libmlx4
openmpi



How reproducible:


Steps to Reproduce:
1.Build RDMA Volume
2.Attempt to Mount RDMA Volume 
3.
  
Actual results:
mount command hangs and does not mount the volume, command never exists either


Expected results:
expect it to mount


Additional info:
I was able to fix the mount by turning the debug logging on at the brick level to determine with ports the bricks are listening on, for each of the servers and bricks. 

I then was then able to edit the <volname>-fuse.vol and trusted-<volname>-fuse.vol files adding a line for the port on each brick

volume ENTV03-client-0
    type protocol/client
    option remote-host ENTSNV03001EP
    option remote-subvolume /var/brick01
    option transport-type rdma
    option remote-port 24009
    option username XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
    option password XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
end-volume

After making this change for each of the brick on each of the clients I was able to mount the volume using RDMA.


Here is the output from various gluster commands and IB oriented commands to show the rest of the system working as expected. 



[root@ENTSNV03001EP /]# gluster volume info
 
Volume Name: ENTV03
Type: Distributed-Replicate
Volume ID: ba4c0970-2b5b-456b-86aa-0e7bc93e9ecf
Status: Started
Number of Bricks: 4 x 2 = 8
Transport-type: rdma
Bricks:
Brick1: ENTSNV03001EP:/var/brick01
Brick2: ENTSNV03002EP:/var/brick03
Brick3: ENTSNV03002EP:/var/brick04
Brick4: ENTSNV03001EP:/var/brick02
Brick5: ENTSNV03003EP:/var/brick05
Brick6: ENTSNV03004EP:/var/brick07
Brick7: ENTSNV03004EP:/var/brick08
Brick8: ENTSNV03003EP:/var/brick06
Options Reconfigured:
diagnostics.brick-log-level: DEBUG

[root@ENTSNV03001EP /]# gluster peer status
Number of Peers: 3

Hostname: ENTSNV03002EP
Uuid: b3e979af-d236-43fb-b939-56579bebc1a8
State: Peer in Cluster (Connected)

Hostname: ENTSNV03004EP
Uuid: 5f191c05-ada6-451b-aa66-19f98b15029a
State: Peer in Cluster (Connected)

Hostname: ENTSNV03003EP
Uuid: 780f9da4-0e17-4e6e-a4d4-ba88d581a0fe
State: Peer in Cluster (Connected)

[root@ENTSNV03001EP ENTV03]# ibnodes
Ca	: 0x0002c90300222ee0 ports 1 "ENTSNV03002EP mlx4_0"
Ca	: 0x0002c90300223400 ports 1 "ENTSNV03003EP mlx4_0"
Ca	: 0x0002c90300223420 ports 1 "ENTSNV03004EP mlx4_0"
Ca	: 0x0002c90300222f30 ports 1 "ENTSNV03001EP mlx4_0"
Switch	: 0x0002c90200477480 ports 36 "MF0;INFIBINF001EP:IS5030/U1" enhanced port 0 lid 1 lmc 0

[root@ENTSNV03001EP ENTV03]# ibv_devinfo
hca_id:	mlx4_0
	transport:			InfiniBand (0)
	fw_ver:				2.11.500
	node_guid:			0002:c903:0022:2f30
	sys_image_guid:			0002:c903:0022:2f33
	vendor_id:			0x02c9
	vendor_part_id:			4099
	hw_ver:				0x0
	board_id:			MT_1060110018
	phys_port_cnt:			1
		port:	1
			state:			PORT_ACTIVE (4)
			max_mtu:		4096 (5)
			active_mtu:		4096 (5)
			sm_lid:			1
			port_lid:		3
			port_lmc:		0x00
			link_layer:		InfiniBand

Comment 1 Kaleb KEITHLEY 2013-04-12 20:12:45 UTC
this appears to be already fixed on the HEAD of the release-3.3 branch in commit 
e02171f1b86cfb3cd365c4c47edc83b8230985bd


Note You need to log in before you can comment on or make changes to this bug.