Bug 763587 (GLUSTER-1855)

Summary: Initial server in a cluster not a friend
Product: [Community] GlusterFS Reporter: Mike Robbert <mrobbert>
Component: cliAssignee: Pranith Kumar K <pkarampu>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: low    
Version: 3.1.0CC: amarts, divya, gluster-bugs, lakshmipathi, pkarampu, vijay
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: DA CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
glusterd logs from all servers none

Description Mike Robbert 2010-10-07 23:08:27 UTC
After installing the 3.1 beta RPMs on 4 RHEL 4 update 5 hosts with infiniband I started the gluster cli on the first host, added the others as peers, then was unable to create a volume that included the host I was on. Below is the output that I got:


[root@iosrv-7-1 ~]# gluster
gluster> peer status
No peers present
gluster> peer probe iosrv-7-2
Probe successful
gluster> peer probe iosrv-7-3
Probe successful
gluster> peer probe iosrv-7-4
Probe successful
gluster> peer status
Number of Peers: 3
hostname:iosrv-7-2, uuid:5268b25f-8533-43fb-ba79-0c26a15ee069, state:3
(connected)
hostname:iosrv-7-3, uuid:19be1f71-ab84-461e-a3ae-988f5e704143, state:3
(connected)
hostname:iosrv-7-4, uuid:3c7fddcf-3edc-4fa0-b6b7-ab15cc3360ce, state:3
(connected)
gluster> volume create gluster-test transport rdma iosrv-7-1:/mnt/brick1
iosrv-7-2:/mnt/brick2 iosrv-7-3:/mnt/brick3 iosrv-7-4:/mnt/brick4
Creation of volume gluster-test has been unsuccessful
Host iosrv-7-1 not a friend
gluster> 

I tried to probe the local machines name, but that didn't work:

gluster> peer probe iosrv-7-1
iosrv-7-1 is already part of another cluster
Probe unsuccessful
Probe failed
gluster> 

Finally I started the CLI on the second host and found this:

gluster> peer status
Number of Peers: 3
hostname:172.16.2.4, uuid:4c649194-b909-4b27-bddc-e11000455fdb, state:3
(connected)
hostname:iosrv-7-3, uuid:19be1f71-ab84-461e-a3ae-988f5e704143, state:3
(connected)
hostname:iosrv-7-4, uuid:3c7fddcf-3edc-4fa0-b6b7-ab15cc3360ce, state:3
(connected)
gluster> peer probe iosrv-7-1
Probe on host iosrv-7-1 port 6969 already a friend
gluster> 

172.16.2.4 is the IP of the first host. Based on the documentation I expected the 'volume create' to just work without any additional probing.

Comment 1 Pranith Kumar K 2010-10-08 02:20:31 UTC
hi Michael,
     Could you please attach the 'uname -a' output, the log files located in /usr/local/var/log/glusterfs on all these machines. This should help us find the root cause of the issue.

Thanks
Pranith.

Comment 2 Mike Robbert 2010-10-08 13:23:19 UTC
Created attachment 341


Logs are prepended with servername that they came from.

Comment 3 Mike Robbert 2010-10-08 13:24:10 UTC
(In reply to comment #1)
> hi Michael,
>      Could you please attach the 'uname -a' output, the log files located in
> /usr/local/var/log/glusterfs on all these machines. This should help us find
> the root cause of the issue.
> 
> Thanks
> Pranith.

uname -a from all 4 servers

iosrv-7-1: Linux iosrv-7-1.local 2.6.18-128.7.1.el5.ddn3.l1.6.7.2.ddn5smp #1 SMP Sat Jun 19 01:54:00 CEST 2010 x86_64 x86_64 x86_64 GNU/Linux
iosrv-7-3: Linux iosrv-7-3.local 2.6.18-128.7.1.el5.ddn3.l1.6.7.2.ddn5smp #1 SMP Sat Jun 19 01:54:00 CEST 2010 x86_64 x86_64 x86_64 GNU/Linux
iosrv-7-4: Linux iosrv-7-4.local 2.6.18-128.7.1.el5.ddn3.l1.6.7.2.ddn5smp #1 SMP Sat Jun 19 01:54:00 CEST 2010 x86_64 x86_64 x86_64 GNU/Linux
iosrv-7-2: Linux iosrv-7-2.local 2.6.18-128.7.1.el5.ddn3.l1.6.7.2.ddn5smp #1 SMP Sat Jun 19 01:54:00 CEST 2010 x86_64 x86_64 x86_64 GNU/Linux

Comment 4 Mike Robbert 2010-10-12 20:41:56 UTC
Any updates on this? I am unable to create any volumes. Today I updated to the latest code from the git.gluster.com repo and the status is the same.

Comment 5 Vijay Bellur 2010-10-13 00:32:40 UTC
(In reply to comment #4)
> Any updates on this? I am unable to create any volumes. Today I updated to the
> latest code from the git.gluster.com repo and the status is the same.

Mike,

As a workaround, add an entry for the initial server to /etc/hosts on the initial server.

Comment 6 Mike Robbert 2010-10-13 14:27:28 UTC
(In reply to comment #5)
> (In reply to comment #4)
> > Any updates on this? I am unable to create any volumes. Today I updated to the
> > latest code from the git.gluster.com repo and the status is the same.
> 
> Mike,
> 
> As a workaround, add an entry for the initial server to /etc/hosts on the
> initial server.

This is the contents of our /etc/hosts without any changes:

127.0.0.1 localhost.localdomain localhost
138.67.1.104 ra.mines.edu
172.16.2.4  iosrv-7-1.local  iosrv-7-1

Do you see any problems with these entries?

Mike

Comment 7 Mike Robbert 2010-10-21 17:33:10 UTC
Is there anything I can do to help debug this and get a fix committed to git? I'm sitting dead in the water right now and I'm not seeing any evidence of progress. I'm willing to work on this, but need some assistance from a developer to know how to do it.

Comment 8 Lakshmipathi G 2010-10-22 00:15:43 UTC
(In reply to comment #7)
> Is there anything I can do to help debug this and get a fix committed to git?
> I'm sitting dead in the water right now and I'm not seeing any evidence of
> progress. I'm willing to work on this, but need some assistance from a
> developer to know how to do it.

Please check with latest 3.1 - if issue still existing ,as a work around ,please use ipaddress instead of hostnames.

Comment 9 Mike Robbert 2010-10-25 13:46:15 UTC
The issue is still there with the latest 3.1, but the work around does help. I was able to probe peers and create a volume using IP addresses rather than hostname. Before doing that I was able to determine that the problem was due to the fact that the hostname command which I believe uses the same underlying system call as the glusterd code returns iosrv-7-1.local and the code is expecting to find no domain name attached. It would be nice to find a fix for this, but for now I can move on and will open a new bug report on the next problem that I've hit. hint: Client hangs on ls of glusterfs native mount

Mike

(In reply to comment #8)
> (In reply to comment #7)
> > Is there anything I can do to help debug this and get a fix committed to git?
> > I'm sitting dead in the water right now and I'm not seeing any evidence of
> > progress. I'm willing to work on this, but need some assistance from a
> > developer to know how to do it.
> 
> Please check with latest 3.1 - if issue still existing ,as a work around
> ,please use ipaddress instead of hostnames.

Comment 10 Mike Robbert 2010-10-25 14:11:09 UTC
*** Bug 2009 has been marked as a duplicate of this bug. ***

Comment 11 Anand Avati 2010-10-27 08:13:42 UTC
PATCH: http://patches.gluster.com/patch/5582 in master (mgmt/glusterd: glusterd_is_local_addr implementation)

Comment 12 Pranith Kumar K 2010-10-27 08:47:21 UTC
fixed in 3.1.1qa1 release.
http://ftp.gluster.com/pub/gluster/glusterfs/qa-releases/glusterfs-3.1.1qa1.tar.gz

Comment 13 Pranith Kumar K 2010-10-28 00:36:06 UTC
The reason for the bug is that in the previous implementation, the aliases are not handled properly for localhost. Gluster handles them now.

Comment 14 Mike Robbert 2010-10-28 15:09:53 UTC
This patch works for me. I'm able to probe peers by hostname and create volumes and everything works. There is still one odd issue though. I start with a clean install on iosrv-7-1 and first probe iosrv-7-2. It shows up on iosrv-7-1 as iosrv-7-2, but when I look at the peer status on iosrv-7-2 it shows it only peer as being 172.16.2.4. This is reproducible with any of my hosts. Let me know if you need any debugging info to fix this minor annoyance.

Thanks,
Mike

(In reply to comment #13)
> The reason for the bug is that in the previous implementation, the aliases are
> not handled properly for localhost. Gluster handles them now.

Comment 15 Pranith Kumar K 2010-10-29 01:35:40 UTC
(In reply to comment #14)
> This patch works for me. I'm able to probe peers by hostname and create volumes
> and everything works. There is still one odd issue though. I start with a clean
> install on iosrv-7-1 and first probe iosrv-7-2. It shows up on iosrv-7-1 as
> iosrv-7-2, but when I look at the peer status on iosrv-7-2 it shows it only
> peer as being 172.16.2.4. This is reproducible with any of my hosts. Let me
> know if you need any debugging info to fix this minor annoyance.
> 
> Thanks,
> Mike
> 
> (In reply to comment #13)
> > The reason for the bug is that in the previous implementation, the aliases are
> > not handled properly for localhost. Gluster handles them now.

hi Michael,
    this is a known issue 1995. So I am marking this as resolved.

Pranith

Comment 16 Amar Tumballi 2011-02-15 06:04:16 UTC
Need to check with Pranith/Vijay about what to update.

Comment 17 Divya 2011-04-13 06:25:25 UTC
The following information is added in Creating Trusted Storage Pools section: After peer probe, in the remote machine, the peer machine information is stored with IP address instead of hostname.