1073071 – Glusterfs 3.4.2 data replication doesn't work for cinder backend in RDO Havana on Fedora 20 two node cluster

Bug 1073071 - Glusterfs 3.4.2 data replication doesn't work for cinder backend in RDO Havana on Fedora 20 two node cluster

Summary: Glusterfs 3.4.2 data replication doesn't work for cinder backend in RDO Havan...

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	replicate
Sub Component:
Version:	3.4.2
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Assignee:	Pranith Kumar K
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2014-03-05 17:23 UTC by Boris Derzhavets
Modified:	2015-12-01 16:45 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2014-03-19 09:46:12 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Boris Derzhavets 2014-03-05 17:23:00 UTC

Description of problem:

[root@dallas1 Data001(keystone_admin)]$ gluster peer status Number of Peers: 1

Hostname: dallas2.localdomain Uuid: b3b1cf43-2fec-4904-82d4-b9be03f77c5f State: Peer in Cluster (Connected)

[root@dallas1 ~(keystone_admin)]$ gluster volume info

Volume Name: cinder-volumes002
Type: Replicate
Volume ID: 732da540-2eef-4842-90d5-55a657bcf4e6
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: dallas1.localdomain:/RPL/Data001
Brick2: dallas2.localdomain:/RPL/Data001
Options Reconfigured:
auth.allow: 192.168.1.*

[root@dallas1 ~(keystone_admin)]$ cd /RPL/Data001
[root@dallas1 Data001(keystone_admin)]$ ls -l
total 636492
-rw-rw-rw-. 2 root root 7516192768 Mar  5 20:43 volume-b3fe6e53-de83-4eb5-be7b-eded741c98dc

[root@dallas1 ~]# ssh dallas2
Last login: Wed Mar  5 19:07:39 2014

[root@dallas2 Data001]# gluster peer status Number of Peers: 1

Hostname: 192.168.1.130 Uuid: a57433dd-4a1a-4442-a5ae-ba2f682e5c79 State: Peer in Cluster (Connected)

[root@dallas2 ~]# cd /RPL/Data001
[root@dallas2 Data001]# ls -la
total 16
drwxr-xr-x. 3 root root 4096 Mar  5 20:25 .
drwxr-xr-x. 3 root root 4096 Mar  5 20:25 ..
drw-------. 5 root root 4096 Mar  5 20:26 .glusterfs

[root@dallas2 Data001]# date
Wed Mar  5 20:53:24 MSK 2014


Version-Release number of selected component (if applicable):

[root@dallas1 Data001(keystone_admin)]$ rpm -qa | grep gluster

glusterfs-libs-3.4.2-1.fc20.x86_64
glusterfs-cli-3.4.2-1.fc20.x86_64
glusterfs-3.4.2-1.fc20.x86_64
glusterfs-server-3.4.2-1.fc20.x86_64
glusterfs-fuse-3.4.2-1.fc20.x86_64
glusterfs-api-3.4.2-1.fc20.x86_64


How reproducible:

Create replica 2 volume on two nodes like above :-

[root@dallas1 ~(keystone_admin)]$  gluster volume create cinder-volumes002  replica 2 dallas1.localdomain:/RPL/Data001  dallas2.localdomain:/RPL/Data001 force
volume create: cinder-volumes002: success: please start the volume to access data
[root@dallas1 ~(keystone_admin)]$  gluster volume start cinder-volumes002
volume start: cinder-volumes002: success

and make sure it doesn't replicate the data


Steps to Reproduce:
1.   yum install glusterfs glusterfs-server glusterfs-fuse
2.   gluster peer probe host2 ( iptables already tuned)
3.   gluster peer status
4.   create replica 2 volume
5.   make sure it's dead.

Actual results:

Data doesn't get replicated to second host

Expected results:

Data gets replicated to second host

Additional info:

Glustefs 3.4.1 on F19 worked fine.

Comment 1 Boris Derzhavets 2014-03-06 05:42:08 UTC

Anothe test :-
Another attempt :-

[root@dallas1 ~(keystone_admin)]$ service glusterd status -l
Redirecting to /bin/systemctl status  -l glusterd.service
glusterd.service - GlusterFS an clustered file-system server
   Loaded: loaded (/usr/lib/systemd/system/glusterd.service; enabled)
   Active: active (running) since Wed 2014-03-05 21:59:54 MSK; 3h 18min ago
  Process: 2580 ExecStart=/usr/sbin/glusterd -p /run/glusterd.pid (code=exited, status=0/SUCCESS)
 Main PID: 2589 (glusterd)
   CGroup: /system.slice/glusterd.service
           ├─ 2589 /usr/sbin/glusterd -p /run/glusterd.pid
           ├─15412 /usr/sbin/glusterfsd -s dallas1.localdomain --volfile-id cinder-volumes012.dallas1.localdomain.FDR-Replicate -p /var/lib/glusterd/vols/cinder-volumes012/run/dallas1.localdomain-FDR-Replicate.pid -S /var/run/8ce78c26e525c50cc10b72362863e173.socket --brick-name /FDR/Replicate -l /var/log/glusterfs/bricks/FDR-Replicate.log --xlator-option *-posix.glusterd-uuid=a57433dd-4a1a-4442-a5ae-ba2f682e5c79 --brick-port 49155 --xlator-option cinder-volumes012-server.listen-port=49155
           ├─15424 /usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S /var/run/2e81d8930636bcf11b9ff2c39a16bb8b.socket
           ├─15428 /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/lib/glusterd/glustershd/run/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/d76b52c3f5f530727ca59045ef42b023.socket --xlator-option *replicate*.node-uuid=a57433dd-4a1a-4442-a5ae-ba2f682e5c79
           └─15452 /sbin/rpc.statd

Mar 06 00:44:14 dallas1.localdomain systemd[1]: Started GlusterFS an clustered file-system server.
Mar 06 00:52:01 dallas1.localdomain rpc.statd[10223]: Version 1.2.9 starting
Mar 06 00:52:01 dallas1.localdomain sm-notify[10224]: Version 1.2.9 starting
Mar 06 01:18:20 dallas1.localdomain rpc.statd[15452]: Version 1.2.9 starting
Mar 06 01:18:20 dallas1.localdomain sm-notify[15453]: Version 1.2.9 starting

# gluster volume info cinder-volumes012

Volume Name: cinder-volumes012
Type: Replicate
Volume ID: 9ee31c6c-0ae3-4fee-9886-b9cb6a518f48
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: dallas1.localdomain:/FDR/Replicate
Brick2: dallas2.localdomain:/FDR/Replicate
Options Reconfigured:
auth.allow: 192.168.1.*

[root@dallas1 ~]# cd /FDR/Replicate
[root@dallas1 Replicate]# pwd
/FDR/Replicate
[root@dallas1 Replicate]# vi test
[root@dallas1 Replicate]# ls -l
total 4
-rw-r--r--. 1 root root 60 Mar  6 00:54 test

[root@dallas1 Replicate]# ssh 192.168.1.140
Last login: Thu Mar  6 00:59:55 2014 from dallas1.localdomain
[root@dallas2 ~]# cd /FDR/Replicate
[root@dallas2 Replicate]# ls -la
total 16
drwxr-xr-x. 3 root root 4096 Mar  6 00:52 .
drwxr-xr-x. 3 root root 4096 Mar  6 00:50 ..
drw-------. 5 root root 4096 Mar  6 00:52 .glusterfs

/var/log/glusterfs/glustershd.log

Given volfile:
+------------------------------------------------------------------------------+
  1: volume cinder-volumes012-client-0
  2:     type protocol/client
  3:     option password 6d588916-1f8b-4af5-9cc7-79a257bdbc2d
  4:     option username 85039418-e593-4bdf-be9b-647ac7e8a7fc
  5:     option transport-type tcp
  6:     option remote-subvolume /FDR/Replicate
  7:     option remote-host dallas1.localdomain
  8: end-volume
  9: 
 10: volume cinder-volumes012-client-1
 11:     type protocol/client
 12:     option password 6d588916-1f8b-4af5-9cc7-79a257bdbc2d
 13:     option username 85039418-e593-4bdf-be9b-647ac7e8a7fc
 14:     option transport-type tcp
 15:     option remote-subvolume /FDR/Replicate
 16:     option remote-host dallas2.localdomain
 17: end-volume
 18: 
 19: volume cinder-volumes012-replicate-0
 20:     type cluster/replicate
 21:     option iam-self-heal-daemon yes
 22:     option self-heal-daemon on
 23:     option entry-self-heal on
 24:     option data-self-heal on
 25:     option metadata-self-heal on
 26:     option background-self-heal-count 0
 27:     subvolumes cinder-volumes012-client-0 cinder-volumes012-client-1
 28: end-volume
 29: 
 30: volume glustershd
 31:     type debug/io-stats
 32:     subvolumes cinder-volumes012-replicate-0
 33: end-volume

+------------------------------------------------------------------------------+
[2014-03-05 20:52:01.228469] I [rpc-clnt.c:1676:rpc_clnt_reconfig] 0-cinder-volumes012-client-0: changing port to 49155 (from 0)
[2014-03-05 20:52:01.228548] W [socket.c:514:__socket_rwv] 0-cinder-volumes012-client-0: readv failed (No data available)
[2014-03-05 20:52:01.231478] I [client-handshake.c:1659:select_server_supported_programs] 0-cinder-volumes012-client-0: Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2014-03-05 20:52:01.234667] I [client-handshake.c:1456:client_setvolume_cbk] 0-cinder-volumes012-client-0: Connected to 192.168.1.130:49155, attached to remote volume '/FDR/Replicate'.
[2014-03-05 20:52:01.234729] I [client-handshake.c:1468:client_setvolume_cbk] 0-cinder-volumes012-client-0: Server and Client lk-version numbers are not same, reopening the fds
[2014-03-05 20:52:01.234806] I [afr-common.c:3698:afr_notify] 0-cinder-volumes012-replicate-0: Subvolume 'cinder-volumes012-client-0' came back up; going online.
[2014-03-05 20:52:01.234864] I [client-handshake.c:450:client_set_lk_version_cbk] 0-cinder-volumes012-client-0: Server lk version = 1
[2014-03-05 20:52:01.287385] I [afr-self-heald.c:1180:afr_dir_exclusive_crawl] 0-cinder-volumes012-replicate-0: Another crawl is in progress for cinder-volumes012-client-0
**[2014-03-05 20:52:01.287428] E [afr-self-heald.c:1067:afr_find_child_position] 0-cinder-volumes012-replicate-0: getxattr failed on cinder-volumes012-client-1 - (Transport endpoint is not connected)**
[2014-03-05 20:52:02.233076] I [rpc-clnt.c:1676:rpc_clnt_reconfig] 0-cinder-volumes012-client-1: changing port to 49154 (from 0)
[2014-03-05 20:52:02.233161] W [socket.c:514:__socket_rwv] 0-cinder-volumes012-client-1: readv failed (No data available)
--------------------------------------------------------------------------
[2014-03-05 20:52:02.236990] E [socket.c:2157:socket_connect_finish] 0-cinder-volumes012-client-1: connection to 192.168.1.140:49154 failed (No route to host)
-------------------------------------------------------------------------------
[2014-03-05 20:52:02.237033] W [socket.c:514:__socket_rwv] 0-cinder-volumes012-client-1: readv failed (No data available)
[2014-03-05 20:52:06.152366] W [socket.c:514:__socket_rwv] 0-cinder-volumes012-client-1: readv failed (No data available)
[2014-03-05 20:52:09.156512] W [socket.c:514:__socket_rwv] 0-cinder-volumes012-client-1: readv failed (No data available)
[2014-03-05 20:52:12.160603] W [socket.c:514:__socket_rwv] 0-cinder-volumes012-client-1: readv failed (No data available)
[2014-03-05 20:52:15.164733] W [socket.c:514:__socket_rwv] 0-cinder-volumes012-client-1: readv failed (No data available)
[2014-03-05 20:52:18.168909] W [socket.c:514:__socket_rwv] 0-cinder-volumes012-client-1: readv failed (No data available)
[2014-03-05 20:52:21.173047] W [socket.c:514:__socket_rwv] 0-cinder-volumes012-client-1: readv failed (No data available)
[2014-03-05 20:52:24.179558] W [socket.c:514:__socket_rwv] 0-cinder-volumes012-client-1: readv failed (No data available)
[2014-03-05 20:52:27.181559] W [socket.c:514:__socket_rwv] 0-cinder-volumes012-client-1: readv failed (No data available)
[2014-03-05 20:52:30.185859] W [socket.c:514:__socket_rwv] 0-cinder-volumes012-client-1: readv failed (No data available)
[2014-03-05 20:52:33.190036] W [socket.c:514:__socket_rwv] 0-cinder-volumes012-client-1: readv failed (No data available)
[2014-03-05 20:52:36.193647] W [socket.c:514:__socket_rwv] 0-cinder-volumes012-client-1: readv failed (No data available)

Comment 2 Joe Julian 2014-03-06 05:55:02 UTC

First issue:

GlusterFS is not a file replication service. It is a clustered filesystem. You must mount that filesystem to take advantage of any of the features.

Writing directly to the bricks is similar to writing to /dev/sda1 with dd and expecting to find the file under your ext4 filesystem.

Please mount your volume and use that client mountpoint for your application.

Do not write directly to the bricks.

----

Second issue:

Note the errors in your logs " E " such as, "Transport endpoint is not connected" and "No route to host". Firewall maybe?

This is not a bug. Please feel free to use the IRC channel and/or mailing list for support on either of these issues.

Comment 3 Boris Derzhavets 2014-03-06 08:50:41 UTC

No firewall.

OK. When I configured (on 01/23/14)  Cinder on F20 to work with similar  volume :-

# openstack-config --set /etc/cinder/cinder.conf DEFAULT volume_driver cinder.volume.drivers.glusterfs.GlusterfsDriver

# openstack-config --set /etc/cinder/cinder.conf DEFAULT glusterfs_shares_config /etc/cinder/shares.conf

# openstack-config --set /etc/cinder/cinder.conf DEFAULT glusterfs_mount_point_base /var/lib/cinder/volumes

# vi /etc/cinder/shares.conf
    192.168.1.127:cinder-volumes05

:wq

# for i in api scheduler volume ; do service openstack-cinder-${i} restart ; done

Everything worked fine on Compute Node mounted 192.168.1.127:cinder-volumes05
on /var/lib/nonova/mnt/zzzzzzzzzzzzzzzzzz.  Cinder volume created on Controller was there. Now I cannot reproduce this setup due to mounted directory is empty

Build as of 01/23/14 :-

http://bderzhavets.blogspot.com/2014/01/setting-up-two-physical-node-openstack.html

After cinder service restart with new /etc/cinder/cinder.conf

[root@dfw02 cinder(keystone_admin)]$ df -h
Filesystem                      Size  Used Avail Use% Mounted on
/dev/mapper/fedora00-root        96G  7.4G   84G   9% /
devtmpfs                        3.9G     0  3.9G   0% /dev
tmpfs                           3.9G  152K  3.9G   1% /dev/shm
tmpfs                           3.9G  1.2M  3.9G   1% /run
tmpfs                           3.9G     0  3.9G   0% /sys/fs/cgroup
tmpfs                           3.9G  184K  3.9G   1% /tmp
/dev/sda5                       477M  101M  347M  23% /boot
/dev/mapper/fedora00-data1       77G   53M   73G   1% /data1
tmpfs                           3.9G  1.2M  3.9G   1% /run/netns
192.168.1.127:cinder-volumes05   77G   52M   73G   1% /var/lib/cinder/volumes/62f75cf6996a8a6bcc0d343be378c10a

At runtime on Compute Node :-

[root@dfw01 ~]# df -h
Filesystem                      Size  Used Avail Use% Mounted on
/dev/mapper/fedora-root          96G   54G   38G  59% /
devtmpfs                        3.9G     0  3.9G   0% /dev
tmpfs                           3.9G  484K  3.9G   1% /dev/shm
tmpfs                           3.9G  1.3M  3.9G   1% /run
tmpfs                           3.9G     0  3.9G   0% /sys/fs/cgroup
tmpfs                           3.9G   36K  3.9G   1% /tmp
/dev/sda5                       477M  121M  327M  27% /boot
/dev/mapper/fedora-data1         77G  6.7G   67G  10% /data1
192.168.1.127:cinder-volumes05   77G  6.7G   67G  10% /var/lib/nova/mnt/62f75cf6996a8a6bcc0d343be378c10a

Now /var/lib/nova/mnt/62f75cf6996a8a6bcc0d343be378c10a is empty . /var/log/nova/compute.log reports no volumes in this directory, however on Controller
var/lib/cinder/volumes/62f75cf6996a8a6bcc0d343be378c10a does contain a cinder
volume. RDO openstack cinder-volume service is responsible for this mount NOT me and it appears to be useless now.


Please view how replication Gluster volume works on CentOS 6.5 (and Glusterfs 3.4.1 worked on F19 in the same way exactly as replication service, when replicated volume has been created for this purpose)

I just followed Andrew Law on CentOS 6.5 and Fedora 20 (01/23/2014) and succeeded with creating replation 2 volume  as backend on cinder service in RDO Havana.  Moreover I mannual create in replicated directory new file and it had been mirrored to second host also. 

http://www.andrewklau.com/getting-started-with-multi-node-openstack-rdo-havana-gluster-backend-neutron/

Comment 4 Boris Derzhavets 2014-03-06 08:57:17 UTC

Fixing typos  above:-

Please view how replication Gluster volume works on CentOS 6.5 (and Glusterfs 3.4.1 worked on F19 in the same way - exactly as replication service, when replicated volume has been created for this purpose.

I just followed Andrew Law on CentOS 6.5 and Fedora 20 (01/23/2014) and succeeded with creating replication 2 volume  as backend for cinder service in RDO Havana.  Moreover,  I manually  created in replicated directory new file and it had been mirrored to second host also. 

http://www.andrewklau.com/getting-started-with-multi-node-openstack-rdo-havana-gluster-backend-neutron/

If 3.4.2 works in different way then 3.4.1 . Please, let me know where instructions are located.

Comment 5 Niels de Vos 2014-03-06 10:16:09 UTC

(In reply to Boris Derzhavets from comment #4)
> If 3.4.2 works in different way then 3.4.1 . Please, let me know where
> instructions are located.

Gluster does not work any different. There could be a change in OpenStack that introduces this regression.

You should verify that you can mount the Gluster volume on the system. Try manually with a command like this:

  # mount -t glusterfs dallas1.localdomain:cinder-volumes012 /mnt

If this fails, make sure you have the glusterfs-fuse package installed, and check for any errors in the log for this (/mnt) mountoint: /var/log/glusterfs/mnt.log

Comment 6 Boris Derzhavets 2014-03-06 10:42:45 UTC

Report from from Two Node Cluster built on 01/23/14 and yum updated about 10 days ago :-

On Controller :-

[root@dfw02 ~(keystone_boris)]$ uname -a
Linux dfw02.localdomain 3.13.4-200.fc20.x86_64 #1 SMP Thu Feb 20 23:00:47 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux


    # gluster volume info cinder-volumes05
    Volume Name: cinder-volumes05
    Type: Replicate
    Volume ID: 029b210d-1fd6-4276-b6d1-86b75078113b
    Status: Started
    Number of Bricks: 1 x 2 = 2
    Transport-type: tcp
    Bricks:
    Brick1: dfw02.localdomain:/data1/cinder5
    Brick2: dfw01.localdomain:/data1/cinder5
    Options Reconfigured:
    auth.allow: 192.168.1.*
    storage.owner-uid: 165
    storage.owner-gid: 165

    [root@dfw02 ~(keystone_admin)]$ cd /data1/cinder5
    [root@dfw02 cinder5(keystone_admin)]$ ls -la
    total 8822912
    drwxr-xr-x.  3 cinder cinder       4096 Mar  3 17:15 .
    drwxr-xr-x.  7 root   root         4096 Jan 25 06:59 ..
    drw-------. 65 root   root         4096 Mar  3 17:15 .glusterfs
    -rw-rw-rw-.  2 root   root   7516192768 Mar  2 01:20 volume-1a4f3831-417e-4002-8302-0473494416b7
    -rw-rw-rw-.  2 root   root   7516192768 Mar  2 10:02 volume-6d51b40b-7a91-4fb6-97b6-725858bbc035
    -rw-rw-rw-.  2 root   root   7516192768 Mar  1 16:29 volume-9b0dcb5a-5baa-4f7a-9163-0dea3b848ac1
    -rw-rw-rw-.  2 root   root   7516192768 Mar  3 18:17 volume-a3764041-cf99-4928-93af-21a13707349d
    -rw-rw-rw-.  2 root   root   7516192768 Feb 27 20:24 volume-de8451cf-df5e-4e7b-9cc0-182d85d7fe40

    [root@dfw02 cinder5(keystone_admin)]$ rpm -qa | grep gluster

    glusterfs-api-3.4.2-1.fc20.x86_64
    glusterfs-server-3.4.2-1.fc20.x86_64
    glusterfs-fuse-3.4.2-1.fc20.x86_64
    glusterfs-3.4.2-1.fc20.x86_64
    glusterfs-cli-3.4.2-1.fc20.x86_64
    glusterfs-libs-3.4.2-1.fc20.x86_64


    On Compute :-

    [root@dfw02 ~]# ssh dfw01.localdomain
    The authenticity of host 'dfw01.localdomain (192.168.1.137)' can't be established.
    ECDSA key fingerprint is e8:8a:71:91:c0:3a:41:6f:8c:11:dc:52:5a:a8:84:73.
    Are you sure you want to continue connecting (yes/no)? yes
    Warning: Permanently added 'dfw01.localdomain' (ECDSA) to the list of known hosts.
    Last login: Mon Mar  3 17:09:04 2014

    [root@dfw01 ~]# cd /data1/cinder5

    [root@dfw01 cinder5]# ls -la
    total 8823088
    drwxr-xr-x.  3 cinder cinder       4096 Mar  3 17:15 .
    drwxr-xr-x.  7 root   root         4096 Jan 25 06:59 ..
    drw-------. 65 root   root         4096 Mar  3 17:15 .glusterfs
    -rw-rw-rw-.  2 root   root   7516192768 Mar  2 01:20 volume-1a4f3831-417e-4002-8302-0473494416b7
    ----------------------------------------------------------------------------
    -rw-rw-rw-.  2 qemu   qemu   7516192768 Mar  6 14:23 volume-6d51b40b-7a91-4fb6-97b6-725858bbc035 <-  owner switched from root to qemu
    ---------------------------------------------------------------------------
    -rw-rw-rw-.  2 root   root   7516192768 Mar  1 16:29 volume-9b0dcb5a-5baa-4f7a-9163-0dea3b848ac1
    -rw-rw-rw-.  2 root   root   7516192768 Mar  3 18:17 volume-a3764041-cf99-4928-93af-21a13707349d
    -rw-rw-rw-.  2 root   root   7516192768 Feb 27 20:24 volume-de8451cf-df5e-4e7b-9cc0-182d85d7fe40

    It's obvious now volume-6d51b40b-7a91-4fb6-97b6-725858bbc035 is attached to running instance ( all instances are cinder volume based).

    Runtime snapshot from compute :-

    [root@dfw01 cinder5]# df -h
    Filesystem                      Size  Used Avail Use% Mounted on
    /dev/mapper/fedora-root          96G   50G   43G  54% /
    devtmpfs                        3.9G     0  3.9G   0% /dev
    tmpfs                           3.9G   84K  3.9G   1% /dev/shm
    tmpfs                           3.9G  1.3M  3.9G   1% /run
    tmpfs                           3.9G     0  3.9G   0% /sys/fs/cgroup
    tmpfs                           3.9G   16K  3.9G   1% /tmp
    /dev/sda5                       477M  122M  327M  28% /boot
    /dev/mapper/fedora-data1         77G  8.5G   65G  12% /data1
    192.168.1.127:cinder-volumes05   77G  8.5G   65G  12% /var/lib/nova/mnt/62f75cf6996a8a6bcc0d343be378c10a

    Runtime snapshot on Controller :-

    [root@dfw02 ~(keystone_admin)]$ df -h
    Filesystem                      Size  Used Avail Use% Mounted on
    /dev/mapper/fedora00-root        96G   38G   54G  41% /
    devtmpfs                        3.9G     0  3.9G   0% /dev
    tmpfs                           3.9G   92K  3.9G   1% /dev/shm
    tmpfs                           3.9G  9.1M  3.9G   1% /run
    tmpfs                           3.9G     0  3.9G   0% /sys/fs/cgroup
    tmpfs                           3.9G  140K  3.9G   1% /tmp
    /dev/sda5                       477M  122M  327M  28% /boot
    /dev/mapper/fedora00-data1       77G  8.5G   65G  12% /data1
    192.168.1.127:cinder-volumes05   77G  8.5G   65G  12% /var/lib/cinder/volumes/62f75cf6996a8a6bcc0d343be378c10a
    tmpfs                           3.9G  9.1M  3.9G   1% /run/netns


    [root@dfw02 ~(keystone_boris)]$ nova list
    +--------------------------------------+---------------+-----------+------------+-------------+------------------------------+
    | ID                                   | Name          | Status    | Task State | Power State | Networks                     |
    +--------------------------------------+---------------+-----------+------------+-------------+------------------------------+
    | 6aeef7ca-0b52-4668-b342-6a9c2299d6e5 | UbuntuGLS     | SUSPENDED | None       | Shutdown    | int1=40.0.0.7, 192.168.1.103 |
    | 4c2609c1-2ade-4992-bf3a-d55337cc4430 | UbuntuTBeta01 | SUSPENDED | None       | Shutdown    | int1=40.0.0.5, 192.168.1.104 |
    | 7953950c-112c-4c59-b183-5cbd06eabcf6 | VF19WXL       | SUSPENDED | None       | Shutdown    | int1=40.0.0.6, 192.168.1.121 |
    | 784e8afc-d41a-4c2e-902a-8e109a40f7db | VF20GLS       | ACTIVE    | None       | Running     | int1=40.0.0.4, 192.168.1.102 |
    | 376cff66-efb8-4a92-8611-3887346acdbb | VF20TST       | SUSPENDED | None       | Shutdown    | int1=40.0.0.2, 192.168.1.105 |
    +--------------------------------------+---------------+-----------+------------+-------------+------------------------------+
    [root@dfw02 ~(keystone_boris)]$ nova show 784e8afc-d41a-4c2e-902a-8e109a40f7db
    +--------------------------------------+----------------------------------------------------------+
    | Property                             | Value                                                    |
    +--------------------------------------+----------------------------------------------------------+
    | status                               | ACTIVE                                                   |
    | updated                              | 2014-03-06T10:16:50Z                                     |
    | OS-EXT-STS:task_state                | None                                                     |
    | key_name                             | None                                                     |
    | image                                | Attempt to boot from volume - no image supplied          |
    | hostId                               | 73ee4f5bd4da8ad7b39d768d0b167a03ac0471ea50d9ded6c6190fb1 |
    | OS-EXT-STS:vm_state                  | active                                                   |
    | OS-SRV-USG:launched_at               | 2014-02-28T07:35:51.000000                               |
    | flavor                               | m1.small (2)                                             |
    | id                                   | 784e8afc-d41a-4c2e-902a-8e109a40f7db                     |
    | security_groups                      | [{u'name': u'default'}]                                  |
    | OS-SRV-USG:terminated_at             | None                                                     |
    | user_id                              | 162021e787c54cac906ab3296a386006                         |
    | name                                 | VF20GLS                                                  |
    | created                              | 2014-02-28T07:35:46Z                                     |
    | tenant_id                            | 4dacfff9e72c4245a48d648ee23468d5                         |
    | OS-DCF:diskConfig                    | MANUAL                                                   |
    | metadata                             | {}                                                       |
    | os-extended-volumes:volumes_attached | [{u'id': u'6d51b40b-7a91-4fb6-97b6-725858bbc035'}]   <-- "Volume's id is shown"  |
    | accessIPv4                           |                                                          |
    | accessIPv6                           |                                                          |
    | progress                             | 0                                                        |
    | OS-EXT-STS:power_state               | 1                                                        |
    | OS-EXT-AZ:availability_zone          | nova                                                     |
    | int1 network                         | 40.0.0.4, 192.168.1.102                                  |
    | config_drive                         |                                                          |
    +--------------------------------------+----------------------------------------------------------+

Comment 7 Boris Derzhavets 2014-03-06 11:07:31 UTC

(In reply to Niels de Vos from comment #5)
> (In reply to Boris Derzhavets from comment #4)
> > If 3.4.2 works in different way then 3.4.1 . Please, let me know where
> > instructions are located.
> 
> Gluster does not work any different. There could be a change in OpenStack
> that introduces this regression.
> 
> You should verify that you can mount the Gluster volume on the system. Try
> manually with a command like this:
> 
>   # mount -t glusterfs dallas1.localdomain:cinder-volumes012 /mnt
> 
> If this fails, make sure you have the glusterfs-fuse package installed, and
> check for any errors in the log for this (/mnt) mountoint:
> /var/log/glusterfs/mnt.log

Compute node mounts :-

dallas1.localdomain:cinder-volumes012 on /var/lib/cinder/volumes/xxxxxxxxxxxxxxx

and then reports , that no cinder-volumes are inside the last folder it's in
/var/log/nova/compute.log on Compute node. Looks like you are correct. 
Regression seems to be in RDO Openstack Havana on F20. 

I am changing header of the bug , what will readdress the issue.

Comment 8 Boris Derzhavets 2014-03-08 08:27:05 UTC

(In reply to Niels de Vos from comment #5)
> (In reply to Boris Derzhavets from comment #4)
> > If 3.4.2 works in different way then 3.4.1 . Please, let me know where
> > instructions are located.
> 
> Gluster does not work any different. There could be a change in OpenStack
> that introduces this regression.
> 
> You should verify that you can mount the Gluster volume on the system. Try
> manually with a command like this:
> 
>   # mount -t glusterfs dallas1.localdomain:cinder-volumes012 /mnt
> 
> If this fails, make sure you have the glusterfs-fuse package installed, and
> check for any errors in the log for this (/mnt) mountoint:
> /var/log/glusterfs/mnt.log

Attempt to mount one brick volume on server ( no replication involved) :-

# cmount -t glusterfs 192.168.1.130:cinder-volume01 /mnt

Failure

gluster-fuse 3.4.2 installed

[root@dallas2 ~]# cat /var/log/glusterfs/mnt.log

[2014-03-08 07:44:03.842208] I [glusterfsd.c:1910:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.4.2 (/usr/sbin/glusterfs --volfile-id=cinder-volume01 --volfile-server=192.168.1.130 /mnt)
[2014-03-08 07:44:03.864046] I [socket.c:3480:socket_init] 0-glusterfs: SSL support is NOT enabled
[2014-03-08 07:44:03.864087] I [socket.c:3495:socket_init] 0-glusterfs: using system polling thread
[2014-03-08 07:44:03.891999] I [socket.c:3480:socket_init] 0-cinder-volume01-client-0: SSL support is NOT enabled
[2014-03-08 07:44:03.892022] I [socket.c:3495:socket_init] 0-cinder-volume01-client-0: using system polling thread
[2014-03-08 07:44:03.892040] I [client.c:2154:notify] 0-cinder-volume01-client-0: parent translators are ready, attempting connect on transport
Given volfile:
+------------------------------------------------------------------------------+
  1: volume cinder-volume01-client-0
  2:     type protocol/client
  3:     option transport-type tcp
  4:     option remote-subvolume /rhs/brick1/cinder-volume01
  5:     option remote-host 192.168.1.130
  6: end-volume
  7: 
  8: volume cinder-volume01-dht
  9:     type cluster/distribute
 10:     subvolumes cinder-volume01-client-0
 11: end-volume
 12: 
 13: volume cinder-volume01-write-behind
 14:     type performance/write-behind
 15:     subvolumes cinder-volume01-dht
 16: end-volume
 17: 
 18: volume cinder-volume01-read-ahead
 19:     type performance/read-ahead
 20:     subvolumes cinder-volume01-write-behind
 21: end-volume
 22: 
 23: volume cinder-volume01-io-cache
 24:     type performance/io-cache
 25:     subvolumes cinder-volume01-read-ahead
 26: end-volume
 27: 
 28: volume cinder-volume01-quick-read
 29:     type performance/quick-read
 30:     subvolumes cinder-volume01-io-cache
 31: end-volume
 32: 
 33: volume cinder-volume01-open-behind
 34:     type performance/open-behind
 35:     subvolumes cinder-volume01-quick-read
 36: end-volume
 37: 
 38: volume cinder-volume01-md-cache
 39:     type performance/md-cache
 40:     subvolumes cinder-volume01-open-behind
 41: end-volume
 42: 
 43: volume cinder-volume01
 44:     type debug/io-stats
 45:     option count-fop-hits off
 46:     option latency-measurement off
 47:     subvolumes cinder-volume01-md-cache
 48: end-volume

+------------------------------------------------------------------------------+
[2014-03-08 07:44:03.895591] I [rpc-clnt.c:1676:rpc_clnt_reconfig] 0-cinder-volume01-client-0: changing port to 49152 (from 0)
[2014-03-08 07:44:03.895624] W [socket.c:514:__socket_rwv] 0-cinder-volume01-client-0: readv failed (No data available)
[2014-03-08 07:44:03.899006] E [socket.c:2157:socket_connect_finish] 0-cinder-volume01-client-0: connection to 192.168.1.130:49152 failed (No route to host)
[2014-03-08 07:44:03.902745] I [fuse-bridge.c:4769:fuse_graph_setup] 0-fuse: switched to graph 0
[2014-03-08 07:44:03.902861] W [socket.c:514:__socket_rwv] 0-cinder-volume01-client-0: readv failed (No data available)
[2014-03-08 07:44:03.903259] I [fuse-bridge.c:3724:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.13 kernel 7.22
[2014-03-08 07:44:03.904812] W [fuse-bridge.c:705:fuse_attr_cbk] 0-glusterfs-fuse: 2: LOOKUP() / => -1 (No such file or directory)
[2014-03-08 07:44:03.922641] I [fuse-bridge.c:4628:fuse_thread_proc] 0-fuse: unmounting /mnt
[2014-03-08 07:44:03.922955] W [glusterfsd.c:1002:cleanup_and_exit] (-->/usr/lib64/libc.so.6(clone+0x6d) [0x7f8c0fdcbded] (-->/usr/lib64/libpthread.so.0(+0x3a5c407f33) [0x7f8c10484f33] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x7f8c11177f75]))) 0-: received signum (15), shutting down
[2014-03-08 07:44:03.922978] I [fuse-bridge.c:5260:fini] 0-fuse: Unmounting '/mnt'.

Comment 9 Niels de Vos 2014-03-08 13:45:22 UTC

It seems that your glusterfs-client dallas2 can not connect to 192.168.1.130. This IP-address was used to create the volume, all clients require a direct connection to the bricks. Please make sure that dallas2 has a route to 192.168.1.130. If the server hosting the brick has multiple network interfaces, you can use the FQDN, or an accessible IP-address.

Additional information an d hints can be found here:
http://hekafs.org/index.php/2013/01/split-and-secure-networks-for-glusterfs/

Hint from the log:
[2014-03-08 07:44:03.899006] E [socket.c:2157:socket_connect_finish]
0-cinder-volume01-client-0: connection to 192.168.1.130:49152 failed (No route
to host)

Comment 10 Niels de Vos 2014-03-08 13:51:08 UTC

Oh, wait... scratch my last comment.

You're mounting from 192.168.1.130, and the volume layout is returned (by the glusterd service listening on port 24007). It is awkward that port 49152 (the brick process) on the same server returns "No route to host".

Could you please verify the following?

1. firewall should allow access to port 49152
2. 'gluster volume status cinder-volume01'
   - the brick should be marked as Online=Y and have a port and PID

Comment 11 Boris Derzhavets 2014-03-08 15:16:46 UTC

(In reply to Niels de Vos from comment #10)
> Oh, wait... scratch my last comment.
> 
> You're mounting from 192.168.1.130, and the volume layout is returned (by
> the glusterd service listening on port 24007). It is awkward that port 49152
> (the brick process) on the same server returns "No route to host".
> 
> Could you please verify the following?
> 
> 1. firewall should allow access to port 49152
> 2. 'gluster volume status cinder-volume01'
>    - the brick should be marked as Online=Y and have a port and PID

Here we go :-

On Controller (192.168.1.130)
-------------------------------------------------------------------------------
[root@dallas1 ~]# gluster volume status cinder-volume01
Status of volume: cinder-volume01
Gluster process 	                 		Port	Online	Pid
------------------------------------------------------------------------------
Brick 192.168.1.130:/rhs/brick1/cinder-volume01		49152	Y	9645
NFS Server on localhost 				2049	Y	9656
NFS Server on dallas2.localdomain			2049	Y	12741
 
There are no active volume tasks
[root@dallas1 ~]# netstat -lntp | grep 49152
tcp        0      0 0.0.0.0:49152           0.0.0.0:*               LISTEN      9645/glusterfsd     
[root@dallas1 ~]# netstat -lntp | grep 24007
tcp        0      0 0.0.0.0:24007           0.0.0.0:*               LISTEN      9132/glusterd   
tcp        0      0 0.0.0.0:2049            0.0.0.0:*               LISTEN      9656/glusterfs 

--------------------------------------------------------------------------------

On Compute (192.168.1.140):-

--------------------------------------------------------------------------------
[root@dallas2 ~]# mount -t glusterfs 192.168.1.130:cinder-volume01 /mnt
Mount failed. Please check the log file for more details.
--------------------------------------------------------------------------------
Log file :-

[2014-03-08 08:29:03.024204] I [glusterfsd.c:1910:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.4.2 (/usr/sbin/glusterfs --volfile-id=cinder-volume01 --volfile-server=192.168.1.130 /mnt)
[2014-03-08 08:29:03.033256] I [socket.c:3480:socket_init] 0-glusterfs: SSL support is NOT enabled
[2014-03-08 08:29:03.033311] I [socket.c:3495:socket_init] 0-glusterfs: using system polling thread
[2014-03-08 08:29:03.045233] I [socket.c:3480:socket_init] 0-cinder-volume01-client-0: SSL support is NOT enabled
[2014-03-08 08:29:03.045255] I [socket.c:3495:socket_init] 0-cinder-volume01-client-0: using system polling thread
[2014-03-08 08:29:03.045274] I [client.c:2154:notify] 0-cinder-volume01-client-0: parent translators are ready, attempting connect on transport
Given volfile:
+------------------------------------------------------------------------------+
  1: volume cinder-volume01-client-0
  2:     type protocol/client
  3:     option transport-type tcp
  4:     option remote-subvolume /rhs/brick1/cinder-volume01
  5:     option remote-host 192.168.1.130
  6: end-volume
  7: 
  8: volume cinder-volume01-dht
  9:     type cluster/distribute
 10:     subvolumes cinder-volume01-client-0
 11: end-volume
 12: 
 13: volume cinder-volume01-write-behind
 14:     type performance/write-behind
 15:     subvolumes cinder-volume01-dht
 16: end-volume
 17: 
 18: volume cinder-volume01-read-ahead
 19:     type performance/read-ahead
 20:     subvolumes cinder-volume01-write-behind
 21: end-volume
 22: 
 23: volume cinder-volume01-io-cache
 24:     type performance/io-cache
 25:     subvolumes cinder-volume01-read-ahead
 26: end-volume
 27: 
 28: volume cinder-volume01-quick-read
 29:     type performance/quick-read
 30:     subvolumes cinder-volume01-io-cache
 31: end-volume
 32: 
 33: volume cinder-volume01-open-behind
 34:     type performance/open-behind
 35:     subvolumes cinder-volume01-quick-read
 36: end-volume
 37: 
 38: volume cinder-volume01-md-cache
 39:     type performance/md-cache
 40:     subvolumes cinder-volume01-open-behind
 41: end-volume
 42: 
 43: volume cinder-volume01
 44:     type debug/io-stats
 45:     option count-fop-hits off
 46:     option latency-measurement off
 47:     subvolumes cinder-volume01-md-cache
 48: end-volume

+------------------------------------------------------------------------------+
[2014-03-08 08:29:03.048820] I [rpc-clnt.c:1676:rpc_clnt_reconfig] 0-cinder-volume01-client-0: changing port to 49152 (from 0)
[2014-03-08 08:29:03.048847] W [socket.c:514:__socket_rwv] 0-cinder-volume01-client-0: readv failed (No data available)
[2014-03-08 08:29:03.052236] E [socket.c:2157:socket_connect_finish] 0-cinder-volume01-client-0: connection to 192.168.1.130:49152 failed (No route to host)
[2014-03-08 08:29:03.056117] I [fuse-bridge.c:4769:fuse_graph_setup] 0-fuse: switched to graph 0
[2014-03-08 08:29:03.056228] W [socket.c:514:__socket_rwv] 0-cinder-volume01-client-0: readv failed (No data available)
[2014-03-08 08:29:03.056322] I [fuse-bridge.c:3724:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.13 kernel 7.22
[2014-03-08 08:29:03.057786] W [fuse-bridge.c:705:fuse_attr_cbk] 0-glusterfs-fuse: 2: LOOKUP() / => -1 (No such file or directory)
[2014-03-08 08:29:03.069326] I [fuse-bridge.c:4628:fuse_thread_proc] 0-fuse: unmounting /mnt
[2014-03-08 08:29:03.069639] W [glusterfsd.c:1002:cleanup_and_exit] (-->/usr/lib64/libc.so.6(clone+0x6d) [0x7f2a5239eded] (-->/usr/lib64/libpthread.so.0(+0x3a5c407f33) [0x7f2a52a57f33] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x7f2a5374af75]))) 0-: received signum (15), shutting down
[2014-03-08 08:29:03.069663] I [fuse-bridge.c:5260:fini] 0-fuse: Unmounting '/mnt'.
[2014-03-08 14:50:59.570357] I [glusterfsd.c:1910:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.4.2 (/usr/sbin/glusterfs --volfile-id=cinder-volume01 --volfile-server=192.168.1.130 /mnt)
[2014-03-08 14:50:59.602928] I [socket.c:3480:socket_init] 0-glusterfs: SSL support is NOT enabled
[2014-03-08 14:50:59.602970] I [socket.c:3495:socket_init] 0-glusterfs: using system polling thread
[2014-03-08 14:50:59.612334] E [socket.c:2157:socket_connect_finish] 0-glusterfs: connection to 192.168.1.130:24007 failed (No route to host)
[2014-03-08 14:50:59.612392] E [glusterfsd-mgmt.c:1875:mgmt_rpc_notify] 0-glusterfsd-mgmt: failed to connect with remote-host: Transport endpoint is not connected
[2014-03-08 14:50:59.612418] I [glusterfsd-mgmt.c:1878:mgmt_rpc_notify] 0-glusterfsd-mgmt: -1 connect attempts left
[2014-03-08 14:50:59.612623] W [glusterfsd.c:1002:cleanup_and_exit] (-->/usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x23) [0x7fb731f0d513] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x90) [0x7fb731f11140] (-->/usr/sbin/glusterfs(+0xd49a) [0x7fb7325cb49a]))) 0-: received signum (1), shutting down
[2014-03-08 14:50:59.612645] I [fuse-bridge.c:5260:fini] 0-fuse: Unmounting '/mnt'.
[2014-03-08 14:54:35.757439] I [glusterfsd.c:1910:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.4.2 (/usr/sbin/glusterfs --volfile-id=cinder-volume01 --volfile-server=192.168.1.130 /mnt)
[2014-03-08 14:54:35.770182] I [socket.c:3480:socket_init] 0-glusterfs: SSL support is NOT enabled
[2014-03-08 14:54:35.770225] I [socket.c:3495:socket_init] 0-glusterfs: using system polling thread
[2014-03-08 14:54:35.775459] E [socket.c:2157:socket_connect_finish] 0-glusterfs: connection to 192.168.1.130:24007 failed (No route to host)
[2014-03-08 14:54:35.775494] E [glusterfsd-mgmt.c:1875:mgmt_rpc_notify] 0-glusterfsd-mgmt: failed to connect with remote-host: Transport endpoint is not connected
[2014-03-08 14:54:35.775511] I [glusterfsd-mgmt.c:1878:mgmt_rpc_notify] 0-glusterfsd-mgmt: -1 connect attempts left
[2014-03-08 14:54:35.775668] W [glusterfsd.c:1002:cleanup_and_exit] (-->/usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x23) [0x7f2b072cd513] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x90) [0x7f2b072d1140] (-->/usr/sbin/glusterfs(+0xd49a) [0x7f2b0798b49a]))) 0-: received signum (1), shutting down
[2014-03-08 14:54:35.775698] I [fuse-bridge.c:5260:fini] 0-fuse: Unmounting '/mnt'.
[2014-03-08 14:54:35.781716] W [glusterfsd.c:1002:cleanup_and_exit] (-->/usr/lib64/libc.so.6(clone+0x6d) [0x7f2b065d9ded] (-->/usr/lib64/libpthread.so.0(+0x3a5c407f33) [0x7f2b06c92f33] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x7f2b07985f75]))) 0-: received signum (15), shutting down

[2014-03-08 14:50:59.570357] I [glusterfsd.c:1910:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.4.2 (/usr/sbin/glusterfs --volfile-id=cinder-volume01 --volfile-server=192.168.1.130 /mnt)
[2014-03-08 14:50:59.602928] I [socket.c:3480:socket_init] 0-glusterfs: SSL support is NOT enabled
[2014-03-08 14:50:59.602970] I [socket.c:3495:socket_init] 0-glusterfs: using system polling thread
[2014-03-08 14:50:59.612334] E [socket.c:2157:socket_connect_finish] 0-glusterfs: connection to 192.168.1.130:24007 failed (No route to host)
[2014-03-08 14:50:59.612392] E [glusterfsd-mgmt.c:1875:mgmt_rpc_notify] 0-glusterfsd-mgmt: failed to connect with remote-host: Transport endpoint is not connected
[2014-03-08 14:50:59.612418] I [glusterfsd-mgmt.c:1878:mgmt_rpc_notify] 0-glusterfsd-mgmt: -1 connect attempts left
[2014-03-08 14:50:59.612623] W [glusterfsd.c:1002:cleanup_and_exit] (-->/usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x23) [0x7fb731f0d513] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x90) [0x7fb731f11140] (-->/usr/sbin/glusterfs(+0xd49a) [0x7fb7325cb49a]))) 0-: received signum (1), shutting down
[2014-03-08 14:50:59.612645] I [fuse-bridge.c:5260:fini] 0-fuse: Unmounting '/mnt'.
[2014-03-08 14:54:35.757439] I [glusterfsd.c:1910:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.4.2 (/usr/sbin/glusterfs --volfile-id=cinder-volume01 --volfile-server=192.168.1.130 /mnt)
[2014-03-08 14:54:35.770182] I [socket.c:3480:socket_init] 0-glusterfs: SSL support is NOT enabled
[2014-03-08 14:54:35.770225] I [socket.c:3495:socket_init] 0-glusterfs: using system polling thread
[2014-03-08 14:54:35.775459] E [socket.c:2157:socket_connect_finish] 0-glusterfs: connection to 192.168.1.130:24007 failed (No route to host)
[2014-03-08 14:54:35.775494] E [glusterfsd-mgmt.c:1875:mgmt_rpc_notify] 0-glusterfsd-mgmt: failed to connect with remote-host: Transport endpoint is not connected
[2014-03-08 14:54:35.775511] I [glusterfsd-mgmt.c:1878:mgmt_rpc_notify] 0-glusterfsd-mgmt: -1 connect attempts left
[2014-03-08 14:54:35.775668] W [glusterfsd.c:1002:cleanup_and_exit] (-->/usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x23) [0x7f2b072cd513] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x90) [0x7f2b072d1140] (-->/usr/sbin/glusterfs(+0xd49a) [0x7f2b0798b49a]))) 0-: received signum (1), shutting down
[2014-03-08 14:54:35.775698] I [fuse-bridge.c:5260:fini] 0-fuse: Unmounting '/mnt'.
[2014-03-08 14:54:35.781716] W [glusterfsd.c:1002:cleanup_and_exit] (-->/usr/lib64/libc.so.6(clone+0x6d) [0x7f2b065d9ded] (-->/usr/lib64/libpthread.so.0(+0x3a5c407f33) [0x7f2b06c92f33] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x7f2b07985f75]))) 0-: received signum (15), shutting down
[2014-03-08 15:12:10.976891] I [glusterfsd.c:1910:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.4.2 (/usr/sbin/glusterfs --volfile-id=cinder-volume01 --volfile-server=192.168.1.130 /mnt)
[2014-03-08 15:12:10.984602] I [socket.c:3480:socket_init] 0-glusterfs: SSL support is NOT enabled
[2014-03-08 15:12:10.984642] I [socket.c:3495:socket_init] 0-glusterfs: using system polling thread
[2014-03-08 15:12:10.989434] E [socket.c:2157:socket_connect_finish] 0-glusterfs: connection to 192.168.1.130:24007 failed (No route to host)
[2014-03-08 15:12:10.989490] E [glusterfsd-mgmt.c:1875:mgmt_rpc_notify] 0-glusterfsd-mgmt: failed to connect with remote-host: Transport endpoint is not connected
[2014-03-08 15:12:10.989507] I [glusterfsd-mgmt.c:1878:mgmt_rpc_notify] 0-glusterfsd-mgmt: -1 connect attempts left
[2014-03-08 15:12:10.990164] W [glusterfsd.c:1002:cleanup_and_exit] (-->/usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x23) [0x7f4cfe595513] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x90) [0x7f4cfe599140] (-->/usr/sbin/glusterfs(+0xd49a) [0x7f4cfec5349a]))) 0-: received signum (1), shutting down
[2014-03-08 15:12:10.990192] I [fuse-bridge.c:5260:fini] 0-fuse: Unmounting '/mnt'.

Comment 12 Boris Derzhavets 2014-03-08 15:21:10 UTC

On Compute:-

[root@dallas2 ~]# rpm -qa | grep glusterfs
glusterfs-api-3.4.2-1.fc20.x86_64
glusterfs-fuse-3.4.2-1.fc20.x86_64
glusterfs-server-3.4.2-1.fc20.x86_64
glusterfs-3.4.2-1.fc20.x86_64
glusterfs-cli-3.4.2-1.fc20.x86_64
glusterfs-libs-3.4.2-1.fc20.x86_64

Comment 13 Joe Julian 2014-03-08 16:23:24 UTC

EHOSTUNREACH is not a bug. Fix your firewall/network. Gluster is receiving an ICMP type 3 datagram which causes the error message, "no route to host".

Comment 14 Boris Derzhavets 2014-03-08 17:27:36 UTC

First one (Havana RDO Controller Node F20 )of boxes is running :-

[root@dallas1 ~(keystone_admin)]$ openstack-status
== Nova services ==
openstack-nova-api:                     active
openstack-nova-cert:                    inactive  (disabled on boot)
openstack-nova-compute:                 inactive  (disabled on boot)
openstack-nova-network:                 inactive  (disabled on boot)
openstack-nova-scheduler:               active
openstack-nova-volume:                  inactive  (disabled on boot)
openstack-nova-conductor:               active
== Glance services ==
openstack-glance-api:                   active
openstack-glance-registry:              active
== Keystone service ==
openstack-keystone:                     active
== neutron services ==
neutron-server:                         active
neutron-dhcp-agent:                     active
neutron-l3-agent:                       active
neutron-metadata-agent:                 active
neutron-lbaas-agent:                    inactive  (disabled on boot)
neutron-openvswitch-agent:              active
neutron-linuxbridge-agent:              inactive  (disabled on boot)
neutron-ryu-agent:                      inactive  (disabled on boot)
neutron-nec-agent:                      inactive  (disabled on boot)
neutron-mlnx-agent:                     inactive  (disabled on boot)
== Cinder services ==
openstack-cinder-api:                   active
openstack-cinder-scheduler:             active
openstack-cinder-volume:                active
== Support services ==
mysqld:                                 inactive  (disabled on boot)
libvirtd:                               active
openvswitch:                            active
dbus:                                   active
tgtd:                                   active
qpidd:                                  active
== Keystone users ==
+----------------------------------+---------+---------+-------+
|                id                |   name  | enabled | email |
+----------------------------------+---------+---------+-------+
| 871cf99617ff40e09039185aa7ab11f8 |  admin  |   True  |       |
| df4a984ce2f24848a6b84aaa99e296f1 |  boris  |   True  |       |
| 57fc5466230b497a9f206a20618dbe25 |  cinder |   True  |       |
| cdb2e5af7bae4c5486a1e3e2f42727f0 |  glance |   True  |       |
| adb14139a0874c74b14d61d2d4f22371 | neutron |   True  |       |
| 2485122e3538409c8a6fa2ea4343cedf |   nova  |   True  |       |
+----------------------------------+---------+---------+-------+
== Glance images ==
+--------------------------------------+---------------------+-------------+------------------+-----------+--------+
| ID                                   | Name                | Disk Format | Container Format | Size      | Status |
+--------------------------------------+---------------------+-------------+------------------+-----------+--------+
| 592faef8-308a-4438-867a-17adf685cde4 | CirrOS 31           | qcow2       | bare             | 13147648  | active |
| d0e90250-5814-4685-9b8d-65ec9daa7117 | Fedora 20 x86_64    | qcow2       | bare             | 214106112 | active |
| 3e6eea8e-32e6-4373-9eb1-e04b8a3167f9 | Ubuntu Server 13.10 | qcow2       | bare             | 244777472 | active |
+--------------------------------------+---------------------+-------------+------------------+-----------+--------+
== Nova managed services ==
+----------------+---------------------+----------+---------+-------+----------------------------+-----------------+
| Binary         | Host                | Zone     | Status  | State | Updated_at                 | Disabled Reason |
+----------------+---------------------+----------+---------+-------+----------------------------+-----------------+
| nova-scheduler | dallas1.localdomain | internal | enabled | up    | 2014-03-08T15:49:49.000000 | None            |
| nova-conductor | dallas1.localdomain | internal | enabled | up    | 2014-03-08T15:49:49.000000 | None            |
| nova-compute   | dallas2.localdomain | nova     | enabled | down  | 2014-03-08T15:43:12.000000 | None            |
+----------------+---------------------+----------+---------+-------+----------------------------+-----------------+
== Nova networks ==
+--------------------------------------+-------+------+
| ID                                   | Label | Cidr |
+--------------------------------------+-------+------+
| 0ed406bf-3552-4036-9006-440f3e69618e | ext   | None |
| 166d9651-d299-47df-a5a1-b368e87b612f | int   | None |
+--------------------------------------+-------+------+
== Nova instance flavors ==
+----+-----------+-----------+------+-----------+------+-------+-------------+-----------+
| ID | Name      | Memory_MB | Disk | Ephemeral | Swap | VCPUs | RXTX_Factor | Is_Public |
+----+-----------+-----------+------+-----------+------+-------+-------------+-----------+
| 1  | m1.tiny   | 512       | 1    | 0         |      | 1     | 1.0         | True      |
| 2  | m1.small  | 2048      | 20   | 0         |      | 1     | 1.0         | True      |
| 3  | m1.medium | 4096      | 40   | 0         |      | 2     | 1.0         | True      |
| 4  | m1.large  | 8192      | 80   | 0         |      | 4     | 1.0         | True      |
| 5  | m1.xlarge | 16384     | 160  | 0         |      | 8     | 1.0         | True      |
+----+-----------+-----------+------+-----------+------+-------+-------------+-----------+
== Nova instances ==
+----+------+--------+------------+-------------+----------+
| ID | Name | Status | Task State | Power State | Networks |
+----+------+--------+------------+-------------+----------+
+----+------+--------+------------+-------------+----------+

The last one run under other tenant

[root@dallas1 ~(keystone_boris)]$ nova list
+--------------------------------------+-------------+-----------+------------+-------------+-----------------------------+
| ID                                   | Name        | Status    | Task State | Power State | Networks                    |
+--------------------------------------+-------------+-----------+------------+-------------+-----------------------------+
| 3a818be8-1d3a-4179-a3be-adf878a93f1a | UbuntuRS003 | SUSPENDED | None       | Shutdown    | int=10.0.0.5, 192.168.1.105 |
| f6382840-aec5-46c2-a0d7-6702686e71fe | VF20RS001   | SUSPENDED | None       | Shutdown    | int=10.0.0.2, 192.168.1.103 |
| 99ec4dc6-c82a-47d4-81e3-9b01c0bf7676 | VF20RS003   | ACTIVE    | None       | Running     | int=10.0.0.4, 192.168.1.104 |
+--------------------------------------+-------------+-----------+------------+-------------+-----------------------------+


Second( Havana RDO Compute Node ) is running :-

[boris@dallas1 ~]$ ssh 192.168.1.140
boris.1.140's password: 
Last login: Sat Mar  8 20:01:17 2014

[boris@dallas2 ~]$ ps -ef | grep nova
nova      1608     1  0 19:44 ?        00:00:20 /usr/bin/python /usr/bin/nova-compute --logfile /var/log/nova/compute.log
qemu      2510     1 22 19:52 ?        00:10:17 /usr/bin/qemu-system-x86_64 -name instance-0000005c -S -machine pc-i440fx-1.6,accel=tcg,usb=off -cpu Penryn,+osxsave,+xsave,+pdcm,+xtpr,+tm2,+est,+smx,+vmx,+ds_cpl,+monitor,+dtes64,+pbe,+tm,+ht,+ss,+acpi,+ds,+vme -m 2048 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 99ec4dc6-c82a-47d4-81e3-9b01c0bf7676 -smbios type=1,manufacturer=Fedora Project,product=OpenStack Nova,version=2013.2.2-1.fc20,serial=6050001e-8c00-00ac-818a-90e6ba2d11eb,uuid=99ec4dc6-c82a-47d4-81e3-9b01c0bf7676 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/instance-0000005c.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/dev/disk/by-path/ip-192.168.1.130:3260-iscsi-iqn.2010-10.org.openstack:volume-3ec33eaf-86c5-4a59-82df-5a647277927a-lun-1,if=none,id=drive-virtio-disk0,format=raw,serial=3ec33eaf-86c5-4a59-82df-5a647277927a,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=25,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:4d:f6:0a,bus=pci.0,addr=0x3 -chardev file,id=charserial0,path=/var/lib/nova/instances/99ec4dc6-c82a-47d4-81e3-9b01c0bf7676/console.log -device isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1 -device usb-tablet,id=input0 -vnc 127.0.0.1:0 -k en-us -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -incoming fd:23 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5
boris    13607 13515  0 20:38 pts/2    00:00:00 grep --color=auto nova

[boris@dallas2 ~]$ ps -ef | grep neutron
neutron   1610     1  0 19:44 ?        00:00:26 /usr/bin/python /usr/bin/neutron-openvswitch-agent --config-file /usr/share/neutron/neutron-dist.conf --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/openvswitch/ovs_neutron_plugin.ini --log-file /var/log/neutron/openvswitch-agent.log
boris    13656 13515  0 20:38 pts/2    00:00:00 grep --color=auto neutron


 In other words,  cloud instances are running on second host getting metadata from Controller Host using thin LVM based cinders volumes on Controller for booting up on Compute.

 Network between this Two Node Neutron GRE+OVS Havana RDO Controller & Compute boxes has been tuned to make it working. Another Cluster instances dual booting with first couple (which is a subject for this bug report) is also running similar "two node config"  had been set up on 01/23/14 with gluster replication 2 volume sharing instances disks ( different drives)  and is still working OK,  regardless yesterday has been yum updated up to the most recent fedora kernel "3.13.5-202" . It's development environment  with no firewall. IPv4 firewalls are supporting two couples of Havava RDO Instances (dual booting on same physical boxes) on F20 , daemons firewalld are disabled 
everywhere.
--------------------------------------------------------------------------------
Since 01/23/14  something has changed either in Gluster or in Openstack RDO Havana what currently makes cooperation impossible. On 01/23 Gluster 3.4.2 and
Openstack RDO Havana ( native for F20) were happily installed to work via replicated gluster volume and are still happy with each other have been coming through all "yum updates" since mentioned date. Same physical network cards on each one of boxes. I didn't touch hardware. I may now reboot to another couple of instances and and everything will start to work fine, because something by some reasons has not been updated  since 01/23. Tested yesterday night.

Comment 15 Boris Derzhavets 2014-03-08 17:40:46 UTC

Double checked :-

== Nova managed services ==
+----------------+---------------------+----------+---------+-------+----------------------------+-----------------+
| Binary         | Host                | Zone     | Status  | State | Updated_at                 | Disabled Reason |
+----------------+---------------------+----------+---------+-------+----------------------------+-----------------+
| nova-scheduler | dallas1.localdomain | internal | enabled | up    | 2014-03-08T15:49:49.000000 | None            |
| nova-conductor | dallas1.localdomain | internal | enabled | up    | 2014-03-08T15:49:49.000000 | None            |
| nova-compute   | dallas2.localdomain | nova     | enabled | up    | 2014-03-08T15:43:12.000000 | None            |
+----------------+---------------------+----------+---------+-------+-----------

[root@dallas1 ~(keystone_admin)]$ nova-manage service list
Binary           Host                                 Zone             Status     State Updated_At
nova-scheduler   dallas1.localdomain                  internal         enabled    :-)   2014-03-08 17:38:00
nova-conductor   dallas1.localdomain                  internal         enabled    :-)   2014-03-08 17:38:00
nova-compute     dallas2.localdomain                  nova             enabled    :-)   2014-03-08 17:38:02

Comment 16 Boris Derzhavets 2014-03-09 07:45:09 UTC

(In reply to Joe Julian from comment #13)
> EHOSTUNREACH is not a bug. Fix your firewall/network. Gluster is receiving
> an ICMP type 3 datagram which causes the error message, "no route to host".

Setting up working with Gluster Havana RDO instances on F20 (01/23) I was able comment out lines:

# -A INPUT -j REJECT --reject-with icmp-host-prohibited
# -A FORWARD -j REJECT --reject-with icmp-host-prohibited

Actually, I followed http://www.andrewklau.com/getting-started-with-multi-node-openstack-rdo-havana-gluster-backend-neutron/

So /etc/sysconfig/iptables fragment for Havana&Gluster still looks like :

-A INPUT -p tcp -m multiport --dport 24007:24047 -j ACCEPT
-A INPUT -p tcp --dport 111 -j ACCEPT
-A INPUT -p udp --dport 111 -j ACCEPT
-A INPUT -p tcp -m multiport --dport 38465:38485 -j ACCEPT
-A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
-A INPUT -p icmp -j ACCEPT
-A INPUT -i lo -j ACCEPT
-A INPUT -p tcp -m multiport --dports 3260 -m comment --comment "001 cinder incoming" -j ACCEPT
-A INPUT -p tcp -m multiport --dports 80 -m comment --comment "001 horizon incoming" -j ACCEPT
-A INPUT -p tcp -m multiport --dports 9292 -m comment --comment "001 glance incoming" -j ACCEPT
-A INPUT -p tcp -m multiport --dports 5000,35357 -m comment --comment "001 keystone incoming" -j ACCEPT
-A INPUT -p tcp -m multiport --dports 3306 -m comment --comment "001 mariadb incoming" -j ACCEPT
-A INPUT -p tcp -m multiport --dports 6080 -m comment --comment "001 novncproxy incoming" -j ACCEPT
-A INPUT -p tcp -m multiport --dports 8770:8780 -m comment --comment "001 novaapi incoming" -j ACCEPT
-A INPUT -p tcp -m multiport --dports 9696 -m comment --comment "001 neutron incoming" -j ACCEPT
-A INPUT -p tcp -m multiport --dports 5672 -m comment --comment "001 qpid incoming" -j ACCEPT
-A INPUT -p tcp -m multiport --dports 8700 -m comment --comment "001 metadata incoming" -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp --dport 22 -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp --dport 5900:5999 -j ACCEPT
# -A INPUT -j REJECT --reject-with icmp-host-prohibited
-A INPUT -p gre -j ACCEPT
-A OUTPUT -p gre -j ACCEPT
# -A FORWARD -j REJECT --reject-with icmp-host-prohibited

and still works with gluster 3.4.2  replication 2 volume . Both instances Controller&Compute are yum updated up to the most recent status on F20.

In meantime if I comment out following lines doing a fresh setup from scratch :-

-A INPUT -j REJECT --reject-with icmp-host-prohibited
-A FORWARD -j REJECT --reject-with icmp-host-prohibited

Havana Controller Node fails to work, so I forced to keep them in /etc/sysconfig/iptables on Fedora 20,
as shown in http://kashyapc.fedorapeople.org/virt/openstack/neutron-configs-GRE-OVS-two-node.txt
-------------------------------------------------------------------------------
Per your recent notice it seems to be a cause of problem.

Comment 17 Boris Derzhavets 2014-03-09 08:44:55 UTC

(In reply to Joe Julian from comment #13)
> EHOSTUNREACH is not a bug. Fix your firewall/network. Gluster is receiving
> an ICMP type 3 datagram which causes the error message, "no route to host".

As far as this lines are commented out on 192.168.1.137  :-

[root@dfw01 ~]#  mount -t glusterfs 192.168.1.127:cinder-volume /mnt01

[root@dfw01 ~]# df -h
Filesystem                      Size  Used Avail Use% Mounted on
/dev/mapper/fedora-root          96G   47G   45G  51% /
devtmpfs                        3.9G     0  3.9G   0% /dev
tmpfs                           3.9G   92K  3.9G   1% /dev/shm
tmpfs                           3.9G  9.4M  3.9G   1% /run
tmpfs                           3.9G     0  3.9G   0% /sys/fs/cgroup
tmpfs                           3.9G   24K  3.9G   1% /tmp
/dev/sda5                       477M  122M  326M  28% /boot
/dev/mapper/fedora-data1         77G  2.1G   71G   3% /data1
192.168.1.127:cinder-volumes05   77G  2.1G   71G   3% /var/lib/nova/mnt/62f75cf6996a8a6bcc0d343be378c10a
192.168.1.127:cinder-volume      96G   37G   55G  40% /mnt01

I didn't touch /etc/sysconfig/iptables any more. It's just as before.

RDO Havana allows comment out this lines out on CentOS 6.5 and in January allows to do the same on F20. Now it doesn't.

Comment 18 Boris Derzhavets 2014-03-09 08:48:41 UTC

Typos

As far as this lines are commented out on 192.168.1.127  :-

Mount is running OK on 192.168.1.137

[root@dfw01 ~]#  mount -t glusterfs 192.168.1.127:cinder-volume /mnt01

[root@dfw01 ~]# df -h
Filesystem                      Size  Used Avail Use% Mounted on
/dev/mapper/fedora-root          96G   47G   45G  51% /
devtmpfs                        3.9G     0  3.9G   0% /dev
tmpfs                           3.9G   92K  3.9G   1% /dev/shm
tmpfs                           3.9G  9.4M  3.9G   1% /run
tmpfs                           3.9G     0  3.9G   0% /sys/fs/cgroup
tmpfs                           3.9G   24K  3.9G   1% /tmp
/dev/sda5                       477M  122M  326M  28% /boot
/dev/mapper/fedora-data1         77G  2.1G   71G   3% /data1
192.168.1.127:cinder-volumes05   77G  2.1G   71G   3% /var/lib/nova/mnt/62f75cf6996a8a6bcc0d343be378c10a
192.168.1.127:cinder-volume      96G   37G   55G  40% /mnt01

Comment 19 Boris Derzhavets 2014-03-09 11:36:35 UTC

I want to thank Joe Julian for his patience and attention to my problems

Comment 20 Niels de Vos 2014-03-19 09:46:12 UTC

(In reply to Boris Derzhavets from comment #18)
> 
> Mount is running OK on 192.168.1.137
> 

I'll close this now, please let us know if you still hit this problem.

Note You need to log in before you can comment on or make changes to this bug.