Bug 1473168 - [Gluster-block] VM hangs, if duplicate IPs are given in 'gluster-block create'
Summary: [Gluster-block] VM hangs, if duplicate IPs are given in 'gluster-block create'
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: gluster-block
Version: cns-3.9
Hardware: Unspecified
OS: Unspecified
medium
low
Target Milestone: ---
: ---
Assignee: Prasanna Kumar Kalever
QA Contact: Rahul Hinduja
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-07-20 07:33 UTC by Sweta Anandpara
Modified: 2018-11-19 08:20 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-11-19 08:20:21 UTC
Embargoed:


Attachments (Terms of Use)

Description Sweta Anandpara 2017-07-20 07:33:52 UTC
Description of problem:
=======================
Had a 1*3 volume, executed 'gluster-block   create   <volname>/<blockname>   ha  3   <IP1>,<IP2>,<IP1>   <size>'. Please note that 'IP1' was given twice.

The first time, VM core was generated [Bug 1473162]. Other attempts to reproduce it (and tried it thrice, on different nodes) resulted in VM hang. I am assuming the code checks only for the 'number' of addresses given, and has no check for the validity/duplicity of the addresses. I still don't understand the reason of hang though. But it would be good to check the sanity of IPs at the CLI level itself.

I am guessing that this might not actually be hit in CNS environment- if heketi does the check, or has NO way of going wrong with the IPs. Will need to confirm on this.


Version-Release number of selected component (if applicable):
============================================================
glusterfs-3.8.4-33 and gluster-block-0.2.1-6


How reproducible:
================
3:3


Steps to Reproduce:
===================
1. Create a 1*3 volume, on a 3node cluster setup
2. Create a block of 'ha=3' and give addresses of node1, node2 and again node1.


Actual results:
==============
Command exits, after a long time with Broken pipe. Login to the console shows no control on the cli. Force reboot the VM, and it is back up.


Expected results:
================
Block create command should check the validity of addresses, before proceeding with the actual block creation.


Additional info:
================
Did not get much information from dmesg or /var/log/messages.

[root@dhcp47-116 abrt]# gluster-block create nash/nb55 ha 3 10.70.47.116,10.70.47.117,10.70.47.116 1M
packet_write_wait: Connection to 10.70.47.116 port 22: Broken pipe
bash-4.3$ ssh root.47.116
^C
bash-4.3$ 

[root@dhcp47-115 ~]# cat /mnt/nash/block-meta/nb55
VOLUME: nash
GBID: 5115979d-9a8d-4ea8-bcae-23280e9c6e2f
SIZE: 1048576
HA: 3
ENTRYCREATE: INPROGRESS
ENTRYCREATE: SUCCESS
10.70.47.116: CONFIGINPROGRESS
10.70.47.116: CONFIGINPROGRESS
10.70.47.117: CONFIGINPROGRESS
10.70.47.116: CONFIGSUCCESS
10.70.47.117: CONFIGSUCCESS
10.70.47.116: CONFIGFAIL
10.70.47.117: CLEANUPINPROGRESS
[root@dhcp47-115 ~]#
[root@dhcp47-115 ~]# rpm -qa | grep gluster
glusterfs-cli-3.8.4-33.el7rhgs.x86_64
glusterfs-rdma-3.8.4-33.el7rhgs.x86_64
python-gluster-3.8.4-33.el7rhgs.noarch
vdsm-gluster-4.17.33-1.1.el7rhgs.noarch
glusterfs-client-xlators-3.8.4-33.el7rhgs.x86_64
glusterfs-fuse-3.8.4-33.el7rhgs.x86_64
gluster-nagios-common-0.2.4-1.el7rhgs.noarch
glusterfs-events-3.8.4-33.el7rhgs.x86_64
gluster-block-0.2.1-6.el7rhgs.x86_64
libvirt-daemon-driver-storage-gluster-3.2.0-14.el7.x86_64
gluster-nagios-addons-0.2.9-1.el7rhgs.x86_64
samba-vfs-glusterfs-4.6.3-3.el7rhgs.x86_64
glusterfs-3.8.4-33.el7rhgs.x86_64
glusterfs-debuginfo-3.8.4-26.el7rhgs.x86_64
glusterfs-api-3.8.4-33.el7rhgs.x86_64
glusterfs-geo-replication-3.8.4-33.el7rhgs.x86_64
glusterfs-libs-3.8.4-33.el7rhgs.x86_64
glusterfs-server-3.8.4-33.el7rhgs.x86_64
[root@dhcp47-115 ~]# 
[root@dhcp47-115 ~]# gluster peer status
Number of Peers: 5

Hostname: dhcp47-121.lab.eng.blr.redhat.com
Uuid: 49610061-1788-4cbc-9205-0e59fe91d842
State: Peer in Cluster (Connected)
Other names:
10.70.47.121

Hostname: dhcp47-113.lab.eng.blr.redhat.com
Uuid: a0557927-4e5e-4ff7-8dce-94873f867707
State: Peer in Cluster (Connected)

Hostname: dhcp47-114.lab.eng.blr.redhat.com
Uuid: c0dac197-5a4d-4db7-b709-dbf8b8eb0896
State: Peer in Cluster (Connected)
Other names:
10.70.47.114

Hostname: dhcp47-116.lab.eng.blr.redhat.com
Uuid: a96e0244-b5ce-4518-895c-8eb453c71ded
State: Peer in Cluster (Disconnected)
Other names:
10.70.47.116

Hostname: dhcp47-117.lab.eng.blr.redhat.com
Uuid: 17eb3cef-17e7-4249-954b-fc19ec608304
State: Peer in Cluster (Connected)
Other names:
10.70.47.117
[root@dhcp47-115 ~]# 
[root@dhcp47-115 ~]# 
[root@dhcp47-115 ~]# gluster v info nash
 
Volume Name: nash
Type: Replicate
Volume ID: f1ea3d3e-c536-4f36-b61f-cb9761b8a0a6
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 10.70.47.115:/bricks/brick4/nash0
Brick2: 10.70.47.116:/bricks/brick4/nash1
Brick3: 10.70.47.117:/bricks/brick4/nash2
Options Reconfigured:
nfs.disable: on
transport.address-family: inet
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
performance.open-behind: off
performance.readdir-ahead: off
network.remote-dio: enable
cluster.eager-lock: enable
cluster.quorum-type: auto
cluster.data-self-heal-algorithm: full
cluster.locking-scheme: granular
cluster.shd-max-threads: 8
cluster.shd-wait-qlength: 10000
features.shard: on
user.cifs: off
server.allow-insecure: on
cluster.brick-multiplex: disable
cluster.enable-shared-storage: enable
[root@dhcp47-115 ~]# 
[root@dhcp47-115 ~]# 
[root@dhcp47-115 ~]#

Comment 3 Sweta Anandpara 2017-07-20 08:37:10 UTC
Not proposing this as a blocker for rhgs3.3 as I have got an input from CNS QE that heketi will not execute 'gluster-block create' command with duplicate IPs.

Comment 10 Amar Tumballi 2018-11-19 08:20:21 UTC
Moving it to 'UPSTREAM' as this is not the usecase, and the tools creating block files are not calling the CLI like this.


Note You need to log in before you can comment on or make changes to this bug.