1180015 – reboot node with some glusterd glusterfsd glusterfs services.

Bug 1180015 - reboot node with some glusterd glusterfsd glusterfs services.

Summary: reboot node with some glusterd glusterfsd glusterfs services.

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	disperse
Sub Component:
Version:	3.6.1
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Assignee:	Xavi Hernandez
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2015-01-08 04:46 UTC by zhangyongsheng
Modified:	2023-09-14 02:53 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2015-01-19 06:24:15 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description zhangyongsheng 2015-01-08 04:46:37 UTC

Description of problem:
three nodes make up one cluster, create a  redundancy volume named "test" in this cluster, then ,start this volume 

. after i excute "init 6" or "reboot" in node2 ,i restart this volume "test" and relevant services often  have some 

trouble and issue, result in volume test can not work.
  i would consult whether excute "reboot" or "init 6" operation in the condition of volume not be stopped is right? 
  for example 
  all xattr of brick dir(xfs file system) lose ,and  client mount this volume fail. 


Version-Release number of selected component (if applicable):


How reproducible:
node1
[root@node-1]#gluster volume create test redundancy 1 node1:/brick1 node2:/brick2 node3:/brick3
[root@node-1]#/sbin/mount.digioceanfs 127.0.0.1:/test /cluster2/test
then ,/cluster2/test/samba this dir is export dir of samba,and the windows
client connectting to this export dir of samba
node2
[root@node-2]/usr/sbin/digioceanfs --volfile-server=127.0.0.1 --volfile-id=/test /cluster2/test
node3
[root@node-3]/usr/sbin/digioceanfs --volfile-server=127.0.0.1 --volfile-id=/test /cluster2/test

Steps to Reproduce:
1. now i reboot node2
[root@node-2]reboot 
2. after node2 server start just have some issue as below
3.

Actual results:
ndoe3 brick glusterfsd log
[2015-01-08 11:50:16.053723] E [posix.c:5618:init] 0-test-posix: Extended attribute trusted.digioceanfs.
volume-id is absent
[2015-01-08 11:50:16.053776] E [xlator.c:425:xlator_init] 0-test-posix: Initialization of volume 'test-p
osix' failed, review your volfile again
[2015-01-08 11:50:16.053795] E [graph.c:322:digioceanfs_graph_init] 0-test-posix: initializing translato
r failed
[2015-01-08 11:50:16.053807] E [graph.c:525:digioceanfs_graph_activate] 0-graph: init failed

check brick xattr
Brick3: node-2.aaa.bbb.ccc:/digioceanfs/wwn-0x6000c29a2efdcab8ebe89f9298246f79
[root@node-3 ~]# getfattr -d -m ".*" -e hex /digioceanfs/wwn-0x6000c299dcd6abf74489faac4a2c0afe

have no xattr.
glusterfsd progress not be started
node1
[root@node-1 ~]# df -h 
df command can be hang-up can not finish.
Expected results:


Additional info:
this condition often happen probability

Comment 1 zhangyongsheng 2015-01-08 04:53:18 UTC

the brick file system is xfs

Comment 2 zhangyongsheng 2015-01-08 04:57:41 UTC

^C^C^C^C
[root@node-1 ~]# uname -a
Linux node-1.aaa.bbb.ccc 3.6.11 #11 SMP Fri Jan 3 11:21:01 CST 2014 x86_64 x86_64 x86_64 GNU/Linux
[root@node-1 ~]# cat /etc/issue
CentOS release 6.4 (Final)
Kernel \r on an \m

Comment 3 Xavi Hernandez 2015-01-13 12:33:42 UTC

It seems that you have made some code modifications and recompiled gluster (logs show renamed functions and even an xattr has been renamed (trusted.digioceanfs.volume-id)).

Can you repeat the test with an official gluster release without modifications ?

Comment 4 zhangyongsheng 2015-01-15 04:36:50 UTC

yesterday, i make a auto reboot（reboot per 30 seconds） shell script,and  use glusterfs-3.6.1.tar.gz source code downed from www.gluster.org building it, above some troubles still exist now and then.service glusterd start and "gluster volume start vol_name" command can be shell script auto execut,after node start up
 i reboot node2, node1 have these toubles.

log info:

[2015-01-14 23:02:11.548016] E [graph.c:525:glusterfs_graph_activate] 0-graph: init failed
[2015-01-15 00:02:28.372193] I [graph.c:269:gf_add_cmdline_options] 0-test-server: adding option 'listen
-port' for volume 'test-server' with value '49153'
[2015-01-15 00:02:28.372233] I [graph.c:269:gf_add_cmdline_options] 0-test-posix: adding option 'glusterd-uuid' for volume 'test-posix' with value 'ef2abf61-6d0e-4edb-af17-41fe991e6419'
[2015-01-15 00:02:28.377067] I [rpcsvc.c:2142:rpcsvc_set_outstanding_rpc_limit] 0-rpc-service: Configure
d rpc.outstanding-rpc-limit with value 64
[2015-01-15 00:02:28.377138] W [options.c:898:xl_opt_validate] 0-test-server: option 'listen-port' is de
precated, preferred is 'transport.socket.listen-port', continuing with correction
[2015-01-15 00:02:28.377457] W [socket.c:3599:reconfigure] 0-test-quota: NBIO on -1 failed (Bad file des
criptor)
[2015-01-15 00:02:28.400184] E [posix.c:5604:init] 0-test-posix: Extended attribute trusted.glusterfs.
volume-id is absent
[2015-01-15 00:02:28.400267] E [xlator.c:425:xlator_init] 0-test-posix: Initialization of volume 'test-p
osix' failed, review your volfile again
[2015-01-15 00:02:28.400285] E [graph.c:322:glusterfs_graph_init] 0-test-posix: initializing translato
r failed
[2015-01-15 00:02:28.400295] E [graph.c:525:glusterfs_graph_activate] 0-graph: init failed
[2015-01-15 00:02:28.400687] W [glusterfsd.c:1194:cleanup_and_exit] (--> 0-: received signum (0), shut
ting down
[2015-01-15 00:02:28.606915] I [MSGID: 100030] [glusterfsd.c:2018:main] 0-/usr/sbin/glusterfsd: Star
ted running /usr/sbin/glusterfsd version 3.6.1 (args: /usr/sbin/glusterfsd -s node-1.aaa.bbb.ccc --v
olfile-id test.node-1.aaa.bbb.ccc.glusterfs-wwn-0x6000c29141020d82685aaf79ffd0a888 -p /var/lib/digioce
and/vols/test/run/node-1.aaa.bbb.ccc-glusterfs-wwn-0x6000c29141020d82685aaf79ffd0a888.pid -S /var/run/
85bd37fb24cafe6902159834b173c220.socket --brick-name /glusterfs/wwn-0x6000c29141020d82685aaf79ffd0a888
 -l /var/log/glusterfs/bricks/glusterfs-wwn-0x6000c29141020d82685aaf79ffd0a888.log --xlator-option *
-posix.glusterd-uuid=ef2abf61-6d0e-4edb-af17-41fe991e6419 --brick-port 49153 --xlator-option test-serv
er.listen-port=49153)
=============================================

[root@node-1 ~]# gluster volume info
 
Volume Name: test
Type: Disperse
Volume ID: e67489f2-8019-4e8e-927d-aa103f8d4502
Status: Started
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: node-3.aaa.bbb.ccc:/glusterfs/wwn-0x6000c299dcd6abf74489faac4a2c0afe
Brick2: node-1.aaa.bbb.ccc:/glusterfs/wwn-0x6000c29141020d82685aaf79ffd0a888
Brick3: node-2.aaa.bbb.ccc:/glusterfs/wwn-0x6000c29a2efdcab8ebe89f9298246f79
Options Reconfigured:
features.quota: on
performance.high-prio-threads: 64
performance.low-prio-threads: 64
performance.least-prio-threads: 64
performance.normal-prio-threads: 64
performance.io-thread-count: 64
server.allow-insecure: on
features.lock-heal: on
network.ping-timeout: 5
performance.client-io-threads: enable
================================================
file copy
/etc/fstab   


# /etc/fstab
# Created by anaconda on Fri Dec 26 10:06:51 2014
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
/dev/mapper/VolGroup00-lv_root /                       ext4    defaults        1 1
UUID=ed36c3b0-b5a6-4b06-bd5a-d1d116251262 /boot                   ext4    defaults        1 2
UUID=9b7e072d-4222-45c6-84c4-f8001d537224 /home                   ext4    defaults        1 2
/dev/mapper/VolGroup00-lv_swap swap                    swap    defaults        0 0
UUID=26a4d133-57a4-4408-a438-978a5cfba248 swap                    swap    defaults        0 0
tmpfs                   /dev/shm                tmpfs   defaults        0 0
devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
sysfs                   /sys                    sysfs   defaults        0 0
proc                    /proc                   proc    defaults        0 0
/dev/disk/by-id/wwn-0x6000c29141020d82685aaf79ffd0a888 /glusterfs/wwn-0x6000c29141020d82685aaf79ffd0a8
88 xfs defaults 0 0
==================================================================
df -hT
[root@node-1 ~]# df -hT
Filesystem    Type    Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup00-lv_root
              ext4    9.0G  1.4G  7.2G  17% /
tmpfs        tmpfs    2.0G     0  2.0G   0% /dev/shm
/dev/sda1     ext4     97M   42M   51M  46% /boot
/dev/sda2     ext4    4.9G  140M  4.5G   3% /home
/dev/sdb       xfs   1014M   33M  982M   4% /glusterfs/wwn-0x6000c29141020d82685aaf79ffd0a888
====================================================
[root@node-1 ~]#getfattr -d -m ".*" -e hex /glusterfs/wwn-0x6000c29141020d82685aaf79ffd0a888
[root@node-1 ~]# 
[root@node-1 ~]# 
[root@node-1 ~]# 
[root@node-1 ~]# 
[root@node-1 ~]# 
[root@node-1 ~]# 
[root@node-1 ~]# 
[root@node-1 ~]#
                    no any information


=================================================
   you need any issue information,i shall provide you as soon as possible. 
   thanks your reply

Comment 5 zhangyongsheng 2015-01-15 04:46:03 UTC

again provide some info
[root@node-3 ~]# ls /digioceanfs/wwn-0x6000c299dcd6abf74489faac4a2c0afe
aaa  bbb


[root@node-1 ~]# ls /digioceanfs/wwn-0x6000c29141020d82685aaf79ffd0a888
[root@node-1 ~]#  
         node1 is fault node           no any information

Comment 6 zhangyongsheng 2015-01-15 04:50:25 UTC

again provide some info
[root@node-3 ~]# ls /glusterfs/wwn-0x6000c299dcd6abf74489faac4a2c0afe
aaa  bbb
[root@node-2 ~]# ls /digioceanfs/wwn-0x6000c29a2efdcab8ebe89f9298246f79
aaa  bbb

[root@node-1 ~]# ls /glusterfs/wwn-0x6000c29141020d82685aaf79ffd0a888
[root@node-1 ~]#  
         node1 is fault node           no any information

aaa bbb is make the dir through gluster client

Comment 7 zhangyongsheng 2015-01-15 04:51:18 UTC

again provide some info
[root@node-3 ~]# ls /glusterfs/wwn-0x6000c299dcd6abf74489faac4a2c0afe
aaa  bbb
[root@node-2 ~]# ls /glusterfs/wwn-0x6000c29a2efdcab8ebe89f9298246f79
aaa  bbb

[root@node-1 ~]# ls /glusterfs/wwn-0x6000c29141020d82685aaf79ffd0a888
[root@node-1 ~]#  
         node1 is fault node           no any information

aaa bbb is make the dir through gluster client

Comment 8 zhangyongsheng 2015-01-15 04:56:13 UTC

 Comment 5  Comment 6 can skip direct look comment 7 is ok

Comment 9 Xavi Hernandez 2015-01-15 09:33:18 UTC

Somehow it seems something is going wrong when restarting the node. Gluster logs say that you are trying to start a brick on a node that doesn't have any brick information.

Can you attach the scripts you are using to start the volume after restarting the node ?

It would also be very interesting to have a detailed step by step procedure (including exact commands used in each step) starting from the volume creation to see what's happening.

Comment 10 zhangyongsheng 2015-01-19 06:23:58 UTC

 i find that it shoud be our issue of running Environmental 
thanks your replise

Comment 11 YONGWANG 2016-01-07 07:09:12 UTC

I encountered the same problem with the same operation.
How did you solve this problem?

[root@host-183 peers]# gluster --version
glusterfs 3.6.2 built on Jan  7 2016 10:05:30


thanks

Comment 12 YONGWANG 2016-01-07 07:10:09 UTC

I encountered the same problem with the same operation.
How did you solve this problem?

[root@host-183 peers]# gluster --version
glusterfs 3.6.2 built on Jan  7 2016 10:05:30


thanks

Comment 13 Red Hat Bugzilla 2023-09-14 02:53:03 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days

Note You need to log in before you can comment on or make changes to this bug.