Bug 1633669

Summary:

Gluster bricks fails frequently

Product:

[Community] GlusterFS

Reporter:

Jaime Dulzura <jaime.dulzura>

Component:

glusterd

Assignee:

bugs <bugs>

Status:

CLOSED DEFERRED

QA Contact:

Severity:

high

Docs Contact:

Priority:

unspecified

Version:

4.1

CC:

amukherj, bugs, jaime.dulzura, pasik

Target Milestone:

---

Target Release:

---

Hardware:

x86_64

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2018-11-22 00:22:12 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
Brick Logs	none
Brick Logs	none
glsuterd.log	none
Latest brick process down logs.	none

Description Jaime Dulzura 2018-09-27 13:58:33 UTC

Description of problem:

We are trying to get the best Gluster volume options to fit our need to share storage for TibCo EMS.

Unfortunately, bricks are failing after few runs of stress testing.

Version-Release number of selected component (if applicable):
Glusterfs-Server 4.1.4 and the latest release 4.1.5

First setup:

3VMs / 8 vCPU / 16G Memory from VSphere 6.5
Gluster volume (replica 3 no arbiter)
bricks are failing after first run of 50k messages

Second Setup:

3VMs / 8 vCPU / 16G Memory from VSphere 6.5
Gluster volume (replica 3 arbiter 1)
bricks are failing after 2 runs of 50k messages


How reproducible: 

Setup same specs with gluster 4.1.4 or 4.1.5 then run 50k TibCo EMS3. mount a volume using gluster native client.

Steps to Reproduce:
1.Setup same VM specs
2.Run a 50k messages from TibCo EMS with Gluster Native shared storage
3.

Actual results: 
Bricks may fail along the way or after all messages have been processed


Expected results:
Volume heath should be available for next round of 50k messages.

Additional info:
Core dumps were generated on the node with bricks that are failing.

volumes info:

 gluster v info

Volume Name: gluster_shared_storage
Type: Replicate
Volume ID: 255a31c4-13a1-4330-a73d-6d001e71d57c
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: iahdvlgfsa001.logistics.corp:/var/lib/glusterd/ss_brick
Brick2: iahdvlgfsb001:/var/lib/glusterd/ss_brick
Brick3: iahdvlgfsc001:/var/lib/glusterd/ss_brick
Options Reconfigured:
performance.client-io-threads: off
nfs.disable: on
transport.address-family: inet
cluster.enable-shared-storage: enable

Volume Name: tibco
Type: Replicate
Volume ID: abc14a06-852d-46c2-8e70-a1f09136bc08
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: iahdvlgfsa001:/local/bricks/volume01/tibco
Brick2: iahdvlgfsb001:/local/bricks/volume01/tibco
Brick3: iahdvlgfsc001:/local/bricks/volume01/tibco (arbiter)
Options Reconfigured:
auth.allow: 127.0.0.1,10.1.25.*,10.1.26.*,10.1.34.*
nfs.disable: on
diagnostics.latency-measurement: on
diagnostics.count-fop-hits: on
performance.strict-o-direct: on
performance.strict-write-ordering: on
cluster.enable-shared-storage: enable

Comment 1 Jaime Dulzura 2018-09-27 14:05:41 UTC

I forgot to mention, We are aiming to use the Gluster Native Client but it seems glusterfs is consuming too much memory on the client side.

We've added NFS-Ganesha to avoid Gluster client overload. But the memory used by NFS-Ganesha is accumulating overtime.

Comment 2 Atin Mukherjee 2018-10-04 16:05:25 UTC

Hi Jaime,

When you mention that "bricks fail frequently" do you refer that brick processes go down or something else? At this point of time from the bug description it's not clear to me that what's the exact issue being highlighted here. So we need bit more details along with glusterd, brick log file and volume status output.

Thanks,
Atin

Comment 3 Jaime Dulzura 2018-10-05 07:24:20 UTC

Hi Atin,

I was referring to bricks processes go down. And it never goes up automatically. To make it available again, I just invoke "gluster v start <volume> force: command. or restart glusterd.


Installed Packages for gluster storage:
# rpm -qa | grep -E "gluster|ganesha"
glusterfs-libs-4.1.5-1.el7.x86_64
glusterfs-events-4.1.5-1.el7.x86_64
nfs-ganesha-2.6.3-1.el7.x86_64
glusterfs-cli-4.1.5-1.el7.x86_64
centos-release-gluster41-1.0-1.el7.centos.x86_64
tendrl-gluster-integration-1.6.3-10.el7.noarch
glusterfs-client-xlators-4.1.5-1.el7.x86_64
glusterfs-server-4.1.5-1.el7.x86_64
nfs-ganesha-xfs-2.6.3-1.el7.x86_64
glusterfs-coreutils-0.2.0-1.el7.x86_64
glusterfs-fuse-4.1.5-1.el7.x86_64
python2-gluster-4.1.5-1.el7.x86_64
glusterfs-api-4.1.5-1.el7.x86_64
glusterfs-4.1.5-1.el7.x86_64
glusterfs-extra-xlators-4.1.5-1.el7.x86_64
nfs-ganesha-gluster-2.6.3-1.el7.x86_64


# gluster v info

Volume Name: CL_Shared
Type: Replicate
Volume ID: ac1f0338-2af8-41b3-af61-7eb7f1c3696e
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: iahdvlgfsa001:/local/bricks/volume02/CL_Shared
Brick2: iahdvlgfsb001:/local/bricks/volume02/CL_Shared
Brick3: iahdvlgfsc001:/local/bricks/volume02/CL_Shared
Options Reconfigured:
performance.client-io-threads: off
nfs.disable: on
transport.address-family: inet
auth.allow: iahdvlgfsc001,iahdvlgfsb001,iahdvlgfsa001,localhost

Volume Name: tibco
Type: Replicate
Volume ID: abc14a06-852d-46c2-8e70-a1f09136bc08
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: iahdvlgfsa001:/local/bricks/volume01/tibco
Brick2: iahdvlgfsb001:/local/bricks/volume01/tibco
Brick3: iahdvlgfsc001:/local/bricks/volume01/tibco
Options Reconfigured:
performance.stat-prefetch: on
performance.md-cache-timeout: 600
performance.cache-invalidation: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
auth.allow: 127.0.0.1,10.1.25.*,10.1.26.*,10.1.34.*
nfs.disable: on
diagnostics.latency-measurement: on
diagnostics.count-fop-hits: on
performance.strict-o-direct: on
performance.strict-write-ordering: on

How could I upload the bricks logs?

Comment 4 Jaime Dulzura 2018-10-05 07:30:28 UTC

Created attachment 1490765 [details]
Brick Logs

required brick logs for this bug report

Comment 5 Jaime Dulzura 2018-10-05 07:31:35 UTC

Created attachment 1490767 [details]
Brick Logs

Required brick logs for this bug report

Comment 6 Jaime Dulzura 2018-10-05 07:34:15 UTC

[root@iahdvlgfsa001 ~]# gluster v status
Status of volume: CL_Shared
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick iahdvlgfsa001:/local/bricks/volume02/
CL_Shared                                   49152     0          Y       4890
Brick iahdvlgfsb001:/local/bricks/volume02/
CL_Shared                                   49152     0          Y       1021
Brick iahdvlgfsc001:/local/bricks/volume02/
CL_Shared                                   49152     0          Y       20186
Self-heal Daemon on localhost               N/A       N/A        Y       32017
Self-heal Daemon on iahdvlgfsc001           N/A       N/A        Y       20211
Self-heal Daemon on iahdvlgfsb001           N/A       N/A        Y       1068

Task Status of Volume CL_Shared
------------------------------------------------------------------------------
There are no active volume tasks

Status of volume: tibco
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick iahdvlgfsa001:/local/bricks/volume01/
tibco                                       49153     0          Y       4990
Brick iahdvlgfsb001:/local/bricks/volume01/
tibco                                       49153     0          Y       1750
Brick iahdvlgfsc001:/local/bricks/volume01/
tibco                                       49153     0          Y       6873
Self-heal Daemon on localhost               N/A       N/A        Y       32017
Self-heal Daemon on iahdvlgfsc001           N/A       N/A        Y       20211
Self-heal Daemon on iahdvlgfsb001           N/A       N/A        Y       1068

Task Status of Volume tibco
------------------------------------------------------------------------------
There are no active volume tasks

Comment 7 Jaime Dulzura 2018-10-05 07:41:12 UTC

Created attachment 1490768 [details]
glsuterd.log

glusterd.log

Comment 8 Jaime Dulzura 2018-10-05 08:42:35 UTC

Created attachment 1490778 [details]
Latest brick process down logs.

Status of failing brick:
[root@iahdvlgfsc001 cevaroot]# gluster v status CL_Shared
Status of volume: CL_Shared
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick iahdvlgfsa001:/local/bricks/volume02/
CL_Shared                                   49152     0          Y       4890
Brick iahdvlgfsb001:/local/bricks/volume02/
CL_Shared                                   49152     0          Y       1021
Brick iahdvlgfsc001:/local/bricks/volume02/
CL_Shared                                   N/A       N/A        N       N/A
Self-heal Daemon on localhost               N/A       N/A        Y       20211
Self-heal Daemon on iahdvlgfsa001.logistics
.corp                                       N/A       N/A        Y       32017
Self-heal Daemon on iahdvlgfsb001           N/A       N/A        Y       1068

Task Status of Volume CL_Shared
------------------------------------------------------------------------------
There are no active volume tasks

glusterd status
[root@iahdvlgfsc001 cevaroot]# systemctl status glusterd -l
● glusterd.service - GlusterFS, a clustered file-system server
   Loaded: loaded (/usr/lib/systemd/system/glusterd.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2018-10-05 00:43:34 CDT; 2h 53min ago
  Process: 1324 ExecStart=/usr/sbin/glusterd -p /var/run/glusterd.pid --log-level $LOG_LEVEL $GLUSTERD_OPTIONS (code=exited, status=0/SUCCESS)
 Main PID: 1332 (glusterd)
   CGroup: /system.slice/glusterd.service
           ├─ 1332 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO
           ├─ 6873 /usr/sbin/glusterfsd -s iahdvlgfsc001 --volfile-id tibco.iahdvlgfsc001.local-bricks-volume01-tibco -p /var/run/gluster/vols/tibco/iahdvlgfsc001-local-bricks-volume01-tibco.pid -S /var/run/gluster/843d10f6ac486e3e.socket --brick-name /local/bricks/volume01/tibco -l /var/log/glusterfs/bricks/local-bricks-volume01-tibco.log --xlator-option *-posix.glusterd-uuid=6af863cd-43f6-448e-936d-889766c1a655 --process-name brick --brick-port 49153 --xlator-option tibco-server.listen-port=49153
           └─20211 /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/run/gluster/glustershd/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/gluster/8abfe66e3fb78dec.socket --xlator-option *replicate*.node-uuid=6af863cd-43f6-448e-936d-889766c1a655 --process-name glustershd

Oct 05 03:15:11 iahdvlgfsc001.logistics.corp local-bricks-volume02-CL_Shared[20186]: dlfcn 1
Oct 05 03:15:11 iahdvlgfsc001.logistics.corp local-bricks-volume02-CL_Shared[20186]: libpthread 1
Oct 05 03:15:11 iahdvlgfsc001.logistics.corp local-bricks-volume02-CL_Shared[20186]: llistxattr 1
Oct 05 03:15:11 iahdvlgfsc001.logistics.corp local-bricks-volume02-CL_Shared[20186]: setfsid 1
Oct 05 03:15:11 iahdvlgfsc001.logistics.corp local-bricks-volume02-CL_Shared[20186]: spinlock 1
Oct 05 03:15:11 iahdvlgfsc001.logistics.corp local-bricks-volume02-CL_Shared[20186]: epoll.h 1
Oct 05 03:15:11 iahdvlgfsc001.logistics.corp local-bricks-volume02-CL_Shared[20186]: xattr.h 1
Oct 05 03:15:11 iahdvlgfsc001.logistics.corp local-bricks-volume02-CL_Shared[20186]: st_atim.tv_nsec 1
Oct 05 03:15:11 iahdvlgfsc001.logistics.corp local-bricks-volume02-CL_Shared[20186]: package-string: glusterfs 4.1.5
Oct 05 03:15:11 iahdvlgfsc001.logistics.corp local-bricks-volume02-CL_Shared[20186]: ---------

Comment 9 Jaime Dulzura 2018-11-22 00:22:12 UTC

i forgot to mention that on our initial setup we include monitoring agent "TENDRL". We install the monitoring with default configuration based on the installation instruction. We are happy with the monitoring since it gives us too much information we need.

In oour observation, there is a core-dump referring to glusterfsd process was killed due to accessing restricted memory and at /var/log/messages there was a brute force access of tendrl agent to the shared volume.

With the above observation, we've decided to reinstall everything and remove TENDRL monitoring in the equation. Then viola, there was no bricks failing for more than a months now.

It could be a bug or not but after we remove tendrl agent, the bricks never fails again.

And yet we are now facing the well known bug from previous releases of NFS-Ganesha with GLUSTERFS volume exports, the OOM killer kills ganesha daemon. I will raise a separate bug report based on how we acquire the ganesha process being forcefully killed by OOM killer.