Bug 1113954 - glusterd logs are filled with "readv on /var/run/a30ad20ae7386a2fe58445b1a2b1359c.socket failed (Invalid argument)"
Summary: glusterd logs are filled with "readv on /var/run/a30ad20ae7386a2fe58445b1a2b...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: glusterd
Version: rhgs-3.0
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
: RHGS 3.1.3
Assignee: Atin Mukherjee
QA Contact: Byreddy
URL:
Whiteboard:
: 1121193 (view as bug list)
Depends On:
Blocks: 1114847 1299183 1310969
TreeView+ depends on / blocked
 
Reported: 2014-06-27 10:34 UTC by Rahul Hinduja
Modified: 2019-10-10 09:20 UTC (History)
11 users (show)

Fixed In Version: glusterfs-3.7.9-1
Doc Type: Bug Fix
Doc Text:
When a brick process was brought down or killed, an excessive number of 'readv failed (invalid argument)' messages were written to the glusterd log by the _socket_rwv() call. This update reduces the number of messages in this situation by a factor of 42.
Clone Of:
: 1114847 (view as bug list)
Environment:
Last Closed: 2016-06-23 04:52:15 UTC
Embargoed:
bsrirama: needinfo-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:1240 0 normal SHIPPED_LIVE Red Hat Gluster Storage 3.1 Update 3 2016-06-23 08:51:28 UTC

Description Rahul Hinduja 2014-06-27 10:34:58 UTC
Description of problem:
=======================

When a brick is brought down following message is logged every 3 sec in glusterd logs:

[2014-06-27 10:28:12.693185] W [socket.c:529:__socket_rwv] 0-management: readv on /var/run/a30ad20ae7386a2fe58445b1a2b1359c.socket failed (Invalid argument)
[2014-06-27 10:28:15.694036] W [socket.c:529:__socket_rwv] 0-management: readv on /var/run/a30ad20ae7386a2fe58445b1a2b1359c.socket failed (Invalid argument)
[2014-06-27 10:28:18.694114] W [socket.c:529:__socket_rwv] 0-management: readv on /var/run/a30ad20ae7386a2fe58445b1a2b1359c.socket failed (Invalid argument)
[2014-06-27 10:28:21.694459] W [socket.c:529:__socket_rwv] 0-management: readv on /var/run/a30ad20ae7386a2fe58445b1a2b1359c.socket failed (Invalid argument)
[2014-06-27 10:28:24.694963] W [socket.c:529:__socket_rwv] 0-management: readv on /var/run/a30ad20ae7386a2fe58445b1a2b1359c.socket failed (Invalid argument)
[2014-06-27 10:28:27.695196] W [socket.c:529:__socket_rwv] 0-management: readv on /var/run/a30ad20ae7386a2fe58445b1a2b1359c.socket failed (Invalid argument)
[2014-06-27 10:28:30.696703] W [socket.c:529:__socket_rwv] 0-management: readv on /var/run/a30ad20ae7386a2fe58445b1a2b1359c.socket failed (Invalid argument)
[2014-06-27 10:28:33.696101] W [socket.c:529:__socket_rwv] 0-management: readv on /var/run/a30ad20ae7386a2fe58445b1a2b1359c.socket failed (Invalid argument)
[2014-06-27 10:28:36.696439] W [socket.c:529:__socket_rwv] 0-management: readv on /var/run/a30ad20ae7386a2fe58445b1a2b1359c.socket failed (Invalid argument)
[2014-06-27 10:28:39.697021] W [socket.c:529:__socket_rwv] 0-management: readv on /var/run/a30ad20ae7386a2fe58445b1a2b1359c.socket failed (Invalid argument)

If a brick is down for weeks, then chances are there that these logs will filled the root space and make the system unusable


Version-Release number of selected component (if applicable):
=============================================================

glusterfs-3.6.0.22-1.el6rhs.x86_64


How reproducible:
=================
1/1


Steps to Reproduce:
====================
1. Create 4 node cluster system
2. Create a volume vol0 (2*2) from 4 node cluster
3. Create a snapshot of a volume
4. Create another volume vol4 (2*3) from 3 nodes of a cluster.
5. Bring down one of the brick from vol4 (I brought down brick participating in the third node)

Actual results:
===============
[2014-06-27 10:28:39.697021] W [socket.c:529:__socket_rwv] 0-management: readv on /var/run/a30ad20ae7386a2fe58445b1a2b1359c.socket failed (Invalid argument)

Frequency of logs is every 3sec, which has a risk to crash the complete system. The logs/failure needs investigation.

Comment 3 Atin Mukherjee 2014-07-21 06:07:51 UTC
*** Bug 1121193 has been marked as a duplicate of this bug. ***

Comment 6 Vivek Agarwal 2015-02-11 09:40:55 UTC
Marking this as this is a customer issue and occurring because of excessive logging.

Comment 13 Atin Mukherjee 2016-03-22 12:18:40 UTC
The fix is now available in rhgs-3.1.3 branch, hence moving the state to Modified.

Comment 15 Byreddy 2016-04-05 04:35:11 UTC
Verified this bug using the build "glusterfs-3.7.9-1"

Brought down one of the brick in the volume and observed the glusterd log for the rate of readv warning messages

The readv  warning message rate is got reduced, currently warning message is populating in the glusterd log for every 2 Mins and 6 seconds

><<<<<<<<<<<
[2016-04-05 04:24:27.921343] W [socket.c:589:__socket_rwv] 0-management: readv on /var/run/gluster/88f23c9fa290ee8044a4fea42da5ef1a.socket failed (Invalid argument)

[2016-04-05 04:26:33.953341] W [socket.c:589:__socket_rwv] 0-management: readv on /var/run/gluster/88f23c9fa290ee8044a4fea42da5ef1a.socket failed (Invalid argument)
[2016-04-05 04:28:39.990809] W [socket.c:589:__socket_rwv] 0-management: readv on /var/run/gluster/88f23c9fa290ee8044a4fea42da5ef1a.socket failed (Invalid argument)

><<<<<<<<<<<

with above details, moving bug to verified state.

Comment 17 Atin Mukherjee 2016-06-06 07:09:53 UTC
LGTM :)

Comment 19 errata-xmlrpc 2016-06-23 04:52:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1240


Note You need to log in before you can comment on or make changes to this bug.