Bug 1285200 - Dist-geo-rep : geo-rep worker crashed while init with [Errno 34] Numerical result out of range.
Summary: Dist-geo-rep : geo-rep worker crashed while init with [Errno 34] Numerical re...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: geo-replication
Version: rhgs-3.1
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
: RHGS 3.1.3
Assignee: Aravinda VK
QA Contact: Rahul Hinduja
URL:
Whiteboard:
Depends On: 1026780
Blocks: 1294588 1299184 1313311
TreeView+ depends on / blocked
 
Reported: 2015-11-25 08:39 UTC by Aravinda VK
Modified: 2016-06-23 04:57 UTC (History)
13 users (show)

Fixed In Version: glusterfs-3.7.9-1
Doc Type: Bug Fix
Doc Text:
Clone Of: 1026780
: 1294588 (view as bug list)
Environment:
Last Closed: 2016-06-23 04:57:37 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:1240 0 normal SHIPPED_LIVE Red Hat Gluster Storage 3.1 Update 3 2016-06-23 08:51:28 UTC

Comment 3 Aravinda VK 2015-12-29 05:28:57 UTC
llistxattr is two syscall instead of one

SIZE = llistxattr(PATH, &VALUE, 0);
_ = llistxattr(PATH, &VALUE, SIZE);

So if any new xattrs added just after first call by any other worker, second syscall will fail with ERANGE error.

For Geo-replication, this is not critical, Geo-rep worker goes to Faulty and restarts automatically.

Fix to be done in $SRC/geo-replication/syncdaemon/libcxattr.py
Handle ERANGE error in listxattr, llistxattr, getxattr, lgetxattr, setxattr and lsetxattr. Retry 2-3 times when ERANGE error.

Comment 4 Aravinda VK 2015-12-29 06:00:29 UTC
Upstream Patch sent http://review.gluster.org/#/c/13106/

Comment 6 Aravinda VK 2016-03-23 06:23:25 UTC
Patch for this bug is available in rhgs-3.1.3 branch as part of rebase from upstream release-3.7.9.

Comment 8 Rahul Hinduja 2016-04-18 14:41:01 UTC
Verified with the build: glusterfs-3.7.9-1

Ran automated geo-rep cases on Tiered and Non-Tiered volume. Also carried mountbroker cases. Haven't seen worker crashing with "Numerical result". Other crashes (IO error) seen during IO at slave and not during INIT of geo-replication. Moving this bug to verified state. Will revisit if seen after the other bz is fixed.

Comment 10 errata-xmlrpc 2016-06-23 04:57:37 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1240


Note You need to log in before you can comment on or make changes to this bug.