Bug 1285200 - Dist-geo-rep : geo-rep worker crashed while init with [Errno 34] Numerical result out of range.
Dist-geo-rep : geo-rep worker crashed while init with [Errno 34] Numerical re...
Status: CLOSED ERRATA
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: geo-replication (Show other bugs)
3.1
x86_64 Linux
medium Severity medium
: ---
: RHGS 3.1.3
Assigned To: Aravinda VK
Rahul Hinduja
: EasyFix, ZStream
Depends On: 1026780
Blocks: 1294588 1299184 1313311
  Show dependency treegraph
 
Reported: 2015-11-25 03:39 EST by Aravinda VK
Modified: 2016-06-23 00:57 EDT (History)
13 users (show)

See Also:
Fixed In Version: glusterfs-3.7.9-1
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1026780
: 1294588 (view as bug list)
Environment:
Last Closed: 2016-06-23 00:57:37 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Comment 3 Aravinda VK 2015-12-29 00:28:57 EST
llistxattr is two syscall instead of one

SIZE = llistxattr(PATH, &VALUE, 0);
_ = llistxattr(PATH, &VALUE, SIZE);

So if any new xattrs added just after first call by any other worker, second syscall will fail with ERANGE error.

For Geo-replication, this is not critical, Geo-rep worker goes to Faulty and restarts automatically.

Fix to be done in $SRC/geo-replication/syncdaemon/libcxattr.py
Handle ERANGE error in listxattr, llistxattr, getxattr, lgetxattr, setxattr and lsetxattr. Retry 2-3 times when ERANGE error.
Comment 4 Aravinda VK 2015-12-29 01:00:29 EST
Upstream Patch sent http://review.gluster.org/#/c/13106/
Comment 6 Aravinda VK 2016-03-23 02:23:25 EDT
Patch for this bug is available in rhgs-3.1.3 branch as part of rebase from upstream release-3.7.9.
Comment 8 Rahul Hinduja 2016-04-18 10:41:01 EDT
Verified with the build: glusterfs-3.7.9-1

Ran automated geo-rep cases on Tiered and Non-Tiered volume. Also carried mountbroker cases. Haven't seen worker crashing with "Numerical result". Other crashes (IO error) seen during IO at slave and not during INIT of geo-replication. Moving this bug to verified state. Will revisit if seen after the other bz is fixed.
Comment 10 errata-xmlrpc 2016-06-23 00:57:37 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1240

Note You need to log in before you can comment on or make changes to this bug.