Bug 823154 - Brick process crashed while rebalancing distributed-replicate volume with kernel untaring on the mount
Brick process crashed while rebalancing distributed-replicate volume with ker...
Status: CLOSED WORKSFORME
Product: GlusterFS
Classification: Community
Component: core (Show other bugs)
pre-release
x86_64 Linux
high Severity urgent
: ---
: ---
Assigned To: shishir gowda
: Triaged
Depends On:
Blocks: 854642
  Show dependency treegraph
 
Reported: 2012-05-19 09:23 EDT by shylesh
Modified: 2013-12-08 20:32 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 854642 (view as bug list)
Environment:
Last Closed: 2012-10-09 10:33:08 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
brick and rebalance logs (262.27 KB, application/x-gzip)
2012-05-19 09:23 EDT, shylesh
no flags Details
attached the gzippedcore (4.43 MB, application/x-gzip)
2012-05-21 03:00 EDT, shylesh
no flags Details

  None (edit)
Description shylesh 2012-05-19 09:23:31 EDT
Created attachment 585576 [details]
brick and rebalance logs

Description of problem:

One of the brick process crashed while rebalancing a distributed-replicate volume

Version-Release number of selected component (if applicable):
3.3.0qa41

How reproducible:


Steps to Reproduce:
1.created a dist-rep volume of 4x2 
2. started kernel untar on the mount
3. add a pair of brick to the volume and initiated rebalance.
  
Actual results:
eventually one of the brick process crashed.

 

Additional info:

Loaded symbols for /lib64/libgcc_s.so.1
Core was generated by `/usr/local/sbin/glusterfsd -s localhost --volfile-id replica.10.16.157.66.home-'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007ff25ce2d03e in posix_lookup (frame=0x7ff2601f0032, this=0x3831332f31383133, loc=0x2f303831332f3937, 
    xdata=0x31332f383731332f) at posix.c:137
137                     MAKE_ENTRY_HANDLE (real_path, par_path, this, loc, &buf);
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.47.el6_2.9.x86_64 libgcc-4.4.6-3.el6.x86_64 openssl-1.0.0-20.el6_2.4.x86_64 zlib-1.2.3-27.el6.x86_64
(gdb) bt
#0  0x00007ff25ce2d03e in posix_lookup (frame=0x7ff2601f0032, this=0x3831332f31383133, loc=0x2f303831332f3937, 
    xdata=0x31332f383731332f) at posix.c:137
#1  0x00007ff25cc19d96 in posix_acl_lookup (frame=0x7ff2601f55ac, this=0x10dc270, loc=0x7ff25feb7868, xattr=0x7ff25fe645a8)
    at posix-acl.c:798
#2  0x00007ff25ca05e31 in pl_lookup (frame=0x7ff260200b14, this=0x10dd320, loc=0x7ff25feb7868, xdata=0x7ff25fe645a8) at posix.c:1661
#3  0x00007ff25c7e0f01 in iot_lookup_wrapper (frame=0x7ff2602032b8, this=0x10de440, loc=0x7ff25feb7868, xdata=0x7ff25fe645a8)
    at io-threads.c:288
#4  0x00007ff2613e79fc in call_resume_wind (stub=0x7ff25feb7828) at call-stub.c:2689
#5  0x00007ff2613eefea in call_resume (stub=0x7ff25feb7828) at call-stub.c:4151
#6  0x00007ff25c7e08d6 in iot_worker (data=0x10f0550) at io-threads.c:131
#7  0x000000358f2077f1 in start_thread () from /lib64/libpthread.so.0
#8  0x000000358eae5ccd in clone () from /lib64/libc.so.6



(gdb) f 0
#0  0x00007ff25ce2d03e in posix_lookup (frame=0x7ff2601f0032, this=0x3831332f31383133, loc=0x2f303831332f3937, 
    xdata=0x31332f383731332f) at posix.c:137
137                     MAKE_ENTRY_HANDLE (real_path, par_path, this, loc, &buf);


(gdb) p loc
$1 = (loc_t *) 0x2f303831332f3937
(gdb) f 1
#1  0x00007ff25cc19d96 in posix_acl_lookup (frame=0x7ff2601f55ac, this=0x10dc270, loc=0x7ff25feb7868, xattr=0x7ff25fe645a8)
    at posix-acl.c:798
798             STACK_WIND (frame, posix_acl_lookup_cbk,
(gdb) p loc
$2 = (loc_t *) 0x7ff25feb7868



looks like possible stack corruption.

attached the logs
Comment 1 Amar Tumballi 2012-05-20 23:38:12 EDT
went through the logs, there is no crash dump detail in the logs. Should I be looking at some other logs?
Comment 2 shylesh 2012-05-21 03:00:58 EDT
Created attachment 585746 [details]
attached the gzippedcore

attached the gzipped core
Comment 3 shishir gowda 2012-07-06 01:38:07 EDT
Please try to reproduce the bug and save the information.
Comment 4 shishir gowda 2012-10-09 10:33:08 EDT
Closing the bug, as I have not been able to reproduce the issue, and no sufficient information was available to continue with the investigation.
Please re-open the bug it occurs again

Note You need to log in before you can comment on or make changes to this bug.