Bug 1401160
Summary: | [Tracker][Ganesha + Multi-Volume/Multi-Mount] : Ganesha crashes during I/O ; I/Os stopped | |||
---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Ambarish <asoman> | |
Component: | nfs-ganesha | Assignee: | Kaleb KEITHLEY <kkeithle> | |
Status: | CLOSED ERRATA | QA Contact: | Ambarish <asoman> | |
Severity: | urgent | Docs Contact: | ||
Priority: | unspecified | |||
Version: | rhgs-3.2 | CC: | amukherj, asoman, bturner, dang, ffilz, jthottan, mbenjamin, rcyriac, rhinduja, rhs-bugs, skoduri, storage-qa-internal | |
Target Milestone: | --- | |||
Target Release: | RHGS 3.2.0 | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | nfs-ganesha-2.4.1-4 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1403727 (view as bug list) | Environment: | ||
Last Closed: | 2017-03-23 06:26:02 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 1403714, 1403727 | |||
Bug Blocks: | 1351528 |
Description
Ambarish
2016-12-03 04:15:44 UTC
Log flooding under the same use case is tracked via https://bugzilla.redhat.com/show_bug.cgi?id=1401162 The first two crashes reported in inode_ctx_free are being tracked as part of bug1403714. Nug1403727 has been filed for the 3rd crash (on nodegqas015) - memory corruption I tried this use case with Dan's fix for the crashes. Ganesha crashed on 3/4 nodes after ~9 hours of pumping IO (single threaded) from 6 clients.Since pacemaker quorum wasn't met,IO came to a halt on all clients. It didn't print anything from code this time,not sure how helpful this is : *********** On gqas013 *********** kroot@gqas013:~\[root@gqas013 ~]# /usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT -F ================================================================= ==16012== ERROR: AddressSanitizer: heap-buffer-overflow on address 0x604a00856c20 at pc 0x5a1aa9 bp 0x7f99c6ef4730 sp 0x7f99c6ef4720 WRITE of size 8 at 0x604a00856c20 thread T270 ==16012== WARNING: Trying to symbolize code, but external symbolizer is not initialized! #0 0x5a1aa8 (/usr/bin/ganesha.nfsd+0x5a1aa8) #1 0x5a563a (/usr/bin/ganesha.nfsd+0x5a563a) #2 0x4a6542 (/usr/bin/ganesha.nfsd+0x4a6542) #3 0x4a215b (/usr/bin/ganesha.nfsd+0x4a215b) #4 0x47a269 (/usr/bin/ganesha.nfsd+0x47a269) #5 0x47b9e7 (/usr/bin/ganesha.nfsd+0x47b9e7) #6 0x6257cf (/usr/bin/ganesha.nfsd+0x6257cf) #7 0x7f9a82836a97 (/lib64/libasan.so.0+0x19a97) #8 0x7f9a82608dc4 (/lib64/libpthread.so.0+0x7dc4) #9 0x7f9a8043173c (/lib64/libc.so.6+0xf773c) 0x604a00856c20 is located 1184 bytes inside of 1480-byte region [0x604a00856780,0x604a00856d48) freed by thread T259 here: #0 0x7f9a82833009 (/lib64/libasan.so.0+0x16009) #1 0x66b2ad (/usr/bin/ganesha.nfsd+0x66b2ad) #2 0x66b32c (/usr/bin/ganesha.nfsd+0x66b32c) #3 0x675139 (/usr/bin/ganesha.nfsd+0x675139) #4 0x67a988 (/usr/bin/ganesha.nfsd+0x67a988) #5 0x68543f (/usr/bin/ganesha.nfsd+0x68543f) #6 0x44ddb5 (/usr/bin/ganesha.nfsd+0x44ddb5) #7 0x4eb2e4 (/usr/bin/ganesha.nfsd+0x4eb2e4) #8 0x4a215b (/usr/bin/ganesha.nfsd+0x4a215b) #9 0x47a269 (/usr/bin/ganesha.nfsd+0x47a269) #10 0x47b9e7 (/usr/bin/ganesha.nfsd+0x47b9e7) #11 0x6257cf (/usr/bin/ganesha.nfsd+0x6257cf) #12 0x7f9a82836a97 (/lib64/libasan.so.0+0x19a97) previously allocated by thread T255 here: #0 0x7f9a82833225 (/lib64/libasan.so.0+0x16225) #1 0x66b262 (/usr/bin/ganesha.nfsd+0x66b262) #2 0x66b30e (/usr/bin/ganesha.nfsd+0x66b30e) #3 0x672c58 (/usr/bin/ganesha.nfsd+0x672c58) #4 0x672f06 (/usr/bin/ganesha.nfsd+0x672f06) #5 0x68fd52 (/usr/bin/ganesha.nfsd+0x68fd52) #6 0x6922cb (/usr/bin/ganesha.nfsd+0x6922cb) #7 0x67ad95 (/usr/bin/ganesha.nfsd+0x67ad95) #8 0x68a028 (/usr/bin/ganesha.nfsd+0x68a028) #9 0x446cd9 (/usr/bin/ganesha.nfsd+0x446cd9) #10 0x44f4b3 (/usr/bin/ganesha.nfsd+0x44f4b3) #11 0x4d3280 (/usr/bin/ganesha.nfsd+0x4d3280) #12 0x4d680f (/usr/bin/ganesha.nfsd+0x4d680f) #13 0x4a215b (/usr/bin/ganesha.nfsd+0x4a215b) #14 0x47a269 (/usr/bin/ganesha.nfsd+0x47a269) #15 0x47b9e7 (/usr/bin/ganesha.nfsd+0x47b9e7) #16 0x6257cf (/usr/bin/ganesha.nfsd+0x6257cf) #17 0x7f9a82836a97 (/lib64/libasan.so.0+0x19a97) Thread T270 created by T0 here: #0 0x7f9a82827c3a (/lib64/libasan.so.0+0xac3a) #1 0x62d8d6 (/usr/bin/ganesha.nfsd+0x62d8d6) #2 0x47bf5a (/usr/bin/ganesha.nfsd+0x47bf5a) #3 0x48d962 (/usr/bin/ganesha.nfsd+0x48d962) #4 0x48fbd8 (/usr/bin/ganesha.nfsd+0x48fbd8) #5 0x41d82d (/usr/bin/ganesha.nfsd+0x41d82d) #6 0x7f9a8035bb34 (/lib64/libc.so.6+0x21b34) Thread T259 created by T0 here: #0 0x7f9a82827c3a (/lib64/libasan.so.0+0xac3a) #1 0x62d8d6 (/usr/bin/ganesha.nfsd+0x62d8d6) #2 0x47bf5a (/usr/bin/ganesha.nfsd+0x47bf5a) #3 0x48d962 (/usr/bin/ganesha.nfsd+0x48d962) #4 0x48fbd8 (/usr/bin/ganesha.nfsd+0x48fbd8) #5 0x41d82d (/usr/bin/ganesha.nfsd+0x41d82d) #6 0x7f9a8035bb34 (/lib64/libc.so.6+0x21b34) Thread T255 created by T0 here: #0 0x7f9a82827c3a (/lib64/libasan.so.0+0xac3a) #1 0x62d8d6 (/usr/bin/ganesha.nfsd+0x62d8d6) #2 0x47bf5a (/usr/bin/ganesha.nfsd+0x47bf5a) #3 0x48d962 (/usr/bin/ganesha.nfsd+0x48d962) #4 0x48fbd8 (/usr/bin/ganesha.nfsd+0x48fbd8) #5 0x41d82d (/usr/bin/ganesha.nfsd+0x41d82d) #6 0x7f9a8035bb34 (/lib64/libc.so.6+0x21b34) Shadow bytes around the buggy address: 0x0c09c0102d30: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c09c0102d40: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c09c0102d50: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c09c0102d60: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c09c0102d70: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa =>0x0c09c0102d80: fa fa fa fa[fa]fa fa fa fa fa fa fa fa fa fa fa 0x0c09c0102d90: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c09c0102da0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c09c0102db0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c09c0102dc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0c09c0102dd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Heap righ redzone: fb Freed Heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack partial redzone: f4 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 ASan internal: fe ==16012== ABORTING *********** On gqas011 *********** kroot@gqas011:~\[root@gqas011 ~]# /usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT -F ================================================================= ==6683== ERROR: AddressSanitizer: heap-use-after-free on address 0x604a038fc5a0 at pc 0x5a1aa9 bp 0x7f42c16ed730 sp 0x7f42c16ed720 WRITE of size 8 at 0x604a038fc5a0 thread T296 ==6683== WARNING: Trying to symbolize code, but external symbolizer is not initialized! #0 0x5a1aa8 (/usr/bin/ganesha.nfsd+0x5a1aa8) #1 0x5a563a (/usr/bin/ganesha.nfsd+0x5a563a) #2 0x4a6542 (/usr/bin/ganesha.nfsd+0x4a6542) #3 0x4a215b (/usr/bin/ganesha.nfsd+0x4a215b) #4 0x47a269 (/usr/bin/ganesha.nfsd+0x47a269) #5 0x47b9e7 (/usr/bin/ganesha.nfsd+0x47b9e7) #6 0x6257cf (/usr/bin/ganesha.nfsd+0x6257cf) #7 0x7f438eeffa97 (/lib64/libasan.so.0+0x19a97) #8 0x7f438ecd1dc4 (/lib64/libpthread.so.0+0x7dc4) #9 0x7f438cafa73c (/lib64/libc.so.6+0xf773c) 0x604a038fc5a0 is located 1184 bytes inside of 1480-byte region [0x604a038fc100,0x604a038fc6c8) freed by thread T92 here: #0 0x7f438eefc009 (/lib64/libasan.so.0+0x16009) #1 0x66b2ad (/usr/bin/ganesha.nfsd+0x66b2ad) #2 0x66b32c (/usr/bin/ganesha.nfsd+0x66b32c) #3 0x675139 (/usr/bin/ganesha.nfsd+0x675139) #4 0x67a988 (/usr/bin/ganesha.nfsd+0x67a988) #5 0x68543f (/usr/bin/ganesha.nfsd+0x68543f) #6 0x44ddb5 (/usr/bin/ganesha.nfsd+0x44ddb5) #7 0x4eb2e4 (/usr/bin/ganesha.nfsd+0x4eb2e4) #8 0x4a215b (/usr/bin/ganesha.nfsd+0x4a215b) #9 0x47a269 (/usr/bin/ganesha.nfsd+0x47a269) #10 0x47b9e7 (/usr/bin/ganesha.nfsd+0x47b9e7) #11 0x6257cf (/usr/bin/ganesha.nfsd+0x6257cf) #12 0x7f438eeffa97 (/lib64/libasan.so.0+0x19a97) previously allocated by thread T151 here: #0 0x7f438eefc225 (/lib64/libasan.so.0+0x16225) #1 0x66b262 (/usr/bin/ganesha.nfsd+0x66b262) #2 0x66b30e (/usr/bin/ganesha.nfsd+0x66b30e) #3 0x672c58 (/usr/bin/ganesha.nfsd+0x672c58) #4 0x672f06 (/usr/bin/ganesha.nfsd+0x672f06) #5 0x68fd52 (/usr/bin/ganesha.nfsd+0x68fd52) #6 0x6922cb (/usr/bin/ganesha.nfsd+0x6922cb) #7 0x67ad95 (/usr/bin/ganesha.nfsd+0x67ad95) #8 0x67d3a9 (/usr/bin/ganesha.nfsd+0x67d3a9) #9 0x449215 (/usr/bin/ganesha.nfsd+0x449215) #10 0x4a8c80 (/usr/bin/ganesha.nfsd+0x4a8c80) #11 0x4a215b (/usr/bin/ganesha.nfsd+0x4a215b) #12 0x47a269 (/usr/bin/ganesha.nfsd+0x47a269) #13 0x47b9e7 (/usr/bin/ganesha.nfsd+0x47b9e7) #14 0x6257cf (/usr/bin/ganesha.nfsd+0x6257cf) #15 0x7f438eeffa97 (/lib64/libasan.so.0+0x19a97) Thread T296 created by T0 here: #0 0x7f438eef0c3a (/lib64/libasan.so.0+0xac3a) #1 0x62d8d6 (/usr/bin/ganesha.nfsd+0x62d8d6) #2 0x47bf5a (/usr/bin/ganesha.nfsd+0x47bf5a) #3 0x48d962 (/usr/bin/ganesha.nfsd+0x48d962) #4 0x48fbd8 (/usr/bin/ganesha.nfsd+0x48fbd8) #5 0x41d82d (/usr/bin/ganesha.nfsd+0x41d82d) #6 0x7f438ca24b34 (/lib64/libc.so.6+0x21b34) Thread T92 created by T0 here: #0 0x7f438eef0c3a (/lib64/libasan.so.0+0xac3a) #1 0x62d8d6 (/usr/bin/ganesha.nfsd+0x62d8d6) #2 0x47bf5a (/usr/bin/ganesha.nfsd+0x47bf5a) #3 0x48d962 (/usr/bin/ganesha.nfsd+0x48d962) #4 0x48fbd8 (/usr/bin/ganesha.nfsd+0x48fbd8) #5 0x41d82d (/usr/bin/ganesha.nfsd+0x41d82d) #6 0x7f438ca24b34 (/lib64/libc.so.6+0x21b34) Thread T151 created by T0 here: #0 0x7f438eef0c3a (/lib64/libasan.so.0+0xac3a) #1 0x62d8d6 (/usr/bin/ganesha.nfsd+0x62d8d6) #2 0x47bf5a (/usr/bin/ganesha.nfsd+0x47bf5a) #3 0x48d962 (/usr/bin/ganesha.nfsd+0x48d962) #4 0x48fbd8 (/usr/bin/ganesha.nfsd+0x48fbd8) #5 0x41d82d (/usr/bin/ganesha.nfsd+0x41d82d) #6 0x7f438ca24b34 (/lib64/libc.so.6+0x21b34) Shadow bytes around the buggy address: 0x0c09c0717860: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c09c0717870: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c09c0717880: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c09c0717890: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c09c07178a0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd =>0x0c09c07178b0: fd fd fd fd[fd]fd fd fd fd fd fd fd fd fd fd fd 0x0c09c07178c0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c09c07178d0: fd fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa 0x0c09c07178e0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c09c07178f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c09c0717900: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Heap righ redzone: fb Freed Heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack partial redzone: f4 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 ASan internal: fe ==6683== ABORTING *********** On gqas006 *********** kroot@gqas006:~\[root@gqas006 ~]# /usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT -F ================================================================= ==4450== ERROR: AddressSanitizer: heap-buffer-overflow on address 0x604a01a04ca0 at pc 0x5a1aa9 bp 0x7f391e825730 sp 0x7f391e825720 WRITE of size 8 at 0x604a01a04ca0 thread T107 ==4450== WARNING: Trying to symbolize code, but external symbolizer is not initialized! #0 0x5a1aa8 (/usr/bin/ganesha.nfsd+0x5a1aa8) #1 0x5a563a (/usr/bin/ganesha.nfsd+0x5a563a) #2 0x4a6542 (/usr/bin/ganesha.nfsd+0x4a6542) #3 0x4a215b (/usr/bin/ganesha.nfsd+0x4a215b) #4 0x47a269 (/usr/bin/ganesha.nfsd+0x47a269) #5 0x47b9e7 (/usr/bin/ganesha.nfsd+0x47b9e7) #6 0x6257cf (/usr/bin/ganesha.nfsd+0x6257cf) #7 0x7f3969b4fa97 (/lib64/libasan.so.0+0x19a97) #8 0x7f3969921dc4 (/lib64/libpthread.so.0+0x7dc4) #9 0x7f396774a73c (/lib64/libc.so.6+0xf773c) 0x604a01a04ca0 is located 1184 bytes inside of 1480-byte region [0x604a01a04800,0x604a01a04dc8) freed by thread T296 here: #0 0x7f3969b4c009 (/lib64/libasan.so.0+0x16009) #1 0x66b2ad (/usr/bin/ganesha.nfsd+0x66b2ad) #2 0x66b32c (/usr/bin/ganesha.nfsd+0x66b32c) #3 0x675139 (/usr/bin/ganesha.nfsd+0x675139) #4 0x67a988 (/usr/bin/ganesha.nfsd+0x67a988) #5 0x68543f (/usr/bin/ganesha.nfsd+0x68543f) #6 0x44ddb5 (/usr/bin/ganesha.nfsd+0x44ddb5) #7 0x4eb2e4 (/usr/bin/ganesha.nfsd+0x4eb2e4) #8 0x4a215b (/usr/bin/ganesha.nfsd+0x4a215b) #9 0x47a269 (/usr/bin/ganesha.nfsd+0x47a269) #10 0x47b9e7 (/usr/bin/ganesha.nfsd+0x47b9e7) #11 0x6257cf (/usr/bin/ganesha.nfsd+0x6257cf) #12 0x7f3969b4fa97 (/lib64/libasan.so.0+0x19a97) previously allocated by thread T107 here: #0 0x7f3969b4c225 (/lib64/libasan.so.0+0x16225) #1 0x66b262 (/usr/bin/ganesha.nfsd+0x66b262) #2 0x66b30e (/usr/bin/ganesha.nfsd+0x66b30e) #3 0x672c58 (/usr/bin/ganesha.nfsd+0x672c58) #4 0x672f06 (/usr/bin/ganesha.nfsd+0x672f06) #5 0x68fd52 (/usr/bin/ganesha.nfsd+0x68fd52) #6 0x6922cb (/usr/bin/ganesha.nfsd+0x6922cb) #7 0x67ad95 (/usr/bin/ganesha.nfsd+0x67ad95) #8 0x68a028 (/usr/bin/ganesha.nfsd+0x68a028) #9 0x446cd9 (/usr/bin/ganesha.nfsd+0x446cd9) #10 0x44f4b3 (/usr/bin/ganesha.nfsd+0x44f4b3) #11 0x4d3280 (/usr/bin/ganesha.nfsd+0x4d3280) #12 0x4d680f (/usr/bin/ganesha.nfsd+0x4d680f) #13 0x4a215b (/usr/bin/ganesha.nfsd+0x4a215b) #14 0x47a269 (/usr/bin/ganesha.nfsd+0x47a269) #15 0x47b9e7 (/usr/bin/ganesha.nfsd+0x47b9e7) #16 0x6257cf (/usr/bin/ganesha.nfsd+0x6257cf) #17 0x7f3969b4fa97 (/lib64/libasan.so.0+0x19a97) Thread T107 created by T0 here: #0 0x7f3969b40c3a (/lib64/libasan.so.0+0xac3a) #1 0x62d8d6 (/usr/bin/ganesha.nfsd+0x62d8d6) #2 0x47bf5a (/usr/bin/ganesha.nfsd+0x47bf5a) #3 0x48d962 (/usr/bin/ganesha.nfsd+0x48d962) #4 0x48fbd8 (/usr/bin/ganesha.nfsd+0x48fbd8) #5 0x41d82d (/usr/bin/ganesha.nfsd+0x41d82d) #6 0x7f3967674b34 (/lib64/libc.so.6+0x21b34) Thread T296 created by T0 here: #0 0x7f3969b40c3a (/lib64/libasan.so.0+0xac3a) #1 0x62d8d6 (/usr/bin/ganesha.nfsd+0x62d8d6) #2 0x47bf5a (/usr/bin/ganesha.nfsd+0x47bf5a) #3 0x48d962 (/usr/bin/ganesha.nfsd+0x48d962) #4 0x48fbd8 (/usr/bin/ganesha.nfsd+0x48fbd8) #5 0x41d82d (/usr/bin/ganesha.nfsd+0x41d82d) #6 0x7f3967674b34 (/lib64/libc.so.6+0x21b34) Shadow bytes around the buggy address: 0x0c09c0338940: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c09c0338950: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c09c0338960: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c09c0338970: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c09c0338980: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa =>0x0c09c0338990: fa fa fa fa[fa]fa fa fa fa fa fa fa fa fa fa fa 0x0c09c03389a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c09c03389b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c09c03389c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c09c03389d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0c09c03389e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Heap righ redzone: fb Freed Heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack partial redzone: f4 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 ASan internal: fe ==4450== ABORTING (In reply to Ambarish from comment #8) > I tried this use case with Dan's fix for the crashes. > > Ganesha crashed on 3/4 nodes after ~9 hours of pumping IO (single threaded) > from 6 clients.Since pacemaker quorum wasn't met,IO came to a halt on all > clients. > > It didn't print anything from code this time,not sure how helpful this is : > > *********** > On gqas013 > *********** > > kroot@gqas013:~\[root@gqas013 ~]# /usr/bin/ganesha.nfsd -L > /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT -F > ================================================================= > ==16012== ERROR: AddressSanitizer: heap-buffer-overflow on address > 0x604a00856c20 at pc 0x5a1aa9 bp 0x7f99c6ef4730 sp 0x7f99c6ef4720 > WRITE of size 8 at 0x604a00856c20 thread T270 > ==16012== WARNING: Trying to symbolize code, but external symbolizer is not > initialized! > #0 0x5a1aa8 (/usr/bin/ganesha.nfsd+0x5a1aa8) > #1 0x5a563a (/usr/bin/ganesha.nfsd+0x5a563a) > #2 0x4a6542 (/usr/bin/ganesha.nfsd+0x4a6542) > #3 0x4a215b (/usr/bin/ganesha.nfsd+0x4a215b) > #4 0x47a269 (/usr/bin/ganesha.nfsd+0x47a269) > #5 0x47b9e7 (/usr/bin/ganesha.nfsd+0x47b9e7) > #6 0x6257cf (/usr/bin/ganesha.nfsd+0x6257cf) > #7 0x7f9a82836a97 (/lib64/libasan.so.0+0x19a97) > #8 0x7f9a82608dc4 (/lib64/libpthread.so.0+0x7dc4) > #9 0x7f9a8043173c (/lib64/libc.so.6+0xf773c) (gdb) l *0x5a1aa8 0x5a1aa8 is in glist_del (/root/ravi/nfs-ganesha/src/include/gsh_list.h:101). 96 { 97 struct glist_head *left = node->prev; 98 struct glist_head *right = node->next; 99 100 if (left != NULL) 101 left->next = right; 102 if (right != NULL) 103 right->prev = left; 104 node->next = NULL; 105 node->prev = NULL; (gdb) list *0x5a563a 0x5a563a is in state_del_locked (/root/ravi/nfs-ganesha/src/SAL/nfs4_state.c:373). 368 */ 369 obj->state_hdl->no_cleanup = true; 370 371 /* Remove from the list of states for a particular file */ 372 PTHREAD_MUTEX_lock(&state->state_mutex); 373 glist_del(&state->state_list); 374 memset(&state->state_obj, 0, sizeof(state->state_obj)); 375 PTHREAD_MUTEX_unlock(&state->state_mutex); 376 377 if (obj->fsal->m_ops.support_ex(obj)) { (gdb) l *0x4a6542 0x4a6542 is in nfs4_op_close (/root/ravi/nfs-ganesha/src/Protocols/NFS/nfs4_op_close.c:310). 305 306 /* File is closed, release the corresponding state. If the FSAL 307 * supports extended ops, this will result in closing any open files 308 * the FSAL has for this state. 309 */ 310 state_del_locked(state_found); 311 312 /* Poison the current stateid */ 313 data->current_stateid_valid = false; 314 (gdb) l *0x4a215b 0x4a215b is in nfs4_Compound (/root/ravi/nfs-ganesha/src/Protocols/NFS/nfs4_Compound.c:734). 729 i + 1; 730 break; 731 } 732 } 733 734 status = (optabv4[opcode].funct) (&argarray[i], 735 &data, 736 &resarray[i]); 737 738 LogCompoundFH(&data); (gdb) l *0x47a269 0x47a269 is in nfs_rpc_execute (/root/ravi/nfs-ganesha/src/MainNFSD/nfs_worker_thread.c:1281). 1276 &reqdata->r_u.req.svc.rq_xprt->blkin.endp, 1277 "export-id", 1278 (op_ctx->export != NULL) 1279 ? op_ctx->export->export_id : -1); 1280 #endif 1281 rc = reqdesc->service_function(arg_nfs, &reqdata->r_u.req.svc, 1282 res_nfs); 1283 1284 #ifdef USE_LTTNG 1285 tracepoint(nfs_rpc, op_end, reqdata); (gdb) > 0x604a00856c20 is located 1184 bytes inside of 1480-byte region > [0x604a00856780,0x604a00856d48) > freed by thread T259 here: > #0 0x7f9a82833009 (/lib64/libasan.so.0+0x16009) > #1 0x66b2ad (/usr/bin/ganesha.nfsd+0x66b2ad) > #2 0x66b32c (/usr/bin/ganesha.nfsd+0x66b32c) > #3 0x675139 (/usr/bin/ganesha.nfsd+0x675139) > #4 0x67a988 (/usr/bin/ganesha.nfsd+0x67a988) > #5 0x68543f (/usr/bin/ganesha.nfsd+0x68543f) > #6 0x44ddb5 (/usr/bin/ganesha.nfsd+0x44ddb5) > #7 0x4eb2e4 (/usr/bin/ganesha.nfsd+0x4eb2e4) > #8 0x4a215b (/usr/bin/ganesha.nfsd+0x4a215b) > #9 0x47a269 (/usr/bin/ganesha.nfsd+0x47a269) > #10 0x47b9e7 (/usr/bin/ganesha.nfsd+0x47b9e7) > #11 0x6257cf (/usr/bin/ganesha.nfsd+0x6257cf) > #12 0x7f9a82836a97 (/lib64/libasan.so.0+0x19a97) (gdb) l *0x66b2ad 0x66b2ad is in gsh_free (/root/ravi/nfs-ganesha/src/include/abstract_mem.h:271). 266 * @param[in] p Block of memory to free. 267 */ 268 static inline void 269 gsh_free(void *p) 270 { 271 free(p); 272 } 273 274 /** 275 * @brief Free a block of memory with size (gdb) l *0x66b32c 0x66b32c is in pool_free (/root/ravi/nfs-ganesha/src/include/abstract_mem.h:420). 415 */ 416 417 static inline void 418 pool_free(pool_t *pool, void *object) 419 { 420 gsh_free(object); 421 } 422 423 #endif /* ABSTRACT_MEM_H */ (gdb) l *0x675139 0x675139 is in mdcache_lru_unref (/root/ravi/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:1456). 1451 1452 if (!qlocked) 1453 QUNLOCK(qlane); 1454 1455 mdcache_lru_clean(entry); 1456 pool_free(mdcache_entry_pool, entry); 1457 freed = true; 1458 1459 (void) atomic_dec_int64_t(&lru_state.entries_used); 1460 } /* refcnt == 0 */ (gdb) l *0x67a988 0x67a988 is in mdcache_put (/root/ravi/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.h:186). 181 * 182 * @param[in] entry Cache entry being returned 183 */ 184 static inline void mdcache_put(mdcache_entry_t *entry) 185 { 186 mdcache_lru_unref(entry, LRU_FLAG_NONE); 187 } 188 189 /** 190 * Return true if there are FDs available to serve open requests, (gdb) l *0x68543f 0x68543f is in mdcache_put_ref (/root/ravi/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c:1508). warning: Source file is more recent than executable. 1503 static void mdcache_put_ref(struct fsal_obj_handle *obj_hdl) 1504 { 1505 mdcache_entry_t *entry = 1506 container_of(obj_hdl, mdcache_entry_t, obj_handle); 1507 1508 mdcache_put(entry); 1509 } 1510 1511 /** 1512 * @brief Release an object handle (gdb) l *0x44ddb5 0x44ddb5 is in fsal_remove (/root/ravi/nfs-ganesha/src/FSAL/fsal_helper.c:1599). 1594 goto out; 1595 } 1596 1597 out: 1598 1599 to_remove_obj->obj_ops.put_ref(to_remove_obj); 1600 1601 out_no_obj: 1602 1603 LogFullDebug(COMPONENT_FSAL, "remove %s: status=%s", name, (gdb) l *0x4eb2e4 0x4eb2e4 is in nfs4_op_remove (/root/ravi/nfs-ganesha/src/Protocols/NFS/nfs4_op_remove.c:104). 99 sizeof(changeid4)); 100 101 res_REMOVE4->REMOVE4res_u.resok4.cinfo.before = 102 fsal_get_changeid4(parent_obj); 103 104 fsal_status = fsal_remove(parent_obj, name); 105 if (FSAL_IS_ERROR(fsal_status)) { 106 res_REMOVE4->status = nfs4_Errno_status(fsal_status); 107 goto out; 108 } (gdb) ^CQuit (gdb) This crash looks similar to the one reported in https://bugzilla.redhat.com/show_bug.cgi?id=1403666#c12 Its the same stack trace reported in other nodes as well. Potential fix for this: https://review.gerrithub.io/308298 The reported issue was not reproducible on Ganesha 2.4.1-6,Gluster 3.8.4-12 on two tries. Will reopen if hit again during regressions. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2017-0493.html |