Bug 1749625
Summary: | [GlusterFS 6.1] GlusterFS brick process crash | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | joe.chan | ||||||||||
Component: | rpc | Assignee: | Xavi Hernandez <jahernan> | ||||||||||
Status: | CLOSED NEXTRELEASE | QA Contact: | |||||||||||
Severity: | urgent | Docs Contact: | |||||||||||
Priority: | urgent | ||||||||||||
Version: | 6 | CC: | bugs, cfeller, jahernan, motillito1, pasik | ||||||||||
Target Milestone: | --- | Keywords: | Triaged | ||||||||||
Target Release: | --- | ||||||||||||
Hardware: | x86_64 | ||||||||||||
OS: | Linux | ||||||||||||
Whiteboard: | |||||||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||||||
Doc Text: | Story Points: | --- | |||||||||||
Clone Of: | |||||||||||||
: | 1782495 (view as bug list) | Environment: | |||||||||||
Last Closed: | 2019-12-24 05:20:55 UTC | Type: | Bug | ||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||
Documentation: | --- | CRM: | |||||||||||
Verified Versions: | Category: | --- | |||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
Embargoed: | |||||||||||||
Bug Depends On: | 1782495 | ||||||||||||
Bug Blocks: | |||||||||||||
Attachments: |
|
Description
joe.chan
2019-09-06 03:06:05 UTC
Created attachment 1612174 [details]
glusterd.log
Created attachment 1612175 [details]
glustershd.log
Do you have a coredump? If yes, please install the debuginfo, and send send the bt for the core. Thank you for your reply. Unfortunately, I cannot find any coredump file in the servers with pattern "core.*" In that case please use the steps in https://github.com/gluster/glusterfs/blob/master/extras/devel-tools/print-backtrace.sh and the backtrace in the log file to get more information as to what happened. (In reply to Nithya Balachandran from comment #5) > In that case please use the steps in > https://github.com/gluster/glusterfs/blob/master/extras/devel-tools/print- > backtrace.sh and the backtrace in the log file to get more information as > to what happened. Thank you for your suggestion, however, I have encountered the error while running the corresponding script. Is there something wrong? [hk2hp060@projadm /home/projadm/user/joechan/gfs] ./print-backtrace_v2.sh /home/projadm/user/joechan/gfs/glusterfs-debuginfo-6.1-1.el7.x86_64.rpm glusterfs/bricks/20190912.log eu-addr2line --functions --exe=/home/projadm/user/joechan/gfs/glusterfs-debuginfo-6.1-1.el7.x86_64/usr/lib/debug/usr/sbin/glusterfsd.debug /usr/sbin/glusterfsd eu-addr2line: cannot find symbol '/usr/sbin/glusterfsd' /usr/sbin/glusterfsd eu-addr2line --functions --exe=/home/projadm/user/joechan/gfs/glusterfs-debuginfo-6.1-1.el7.x86_64/usr/lib/debug/usr/sbin/glusterfsd.debug /usr/sbin/glusterfsd eu-addr2line: cannot find symbol '/usr/sbin/glusterfsd' /usr/sbin/glusterfsd (In reply to joe.chan from comment #6) > (In reply to Nithya Balachandran from comment #5) > > In that case please use the steps in > > https://github.com/gluster/glusterfs/blob/master/extras/devel-tools/print- > > backtrace.sh and the backtrace in the log file to get more information as > > to what happened. > > Thank you for your suggestion, however, I have encountered the error while > running the corresponding script. Is there something wrong? > > [hk2hp060@projadm /home/projadm/user/joechan/gfs] ./print-backtrace_v2.sh > /home/projadm/user/joechan/gfs/glusterfs-debuginfo-6.1-1.el7.x86_64.rpm > glusterfs/bricks/20190912.log > eu-addr2line --functions > --exe=/home/projadm/user/joechan/gfs/glusterfs-debuginfo-6.1-1.el7.x86_64/ > usr/lib/debug/usr/sbin/glusterfsd.debug /usr/sbin/glusterfsd > eu-addr2line: cannot find symbol '/usr/sbin/glusterfsd' > /usr/sbin/glusterfsd > eu-addr2line --functions > --exe=/home/projadm/user/joechan/gfs/glusterfs-debuginfo-6.1-1.el7.x86_64/ > usr/lib/debug/usr/sbin/glusterfsd.debug /usr/sbin/glusterfsd > eu-addr2line: cannot find symbol '/usr/sbin/glusterfsd' > /usr/sbin/glusterfsd Sorry, it is a mistake. Below is the backtrace in log file: [hk2hp060@projadm /home/projadm/user/joechan/gfs] ./print-backtrace.sh /home/projadm/user/joechan/gfs/glusterfs-debuginfo-6.1-1.el7.x86_64.rpm glusterfs/bricks/20190912.log /usr/lib64/glusterfs/6.1/rpc-transport/socket.so(+0xa4cc)[0x7f5061bdf4cc] socket_is_connected inlined at /usr/src/debug/glusterfs-6.1/rpc/rpc-transport/socket/src/socket.c:2887 in socket_event_handler /usr/src/debug/glusterfs-6.1/rpc/rpc-transport/socket/src/socket.c:2619 Can you confirm that you have copied the stacktrace alone to a different file and run the script on thta file? You should be seeing one line per line in the stacktrace. (In reply to Nithya Balachandran from comment #8) > Can you confirm that you have copied the stacktrace alone to a different > file and run the script on thta file? You should be seeing one line per line > in the stacktrace. Yes it is confirmed that the procedure is correct. stacktrace from log: /lib64/libglusterfs.so.0(+0x26db0)[0x7f506d81adb0] /lib64/libglusterfs.so.0(gf_print_trace+0x334)[0x7f506d8257b4] /lib64/libc.so.6(+0x36340)[0x7f506be5a340] /usr/lib64/glusterfs/6.1/rpc-transport/socket.so(+0xa4cc)[0x7f5061bdf4cc] /lib64/libglusterfs.so.0(+0x8c286)[0x7f506d880286] /lib64/libpthread.so.0(+0x7dd5)[0x7f506c65add5] /lib64/libc.so.6(clone+0x6d)[0x7f506bf2202d] only one line while running the script: /usr/lib64/glusterfs/6.1/rpc-transport/socket.so(+0xa4cc)[0x7f5061bdf4cc] socket_is_connected inlined at /usr/src/debug/glusterfs-6.1/rpc/rpc-transport/socket/src/socket.c:2887 in socket_event_handler /usr/src/debug/glusterfs-6.1/rpc/rpc-transport/socket/src/socket.c:2619 attached more debug result by manually execute the "eu-addr2line" command: /lib64/libglusterfs.so.0(+0x26db0)[0x7f506d81adb0] _gf_msg_backtrace_nomem /usr/src/debug/glusterfs-6.1/libglusterfs/src/logging.c:1124 /lib64/libglusterfs.so.0(gf_print_trace+0x334)[0x7f506d8257b4] sprintf inlined at /usr/src/debug/glusterfs-6.1/libglusterfs/src/common-utils.c:954 in gf_print_trace /usr/include/bits/stdio2.h:33 /lib64/libglusterfs.so.0(+0x8c286)[0x7f506d880286] event_dispatch_epoll_handler inlined at /usr/src/debug/glusterfs-6.1/libglusterfs/src/event-epoll.c:761 in event_dispatch_epoll_worker /usr/src/debug/glusterfs-6.1/libglusterfs/src/event-epoll.c:648 I am not able to find the "/lib64/libc.so.6" and "/lib64/libpthread.so.0" under "glusterfs-debuginfo-6.1-1.el7.x86_64/usr/lib/" libc and libpthread are std libraries and not part of gluster. I've moving this to the rpc component for someone to look at. Created attachment 1618042 [details]
2 coredump files
attached the coredump files
Program terminated with signal 11, Segmentation fault. #0 socket_is_connected (this=0x7f90d80018a0) at socket.c:2619 2619 if (priv->use_ssl) { Missing separate debuginfos, use: debuginfo-install glibc-2.17-260.el7_6.6.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-37.el7_6.x86_64 libacl-2.2.51-14.el7.x86_64 libaio-0.3.109-13.el7.x86_64 libattr-2.4.46-13.el7.x86_64 libcom_err-1.42.9-13.el7.x86_64 libgcc-4.8.5-36.el7_6.2.x86_64 libselinux-2.5-14.1.el7.x86_64 libuuid-2.23.2-59.el7_6.1.x86_64 openssl-libs-1.0.2k-16.el7_6.1.x86_64 pcre-8.32-17.el7.x86_64 zlib-1.2.7-18.el7.x86_64 (gdb) bt #0 socket_is_connected (this=0x7f90d80018a0) at socket.c:2619 #1 socket_event_handler (fd=5, idx=11, gen=16, data=0x7f90d80018a0, poll_in=1, poll_out=0, poll_err=0, event_thread_died=0 '\000') at socket.c:2887 #2 0x00007f90ec34c286 in event_dispatch_epoll_handler (event=0x7f90d7ffee70, event_pool=0x5632b0727560) at event-epoll.c:648 #3 event_dispatch_epoll_worker (data=0x5632b077a420) at event-epoll.c:761 #4 0x00007f90eb126dd5 in start_thread () from /lib64/libpthread.so.0 #5 0x00007f90ea9ee02d in clone () from /lib64/libc.so.6 @joe, do you have any port scanning tool that could try to probe gluster ports ? (In reply to Xavi Hernandez from comment #14) > @joe, do you have any port scanning tool that could try to probe gluster > ports ? Yes, I have use the security scanner called Rapid7 to ensure the security hardening of the whole system. I think the root cause of this crash is most likely due to the scanning as the case can be reproduced. Thank you. REVIEW: https://review.gluster.org/23861 (socket: fix error handling) posted (#1) for review on master by Xavi Hernandez I've found the problem. I've sent a patch to fix it. As soon as it is accepted I will backport it so that it will be fixed in next release. REVISION POSTED: https://review.gluster.org/23861 (socket: fix error handling) posted (#2) for review on master by Xavi Hernandez Thank you for your support, however, I have blocked the corresponding port and only allow the GlusterFS member to access for the current workaround. REVIEW: https://review.gluster.org/23872 (socket: fix error handling) posted (#1) for review on release-6 by Xavi Hernandez REVIEW: https://review.gluster.org/23872 (socket: fix error handling) merged (#2) on release-6 by Xavi Hernandez *** Bug 1740413 has been marked as a duplicate of this bug. *** *** Bug 1739884 has been marked as a duplicate of this bug. *** |