Red Hat Bugzilla – Bug 961929
NFS - Crash during bonnie++
Last modified: 2016-01-03 23:40:46 EST
Description of problem:
When running bonnie++ on my NFS mounted 4x1 volume I see bonnie hang. In the nfs log I see:
[2013-05-10 15:34:40.440975] I [client-handshake.c:1468:client_setvolume_cbk] 0-testvol-client-3: Server and Client lk-version numbers are not same, reopening the fds
[2013-05-10 15:34:40.441174] I [client-handshake.c:450:client_set_lk_version_cbk] 0-testvol-client-3: Server lk version = 1
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
signal received: 11
time of crash: 2013-05-10 15:36:19configuration details:
package-string: glusterfs 18.104.22.168rhs
[2013-05-10 15:37:35.448865] C [client-handshake.c:127:rpc_client_ping_timer_expired] 0-testvol-client-1: server 10.8.0.23:49152 has not responded in the last 42 seconds, disconnecting.
[2013-05-10 16:06:21.631308] E [rpc-clnt.c:207:call_bail] 0-testvol-client-1: bailing out frame type(GlusterFS 3.3) op(WRITE(13)) xid = 0x29901x sent = 2013-05-10 15:36:19.843401. timeout = 1800
[2013-05-10 16:06:21.631340] W [client-rpc-fops.c:866:client3_3_writev_cbk] 0-testvol-client-1: remote operation failed: Transport endpoint is not connected
Version-Release number of selected component (if applicable):
I have ran this twice, reproducible both times thus far.
Steps to Reproduce:
1. Create a 4x1 volume.
2. Mount via NFS on a 6.3 client
3. Run bonnie++
Hang on client side, crash server side.
Created attachment 746339 [details]
server that crashed sos
Created attachment 746340 [details]
Other server sos
I couldn't find a core anywhere on this system. When I ran gluster volume status I saw:
[root@storage-qe10 core]# gluster volume status
Status of volume: testvol
Gluster process Port Online Pid
vol_brick0 49152 Y 6723
vol_brick1 49152 Y 5418
vol_brick2 49153 Y 6732
vol_brick3 49153 Y 5435
NFS Server on localhost 2049 Y 6742
NFS Server on 6614425e-2a78-41f6-9604-8af602113ba6 2049 Y 5445
There are no active volume tasks
I went to look at the process in GDB and saw:
[root@storage-qe10 core]# gdb -p 6742
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-60.el6)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
Attaching to process 6742
/usr/sbin/glusterfsd (deleted): No such file or directory.
#0 0xf8fbb06e in ?? ()
Cannot access memory at address 0xe1adfd98
I didn't see that PID in ps either.
I tried this with 3X1 distribute volume. It worked fine. I tried with 22.214.171.124rhs. Here are the results:
root - /mnt/r2
11:37:24 :) ⚡ bonnie++ -u root -s 4098 -r 0
Using uid:0, gid:0.
Writing a byte at a time...done
Reading a byte at a time...done
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...done.
Create files in random order...done.
Stat files in random order...done.
Delete files in random order...done.
Version 1.96 ------Sequential Output------ --Sequential Input- --Random-
Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
pranithk-lapt 4098M 1928 97 100855 6 50034 12 4886 98 108903 10 5419 45
Latency 14061us 200ms 11236ms 35341us 100ms 4015us
Version 1.96 ------Sequential Create------ --------Random Create--------
pranithk-laptop -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
16 216 2 289 2 311 2 216 2 982 3 772 3
Latency 37881us 3440ms 2292ms 85868us 11270us 37172us
Let me try to re-create with 126.96.36.199rhs.
Works with 188.8.131.52rhs as well.
Since the bug is not re-creatable for me I went through the logs. According to the logs, there are around 7394 frames in transit at the time of the crash. We must figure out how frames with 'type and op as 0' come into the system.
pranithk@pranithk-laptop - ~/Downloads/sos/961929/storage-
12:37:30 :) ⚡ grep "frame : type(0) op(0)" nfs.log | wc -l
The second thing we need to figure out is why are the nfs processes running even after a crash? According to .cmd_log_history The only commands executed are:
[2013-05-10 15:34:37.406987] : volume start testvol : SUCCESS
[2013-05-10 19:01:56.460244] : volume status : SUCCESS
[2013-05-10 19:01:56.462865] : volume status : SUCCESS
So I think may be we are missing other steps in the bug description which lead to nfs still running.
Are you running this whole test as part of some automation?.
Yes I am hitting this in the FS Sanity nightly runs. I hit this in automation both times I tested it and opened the BZ, I haven't attempeted a manual reproducer yet. I will run everything manually this this morning and document what I am seeing.
This sounds very similar to BZ 961198 for which fix is already posted downstream by Rajesh. The issue is already explained by the fix:
If there is a core file available, load the core into gdb. If the stack trace looks similar to this (which is a heap/arena-fast bin corruption issue), then it must be a duplicate one:
Program received signal SIGSEGV, Segmentation fault.
0x00000031b7a7be5b in malloc_consolidate (av=av@entry=0x31b7db1740 <main_arena>) at malloc.c:4143
4143 unlink(av, p, bck, fwd);
#0 0x00000031b7a7be5b in malloc_consolidate (av=av@entry=0x31b7db1740 <main_arena>) at malloc.c:4143
#1 0x00000031b7a7d09e in _int_malloc (av=av@entry=0x31b7db1740 <main_arena>, bytes=bytes@entry=6388) at malloc.c:3422
#2 0x00000031b7a802d8 in __libc_calloc (n=<optimized out>, elem_size=<optimized out>) at malloc.c:3220
#3 0x00007f7a0b81e703 in __gf_default_calloc (cnt=1, size=6388) at mem-pool.h:75
#4 0x00007f7a0b81eb1f in __gf_calloc (nmemb=1, size=6388, type=48) at mem-pool.c:104
#5 0x00007f7a0b81f6b0 in mem_get (mem_pool=0x20cfb90) at mem-pool.c:419
#6 0x00007f7a0b5c42f1 in rpcsvc_request_create (svc=0x20cfad0, trans=0x212a230, msg=0x2f33670) at rpcsvc.c:361
#7 0x00007f7a0b5c483f in rpcsvc_handle_rpc_call (svc=0x20cfad0, trans=0x212a230, msg=0x2f33670) at rpcsvc.c:504
#8 0x00007f7a0b5c4d49 in rpcsvc_notify (trans=0x212a230, mydata=0x20cfad0, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x2f33670) at rpcsvc.c:644
#9 0x00007f7a0b5c95d8 in rpc_transport_notify (this=0x212a230, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x2f33670) at rpc-transport.c:497
#10 0x00007f7a083d8fa3 in socket_event_poll_in (this=0x212a230) at socket.c:2119
#11 0x00007f7a083d943b in socket_event_handler (fd=18, idx=9, data=0x212a230, poll_in=1, poll_out=0, poll_err=0) at socket.c:2231
#12 0x00007f7a0b848bba in event_dispatch_epoll_handler (event_pool=0x20b0ea0, events=0x20cdd30, i=0) at event-epoll.c:384
#13 0x00007f7a0b848d9f in event_dispatch_epoll (event_pool=0x20b0ea0) at event-epoll.c:445
#14 0x00007f7a0b81e591 in event_dispatch (event_pool=0x20b0ea0) at event.c:113
#15 0x0000000000408608 in main (argc=11, argv=0x7fffae3fbf88) at glusterfsd.c:1902
I hit this last night on a 6x2 volume on the latest bits:
$ rpm -q glusterfs
[root@storage-qe03 /]# gluster volume info
Volume Name: testvol
Volume ID: 27c4afa8-fadc-4ce5-88e2-8b326d49c46b
Number of Bricks: 6 x 2 = 12
I talked to Saurabh and we may be seeing the same thing as:
It looks like there is a patch attached to that BZ, could we get that patch included in the next downstream build to see if it resolves this issue as well?
@Ben, yes, the patch got into the 184.108.40.206rhs, can you please Verify if that build works for you?
I tested this last night on the 220.127.116.11 bits and I can confirm that the crash appears to be fixed but performance is really poor. Normally my bonnie++ runs take from 30-40 minutes on gigabit lan, the current run has been going for over 16 hours. I looked on all nodes this morning and I don't see any crashes. When I straced the PID of bonnie I saw that is was still working, just slow. I tried to quantify this with dd and I didn't see terribly bad performance:
[root@storage-qe06 gluster-mount]# echo 3 > /proc/sys/vm/drop_caches; time `dd if=/dev/zero of=./myfile.test bs=1024k count=1000; sync`
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 18.7489 s, 55.9 MB/s
1 GB in 19 seconds works out to about 53 MB/second which is not terribly bad. I thought maybe bonnie was using a smaller block size so I tried:
[root@storage-qe06 gluster-mount]# echo 3 > /proc/sys/vm/drop_caches; time `dd if=/dev/zero of=./myfile.test bs=1024 count=1000000; sync`
1000000+0 records in
1000000+0 records out
1024000000 bytes (1.0 GB) copied, 18.9923 s, 53.9 MB/s
That is really close even with the small block size. I would think that with the time it was taking bonnie to finish single threaded I/O would be in the kilobytes. That doesn't seem to be the case here. Were there any other upstream fixes related to NFS performance that need to be plucked along with the patch in 18.104.22.168?
Ben, can we open another bug for perf? as it is an crash issue, can we mark the crash as VERIFIED? lets tackle the perf issue in different bug. (hence marking it ON_QA).
anyways, for the performance, do 'gluster volume set <VOL> eager-locking disable' and try it once.
No problem Amar. The new BZ Saurabh opened and its BZ #962510. Marking this BZ as verified on 22.214.171.124.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA.
For information on the advisory, and where to find the updated files, follow the link below.
If the solution does not work for you, open a new bug report.