Hide Forgot
Created attachment 158 [details] update the error msg attached VOL files, logs, postmark config and test scripts
Hardware summary Dell R610 E5506 Xeon Processor, 2.13GHz 4M Cache, 4.86 GT/s QPI, TurboHT 12GB Memory (6x2GB), 1333Mhz Dual Ranked RDIMMs for 2 Processors Intel® Gigabit ET NIC, Dual Port, Copper, PCIe-4 HP Procurve 2510G-24 Port Gigibit switches Intel X25-E 64GB SSD's
And OS : Centos 5.4 Final
Environment : Version : glusterfs 3.0.3 built on Mar 14 2010 04:05:39 git: v3.0.2-41-g029062c Starting Time: 2010-03-16 16:10:00 Command line : /usr/sbin/glusterfs --log-level=NORMAL --volfile=/etc/glusterfs/glusterfs-ssd2.vol /mnt/glusterfs-ssd2 PID : 10030 System name : Linux Nodename : node1 Kernel Release : 2.6.32.8 Hardware Identifier: x86_64 Version : glusterfs 3.0.3 built on Mar 14 2010 04:05:39 git: v3.0.2-41-g029062c Starting Time: 2010-03-16 16:11:46 Command line : /usr/sbin/glusterfs --log-level=NORMAL --volfile=/etc/glusterfs/glusterfs-ssd2.vol /mnt/glusterfs-ssd2 PID : 7420 System name : Linux Nodename : node2 Kernel Release : 2.6.32.8 Hardware Identifier: x86_64 Other settings on each node: Fuse version 2.8.3 - No patches echo 0 > /sys/block/sdb/queue/rotational echo 0 > /sys/block/sdd/queue/rotational echo noop > /sys/block/sdb/queue/scheduler echo noop > /sys/block/sdd/queue/scheduler echo 2 > /proc/sys/vm/overcommit_memory # Nodes & test details A two node gluster setup. Each node with 2 Intel X25-E 64GB SSD's. Each node acting as server & client. SSD1 on node1 is replicated to ssd1 on node2 and ssd2 on node2 is replicated to ssd2 on node1. On node1 postmark test is run on ssd1 and on node2 postmark test is done on ssd2. The test spawns 10 postmark threads each writing to its own directory i,e /gluster_mnt/`hostname`/threadID. Postmark config, stress test script , gluster vol files and log files are attached in a rar file. Note that there are out of memory errors as well in the log. Iam guessing this occurred due to continuous read/write attempts on the client volume experiencing disconnect issues. Pls let me know if any other details are required # fstab /dev/sdb /mnt/ssd1 ext3 defaults,noatime,nodiratime 0 0 /dev/sdd /mnt/ssd2 ext3 defaults,noatime,nodiratime 0 0 /etc/glusterfs/glusterfs-ssd1.vol /mnt/glusterfs-ssd1 glusterfs defaults,_netdev 0 0 /etc/glusterfs/glusterfs-ssd2.vol /mnt/glusterfs-ssd2 glusterfs defaults,_netdev 0 0 # Error summary a) on client log [2010-03-16 17:57:27] E [client-protocol.c:4952:client_lookup_cbk] 172.16.1.13-ssd1-brick: invalid argument: dictbuf [2010-03-16 17:57:27] E [client-protocol.c:4952:client_lookup_cbk] 172.16.1.12-ssd1-brick: invalid argument: dictbuf [2010-03-16 17:57:27] E [saved-frames.c:165:saved_frames_unwind] 172.16.1.13-ssd1-brick: forced unwinding frame type(1) op( LOOKUP) [2010-03-16 17:57:27] E [client-protocol.c:4952:client_lookup_cbk] 172.16.1.12-ssd1-brick: invalid argument: dictbuf [2010-03-16 17:57:27] W [fuse-bridge.c:521:fuse_entry_cbk] glusterfs-fuse: 6469926: LOOKUP() /node1/thread8/3613 => -1 (Success) [2010-03-16 17:57:27] N [client-protocol.c:6246:client_setvolume_cbk] 172.16.1.13-ssd1-brick: Connected to 172.16.1.13:6996 , attached to remote volume 'ssd1-brick'. [2010-03-16 17:57:28] E [client-protocol.c:4952:client_lookup_cbk] 172.16.1.12-ssd1-brick: invalid argument: dictbuf [2010-03-16 17:57:28] E [client-protocol.c:4952:client_lookup_cbk] 172.16.1.13-ssd1-brick: invalid argument: dictbuf [2010-03-16 17:57:28] E [client-protocol.c:4952:client_lookup_cbk] 172.16.1.12-ssd1-brick: invalid argument: dictbuf [2010-03-16 17:57:28] W [fuse-bridge.c:521:fuse_entry_cbk] glusterfs-fuse: 6471149: LOOKUP() /node1/thread10/2964 => -1 (Success) [2010-03-16 17:57:29] E [client-protocol.c:4952:client_lookup_cbk] 172.16.1.12-ssd1-brick: invalid argument: dictbuf [2010-03-16 17:57:30] E [client-protocol.c:4952:client_lookup_cbk] 172.16.1.13-ssd1-brick: invalid argument: dictbuf [2010-03-16 17:57:31] E [client-protocol.c:4952:client_lookup_cbk] 172.16.1.12-ssd1-brick: invalid argument: dictbuf [2010-03-16 17:57:31] E [saved-frames.c:165:saved_frames_unwind] 172.16.1.12-ssd1-brick: forced unwinding frame type(1) op( LOOKUP) [2010-03-16 17:57:31] E [saved-frames.c:165:saved_frames_unwind] 172.16.1.12-ssd1-brick: forced unwinding frame type(1) op( LOOKUP) [2010-03-16 17:57:31] N [client-protocol.c:6246:client_setvolume_cbk] 172.16.1.12-ssd1-brick: Connected to 172.16.1.12:6996 , attached to remote volume 'ssd1-brick'. => -1 (Success) LOOKUP) [2010-03-16 17:57:31] E [client-protocol.c:4952:client_lookup_cbk] 172.16.1.12-ssd1-brick: invalid argument: dictbuf => -1 (Success) , attached to remote volume 'ssd1-brick'. node1/thread5/4794. proceeding to metadata check node1/thread9/4795. proceeding to metadata check node1/thread5/4794. proceeding to metadata check node1/thread9/4795. proceeding to metadata check node1/thread5/4794. proceeding to metadata check node1/thread9/4795. proceeding to metadata check [2010-03-16 17:57:43] E [client-protocol.c:4952:client_lookup_cbk] 172.16.1.12-ssd1-brick: invalid argument: dictbuf b) On server [2010-03-16 17:57:31] N [server-protocol.c:6788:notify] ssd1-server-tcp: 172.16.1.12:1022 disconnected [2010-03-16 17:57:31] N [server-protocol.c:5852:mop_setvolume] ssd1-server-tcp: accepted client from 172.16.1.12:1022 [2010-03-16 17:57:44] N [server-protocol.c:6788:notify] ssd1-server-tcp: 172.16.1.12:1022 disconnected [2010-03-16 17:57:44] N [server-protocol.c:5852:mop_setvolume] ssd1-server-tcp: accepted client from 172.16.1.12:1022 [2010-03-16 17:57:44] E [server-protocol.c:339:protocol_server_reply] protocol/server: frame 1619694: failed to submit. op= 27, type= 4 [2010-03-16 17:57:44] E [server-protocol.c:339:protocol_server_reply] protocol/server: frame 1619695: failed to submit. op= 0, type= 4 [2010-03-16 17:57:44] E [server-protocol.c:339:protocol_server_reply] protocol/server: frame 1619696: failed to submit. op= 23, type= 4 [2010-03-16 17:57:44] N [server-protocol.c:6788:notify] ssd1-server-tcp: 172.16.1.12:1022 disconnected [2010-03-16 17:57:44] E [server-protocol.c:339:protocol_server_reply] protocol/server: frame 1619814: failed to submit. op= 0, type= 4 [2010-03-16 17:57:44] N [server-protocol.c:6788:notify] ssd1-server-tcp: 172.16.1.12:1023 disconnected [2010-03-16 17:57:44] E [server-protocol.c:339:protocol_server_reply] protocol/server: frame 10831263: failed to submit. op= 31, type= 4 [2010-03-16 17:57:44] E [server-protocol.c:339:protocol_server_reply] protocol/server: frame 10831613: failed to submit. op= 31, type= 4 [2010-03-16 17:57:44] N [server-helpers.c:842:server_connection_destroy] ssd1-server-tcp: destroyed connection of node1-9955-2010/03/16-16:09:56:210129-172.16.1.12-ssd1-brick [2010-03-16 17:57:44] N [server-protocol.c:5852:mop_setvolume] ssd1-server-tcp: accepted client from 172.16.1.12:1023 [2010-03-16 17:57:44] N [server-protocol.c:5852:mop_setvolume] ssd1-server-tcp: accepted client from 172.16.1.12:1022
Hi Dushyanth, Can you please let us know the memory usage of glusterfs processes when you observe this problem? Thanks, Vijay
Hi Vijay, Unfortunately i dont have that information. I will repeat the test and monitor per process memory usage as well. Iam attaching system wide memory & paging statistics of both nodes from gluster start to the time when out of memory errors occurred. Thanks Dushyanth
Created attachment 160 [details] Stack trace after fault from gdb
Created attachment 161 [details] An NFS kickstart test case for the lab
Also note that gluster server, client & postmark are the only active processes running on these nodes
Hi Dushyanth, Can you please check if commenting out quickread in the volume file would help? Please note that you will also have to change subvolumes associated with volume ssd*-statprefetch to refer to ssd*-iocache. Thanks, Vijay
Is this to test the OOM issue or "invalid argument: dictbuf" which i think is actually causing the disconnects ?
We suspect both the OOM issue and "invalid argument: dictbuf" to be due to having quickread translator in the setup. Please let us know how this goes.
Ok - Thanks for the clarification. Starting the test with psacct and some ps, top snapshot logging every 2 mts. Will let you know how this goes
* Removing quickread solved the problem - Thx * What is the drawback of not having quickread ? * Also quickread has a limitation of 1M - can this be increased to 2M. Would help in certain cases.
(In reply to comment #13) > * What is the drawback of not having quickread ? The drawback would be that reads of small files would involve extra network transfers. With quickread, the file data gets cached in a lookup call and hence additional read calls are avoided. > * Also quickread has a limitation of 1M - can this be increased to 2M. Would > help in certain cases. Right now, we have a limitation in our protocol which does not allow us to cache file sizes greater than 1000 KB. We will explore the possibility of doing this.
Hey, Any updates on what the issue with quickread is ? Thanks Dushyanth
(In reply to comment #15) > > Any updates on what the issue with quickread is ? > This issue is the same that is mentioned in enhancement Bug 723: * An upper limit specifying maximum memory quick-read can use for caching has to be implemented. We will be fixing it soon as part of Bug 723.
*** This bug has been marked as a duplicate of bug 723 ***