Bug 806841

Summary:	object-strorage: GET for large data set fails
Product:	[Community] GlusterFS	Reporter:	Saurabh <saujain>
Component:	object-storage	Assignee:	Junaid <junaid>
Status:	CLOSED CURRENTRELEASE	QA Contact:	Saurabh <saujain>
Severity:	medium	Docs Contact:
Priority:	high
Version:	pre-release	CC:	divya, gluster-bugs, mzywusko, redhat, rfortier, vagarwal
Target Milestone:	---
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:	glusterfs-3.4.0	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2013-07-24 17:57:39 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	DP	CRM:
Verified Versions:	3.3.0qa45	Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	817967

Description Saurabh 2012-03-26 10:41:43 UTC

Description of problem:

GET for large data fails, complaining about the timeout,
Mar 25 23:27:51 QA-50 account-server 127.0.0.1 - - [26/Mar/2012:03:27:51 +0000] "HEAD /sdb1/56806/AUTH_test" 204 - "tx3dab5af984e94e678ebf67ac9c534855" "-" "-" 2.5358 ""
Mar 25 23:28:01 QA-50 proxy-server ERROR with Container server 127.0.0.1:6011/sdb1 re: Trying to GET /v1/AUTH_test/container2: Timeout (10s) (txn: tx3dab5af984e94e678ebf67ac9c534855) (client_ip: 172.17.251.78)
Mar 25 23:28:01 QA-50 proxy-server Container GET returning 503 for [] (txn: tx3dab5af984e94e678ebf67ac9c534855) (client_ip: 172.17.251.78)
Mar 25 23:28:01 QA-50 proxy-server 172.17.251.78 172.17.251.78 26/Mar/2012/03/28/01 GET /v1/AUTH_test/container2 HTTP/1.0 503 - curl/7.19.7%20%28x86_64-redhat-linux-gnu%29%20libcurl/7.19.7%20NSS/3.12.9.0%20zlib/1.2.3%20libidn/1.18%20libssh2/1.2.2 test%2CAUTH_tke0f94be4b83644fa8276b5471a0aed05 - - - tx3dab5af984e94e678ebf67ac9c534855 - 12.6188 -


now, going further if I try to do a GET on container3 which is just having one file, that also fails,

and then after restarting the swift again , the GET to container3 works ,

but will fail again if a GET is tried on container2, which is having more than 60000 files.


Version-Release number of selected component (if applicable):
3.3.0qa30 and swift 1.4.7

How reproducible:
always


Steps to Reproduce:
1.create more than 70000 objects inside a container.
2.try to do a GET for the dataset.
3.
  
Actual results:

GET fails with 503 service unavailable

Expected results:

at least the GET should when "limit" delimiter is set to a lower value

Additional info:

some times I have found these kind of messages also,
[root@QA-50 ~]# tail -f /var/log/messages
Mar 25 23:38:23 QA-50 container-server STDOUT: BaseHTTPServer.BaseHTTPRequestHandler.finish(self) (txn: tx3dab5af984e94e678ebf67ac9c534855)
Mar 25 23:38:23 QA-50 container-server STDOUT: File "/usr/lib64/python2.6/SocketServer.py", line 661, in finish (txn: tx3dab5af984e94e678ebf67ac9c534855)
Mar 25 23:38:23 QA-50 container-server STDOUT: self.wfile.flush() (txn: tx3dab5af984e94e678ebf67ac9c534855)
Mar 25 23:38:23 QA-50 container-server STDOUT: File "/usr/lib64/python2.6/socket.py", line 303, in flush (txn: tx3dab5af984e94e678ebf67ac9c534855)
Mar 25 23:38:23 QA-50 container-server STDOUT: self._sock.sendall(buffer(data, write_offset, buffer_size)) (txn: tx3dab5af984e94e678ebf67ac9c534855)
Mar 25 23:38:23 QA-50 container-server STDOUT: File "/usr/lib/python2.6/site-packages/eventlet/greenio.py", line 283, in sendall (txn: tx3dab5af984e94e678ebf67ac9c534855)
Mar 25 23:38:23 QA-50 container-server STDOUT: tail = self.send(data, flags) (txn: tx3dab5af984e94e678ebf67ac9c534855)
Mar 25 23:38:23 QA-50 container-server STDOUT: File "/usr/lib/python2.6/site-packages/eventlet/greenio.py", line 269, in send (txn: tx3dab5af984e94e678ebf67ac9c534855)
Mar 25 23:38:23 QA-50 container-server STDOUT: total_sent += fd.send(data[total_sent:], flags) (txn: tx3dab5af984e94e678ebf67ac9c534855)
Mar 25 23:38:23 QA-50 container-server STDOUT: error: [Errno 32] Broken pipe (txn: tx3dab5af984e94e678ebf67ac9c534855)

Comment 1 Saurabh 2012-03-30 12:43:19 UTC

This issue can be resolved if the changes in configuration files are made, 
this includes adding the variable node_timeout = 60 in the proxy-server and container-server config files.

Comment 2 Junaid 2012-05-25 05:54:12 UTC

Saurabh, the issue was with a bottleneck in the code which is fixed in the new release and also requires some tuning in the configuration files which are available in the new rpm's.

Comment 3 Saurabh 2012-06-01 06:40:33 UTC

Created 20000 files of size 1MB inside a subdir of a container and tried to list the container objects, but the result is this,


[root@QA-39 object-dir]# curl -v -H 'X-Storage-Token: AUTH_tk92ce6a0a1224460dbd30a0146a767877' http://172.17.251.90:8080/v1/AUTH_test/cont1/ 
* About to connect() to 172.17.251.90 port 8080 (#0)
*   Trying 172.17.251.90... connected
* Connected to 172.17.251.90 (172.17.251.90) port 8080 (#0)
> GET /v1/AUTH_test/cont1/ HTTP/1.1
> User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.12.9.0 zlib/1.2.3 libidn/1.18 libssh2/1.2.2
> Host: 172.17.251.90:8080
> Accept: */*
> X-Storage-Token: AUTH_tk92ce6a0a1224460dbd30a0146a767877
> 
< HTTP/1.1 503 Internal Server Error
< Content-Type: text/html; charset=UTF-8
< Content-Length: 0
< Date: Fri, 01 Jun 2012 00:30:54 GMT
< 
* Connection #0 to host 172.17.251.90 left intact
* Closing connection #0


altogether for large number of objects it is failing to list them, this was not seen earlier.

Comment 4 Junaid 2012-06-05 06:30:32 UTC

Can you please mention the GlusterFS version and gluster-swift version. Also, the type of GlusterFS used and the hierarchy structure of objects(number of subdirs in the object) and the timeout values.

Comment 5 Saurabh 2012-06-05 11:18:14 UTC

GlusterFS version: 3.3.0qa45
gluster-object- swift-rc1-rpms
glusterfs volume: distribute-replicate (2x2)
machine type: four physical nodes with 48GB RAM and 24 cpus

didn't modify the node-timeout values as you mentioned that the issue is been fixed by changes in the code.

Comment 6 Saurabh 2012-06-08 11:32:59 UTC

tried with swift-rc-rpms and 3.3.0qa45
the proxy-server config files are updated with the tunables like,
node_timeout = 120
conn_timeout = 5
workers = 4
backlog = 10000


The GET over 10000 objects works and list all the objects,

Below I am just showing the result of the HEAD

[root@QA-31 ~]# curl -v -H 'X-Auth-Token: AUTH_tk7fa5b73946254642a5c916a296392407' http://10.16.157.63:8080/v1/AUTH_test/cont2/ -X HEAD
* About to connect() to 10.16.157.63 port 8080 (#0)
*   Trying 10.16.157.63... connected
* Connected to 10.16.157.63 (10.16.157.63) port 8080 (#0)
> HEAD /v1/AUTH_test/cont2/ HTTP/1.1
> User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.12.9.0 zlib/1.2.3 libidn/1.18 libssh2/1.2.2
> Host: 10.16.157.63:8080
> Accept: */*
> X-Auth-Token: AUTH_tk7fa5b73946254642a5c916a296392407
> 
< HTTP/1.1 204 No Content
< X-Container-Object-Count: 10000
< X-Container-Bytes-Used: 10485760000
< Accept-Ranges: bytes
< Content-Length: 0
< Date: Fri, 08 Jun 2012 11:24:33 GMT
< 
* Connection #0 to host 10.16.157.63 left intact
* Closing connection #0

Comment 7 Saurabh 2012-06-11 14:03:23 UTC

I have tried to collect the info for the latest rpms and found with tunables been set as for rc-rpm config the GET for 10000 files work,

but without the tunables the GET still fails.

Although, with stat-prefetch I tried to find the difference in numbers of maximum objects can be listed  with this option on/off.


so, with stat-prefetch on 2000 objects got listed.
stat-prefetch off .. response is "503 Internal server"
again stat-prefetch on ..  response is "503 Internal server"

even if you reduce the number of files, to 1900 or 1800 with stat-prefetch on

the issue remains same.

seems there is some issue using stat-prefetch itself, 

vol config files were updated with correct information of stat-prefetch on/off.

Comment 8 Junaid 2012-06-12 05:56:50 UTC

Since the load that you are applying is high, to successfully complete the request we should tune the conf files.

Comment 9 Saurabh 2012-06-12 06:18:04 UTC

So, I have tested the GET with some tunables like giving larger values for node_timeout, client_timeout, conn_timeout, workers and backlog, the GET passed for 10000 objects.

Comment 10 Shawn Heisey 2012-11-19 19:08:27 UTC

Has anyone thought about what happens if you've got a container with five million files (or more, perhaps 25 million) in it, contained in directory tree that's a few levels deep and has thousands of directories?  Is this completely impossible?