846641 – gluster-object: messages with "async update later"

Bug 846641 - gluster-object: messages with "async update later"

Summary: gluster-object: messages with "async update later"

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	gluster-swift
Sub Component:
Version:	2.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	unspecified
Target Milestone:	---
Target Release:	---
Assignee:	Luis Pabón
QA Contact:	pushpesh sharma
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	858438
TreeView+	depends on / blocked

Reported:	2012-08-08 10:21 UTC by Saurabh
Modified:	2016-11-08 22:25 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Clones:	858438 (view as bug list)
Environment:
Last Closed:	2013-09-23 22:32:19 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Saurabh 2012-08-08 10:21:00 UTC

Description of problem:

cluster info:-

4 physical machines

gluster volume info: distribute-replicate(2x2) type

==============================================================================

Aug  8 10:07:12 gqac028 object-server ERROR container update failed with 127.0.0.1:6011/sdb1 (saving for async update later): ConnectionTimeout (5.0s) (txn: tx695077467d7f4f5d89a9b9c1783b2f7f)
Aug  8 10:07:14 gqac028 object-server ERROR container update failed with 127.0.0.1:6011/sdb1 (saving for async update later): ConnectionTimeout (5.0s) (txn: tx8aee3d4d2cae44cdaa9b6ca3597a7898)
Aug  8 10:07:14 gqac028 object-server ERROR container update failed with 127.0.0.1:6011/sdb1 (saving for async update later): ConnectionTimeout (5.0s) (txn: txf534ac48abc74b7fada908d64ece4c0f)
Aug  8 10:07:14 gqac028 object-server ERROR container update failed with 127.0.0.1:6011/sdb1 (saving for async update later): ConnectionTimeout (5.0s) (txn: tx4cbc78f252f444c7bc45400003445d7c)
Aug  8 10:07:19 gqac028 object-server ERROR container update failed with 127.0.0.1:6011/sdb1 (saving for async update later): ConnectionTimeout (5.0s) (txn: txcf42532334ee4c5e966e441c31202821)
Aug  8 10:07:20 gqac028 object-server ERROR container update failed with 127.0.0.1:6011/sdb1 (saving for async update later): ConnectionTimeout (5.0s) (txn: tx60bcb4674afd4a748d91c7f59c66159d)

Version-Release number of selected component (if applicable):
[root@gqac028 cont5]# glusterfs -V
glusterfs 3.3.0rhs built on Jul 25 2012 11:21:57
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General Public Licen

How reproducible:
happening this time

Steps to Reproduce:
1. send curl request to upload files to 2000 different containers
2. 
3.
  
Actual results:
the error 

Expected results:
All the objects should uploaded without any error.

Additional info:


[root@gqac028 cont5]# gluster volume info test2
 
Volume Name: test2
Type: Distributed-Replicate
Volume ID: f049d836-8cc8-47fd-bab1-975be12bf0fd
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: 10.16.157.81:/home/test2-dr
Brick2: 10.16.157.75:/home/test2-drr
Brick3: 10.16.157.78:/home/test2-ddr
Brick4: 10.16.157.21:/home/test2-ddrr

Comment 2 Junaid 2013-01-22 06:57:55 UTC

This seems to be a performance issue. Can you verify the same in gluster-swift-1.7.4 rpms.

Comment 3 pushpesh sharma 2013-07-17 10:36:55 UTC

This BZ has been verified using catalyst workload on RHS2.1.It seems to be fixed, as new PDQ performance related changes are merged to RHS2.1. 

[root@dhcp207-9 ~]# rpm -qa|grep gluster
gluster-swift-object-1.8.0-6.3.el6rhs.noarch
vdsm-gluster-4.10.2-22.7.el6rhs.noarch
gluster-swift-plugin-1.8.0-2.el6rhs.noarch
glusterfs-geo-replication-3.4.0.12rhs.beta3-1.el6rhs.x86_64
glusterfs-3.4.0.12rhs.beta3-1.el6rhs.x86_64
gluster-swift-1.8.0-6.3.el6rhs.noarch
glusterfs-server-3.4.0.12rhs.beta3-1.el6rhs.x86_64
gluster-swift-proxy-1.8.0-6.3.el6rhs.noarch
gluster-swift-account-1.8.0-6.3.el6rhs.noarch
glusterfs-rdma-3.4.0.12rhs.beta3-1.el6rhs.x86_64
glusterfs-fuse-3.4.0.12rhs.beta3-1.el6rhs.x86_64
gluster-swift-container-1.8.0-6.3.el6rhs.noarch


All performance related tests(From QE perspective) will be done using catalyst workload(If required in future may be ssbench).Which has 15 runs of 10000 requests(PUT/GET/HEAD/DELETE) each distributed among 10 threads.These comprehensive test include all file formats and varied sizes.These test executed on a machine with following configuration:-

RAM:- 7500Gb
CPU:- 1
Volume Info:-

All bricks are created as a logical volume(on localhost) of 10G each, and each volume has 4 of such bricks.

[root@dhcp207-9 ~]# gluster volume info
 
Volume Name: test
Type: Distribute
Volume ID: 440fdac0-a3bd-4ab1-a70c-f4c390d97100
Status: Started
Number of Bricks: 4
Transport-type: tcp
Bricks:
Brick1: localhost:/mnt/lv1/lv1
Brick2: localhost:/mnt/lv2/lv2
Brick3: localhost:/mnt/lv3/lv3
Brick4: localhost:/mnt/lv4/lv4
 
Volume Name: test2
Type: Distribute
Volume ID: 6d922203-6657-4ed3-897a-069ef6c396bf
Status: Started
Number of Bricks: 4
Transport-type: tcp
Bricks:
Brick1: localhost:/mnt/lv5/lv5
Brick2: localhost:/mnt/lv6/lv6
Brick3: localhost:/mnt/lv7/lv7
Brick4: localhost:/mnt/lv8/lv8


PS: Performance Engineering will be responsible for all large scale test , which will be done on BAGL cluster.

Comment 4 Scott Haines 2013-09-23 22:32:19 UTC

Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html

Note You need to log in before you can comment on or make changes to this bug.