Description of problem: I had tried to PUT 1000 objects inside a container. The PUT operations were happening in a loop one after the other. before this I had changed the proxy-server.conf with parameter "workers = 4" and actually it created 962 files and after that it reported "503 Service Unavailable." Mar 26 02:37:25 QA-50 object-server ERROR __call__ error with PUT /sdb1/19407/AUTH_test/container3/file1KB_991 : #012Traceback (most recent call last):#012 File "/usr/lib/python2.6/site-packages/swift-1.4.9-py2.6.egg/swift/obj/server.py", line 859, in __call__#012 res = getattr(self, req.method)(req)#012 File "/usr/lib/python2.6/site-packages/swift-1.4.9-py2.6.egg/swift/obj/server.py", line 575, in PUT#012 obj, self.logger, disk_chunk_size=self.disk_chunk_size)#012 File "/usr/lib/python2.6/site-packages/swift-1.4.9-py2.6.egg/swift/obj/server.py", line 399, in get_DiskFile_obj#012 disk_chunk_size, fs_object = self.fs_object);#012 File "/usr/lib/python2.6/site-packages/swift-1.4.9-py2.6.egg/swift/plugins/DiskFile.py", line 76, in __init__#012 check_valid_account(account, fs_object)#012 File "/usr/lib/python2.6/site-packages/swift-1.4.9-py2.6.egg/swift/plugins/utils.py", line 356, in check_valid_account#012 return _check_valid_account(account, fs_object)#012 File "/usr/lib/python2.6/site-packages/swift-1.4.9-py2.6.egg/swift/plugins/utils.py", line 326, in _check_valid_account#012 if not check_account_exists(fs_object.get_export_from_account_id(account), \#012 File "/usr/lib/python2.6/site-packages/swift-1.4.9-py2.6.egg/swift/plugins/Glusterfs.py", line 99, in get_export_from_account_id#012 for export in self.get_export_list():#012 File "/usr/lib/python2.6/site-packages/swift-1.4.9-py2.6.egg/swift/plugins/Glusterfs.py", line 92, in get_export_list#012 return self.get_export_list_local()#012 File "/usr/lib/python2.6/site-packages/swift-1.4.9-py2.6.egg/swift/plugins/Glusterfs.py", line 52, in get_export_list_local#012 raise Exception('Getting volume failed %s', self.name)#012Exception: ('Getting volume failed %s', 'glusterfs') (txn: tx3163382072f4453495bc2693aecc8aff) Version-Release number of selected component (if applicable): 3.3.0qa30 and swift 1.4.7 How reproducible: executed the script once, but had seen the same issue at more than one instance Steps to Reproduce: 1. add workers=4 in the proxy-servers.conf file 2. start creating files 3. Actual results: after creating 962 files , response is 503 service unavailable Expected results: all the files should have got created. Additional info:
*** Bug 782003 has been marked as a duplicate of this bug. ***
I have executed several tests to find the correct config with worker threads, node_timeout and other variables, Still things fail after tweaking several variables, though parallel operations work properly on original as tried out with 100 objects in parallel for both https and http. Here with swift_plugin the HEAD fails usually as we try to find out the account availability, Apr 26 05:47:28 QA-91 account-server ERROR __call__ error with HEAD /sdb1/48757/AUTH_test2 : #012Traceback (most recent call last):#012 File "/usr/lib/python2.6/site-packages/swift/account/server.py", line 361, in __call__#012 res = getattr(self, req.method)(req)#012 File "/usr/lib/python2.6/site-packages/swift/account/server.py", line 163, in HEAD#012 broker = self._get_account_broker(drive, part, account)#012 File "/usr/lib/python2.6/site-packages/swift/account/server.py", line 62, in _get_account_broker#012 return DiskAccount(self.root, account, self.fs_object);#012 File "/usr/lib/python2.6/site-packages/swift/plugins/DiskDir.py", line 403, in __init__#012 check_valid_account(account, fs_object)#012 File "/usr/lib/python2.6/site-packages/swift/plugins/utils.py", line 356, in check_valid_account#012 return _check_valid_account(account, fs_object)#012 File "/usr/lib/python2.6/site-packages/swift/plugins/utils.py", line 326, in _check_valid_account#012 if not check_account_exists(fs_object.get_export_from_account_id(account), \#012 File "/usr/lib/python2.6/site-packages/swift/plugins/Glusterfs.py", line 99, in get_export_from_account_id#012 for export in self.get_export_list():#012 File "/usr/lib/python2.6/site-packages/swift/plugins/Glusterfs.py", line 92, in get_export_list#012 return self.get_export_list_local()#012 File "/usr/lib/python2.6/site-packages/swift/plugins/Glusterfs.py", line 52, in get_export_list_local#012 raise Exception('Getting volume failed %s', self.name)#012Exception: ('Getting volume failed %s', 'glusterfs') (txn: tx5d8fdd3b858c4e11817e70604a0fa9b2) Hopefully load balancing may help for parallel operations.
parallel PUTs operations also fail because of a tmp file not getting created in correct location, Apr 26 23:32:09 QA-91 object-server ERROR Container update failed (saving for async update later): 500 response from 127.0.0.1:6011/sdb1 (txn: tx7695bbc2e6034d3fbdc89cb1135e8bfb) Apr 26 23:32:09 QA-91 object-server ERROR __call__ error with PUT /sdb1/79140/AUTH_test2/cont1/zero19 : #012Traceback (most recent call last):#012 File "/usr/lib/python2.6/site-packages/swift/obj/server.py", line 859, in __call__#012 res = getattr(self, req.method)(req)#012 File "/usr/lib/python2.6/site-packages/swift/obj/server.py", line 655, in PUT#012 device)#012 File "/usr/lib/python2.6/site-packages/swift/obj/server.py", line 471, in container_update#012 contdevice, headers_out, objdevice)#012 File "/usr/lib/python2.6/site-packages/swift/obj/server.py", line 449, in async_update#012 os.path.join(self.devices, objdevice, 'tmp'))#012 File "/usr/lib/python2.6/site-packages/swift/common/utils.py", line 860, in write_pickle#012 fd, tmppath = mkstemp(dir=tmp, suffix='.tmp')#012 File "/usr/lib64/python2.6/tempfile.py", line 293, in mkstemp#012 return _mkstemp_inner(dir, prefix, suffix, flags)#012 File "/usr/lib64/python2.6/tempfile.py", line 228, in _mkstemp_inner#012 fd = _os.open(file, flags, 0600)#012OSError: [Errno 2] No such file or directory: '/mnt/gluster-object/sdb1/tmp/tmp5aiFgJ.tmp' (txn: tx7695bbc2e6034d3fbdc89cb1135e8bfb) the mentioned in the exceptio does not exist, and sdb1 translation should be taken care to replace it with AUTH_test.
The above error (comment 3) is fixed as part of https://bugzilla.redhat.com/show_bug.cgi?id=821310 bug.
as per Junaid's comment, the error mentioned in comment 3 is not seen anymore with parallel PUT requests, though 503 service unavailable issues may still happen, for which a similar is opened i.e. 821310. Other issues, like HEAD related problems are getting avoided for sometime only by providing larger values to the varibles like recheck_container_existence and recheck_account_existece. THough the similar issue may happen after the time lapse, setting the larger values for the earlier mentioned variables provides some workaround.