Bug 1668368 - image upload failed even swift-object-server failed on one of controller node in 3 controller nodes environment
Summary: image upload failed even swift-object-server failed on one of controller node...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-swift
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: ---
: ---
Assignee: Christian Schwede (cschwede)
QA Contact: Gal Amado
Tana
URL:
Whiteboard:
: 1668370 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-01-22 14:56 UTC by Meiyan Zheng
Modified: 2023-07-18 05:12 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-12-02 11:55:42 UTC
Target Upstream Version:
Embargoed:
cschwede: needinfo-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 751278 0 None MERGED Add proxy config option to set recoverable_node_timeout 2020-12-23 07:40:19 UTC
OpenStack gerrit 751279 0 None NEW Decrease Swift proxy timeouts for GET/HEAD requests 2020-12-23 07:40:51 UTC
Red Hat Issue Tracker OSP-11149 0 None None None 2021-12-02 11:58:50 UTC

Comment 4 Christian Schwede (cschwede) 2019-01-31 10:33:21 UTC
So the problem here is that the container is paused, not stopped. 

If it is paused it will freeze the container, and any network request is simply not answered (not even reset).

Eg., a request to a stopped swift object server container looks like this:

[root@cocurl http://172.17.4.10:6000
curl: (7) Failed connect to 172.17.4.10:6000; Connection refused

However, a curl request to a paused container just hangs - there is no RST sent by the server, it simply waits.

The same happens in the Swift code and it seems like the client hits the timeout then, before Swift has a chance to try another server.
I think we need a better way to handle errors like this.

If you stop the container and try uploading the image, it will succeed (using the two remaining servers).

Comment 9 Christian Schwede (cschwede) 2020-02-06 15:49:14 UTC
*** Bug 1668370 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.