Bug 1454338 - first_lookup slow in disperse volumes
Summary: first_lookup slow in disperse volumes
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: GlusterFS
Classification: Community
Component: disperse
Version: mainline
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
Assignee: Pranith Kumar K
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-05-22 13:43 UTC by Raghavendra Talur
Modified: 2017-08-17 08:51 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-08-17 08:51:50 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Raghavendra Talur 2017-05-22 13:43:51 UTC
Description of problem:

When a brick is down or when a brick is not healed yet, it is observed that the "Going up" message in client logs come 10 seconds after being connected to most of the bricks. It would be good to have clients come up sooner if minimum number of bricks are up.

How reproducible:
Have seen multiple times, not sure if 100%

Comment 1 Xavi Hernandez 2017-05-24 06:29:54 UTC
This is by design. It only affects the initial mount from a client and it's done to avoid unnecessary heals when bricks are busy or being started at the same time than the mount. Once the initial timeout has expired or all bricks have reported, the volume will work without any delay, even if bricks go down and up later (as long as there are enough healthy bricks).

EC waits for up to 10 seconds until all bricks have reported UP or DOWN state. If all bricks have reported before the 10 seconds timeout, the volume will be brought UP or DOWN depending on how many UP bricks are available. If not all bricks have reported in 10 seconds, the volume will go UP or DOWN depending on the number of UP bricks.


Note You need to log in before you can comment on or make changes to this bug.