Bug 1473762 - [GSS] [RFE] Add Graceful brick shutdown for clients to reduce service loss during reboot.
[GSS] [RFE] Add Graceful brick shutdown for clients to reduce service loss du...
Status: NEW
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: glusterfs (Show other bugs)
3.2
Unspecified Unspecified
medium Severity high
: ---
: ---
Assigned To: Samikshan Bairagya
Bala Konda Reddy M
: FutureFeature
: 1473759 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-07-21 10:46 EDT by Paul Armstrong
Modified: 2017-10-21 13:06 EDT (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Paul Armstrong 2017-07-21 10:46:58 EDT
Description of problem: Customer is working through patching and other maintenance scenarios. When system is rebooted after patching, client systems experience network.ping-timeout seconds of lost access to the volume. Customer is looking to minimize the loss of volume availability during maintenance. i.e Leave volume online while patches are being applied, then reboot the system, glusterd manages the graceful disconnect of clients from the brick on shutdown, then self-heal is initiated when the system comes back online.


Version-Release number of selected component (if applicable):
3.2 (mine)
3.3 (customer)

How reproducible:
Always.

Steps to Reproduce:
1. run client in a test mode writing to replicated volume.
2. patch one server and reboot


Actual results:
3. client hangs for 42s by default

Expected results:
3. client continues to write to available volume.

Additional info:

workaround:
pkill -f volume_path prior to reboot
gluster volume start vol_name force on reboot

better would be:
gluster volume maintenance vol_name on
gluster peer maintenance peer_name on

tells individual volume or all volumes on a node to go into maintenance mode - when signalled they gracefully terminate connections and let clients know that they are unavailable. 

gluster volume maintenance vol_name off
gluster peer maintenance peer_name off

instructs individual volumes or all volumes on a node to exit maintenance mode, forcibly start and perform self-heal operations.

Customer is opening case.
Comment 2 Rejy M Cyriac 2017-07-26 03:50:39 EDT
*** Bug 1473759 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.