Description of problem: Customer is working through patching and other maintenance scenarios. When system is rebooted after patching, client systems experience network.ping-timeout seconds of lost access to the volume. Customer is looking to minimize the loss of volume availability during maintenance. i.e Leave volume online while patches are being applied, then reboot the system, glusterd manages the graceful disconnect of clients from the brick on shutdown, then self-heal is initiated when the system comes back online. Version-Release number of selected component (if applicable): 3.2 (mine) 3.4 (customer) How reproducible: Always. Steps to Reproduce: 1. run client in a test mode writing to replicated volume. 2. patch one server and reboot Actual results: 3. client hangs for 42s by default Expected results: 3. client continues to write to available volume. Additional info: workaround: pkill -f volume_path prior to reboot gluster volume start vol_name force on reboot better would be: gluster volume maintenance vol_name on gluster peer maintenance peer_name on tells individual volume or all volumes on a node to go into maintenance mode - when signalled they gracefully terminate connections and let clients know that they are unavailable. gluster volume maintenance vol_name off gluster peer maintenance peer_name off instructs individual volumes or all volumes on a node to exit maintenance mode, forcibly start and perform self-heal operations. Customer is opening case.
*** This bug has been marked as a duplicate of bug 1473762 ***