Bug 155304 - gnbd_monitor doesn't correctly reset after an uncached gnbd has failed and been restored
gnbd_monitor doesn't correctly reset after an uncached gnbd has failed and be...
Product: Red Hat Cluster Suite
Classification: Red Hat
Component: gnbd (Show other bugs)
All Linux
medium Severity medium
: ---
: ---
Assigned To: Robert Peterson
Cluster QE
Depends On:
  Show dependency treegraph
Reported: 2005-04-18 17:21 EDT by Ben Marzinski
Modified: 2009-04-16 16:28 EDT (History)
1 user (show)

See Also:
Fixed In Version: RHBA-2006-0170
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2006-01-06 15:28:59 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Ben Marzinski 2005-04-18 17:21:39 EDT
when an uncached gnbd fails, gnbd_monitor fences the server if it is
nonresponsive. Then it waits for all current users of the device to close it.
Finally it tries to contact the server at regular intervals.  If the server
comes back up, and reexports the device. gnbd_monitor is supposed to reimport it
and start the monitoring all over again.

Currently, the check to make sure that the reimport was successful is wrong, so
usually, after the device has been successfully reimported, gnbd_monitor will
not reset. The next time that the device fails, gnbd_monitor will skip the fence
steps and simply try and reimport the device.  This means that it cases where
the gnbd server is nonresponsive, but the gnbd server node is still alive,
gnbd_monitor will not fence the server after the first time.
Fixing this problem involves changing the line
if (check_recvd(dev) == 1)
if (check_recvd(dev) >= 0)
which is obviously the correct thing to check for.

A related issue is the requirement that gnbd_monitor waits until all users have
closed the device.  This is an unnecessary requirement, and it makes it much
harder to use dm-multipath, since dm-multipath keeps failed paths open.
Comment 1 Red Hat Bugzilla 2006-01-06 15:28:59 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.