Bug 1124448 - register_bcache return value does not distinguish between transient and permanent failures
Summary: register_bcache return value does not distinguish between transient and perma...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 20
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-07-29 14:02 UTC by Ian Pilcher
Modified: 2014-07-30 00:25 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-07-30 00:25:14 UTC


Attachments (Terms of Use)
Make register_bcache distinguish (possibly) non-fatal errors (1.08 KB, text/plain)
2014-07-29 14:02 UTC, Ian Pilcher
no flags Details


Links
System ID Priority Status Summary Last Updated
Linux Kernel 80961 None None None Never

Description Ian Pilcher 2014-07-29 14:02:51 UTC
Created attachment 922159 [details]
Make register_bcache distinguish (possibly) non-fatal errors

When registering a bcache device (cache or backing device), register_bcache attempts to acquire exclusive access to the device, and returns an error if it fails to do so.  In the error path, it checks whether the device is already registered as a bcache device, and it logs a different error message in this case ("device already registered" vs. "device busy"), but it does not return a different value.  It returns -EINVAL for all errors.

This means that userspace has no way of distinguishing between the "device busy" case, which should potentially be retried and the "already registered" case (or other errors), which are presumably fatal.

This is a real problem.  I have found that registration of a partition on an MD RAID (imsm) device fails about 50% of the time when triggered by udev.  Changing register_bcache's return value to distinguish between the two cases and modifying the udev helper to retry in the "device busy" case appears to fix the problem.

The attached patch changed register_bcache to return -EBUSY in the case where it cannot obtain exclusive access to the device, but it is not already registered with bcache.  It still returns -EINVAL for all other errors.

Comment 1 Josh Boyer 2014-07-29 14:54:25 UTC
http://thread.gmane.org/gmane.linux.kernel.bcache.devel/2594

Hm.  No response from the upstream maintainers.  A couple things you should probably do here:

1) Make sure you send your patch not as an attachment, with a strip level of 1.
2) Include a changelog and a Signed-off-by line
3) Send it directly to the primary maintainer (Kent Overstreet) with the list CC'd.

It's probably best to review Documentation/SubmittingPatches in the kernel.

Comment 2 Ian Pilcher 2014-07-30 00:25:14 UTC
(In reply to Josh Boyer from comment #1)
> http://thread.gmane.org/gmane.linux.kernel.bcache.devel/2594
> 
> Hm.  No response from the upstream maintainers.  A couple things you should
> probably do here:
> 
> 1) Make sure you send your patch not as an attachment, with a strip level of
> 1.
> 2) Include a changelog and a Signed-off-by line
> 3) Send it directly to the primary maintainer (Kent Overstreet) with the
> list CC'd.
> 
> It's probably best to review Documentation/SubmittingPatches in the kernel.

As I responded in the kernel.org bug ...

It never occurred to me that the patch would go in as is.  I don't even know if it applies against the latest development tree (and I've long since given up playing kernel patch Whack-a-Mole).

This is a bug report, that happens to include a suggested approach to fixing the issue.

Since all roads seem to lead to bcache-devel, and no one there appears interested ...


Note You need to log in before you can comment on or make changes to this bug.