Bug 827538 - DVB USB device firmware requested in module_init()
Summary: DVB USB device firmware requested in module_init()
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 17
Hardware: x86_64
OS: Linux
unspecified
low
Target Milestone: ---
Assignee: Mauro Carvalho Chehab
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-06-01 17:57 UTC by Antti Palosaari
Modified: 2013-07-04 22:59 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-01-24 22:38:13 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
syslog (2.75 KB, text/plain)
2012-06-01 17:57 UTC, Antti Palosaari
no flags Details
Possible fix by doing direct filesystem firmware loading (2.32 KB, patch)
2012-10-03 16:35 UTC, Linus Torvalds
no flags Details | Diff
Improved version of previous patch (2.75 KB, patch)
2012-10-03 18:12 UTC, Linus Torvalds
no flags Details | Diff

Description Antti Palosaari 2012-06-01 17:57:14 UTC
Created attachment 588547 [details]
syslog

Description of problem:

DVB USB device firmware downloading takes 30 seconds.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:

It takes over 30 seconds until device appears.

Expected results:

Device should be ready during 2 seconds when plugged.


Additional info:

errors seen in log:
udevd[423]: worker [1581] timeout, kill it
udevd[423]: worker [1581] terminated by signal 9 (Killed)

Downgrading to Fedora 16 udev packages (udev libudev libgudev1) resolved the problem.

Comment 1 Kay Sievers 2012-06-01 19:44:17 UTC
This is very likely a kernel driver issue.

Drivers must not load firmware in the module_init() path, or device
probe()/bind() path. This creates a deadlock in the event handling.

We used to silently try to work around that, but recently started
to log this error explicitely.

The firmware should in general be requested asynchronously, or at the first
open() of the device.

Details are here:
  http://thread.gmane.org/gmane.linux.network/217729

Comment 2 Mauro Carvalho Chehab 2012-06-01 21:01:36 UTC
(In reply to comment #1)
> This is very likely a kernel driver issue.
> 
> Drivers must not load firmware in the module_init() path, or device
> probe()/bind() path. This creates a deadlock in the event handling.
> 
> We used to silently try to work around that, but recently started
> to log this error explicitely.
> 
> The firmware should in general be requested asynchronously, or at the first
> open() of the device.
> 
> Details are here:
>   http://thread.gmane.org/gmane.linux.network/217729

Key,

There are _several_ media drivers that requires loading the firmware during device probe, as, without a firmware, the driver is not able to detect what ancillary drivers are also required, as, on those drivers, the "main" driver is generally just an USB or PCI bridge for stream, and an I2C bridge for ancillary drivers. The real device control happens at those I2C ancillary drivers, that provide functions for tuning and demodulating the media streams. Without the firmware, it is not possible to know what else is needed.

Comment 3 Kay Sievers 2012-06-02 02:06:53 UTC
(In reply to comment #2)
> There are _several_ media drivers that requires loading the firmware during
> device probe, as, without a firmware, the driver is not able to detect what
> ancillary drivers are also required, as, on those drivers, the "main" driver
> is generally just an USB or PCI bridge for stream, and an I2C bridge for
> ancillary drivers. The real device control happens at those I2C ancillary
> drivers, that provide functions for tuning and demodulating the media
> streams. Without the firmware, it is not possible to know what else is
> needed.

That should still work, if the firmware loading request is done asynchronous.

What we do not support anymore, and what might entirely stop to work in the
future is blocking calls to request_firmware() from module_init() or probe().

It's pretty obvious I think, that the init_module() syscall is not allowed
to call-out to a userspace transaction and block until that transaction is
fulfilled, or it has run into the 60 second default timeout. That behavior
is really broken and needs to be fixed.

Comment 4 Kay Sievers 2012-06-02 13:45:27 UTC
In case it sounded wrong, comment#1 should really read:
"Drivers must not *synchronously* load firmware in the module_init() path, or
device probe()/bind() path."

What we need to avoid is the the "modprobe" process get blocked in the kernel,
waiting for a callout to userspace, which might not be fulfilled. That creates
all sorts of problems, hence udev compalins about that now.

Comment 5 Mauro Carvalho Chehab 2012-06-02 14:09:51 UTC
(In reply to comment #4)
> In case it sounded wrong, comment#1 should really read:
> "Drivers must not *synchronously* load firmware in the module_init() path, or
> device probe()/bind() path."
> 
> What we need to avoid is the the "modprobe" process get blocked in the
> kernel,
> waiting for a callout to userspace, which might not be fulfilled. That
> creates
> all sorts of problems, hence udev compalins about that now.

While it is ok to require that module_init() should not require a firmware, device probe()/bind(), on some media devices, require firmware to happen, as otherwise the i2c bus will not be available, and the media drivers for the I2C drivers can't be probed.

A hack to run the probe asynchronously would mean that the probe would return 0 _even_ if the device driver cannot be completely loaded/binded (as the I2C drivers are required for the device to be available will not be loaded without a firmware at the bridge driver).

Comment 6 Antti Palosaari 2012-06-03 13:06:57 UTC
As Mauro said DVB devices are generally required to download firmware at the very first. This is not only for I2C adapter but remote controller too.

I delayed DVB USB framework firmware downloading and bug disappeared. I did that in USB .probe() delaying real .probe() using workqueue and returning success for the ongoing .probe() in every case.

We have now quite many drivers to fix, not only bus interface drivers but also some demod and tuner drivers that loads firmware on attach().

I also strongly suspect that same issue has been behind all the firmware downloading problems we have had during the last years. Especially resume from the suspend has been broken:
http://www.mail-archive.com/linux-media@vger.kernel.org/msg09974.html

I wonder why these udev requirements and changes are never meet relevant persons from the linux-media.

Comment 7 Hin-Tak Leung 2012-06-12 19:39:34 UTC
Just signing up to Cc. This change is probably part of the general boot-up optimization from the systemd related changes.

Comment 8 Mauro Carvalho Chehab 2012-06-19 09:49:22 UTC
(In reply to comment #5)
> (In reply to comment #4)
> > In case it sounded wrong, comment#1 should really read:
> > "Drivers must not *synchronously* load firmware in the module_init() path, or
> > device probe()/bind() path."
> > 
> > What we need to avoid is the the "modprobe" process get blocked in the
> > kernel,
> > waiting for a callout to userspace, which might not be fulfilled. That
> > creates
> > all sorts of problems, hence udev compalins about that now.
> 
> While it is ok to require that module_init() should not require a firmware,
> device probe()/bind(), on some media devices

To make it clearer, this is what Kernel Documentation/driver-model/driver.txt says about the .probe callback:

	int	(*probe)	(struct device * dev);

   The probe() entry is called in task context, with the bus's rwsem locked
   and the driver partially bound to the device.  Drivers commonly use
   container_of() to convert "dev" to a bus-specific type, both in probe()
   and other routines.  That type often provides device resource data, such
   as pci_dev.resource[] or platform_device.resources, which is used in
   addition to dev->platform_data to initialize the driver.

   This callback holds the driver-specific logic to bind the driver to a
   given device.  That includes verifying that the device is present, that
   it's a version the driver can handle, that driver data structures can
   be allocated and initialized, and that any hardware can be initialized.
   Drivers often store a pointer to their state with dev_set_drvdata().
   When the driver has successfully bound itself to that device, then probe()
   returns zero and the driver model code will finish its part of binding
   the driver to that device.

   A driver's probe() may return a negative errno value to indicate that
   the driver did not bind to this device, in which case it should have
   released all resources it allocated.

On DVB devices, it is not possible to bind, verify that the device is present and that it is a version that the driver can handle without initializing the board's I2C and probing the I2C devices on it. For that to happen, the bridge driver's firmware is required.

So, the DVB drivers are doing the right thing. It is an udev bug to block loading a firmware while the driver's probe method is being called.

Comment 9 Kay Sievers 2012-06-19 10:54:19 UTC
(In reply to comment #8)
> So, the DVB drivers are doing the right thing. It is an udev bug to block
> loading a firmware while the driver's probe method is being called.

It's a bug in the driver. Requesting firmware in a blocking manner in the
module_init() path is a serious bug in the driver. Modprobe must return,
not blindly wait for a userspace transaction to be fulfilled, which might
never happen.

Please fix the driver, or close the bug if you refuse to do it.

Udev behaves correctly and will not be changed. The problem this broken
driver behaviour causes, is bigger than udev, and udev is just the messenger
of the issue, not the cause.

Comment 10 Mauro Carvalho Chehab 2012-06-19 11:32:31 UTC
(In reply to comment #9)
> (In reply to comment #8)
> > So, the DVB drivers are doing the right thing. It is an udev bug to block
> > loading a firmware while the driver's probe method is being called.
> 
> It's a bug in the driver. Requesting firmware in a blocking manner in the
> module_init() path is a serious bug in the driver. Modprobe must return,
> not blindly wait for a userspace transaction to be fulfilled, which might
> never happen.

Firmware load at DVB is not at driver's module_init(). It is at driver's .probe(),
and it is _required_ to fulfill the Linux driver model, as specified at the Kernel's documentation, as shown above.

It is _not_possible_ to probe the device without loading the bridge's firmware, as probing the device requires access to the device's internal buses, in order to inspect and load the required drivers. Typically, a DVB device requires several sub-drivers. The simplest driver requires at least 3/4 drivers:
    - The bridge driver (the one that is modprobed);
    - The frontend driver (with the remaining logic to get a digital channel);
    - The tuner driver (with the code to set the PLL to a given frequency - sometimes part of the frontend);
    - The demod driver (with the code to decode a digital channel).

It is not uncommon to also need drivers for IR, for the analog part and some common drivers with a code shared with other drivers.

So, in any case, probing a DVB device will make the driver to go to disk, in order to load the required modules and firmwares. So, even from booting speedup PoV, not loading a firmware won't make a significant speedup, as firmware is just one (typically small) part of the data that is needed to be loaded during a DVB driver modprobe.

If you want to speed up the boot time, just modprobe those drivers in background, but don't assume that firmware can be loaded later, as it can't.

> Please fix the driver, or close the bug if you refuse to do it.

There are no bugs at the 64 media drivers that require firmware at device's probe time. They're doing the right thing. The bug is at udev (and/or modprobe/systemd) level by forbidding a driver to properly implement the .probe() callback.

> Udev behaves correctly and will not be changed. The problem this broken
> driver behaviour causes, is bigger than udev, and udev is just the messenger
> of the issue, not the cause.

It is stupid to refuse to load a firmware during probe(): even assuming that it would be possible to do it, the firmware size is generally small part (typically 16Kb-64Kb) of what is needed to be load from the disk, when comparing with the size of the driver and media core modules that are required for a DVB device to work.

Comment 11 Kay Sievers 2012-06-19 15:30:20 UTC
(In reply to comment #10)
> Firmware load at DVB is not at driver's module_init(). It is at driver's
> .probe(),

Which is usually the same context, unless you delegate that work.

> and it is _required_ to fulfill the Linux driver model, as specified at the
> Kernel's documentation, as shown above.

"fulfill the model"? - that does not make much sense in that context. The
current driver behaviour is still broken.
 
> It is _not_possible_ to probe the device without loading the bridge's
> firmware, as probing the device requires access to the device's internal
> buses, in order to inspect and load the required drivers. Typically, a DVB
> device requires several sub-drivers. The simplest driver requires at least
> 3/4 drivers:
>     - The bridge driver (the one that is modprobed);
>     - The frontend driver (with the remaining logic to get a digital
> channel);
>     - The tuner driver (with the code to set the PLL to a given frequency -
> sometimes part of the frontend);
>     - The demod driver (with the code to decode a digital channel).
> 
> It is not uncommon to also need drivers for IR, for the analog part and some
> common drivers with a code shared with other drivers.

All that doesn't matter; do whatever you need to do, but load the firmware
async, and continue to do the rest of the job when the firmware arrives, and
do not block modprobe.

> So, in any case, probing a DVB device will make the driver to go to disk, in
> order to load the required modules and firmwares. So, even from booting
> speedup PoV, not loading a firmware won't make a significant speedup, as
> firmware is just one (typically small) part of the data that is needed to be
> loaded during a DVB driver modprobe.

That's all entirely irrelevant.

> If you want to speed up the boot time, just modprobe those drivers in
> background, but don't assume that firmware can be loaded later, as it can't.

Nobody talks about boot times here. We just talk about a buggy driver.

> > Please fix the driver, or close the bug if you refuse to do it.
> 
> There are no bugs at the 64 media drivers that require firmware at device's
> probe time. They're doing the right thing. The bug is at udev (and/or
> modprobe/systemd) level by forbidding a driver to properly implement the
> .probe() callback.

Nope. Don't block in the firmware request, there is an async API for that,
and you need to use that, or run your own workqueue in whatever way.

> It is stupid to refuse to load a firmware during probe()

The only thing stupid here is to block in a syscall with a call from inside
the kernel out to userspace, and block until that is finished.

You can argue as much as you want here, the problem will not go away that
way.

Comment 12 Kay Sievers 2012-06-19 15:31:53 UTC
Mauro, please stop assigning YOUR bugs to me. I can not fix YOUR buggy drivers!

Comment 13 Linus Torvalds 2012-10-03 16:35:28 UTC
Created attachment 621017 [details]
Possible fix by doing direct filesystem firmware loading

This is not in any kind of final form, and may be horribly buggy. It also doesn't participate in the nice firmware caching logic. 

So consider this an RFC and a call for testing

Comment 14 Linus Torvalds 2012-10-03 18:12:07 UTC
Created attachment 621061 [details]
Improved version of previous patch

This patch should work even if you haven't enabled user-space FW loading support (CONFIG_FW_LOADER), and fixes up some VFS complaints that Al had about the first version of the patch.

It still doesn't participate in firmware caching etc, and we should possibly consider throwing out the udev case entirely, but this should be sufficient for testing.

Comment 15 Michael Shigorin 2012-10-03 18:55:49 UTC
(In reply to comment #12)
> I can not fix YOUR buggy drivers!
A billion "thank you", Kay, and another one to Cormier.  Move your titanic on.

Comment 16 Antti Palosaari 2012-10-04 09:20:24 UTC
I just tested latest Kernel master and it works.
Last commit:
commit 612a9aab56a93533e76e3ad91642db7033e03b69
Merge: 3a49431 268d283
Author: Linus Torvalds <torvalds>
Date:   Wed Oct 3 23:29:23 2012 -0700

    Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux

Linux localhost.localdomain 3.6.0+ #1 SMP Thu Oct 4 11:56:13 EEST 2012 x86_64 x86_64 x86_64 GNU/Linux

**

I also tested Fedora 17 latest Kernel and it was still broken.

Linux localhost.localdomain 3.5.4-2.fc17.x86_64 #1 SMP Wed Sep 26 21:58:50 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

Comment 17 Josh Boyer 2012-10-04 12:43:07 UTC
(In reply to comment #16)
> I just tested latest Kernel master and it works.
> Last commit:
> commit 612a9aab56a93533e76e3ad91642db7033e03b69
> Merge: 3a49431 268d283
> Author: Linus Torvalds <torvalds>
> Date:   Wed Oct 3 23:29:23 2012 -0700
> 
>     Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux
> 
> Linux localhost.localdomain 3.6.0+ #1 SMP Thu Oct 4 11:56:13 EEST 2012
> x86_64 x86_64 x86_64 GNU/Linux
> 
> **

That contains the revised version of the patch Linus attached here (commit abb139e75c2cdbb95).  Thank you very much for testing.


> I also tested Fedora 17 latest Kernel and it was still broken.

That's to be expected at this point.

Comment 18 Ville Skyttä 2012-10-28 10:24:07 UTC
Still expected to be broken with 3.6.3-1.fc17.x86_64?

Comment 19 Josh Boyer 2012-10-29 15:01:33 UTC
(In reply to comment #18)
> Still expected to be broken with 3.6.3-1.fc17.x86_64?

Yes.  Patch is in 3.7-rc1 and hasn't been brought back to stable yet.

Comment 20 Josh Boyer 2013-01-08 16:07:19 UTC
F17 will be rebased to the 3.7 kernel within the next couple of weeks.  A backport of the patches never made it to the 3.6 stable series.  I'll mark this bug as POST for now.

Comment 21 Fedora Update System 2013-01-22 14:29:32 UTC
kernel-3.7.3-101.fc17 has been submitted as an update for Fedora 17.
https://admin.fedoraproject.org/updates/FEDORA-2013-1025/kernel-3.7.3-101.fc17

Comment 22 Fedora Update System 2013-01-24 22:38:16 UTC
kernel-3.7.3-101.fc17 has been pushed to the Fedora 17 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.