Bug 2152281 - cm32181 module error blocking suspend
Summary: cm32181 module error blocking suspend
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 36
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-12-10 07:40 UTC by ivan
Modified: 2023-02-13 08:27 UTC (History)
17 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-02-13 08:27:10 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
dmesg / 6.1.0-0.rc8.20221206gitbce9332220bd.59.fc38.x86_64 (85.02 KB, text/plain)
2022-12-10 07:40 UTC, ivan
no flags Details
oops when loading cm32181 after a suspend/resume cycle (92.20 KB, text/plain)
2022-12-10 07:41 UTC, ivan
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Launchpad.net ubuntu/+source/linux/+bug/1988346 0 None None None 2022-12-10 07:40:17 UTC

Description ivan 2022-12-10 07:40:18 UTC
Created attachment 1931536 [details]
dmesg / 6.1.0-0.rc8.20221206gitbce9332220bd.59.fc38.x86_64

With recent f36 kernels the cm32181 module breaks suspend on a Lenovo X1 carbon gen2:

$ systemctl suspend

dmesg:

cm32181 i2c-CPLM3218:00: PM: dpm_run_callback(): acpi_subsys_suspend+0x0/0x60 returns -121
cm32181 i2c-CPLM3218:00: PM: failed to suspend async: error -121

This happens on rawhide (6.1.0-0.rc8.20221206gitbce9332220bd.59.fc38.x86_64), 6.0.11-200.fc36.x86_64, 6.0.10, 6.09.

Suspend used to work with older kernels but I can't say at which version it stopped working (it's not my daily laptop).

Stock fedora install, no third party modules, ...

As suggested by Hans in https://bugzilla.redhat.com/show_bug.cgi?id=2029207#c65, blacklisting the cm32181 module fixes the issue.

Other findings:

- I found a similar bug report in ubuntu (bug linked) which is in "confirmed" status, and is a bit old (by the kernel's standards), yet there seems to be a fix mentioned in the comments.

- Hans' initial fix (https://bugzilla.redhat.com/show_bug.cgi?id=2029207#c59) allows the laptop to suspend but it then freezes on resume.

- with the module blacklisted (= not loaded at boot):

  - `modprobe cm32181` triggers a kernel oops if a suspend/resume cycle was done once (or more).

  - `modprobe cm32181` doesn't trigger an oops if the laptop was never suspended. Then:

    - suspend/resume works if the cm32181 is rmmod'ed

    - without rmmod'ing the driver the laptop suspends successfully (power led slowly blinking) but can't be resumed (led continues blinking - have to hold the power button for a few seconds to power off the laptop).

Comment 1 ivan 2022-12-10 07:41:13 UTC
Created attachment 1931537 [details]
oops when loading cm32181 after a suspend/resume cycle

Comment 2 Hans de Goede 2022-12-20 12:26:39 UTC
Thank you for filing a separate bug report for this and sorry for being a bit slow to respond.

Thanks for attaching the oops log. The oops is caused by this error:

[  107.422659] i2c i2c-0: Transfer while suspended

And I was actually recently involved in reviewing a kernel patch which fixes this error on some Intel (i2c-designware) I2C controllers.

I have started a test kernel build with both the cm32181 suspend fix, as well as the fix for the "i2c i2c-0: Transfer while suspended" error:

https://koji.fedoraproject.org/koji/taskinfo?taskID=95562586

As usual this is still building atm, it should be finished in a couple of hours.

For install instructions see: https://fedorapeople.org/~jwrdegoede/kernel-test-instructions.txt

Please test the following after installing + booting this kernel:

1. Boot with cm32181 still blacklisted, do a suspend + resume and then modprobe cm32181 manually after the suspend/resume, the oops should be gone now.

2. If cm32181 successfully modprobe-s after the suspend/resume, try another suspend resume.

3. Try boot + suspend resume without cm32181 blacklisted

Hopefully all 3 tests will work :)

Comment 3 ivan 2022-12-21 14:31:59 UTC
(In reply to Hans de Goede from comment #2)
> Thank you for filing a separate bug report for this and sorry for being a
> bit slow to respond.

No problem at all !

> Please test the following after installing + booting this kernel:
> 
> 1. Boot with cm32181 still blacklisted, do a suspend + resume and then
> modprobe cm32181 manually after the suspend/resume, the oops should be gone
> now.

The oops is gone; there's no error in dmesg.

> 2. If cm32181 successfully modprobe-s after the suspend/resume, try another
> suspend resume.
> 
> 3. Try boot + suspend resume without cm32181 blacklisted

In both cases the laptop suspends but is unresponsive after a resume (the display shows the last state but the mouse/keyboard/network/... don't work).

Resume works if cm32181 is rmmod'ed before suspending. But then, modprobe cm32181 triggers those errors in dmesg:

[  108.792976] i2c i2c-0: Failed to register i2c client dummy at 0x48 (-16)
[  108.794391] cm32181: probe of i2c-CPLM3218:00 failed with error -16

Comment 4 Hans de Goede 2022-12-21 22:53:19 UTC
Thank you for testing and good to know that the oops is gone.

The issue with the 2nd modprobe not working is due to a resource leak in the driver, so not related to the suspend/resume not working issue (still something which needs fixing though).

It is unfortunate that suspend/resume still does not work.

Starting tomorrow I'm taking time of from work for 2 weeks. I'm afraid that further debugging / fixing of this issue will have to wait till I'm back. I'll likely have a bit of backlog to go through when I return to work, but I will try to get back to debugging this further soon after I'm back.

Merry Christmas and a happy new year!

Comment 5 ivan 2022-12-22 05:21:26 UTC
(In reply to Hans de Goede from comment #4)

> Starting tomorrow I'm taking time of from work for 2 weeks. I'm afraid that
> further debugging / fixing of this issue will have to wait till I'm back.
> I'll likely have a bit of backlog to go through when I return to work, but I
> will try to get back to debugging this further soon after I'm back.

No worries ! There's a workaround for this bug so there's really nothing urgent (+ I don't use this light sensor). Other, less proficient users might have this issue though so until (/if) the bug is fixed it could make sense to temporarily 'rmmod' the driver as a pre-suspend tweak for fedora users.

> Merry Christmas and a happy new year!

Thanks - same to you !

Comment 6 Hans de Goede 2023-01-17 21:23:01 UTC
A patch has been posted upstream which I think might resolve this:
https://lore.kernel.org/linux-iio/20230117160951.282581-1-kai.heng.feng@canonical.com/

I have started a Fedora kernel scratch-build with the patch added:

https://koji.fedoraproject.org/koji/taskinfo?taskID=96266533

As usual this is still building and should be done in a couple of hours.

Note it would be good if you can first just test the latest 6.1.x kernel from the Fedora updates repo with the cm32181 module still blacklisted before testing this one, to rule out any new issues in 6.1 .

And then after confirming 6.1 still suspends/resume with the module blacklisted, give it a try with this test kernel without the module blacklisted.

Comment 7 Hans de Goede 2023-01-18 10:55:52 UTC
Please ignore the scratch build from my last comment, there is a bug in the patch (which I introduced). I'll start a new scratch build with a fixed patch.

Comment 8 Hans de Goede 2023-01-18 11:16:10 UTC
Here is a new scratch-build with the fixed patch (still building, done in a couple of hours):

https://koji.fedoraproject.org/koji/taskinfo?taskID=96289981

Comment 9 ivan 2023-01-19 10:06:44 UTC
Hi Hans,

I won't have access to the laptop until mid next week (I've downloaded the kernel rpms in case they're automatically deleted until then). I'll report back as soon as I have a chance to test your build, sorry about the delay !

Comment 10 Hans de Goede 2023-01-19 10:51:56 UTC
(In reply to ivan from comment #9)
> I won't have access to the laptop until mid next week (I've downloaded the
> kernel rpms in case they're automatically deleted until then). I'll report
> back as soon as I have a chance to test your build, sorry about the delay !

No problem and thank you for your continued help with resolving this.

Comment 11 ivan 2023-01-26 09:03:39 UTC
Suspend/resume works with your test kernel...

No idea if it's related but `rmmod cm32181` then `modprobe cm32181` shows the following in dmesg:

i2c i2c-0: Failed to register i2c client dummy at 0x48 (-16)
cm32181: probe of i2c-CPLM3218:00 failed with error -16

yet the module is loaded (but unused):

$ lsmod | grep cm

cm32181                16384  0
industrialio          106496  1 cm32181

This happens regardless of suspend/resume (eg. tried just after a clean boot)

No problem to test further if needed. Or just call it quits :)

Cheers
Ivan

Comment 12 Hans de Goede 2023-01-26 09:53:47 UTC
(In reply to ivan from comment #11)
> Suspend/resume works with your test kernel...

Great, thank you for testing.

> No idea if it's related but `rmmod cm32181` then `modprobe cm32181` shows
> the following in dmesg:
> 
> i2c i2c-0: Failed to register i2c client dummy at 0x48 (-16)
> cm32181: probe of i2c-CPLM3218:00 failed with error -16

This is a known issue / bug which we actually found while working on the suspend/resume bug. The CPLM3218 device contains 2 I2C resources and we actually want to use the second one (which was also causing the suspend/resume issue).

The driver creates an extra (dummy) i2c-client for the second address and then talks through that i2c-client and on rmmod it forgets to unregister the dummy i2c-client. So on a second modprobe we get a -16 aka EBUSY error

The driver loads the second time, but it does not bind to the device, so you don't get ALS functionality after a rmmod + modprobe. Since you don't really need to do a rmmod + modprobe this is not really an issue though.

I've added fixing this minor issue to my todo list.

As for the fix for the suspend/resume issue that one should show up on a 6.2-rc# release soon-ish and then get backported to 6.1.y from there.

Comment 13 ivan 2023-01-26 10:45:09 UTC
(In reply to Hans de Goede from comment #12)

> > No idea if it's related but `rmmod cm32181` then `modprobe cm32181` shows
> > the following in dmesg:
> > 
> > i2c i2c-0: Failed to register i2c client dummy at 0x48 (-16)
> > cm32181: probe of i2c-CPLM3218:00 failed with error -16
> 
> This is a known issue / bug which we actually found while working on the
> suspend/resume bug. The CPLM3218 device contains 2 I2C resources and we
> actually want to use the second one (which was also causing the
> suspend/resume issue).

Thanks for the explanation !

> As for the fix for the suspend/resume issue that one should show up on a
> 6.2-rc# release soon-ish and then get backported to 6.1.y from there.

Should I close the bug as "errata" or will you close it once a new kernel is out with the fix ?

Comment 14 Hans de Goede 2023-01-26 16:22:03 UTC
> Should I close the bug as "errata" or will you close it once a new kernel is out with the fix ?

Lets leave this open till a fixed kernel is in the Fedora updates repositories. This way it will be easier for other users, who may have the same bug, to find this issue.

Comment 15 Hans de Goede 2023-02-06 10:14:56 UTC
FYI a patch to fix rmmod + modprobe again not working has been posted upstream now:

https://lore.kernel.org/linux-iio/20230206063616.981225-1-kai.heng.feng@canonical.com/

Comment 16 Hans de Goede 2023-02-13 08:27:10 UTC
The fix for this has landed in 6.1.11 which is in updates-testing now, closing.


Note You need to log in before you can comment on or make changes to this bug.