Bug 982398 - Lenovo 3000 N500 cannot boot 3.9 series kernels if acpi-cpufreq loaded
Summary: Lenovo 3000 N500 cannot boot 3.9 series kernels if acpi-cpufreq loaded
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 19
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-07-08 23:01 UTC by "FeRD" (Frank Dana)
Modified: 2014-12-29 22:10 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-03-10 14:43:47 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
lspci output (2.23 KB, text/plain)
2013-07-08 23:01 UTC, "FeRD" (Frank Dana)
no flags Details
sanitized dmidecode output (6.01 KB, text/plain)
2013-07-08 23:02 UTC, "FeRD" (Frank Dana)
no flags Details
dmesg output from kernel 3.8.11-200.fc18.x86_64 with default command line (59.86 KB, text/plain)
2013-07-08 23:03 UTC, "FeRD" (Frank Dana)
no flags Details
dmesg output from booting Fedora 19 LiveCD with acpi=off added to command line (52.51 KB, text/plain)
2013-07-08 23:03 UTC, "FeRD" (Frank Dana)
no flags Details
dmesg output from kernel 3.9.8-200.fc18.x86_64 with acpi=off added to command line (52.00 KB, text/plain)
2013-07-08 23:04 UTC, "FeRD" (Frank Dana)
no flags Details
dmesg output from kernel 3.9.8-200.fc18.x86_64 with acpi=rsdt added to command line (60.71 KB, text/plain)
2013-07-08 23:05 UTC, "FeRD" (Frank Dana)
no flags Details
acpidump output as requested (154.06 KB, text/plain)
2013-10-20 00:01 UTC, "FeRD" (Frank Dana)
no flags Details
output from aaron's modified acpidump, as requested (152.80 KB, text/plain)
2013-10-24 08:10 UTC, "FeRD" (Frank Dana)
no flags Details
ACPI table capture as requested in comment #27 (1/2) (14.71 KB, application/x-bzip)
2013-10-30 00:16 UTC, "FeRD" (Frank Dana)
no flags Details
ACPI table capture as requested in comment #27 (2/2) (14.67 KB, application/x-bzip)
2013-10-30 00:17 UTC, "FeRD" (Frank Dana)
no flags Details

Description "FeRD" (Frank Dana) 2013-07-08 23:01:26 UTC
Created attachment 770702 [details]
lspci output

Description of problem:
I have a Lenovo laptop (non-Thinkpad, and with Intel graphics only — this is not the Thinkpad Nvidia bug) which cannot boot any kernel newer than 3.8.11 using the default command line. The boot process aborts fairly early, with the machine spontaneously rebooting to the BIOS initialization.
This is a problem which began occurring in Fedora 18 with the 3.9.2-200.fc18.x86_64 kernel, and also affects 3.9.6-200.fc18.x86_64, 3.9.8-200.fc18.x86_64 from updates-testing, and the Fedora 19 release kernel (3.9.5-301.fc19.x86_64) from the Fedora 19 LiveCD / fedup upgrade process.

I ignored the issue initially and stayed with 3.8.11-200.fc18.x86_64, as this is a low-use laptop. But it's come to a head with the Fedora 19 release, as I was unable to boot into the fedup upgrade environment to complete the installation. 

The laptop is stock except for a wireless card upgrade (the original b43-based 802.11g miniPCIe card replaced with an iwlwifi-based Intel 5100 802.11n card), but my first action in troubleshooting this issue was to remove the 5100. The failure occurs even with no wireless card installed, and all logs included were captured without the 5100 installed.

After significant experimentation, I appear to have narrowed it down to an ACPI issue with the entire 3.9.x kernel series.


Version-Release number of selected component (if applicable):
kernel-3.9.5-301.fc19.x86_64

(actually affects all kernel-3.9.2-200.fc18.x86_64 and higher from F18 or F19)


How reproducible:
Always, when using default kernel options


Steps to Reproduce:
1. Install 3.9.x series kernel on Lenovo 3000 N500
2. Attempt to boot with default/standard command line

Actual results:
After a few seconds, end up back at BIOS init screen 

Expected results:
Boot into Fedora environment successfully


Additional info:
After discovering that the problem affected the Fedora 19 LiveCD boot (indicating that it was not an issue with my installation), I began experimenting with kernel command lines. I discovered the following, using either the LiveCD 3.9.5 kernel or the 3.9.8 kernel from Fedora 18 updates-testing:

Kernel command line change (result)
-----------------------------------
<none> (failure)
noapic (failure)
acpi=off (SUCCESS)
acpi=noirq (failure)
acpi=strict (failure)
acpi=rsdt (SUCCESS)

So, the problem may be the ACPI XSDT in 3.9.x. However, the included dmesg output from 3.8.11-200 shows that it boots successfully using the XSDT.

Unfortunately, I have no logs from the failed boots, but will include lspci and dmesg output for the laptop, as well as dmesg output from a selection of successful boots.
* lspci output
* sanitized dmidecode output
* dmesg output for normal 3.8.11-200.fc18 boot (no command-line adjustments)
* dmesg output for Fedora 19 LiveCD booted with acpi=off
* dmesg output for 3.9.8-200.fc18 (updates-testing) booted with acpi=off
* dmesg output for 3.9.8-200.fc18 (updates-testing) booted with acpi=rsdt

Comment 1 "FeRD" (Frank Dana) 2013-07-08 23:02:01 UTC
Created attachment 770703 [details]
sanitized dmidecode output

Comment 2 "FeRD" (Frank Dana) 2013-07-08 23:03:06 UTC
Created attachment 770704 [details]
dmesg output from kernel 3.8.11-200.fc18.x86_64 with default command line

Comment 3 "FeRD" (Frank Dana) 2013-07-08 23:03:46 UTC
Created attachment 770705 [details]
dmesg output from booting Fedora 19 LiveCD with acpi=off added to command line

Comment 4 "FeRD" (Frank Dana) 2013-07-08 23:04:41 UTC
Created attachment 770706 [details]
dmesg output from kernel 3.9.8-200.fc18.x86_64 with acpi=off added to command line

Comment 5 "FeRD" (Frank Dana) 2013-07-08 23:05:06 UTC
Created attachment 770707 [details]
dmesg output from kernel 3.9.8-200.fc18.x86_64 with acpi=rsdt added to command line

Comment 6 Tadas Slotkus 2013-07-14 16:44:21 UTC
I have Lenovo 3000 N500 too, maybe with different hardware set, but I had to use acpi=rsdt since Linux 3.7.6-102.fc17.x86_64 or maybe earlier version to boot successfully. To boot Fedora-19 I also have to add nomodeset, to avoid nouveau crashing.

Comment 7 Tadas Slotkus 2013-07-30 14:10:58 UTC
This issue is only for 64bit OS, I successfully booted and installed 32bit version of Fedora19 without acpi=... kernel argument.

Comment 8 "FeRD" (Frank Dana) 2013-07-30 17:17:04 UTC
Having downloaded and burned the Fedora 19 i686 LiveCD, I can confirm the previous comment's claim.

I am able to boot fully into the live environment under its kernel-3.9.5-301.fc19.i686, using no special command-line options. It is apparently only x86_64 kernels in the 3.9.x and later series that cause spontaneous boot-time reboots, unless the command line includes acpi=off or acpi=rsdt.

Comment 9 Aaron Lu 2013-09-23 08:45:49 UTC
Can someone please test an upstream kernel 3.11.x? If there is still problem with x86_64, then git bisect is needed to find the offending commit.

Comment 10 "FeRD" (Frank Dana) 2013-09-23 18:17:38 UTC
Not sure I'm clear exactly what you mean by "upstream" kernel 3.11.x, Aaron. I just installed the current kernel-3.11.1-200.fc19.x86_64 package from F19 updates repo and tested; behavior is exactly the same. Attempting to boot with the default kernel command line (no acpi options) results in a near-instantaneous reboot, whereas I am able to boot successfully with "acpi=rsdt".

Does that give you the information you need, or is there another kernel you'd like me to try? I'm happy to experiment, the laptop in question is low-usage for me.

Comment 11 Aaron Lu 2013-09-24 01:40:46 UTC
upstream kernel means the kernel from kernel.org without any distro patches, but it should also be OK to test Fedora's kernel as there should be very little additional patches added(I didn't check though).

So the problem starts to occur from 3.9 series, can you please do a git bisect from git tag v3.8(which should be OK) to tag v3.9(which is problematic) to find the offending commit? Thanks.

Comment 12 "FeRD" (Frank Dana) 2013-09-25 18:25:17 UTC
Well, this has become more... "interesting".

I cloned the kernel tree and checked out v3.8.11 (which worked for me, from Fedora's rpm builds), only to find that it did NOT successfully boot without "acpi=rsdt". Nor did v.3.8. Then, taking comment#6 as a guideline, I tried v3.7.5. No love, either.

Long story short, I ended up having to downgrade all the way to v3.6 to get a vanilla kernel built that would successfully boot without messing with "acpi=" on its command line. I KNOW kernels later than that worked for me when installed from Fedora RPMs, so maybe they were carrying some patch for a while that then got dropped for the 3.9.x series, or something... who knows?

Regardless, the point is to identify what change in the kernel itself is causing this problem. So, I'm in the process now of bisecting from v3.6 to v3.7.5. Will report back after 12 or 13 more build-install-reboot cycles.

Comment 13 "FeRD" (Frank Dana) 2013-10-03 11:03:28 UTC
The bisect is complete, sorry for the delay — the laptop isn't a very fast kernel compilation machine, and I ended up running out of disk space twice.

I'm not sure how conclusive/helpful the results will prove to be. I was able to obtain a first-bad-commit. (Just a few commits before v3.7.5, I could've saved myself a lot of time by not starting way back at v3.6.0.) Unfortunately, it's a change in kernel/module.c during a time when that code was greatly in flux, so my attempts to reverse it out on current (v3.11.1) kernel code proved impossible. I did hand-reverse it out of v3.7.5, but that kernel still failed to boot with this change reversed.

Coupled with the fact that I had to go back so much further than I recall to find a working kernel, and looking at the actual change that the bisect produced as "the culprit", I'm more than a little worried that the whole process was sabotaged by the fact that I was testing under a systemd that's much newer (F19-updates) than the one present (F18) when the problems first started. But, I'll leave that investigation to those who know the kernel code well, as I'm way out of my depth there.

Anyway, the resultant first bad commit was the following:


commit f9586ddba7b93ea72190f79c82079cbfc4bb8730
Author: Rusty Russell <rusty.au>
Date:   Sat Jan 12 13:27:34 2013 +1030

    module: put modules in list much earlier.
    
    commit 1fb9341ac34825aa40354e74d9a2c69df7d2c304 upstream.
    
    Prarit's excellent bug report:
    > In recent Fedora releases (F17 & F18) some users have reported seeing
    > messages similar to
    >
    > [   15.478160] kvm: Could not allocate 304 bytes percpu data
    > [   15.478174] PERCPU: allocation failed, size=304 align=32, alloc from
    > reserved chunk failed
    >
    > during system boot.  In some cases, users have also reported seeing this
    > message along with a failed load of other modules.
    >
    > What is happening is systemd is loading an instance of the kvm module for
    > each cpu found (see commit e9bda3b).  When the module load occurs the kernel
    > currently allocates the modules percpu data area prior to checking to see
    > if the module is already loaded or is in the process of being loaded.  If
    > the module is already loaded, or finishes load, the module loading code
    > releases the current instance's module's percpu data.
    
    Now we have a new state MODULE_STATE_UNFORMED, we can insert the
    module into the list (and thus guarantee its uniqueness) before we
    allocate the per-cpu region.
    
    Reported-by: Prarit Bhargava <prarit>
    Signed-off-by: Rusty Russell <rusty.au>
    Tested-by: Prarit Bhargava <prarit>
    Signed-off-by: Greg Kroah-Hartman <gregkh>

Comment 14 Aaron Lu 2013-10-08 07:07:21 UTC
Sorry for the late reply, just got back from vacation. It seems that you are using a stable tree, can you please use Linus' git tree instead?
http://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

The corresponding commit for your bisected one is:
commit 1fb9341ac34825aa40354e74d9a2c69df7d2c304
Author: Rusty Russell <rusty.au>
Date:   Sat Jan 12 13:27:34 2013 +1030

    module: put modules in list much earlier.

Given that you have already tested the upstream kernel that only v3.6 works, the above commit shouldn't matter since it lands in v3.8, thus shouldn't be the culprit.

Comment 15 "FeRD" (Frank Dana) 2013-10-08 17:28:15 UTC
Thanks, Aaron. I've checked out linus' tree, confirmed that v3.11 does not boot without acpi=rsdt, while v3.7 boots successfully with no acpi= commandline flags. I'll begin bisecting between those two points.

Comment 16 Aaron Lu 2013-10-09 00:42:36 UTC
Hi Frank,

Bisect from v3.7 to v3.11 would be too many, I think you can first check which kernel version starts to break and then bisect between two adjacent kernel versions.

Comment 17 Aaron Lu 2013-10-11 01:14:16 UTC
Please also attach acpidump:
# acpidump > acpidump.txt

thanks.

Comment 18 Aaron Lu 2013-10-11 01:20:10 UTC
And do you use the same config file as Fedora's 3.8.11-220.fc18(which works OK) for your 3.8.11 kernel(which is problematic) build?

Comment 19 "FeRD" (Frank Dana) 2013-10-19 23:59:58 UTC
Apologies for the delay in getting back to you on this, last week didn't provide me with any time to work on this issue. (Apologies, also, for the length of what follows; I tried to be brief, and failed utterly.)

acpidump output will be attached following this comment. 

Your questions about the config I used are astute, as I suspect that's one area where my methodology may have failed me. To answer your specific query: No, I didn't use the Fedora 3.8.11 config for any builds; more on that later, but generally I used configs from later (problematic) Fedora builds.

I don't use the Fedora 18 3.8.11-200 config for the same reason I haven't been able to re-test with that kernel: Since the laptop's upgrade to Fedora 19, I no longer have access to it. (I'm sure it's somewhere on the Koji buildservers, if I really went digging.) I'd initially been disinclined to attempt shoe-horning an F18 kernel into an F19 system — a concern which in retrospect seems foolish, given the number and range of kernel revs I've since experimented with. Hindsight...


My second bisect attempt has produced more favorable results, flagging a change that, when reverted from v3.9, produces a kernel that boots with no special acpi arguments. The same change reverted from v3.11 was NOT effective, so it appears there is more to the problem. However, this is perhaps a sign of progress. Moreover, this time I at least appear to be in the right neighborhood, as there are other neighboring commits which specifically deal with ACPI, and in 64-bit systems.


This commit was flagged as first-bad after bisecting Linus' tree:

commit efa17194581bdfca0986dabc178908bd7c21ba00 (refs/bisect/bad)
Author: Matthew Garrett <matthew.garrett>
Date:   2013-01-22 22:33:46 +0100

    cpufreq: Add module aliases for acpi-cpufreq
    
    The acpi core will call request_module("acpi-cpufreq") on subsystem init,
    but this will fail if the module isn't available at that stage of boot.
    Add some module aliases to ensure that udev can load the module on Intel
    and AMD systems with the appropriate feature bits - I /think/ that this
    will also work on VIA systems, but haven't verified that.
    
    References: http://lkml.kernel.org/r/1448223.sdUJnNSRz4@vostro.rjw.lan
    Signed-off-by: Matthew Garrett <matthew.garrett>
    Tested-by: Leonid Isaev <lisaev.edu>
    Acked-by: Borislav Petkov <bp>
    Cc: 3.7+ <stable.org>
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki>


As I said, reversing this change on a checked-out v3.9 produced a kernel which booted without requiring acpi=rsdt, an improvement over previous results. I have a branch off of v3.9 containing that one revert, and will continue to investigate (read: flail around), to the extent I'm able to wrap my brain around git. Hopefully "walking" that change forward towards v3.11 will determine where it stops being effective. (Beyond my lack of familiarity with the requisite git gymnastics, my kernel-config methodology is still questionable — see below.)

I noted several more ACPI and/or cpufreq changes almost immediately prior to efa17194 which could also be involved in the larger problem (and the v3.11 failure), in particular these commits:

commit 9855d8c ACPI: Check MSR valid bit before using P-state frequencies
commit 78e8eb8 cpufreq: cpufreq-cpu0: use RCU locks around usage of OPP
commit f44d188 cpufreq: OMAP: use RCU locks around usage of OPP
commit 2521686 ACPI, APEI: Fixup incorrect 64-bit access width firmware bug
commit f427e5f ACPI / processor: Get power info before updating the C-states

2521686, in particular, is a 64-bit only change which could explain why all 32-bit kernels boot on the laptop without issue.

This is now heavily into the realm of conjecture, though, and any input/guidance is more than welcome.


A last note on kernel configs, and how I probably made a mess of things there:

The first bisect round I was somewhat less than methodical regarding the kernel configs, frequently I copied /boot/config-3.9.9-200.fc19.x86_64 (the earliest one I had available at the time) into the build directory as .config and then ran `make ARCH=x86_64 oldnoconfig`, which is how the kernel RPM spec generates its final config before building.

This second time, I started with /boot/config-3.11.1-200.fc19.x86_64 (as I was building the v3.11 source), and on each iteration of the bisect configured the build tree with `make ARCH=x86_64 olddefconfig` with the previous iteration's .config left in place. As I was mostly working backwards through the git tree this seemed to go fairly smoothly, but perhaps was unwise?

The working v3.9+fix config was olddefconfig'd from the failed v3.11+fix config, which had been olddefconfig'd from /boot/config-3.11.1-200.fc19.x86_64

Comment 20 "FeRD" (Frank Dana) 2013-10-20 00:01:21 UTC
Created attachment 814126 [details]
acpidump output as requested

Comment 21 Aaron Lu 2013-10-24 06:26:37 UTC
Hi Frank,

Sorry for the delay.

Looking at the acpidump, the FADT's address is different with XSDT and RSDT:
XSDT has FADT's address as:
[024h 0036   8]       ACPI Table Address   0 : 00000000BDBE8000
RSDT has FADT's address as:
[024h 0036   4]       ACPI Table Address   0 : BDBE4000

So I doubt some values in FADT might be different for the two places. The previous acpidump has the dump of FADT in address BDBE8000, I've prepared a new acpidump for you to use here:
https://github.com/aaronlu/linux/tree/ae51ccd74f50eed91a0178298ee887522d16b0c5/tools/power/acpi
Get the acpidump.c and Makefile and build the new acpidump, use it to get another dump which should contain the FADT in BDBE4000, attach the output acpidump.txt here. Thanks.

Comment 22 "FeRD" (Frank Dana) 2013-10-24 08:10:38 UTC
Created attachment 815663 [details]
output from aaron's modified acpidump, as requested

Thanks, Aaron. Here's the output from that modified acpidump, as requested. I do see a few bits difference between the tables at those two locations, though perhaps that's mostly just labeling?

Comment 23 Aaron Lu 2013-10-24 09:05:40 UTC
I saw only the 'PM Profile' changed, but it shouldn't matter. I'll continue the investigation.

Comment 24 Aaron Lu 2013-10-25 02:50:05 UTC
I think the following commit:

commit c655affbd524d0105978ecd696c3bb8a281b418b
Author: Rafael J. Wysocki <rafael.j.wysocki>
Date:   Fri Jun 7 13:13:31 2013 +0200

    ACPI / cpufreq: Add ACPI processor device IDs to acpi-cpufreq

will also have to be reverted to make your system boot for a v3.11 kernel, so that acpi-cpufreq module will not be loaded for your system. But the problem is not with these commits, they are used to match acpi cpu frequency driver. The problem seems to me, as long as acpi-cpufreq is running, the system would hang/reboot. There is something wrong with cpufreq handling.

Comment 25 "FeRD" (Frank Dana) 2013-10-25 10:20:29 UTC
Hmmm. Well, if that's the case, I should be able to boot even a stock Fedora kernel build just by blacklisting the acpi-cpufreq module... correct? I'll give that a try in my next round of testing, as it'd be great to have confirmation that the trigger is somewhere in the module code.

Just as a reminder, in case it helps frame the problem: this problem occurs only with 64-bit kernel builds. There doesn't seem to be any issue booting 32-bit kernels. (At least up through the latest I've tested, which would be 3.9.5-301.fc19.i686 from the Fedora 19 32-bit LiveCD.)

Comment 26 "FeRD" (Frank Dana) 2013-10-27 09:40:02 UTC
Hunch correct! It turns out I can boot seemingly any 64-bit kernel, even the latest 3.11.6-200.fc19.x86_64 from Fedora updates, without needing special command line options or really any trickery, beyond creating a file in /etc/modprobe.d that contains:

blacklist acpi-cpufreq

So, it appears that the bug lives somewhere in the module itself, and it is when the (64-bit) kernel attempts to load it that the spontaneous reboot is triggered.

Comment 27 Aaron Lu 2013-10-29 07:42:27 UTC
If you boot with acpi=rsdt and do not blacklist acpi-cpufreq, the system would still run normally, right?

If so, please attach the following data:
1 boot without any acpi kernel cmdline options, make acpi-cpufreq blacklisted(or you will not be able to boot), then copy all ACPI tables by:
# cp -r /sys/firmware/acpi/tables tables-xsdt
# tar cvf tables-xsdt.tar tables-xsdt
# bzip2 tables-xsdt.tar
Attach tables-xsdt.tar.bz2
2 boot with acpi=rsdt and do not blacklist acpi-cpufreq, after boot, make sure acpi-cpufreq is loaded with /sbin/lsmod. This is used to make sure acpi-cpufreq indeed works this time. And then copy all ACPI tables again like in 1, except that you can name the result file as tables-rsdt.tar.bz2.

It is possible some of the dynamic table which provides cpufreq functionality is different when rsdt and xsdt is used.

Thanks.

Comment 28 "FeRD" (Frank Dana) 2013-10-30 00:14:09 UTC
(In reply to Aaron Lu from comment #27)
> If you boot with acpi=rsdt and do not blacklist acpi-cpufreq, the system
> would still run normally, right?

Correct. Either successful-boot method (acpi=rsdt on the kernel command line, or acpi-cpufreq module blacklisting) results in a running system that appears to be fully functional and operating normally. I can't necessarily say what the state of CPU power management is, especially in blacklist case (where there's no visibility without the module loaded) but ostensibly the laptop is perfectly usable.

In response to your request, I performed the following steps, starting from a running system booted on 3.11.6-200.fc19.x86_64 with acpi-cpufreq blacklisted (and therefore with ACPI using the XSDT table):

1. Captured tables-xsdt.tar.bz2 as instructed
2. Commented out the line blacklisting acpi-cpufreq, in /etc/modprobe.d
3. Ran a `sudo modprobe acpi-cpufreq` — Bang! Instant spontaneous reboot
(This is what I was almost "hoping" would happen. If the reboot occurs even in a fully booted system with several hours of uptime, that's absolute confirmation that the kernel simply attempting to load the acpi-cpufreq module is somehow the trigger.)
4. Added "acpi=rsdt" to the kernel command line when booting back into 3.11.6-200.fc19.x86_64
5. Confirmed that acpi_cpufreq appears in lsmod output[*]
6. Captured tables-rsdt.tar.bz2 as instructed

[*] — In fact, when booted the second time (acpi=rsdt, no blacklist) I even grabbed this:

% sudo cpupower frequency-info
analyzing CPU 0:
  driver: acpi-cpufreq
  CPUs which run at the same hardware frequency: 0
  CPUs which need to have their frequency coordinated by software: 0
  maximum transition latency: 10.0 us.
  hardware limits: 1000 MHz - 2.17 GHz
  available frequency steps: 2.17 GHz, 1.67 GHz, 1.33 GHz, 1000 MHz
  available cpufreq governors: conservative, userspace, powersave, ondemand, performance
  current policy: frequency should be within 1000 MHz and 2.17 GHz.
                  The governor "ondemand" may decide which speed to use
                  within this range.
  current CPU frequency is 1000 MHz (asserted by call to hardware).
  boost state support:
    Supported: no
    Active: no

...So, the acpi=rsdt method may be the better workaround for however long this remains an issue, since it lets the OS access normal PM functionality. Booting with the module blacklisted, the same command results in "no or unknown cpufreq driver is active on this CPU", and no runtime control or monitoring is possible.


> It is possible some of the dynamic table which provides cpufreq
> functionality is different when rsdt and xsdt is used.

I'll attach both captures immediately following this comment.

Unfortunately, I can tell you right now that diffing the two trees shows absolutely no differences beyond the one we already know about, the FACP table.

I'm stumped... Would it perhaps be possible, now that we know "exactly" where the problem occurs, for me to capture a kernel trace during `modprobe acpi-cpufreq`, see what's going on in the cycles leading up to whatever triggers the reboot? A major hurdle (other than the fact that I'm clueless about this and would likely need hand-holding) would obviously be obtaining the capture output, before it goes *poof* as the machine resets. But, heck, I have two other Fedora boxes on the local LAN — maybe some sort of trace which outputs over the network for remote capture? Is such a thing even possible?

Comment 29 "FeRD" (Frank Dana) 2013-10-30 00:16:31 UTC
Created attachment 817202 [details]
ACPI table capture as requested in comment #27 (1/2)

Comment 30 "FeRD" (Frank Dana) 2013-10-30 00:17:16 UTC
Created attachment 817203 [details]
ACPI table capture as requested in comment #27 (2/2)

Comment 31 Aaron Lu 2013-10-30 07:26:32 UTC
There is netconsole you can try.

Comment 32 Justin M. Forbes 2014-01-03 22:10:41 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 19 kernel bugs.

Fedora 19 has now been rebased to 3.12.6-200.fc19.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 20, and are still experiencing this issue, please change the version to Fedora 20.

If you experience different issues, please open a new bug report for those.

Comment 33 Justin M. Forbes 2014-03-10 14:43:47 UTC
*********** MASS BUG UPDATE **************

This bug has been in a needinfo state for more than 1 month and is being closed with insufficient data due to inactivity. If this is still an issue with Fedora 19, please feel free to reopen the bug and provide the additional information requested.

Comment 34 Len Brown 2014-03-10 20:31:25 UTC
If I had access to this machine,
here is how I'd debug this issue:

boot with "panic=30" to make sure the system doesn't
reboot within 30 seconds of a panic.

have a digital camera ready

from comment #28
2. Commented out the line blacklisting acpi-cpufreq, in /etc/modprobe.d

CTRL-ALT-F1
to get into VGA mode on tty1

3. Ran a `sudo modprobe acpi-cpufreq` — Bang! Instant spontaneous reboot

Then take a photo of the display and attach it to this bug report.

---
If I can't get a photo, I'd modify the driver and put RETURN's
in its initialization sequence to "sneak up" on where it crashes.

Re: acpi=rsdp and 32-bit mode, unclear why those work
needs to be explained.

Comment 35 Tadas Slotkus 2014-12-28 22:16:32 UTC
Tried many times with Debian 8 live:

uname -a
Linux debian 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt2-1 (2014-12-08) x86_64 GNU/Linux

Kernel command line: modprobe.blacklist=acpi_cpufreq panic=30 boot=live live-media-path=live64 components initrd=/live64/initrd.img BOOT_IMAGE=/live64/vmlinuz

Regardless of panic=30, reboot happens instantly without any panic messages/30s delay.

With same command line sometimes it boots, sometimes not. When doing modprobe acpi_cpufreq usually it doesn't reboot, except one time.


Tried even blacklisting nouveau, but no luck.


When using default kernel line and acpi=rsdt it always boots successfully.

Comment 36 Tadas Slotkus 2014-12-29 22:10:54 UTC
Got a bit further with 64-bit kernels. Default kernel line with "memmap=99G$0x100000000" is a dirty workaround for the boot and resume from suspend problems. Only 3GB of RAM is visible though. Only checked with nouveau driver.

These bugs might be related:
sleep and wake up does not work at Lenovo 3000 N500, model 4233-5MG https://bugzilla.kernel.org/show_bug.cgi?id=69331

Memory corruption on Lenovo t440p with runpm https://bugs.freedesktop.org/show_bug.cgi?id=78530


Note You need to log in before you can comment on or make changes to this bug.