| Summary: | sporadic prieport log spew | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Jason Tibbitts <j> | ||||||||||||
| Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> | ||||||||||||
| Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||||||
| Severity: | unspecified | Docs Contact: | |||||||||||||
| Priority: | unspecified | ||||||||||||||
| Version: | 25 | CC: | gansalmon, ichavero, itamar, jonathan, kernel-maint, madhu.chinakonda, mchehab | ||||||||||||
| Target Milestone: | --- | Flags: | jforbes:
needinfo?
|
||||||||||||
| Target Release: | --- | ||||||||||||||
| Hardware: | Unspecified | ||||||||||||||
| OS: | Unspecified | ||||||||||||||
| Whiteboard: | |||||||||||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||||||||
| Doc Text: | Story Points: | --- | |||||||||||||
| Clone Of: | Environment: | ||||||||||||||
| Last Closed: | 2017-04-28 17:14:56 UTC | Type: | Bug | ||||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||||
| Documentation: | --- | CRM: | |||||||||||||
| Verified Versions: | Category: | --- | |||||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||
| Attachments: |
|
||||||||||||||
Created attachment 1210222 [details]
lspci -v output
Created attachment 1210223 [details]
lsmod output
Created attachment 1210267 [details]
continuation of my dmesg
So I am now pretty sure that this is correlated with the partial X lockups I'm seeing. I've attached the continuation of the previous kernel log, up to the point where I rebooted the machine. There are some new lines in there.
As this is a new system, it's really quite possible that something is wrong with the hardware. It appears to run fine (until it dies, of course) so if anything is wrong it's relatively subtle. CPU and RAM have undergone days of testing before I even got to the point of trying to use it as a desktop, and there's no display corruption.
And... it just died before I could actually submit this update. While rebooting, I went into the BIOS and disabled all PCIE ASPM options. So far, no pcieport complaints in the kernel log, and I've turned on the various compositor options which seemed to make things worse. I'll attach dmesg from this boot as well, just in case there's something that might make it obvious what's happened (or what the BIOS might have been doing wrong).
Created attachment 1210268 [details]
dmesg on boot without ASPM enabled in the BIOS.
*********** MASS BUG UPDATE ************** We apologize for the inconvenience. There is a large number of bugs to go through and several of them have gone stale. Due to this, we are doing a mass bug update across all of the Fedora 25 kernel bugs. Fedora 25 has now been rebased to 4.10.9-200.fc25. Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel. If you have moved on to Fedora 26, and are still experiencing this issue, please change the version to Fedora 26. If you experience different issues, please open a new bug report for those. *********** MASS BUG UPDATE ************** This bug is being closed with INSUFFICIENT_DATA as there has not been a response in 2 weeks. If you are still experiencing this issue, please reopen and attach the relevant data from the latest kernel you are running and any data that might have been requested previously. |
Created attachment 1210221 [details] dmesg I'm running a new system (so I don't really have much in the way of history to say when this might have started) on Fedora 24. I found several sets of errors from pcieport in dmesg, accumulating at the rate of a few a minute. I updated to the 4.8.1 kernel from F25 and the messages don't _seem_ to happen at the same rate, but I do still see some. I had made a number of other changes (like turning off vsync in the desktop compositor and updating to the new mesa in testing) in an attempt to get the system to be stable enough for use, so I'm not entirely sure the kernel was the cause but I can work back from here. The messages look like this: [ 1553.571562] pcieport 0000:00:03.0: AER: Multiple Corrected error received: id=0018 [ 1553.571574] pcieport 0000:00:03.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=0018(Transmitter ID) [ 1553.571581] pcieport 0000:00:03.0: device [8086:6f08] error status/mask=00001100/00002000 [ 1553.571585] pcieport 0000:00:03.0: [ 8] RELAY_NUM Rollover [ 1553.571588] pcieport 0000:00:03.0: [12] Replay Timer Timeout I'm not sure what device that might be. lspci shows: 00:03.0 PCI bridge: Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3 v4/Xeon D PCI Express Root Port 3 (rev 01) 03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Pitcairn XT [Radeon HD 7870 GHz Edition] 03:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Cape Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series] I'll attach the full dmesg, lspci -v and lsmod.