980221 – Flood of mei_me errors and possibly a system hang after suspend/resume

Bug 980221 - Flood of mei_me errors and possibly a system hang after suspend/resume

Summary: Flood of mei_me errors and possibly a system hang after suspend/resume

Keywords:
Status:	CLOSED RAWHIDE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	kernel
Sub Component:
Version:	rawhide
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	urgent
Target Milestone:	---
Assignee:	Kernel Maintainer List
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2013-07-01 18:52 UTC by Adam Williamson
Modified:	2013-07-23 15:41 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2013-07-23 15:41:25 UTC
Type:	Bug
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
everything in journalctl -a from the time I suspended the system through to the reboot (119.74 KB, text/plain) 2013-07-01 18:54 UTC, Adam Williamson	no flags	Details
View All

Description Adam Williamson 2013-07-01 18:52:19 UTC

After upgrading my desktop to Rawhide, I am hitting what appears to be something similar to this problem reported to LKML:

https://lkml.org/lkml/2013/6/3/460

My system survives a suspend/resume cycle, but after I resume, I do have a huge pile of these errors in my logs:

Jul 1 11:19:15 adam kernel: [32032.801584] mei_me 0000:00:16.0: reset: init clients timeout hbm_state = 1.
Jul 1 11:19:15 adam kernel: [32032.801589] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING

/var/log/messages shows a huge amount of them around 10:20 - which I think is the time I resumed the system - but then it shows them occurring consistently every 30 seconds up until 11:19, around which time the system hung. journald does not appear to log any of the messages after 10:20, or else it logs the ones after 10:20 with bogus timestamps; in journalctl -a output I only see the errors timestamped 10:20, then suddenly the log jumps to the reboot at 11:20. So obviously this flood of errors causes issues for logging (or actually, I guess, journald may be correctly rate-limiting the flood of messages). I'm more worried that it's causing my system to hang an hour after I resume it, though.

I have seen my system hang twice since upgrading to Rawhide. The second time is described above. The first time looks very similar in the logs: a bunch of mei_me errors at 10:09, /var/log/messages then shows a pair every 30 seconds for an hour, and a system reboot (which would be me rebooting after the system hung) at 11:11.

Basically, it looks very much like each time I suspend/resume I get a huge flood of these mei_me messages, then a pair every 30 seconds, then the system hangs an hour later.

My system is a self-built one based on an Asus P8P67 Deluxe motherboard. I have mailed the upstream guy (Tomas Winkler) to let him know I also am hitting this mei_me problem.

Comment 1 Adam Williamson 2013-07-01 18:54:44 UTC

Created attachment 767497 [details]
everything in journalctl -a from the time I suspended the system through to the reboot

Comment 2 Adam Williamson 2013-07-23 15:41:25 UTC

This seems to be better since rc1 or so.

Note You need to log in before you can comment on or make changes to this bug.