Bug 1254299 - f23 workstation x86_64 Alpha-2 live locks up
f23 workstation x86_64 Alpha-2 live locks up
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
23
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Adam Jackson
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-08-17 12:21 EDT by satellitgo
Modified: 2015-09-10 15:35 EDT (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-09-10 15:35:34 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
journalctl with standard (non-working) boot (234.24 KB, text/x-vhdl)
2015-08-24 13:30 EDT, Jeremy Rimpo
no flags Details
journalctl with nomodeset (working) boot (818.73 KB, text/x-vhdl)
2015-08-24 13:36 EDT, Jeremy Rimpo
no flags Details
Crash.log wks fereeze f23 (313.50 KB, text/plain)
2015-08-24 21:20 EDT, satellitgo
no flags Details
new 'drm.debug=14" freeze in f23 workstation with intel graphics (643.22 KB, text/plain)
2015-08-25 11:27 EDT, satellitgo
no flags Details

  None (edit)
Description satellitgo 2015-08-17 12:21:25 EDT
Description of problem:
f23 workstation x86_64 Alpha-2 live locks up with intel graphics

Version-Release number of selected component (if applicable):
Intel NUC i3
Graphics Card: Haswell ULT Integrated Controller
Processor: Intel @ Core i3-4010U CPU @1.70GHZ x2
7,7 GiB Memory
953.0 GiB Hard Drive
How reproducible:
install f23 workstation (both bios boot and EFI) to HD 
after several minutes the system locks up and loses keyboard and M$ wheel mouse.
Power off is only way out

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:
Comment 1 satellitgo 2015-08-17 12:23:46 EDT
(In reply to satellitgo from comment #0)
> Description of problem:
> f23 workstation x86_64 Alpha-2 live locks up with intel graphics
> 
> Version-Release number of selected component (if applicable):
> Intel NUC i3
> Graphics Card: Haswell ULT Integrated Controller
> Processor: Intel @ Core i3-4010U CPU @1.70GHZ x2
> 7,7 GiB Memory
> 953.0 GiB Hard Drive
> How reproducible:
> install f23 workstation (both bios boot and EFI) to HD 
> after several minutes the system locks up and loses keyboard and M$ wheel
> mouse.
> Power off is only way out
> 
> Steps to Reproduce:
> 1.
> 2.
> 3.
> 
> Actual results:
> 
> 
> Expected results:
> 
> 
> Additional info:
This lockup does not occur with other f23 alpha-2 live DE Installs (Cinnamon; LXDE)
Comment 2 Fedora Blocker Bugs Application 2015-08-17 12:28:12 EDT
Proposed as a Freeze Exception for 23-beta by Fedora user satellit using the blocker tracking app because:

 workstation is blocking desktop. should function without locking up
Comment 3 Josh Boyer 2015-08-17 16:45:51 EDT
It's probably a bit early to be throwing blocker status at this.  We have a single report on a single machine with zero debug data.  We have no idea how prevalent this really is.

Your report is confusing.  You say the live locks up, but the further data indicates that you installed it to the disk.  So does it lock up when booted in live mode, or does it only lock up when you boot an install from the disk?

After several minutes of what kind of activity does the machine lock up?

When you say EFI and bios, does that mean you changed the boot mode in firmware between boots?

If you boot the debug kernel, is there a backtrace produced?

Did Alpha-1 do this?
Comment 4 satellitgo 2015-08-17 17:15:54 EDT
(In reply to Josh Boyer from comment #3)
> It's probably a bit early to be throwing blocker status at this.  We have a
> single report on a single machine with zero debug data.  We have no idea how
> prevalent this really is.
> 
> Your report is confusing.  You say the live locks up, but the further data
> indicates that you installed it to the disk.  So does it lock up when booted
> in live mode, or does it only lock up when you boot an install from the disk?
> 
> After several minutes of what kind of activity does the machine lock up?
> 
> When you say EFI and bios, does that mean you changed the boot mode in
> firmware between boots?

I did 2 installs using EFI and then bios boot to 2 different Hard Disks.
saw this on both installs
> 
> If you boot the debug kernel, is there a backtrace produced?
> 
> Did Alpha-1 do this?

I will try to do as adamw suggested this morning:journalctl -b -l and see what it shows.
Comment 5 Josh Boyer 2015-08-17 17:30:28 EDT
(In reply to satellitgo from comment #4)
> (In reply to Josh Boyer from comment #3)
> > It's probably a bit early to be throwing blocker status at this.  We have a
> > single report on a single machine with zero debug data.  We have no idea how
> > prevalent this really is.
> > 
> > Your report is confusing.  You say the live locks up, but the further data
> > indicates that you installed it to the disk.  So does it lock up when booted
> > in live mode, or does it only lock up when you boot an install from the disk?
> > 
> > After several minutes of what kind of activity does the machine lock up?
> > 
> > When you say EFI and bios, does that mean you changed the boot mode in
> > firmware between boots?
> 
> I did 2 installs using EFI and then bios boot to 2 different Hard Disks.
> saw this on both installs

Still unclear.  You saw a lockup after you booted the machine into the HD install, or while you were booted in live but after the install completed?

How did you install to two different hard disks?  Did you swap them out in one machine?

> > 
> > If you boot the debug kernel, is there a backtrace produced?
> > 
> > Did Alpha-1 do this?
> 
> I will try to do as adamw suggested this morning:journalctl -b -l and see
> what it shows.

Yes.  We need more data.
Comment 6 satellitgo 2015-08-17 17:42:01 EDT
pastebin.com/nr5LHsBH
Install to external HD from DVD of f23 RC-2 workstation live (bios boot)

Next I will redo an install from a disks  restore USB to another external HD  (efi boot) and wait for lockup
Comment 7 satellitgo 2015-08-17 18:41:04 EDT
journalctl -b -l
pastebin.com/ttqtvmxg
froze up when tried to use gnome-software to install thunderbird

EFI boot of gnome-disks restore USB to HD  saw BZ 124442 and 1244261 on live.
abrt shows no errors after power off reboot
Comment 8 satellitgo 2015-08-17 23:22:56 EDT
journalctl -b -l
http://ur1.ca/ngukq
froze up after about 1/2 hr with ff; thunderbird; and xchat running
gnome disks restore (same usb as previous test)
MSI Windbox  
intel celeron cpu1037U@1.80GHz x2
intel ivybridge mobile
3.8 GB Memory

does not look like it is hardware....
Comment 9 satellitgo 2015-08-17 23:35:15 EDT
(In reply to satellitgo from comment #8)
> journalctl -b -l
> http://ur1.ca/ngukq
> froze up after about 1/2 hr with ff; thunderbird; and xchat running
> gnome disks restore (same usb as previous test)
> MSI Windbox  
> intel celeron cpu1037U@1.80GHz x2
> intel ivybridge mobile
> 3.8 GB Memory
> 
> does not look like it is hardware....

fpaste.org/256082/8686914
extended lines:
http://url.ca/ngups
Comment 10 satellitgo 2015-08-17 23:38:57 EDT
(In reply to satellitgo from comment #9)
> (In reply to satellitgo from comment #8)
> > journalctl -b -l
> > http://ur1.ca/ngukq
> > froze up after about 1/2 hr with ff; thunderbird; and xchat running
> > gnome disks restore (same usb as previous test)
> > MSI Windbox  
> > intel celeron cpu1037U@1.80GHz x2
> > intel ivybridge mobile
> > 3.8 GB Memory
> > 
> > does not look like it is hardware....
> 
> fpaste.org/256082/8686914
> extended lines:
> http://url.ca/ngurq
Comment 11 satellitgo 2015-08-17 23:41:56 EDT
http://ur1.ca/ngurq

sorry for error on address
Comment 12 Josh Boyer 2015-08-18 08:58:30 EDT
(In reply to satellitgo from comment #6)
> pastebin.com/nr5LHsBH
(In reply to satellitgo from comment #7)
> journalctl -b -l
> pastebin.com/ttqtvmxg
(In reply to satellitgo from comment #8)
> journalctl -b -l
> http://ur1.ca/ngukq

You truncated all the lines.  These aren't useful.

(In reply to satellitgo from comment #9)
> fpaste.org/256082/8686914
Hidden fpaste, can't read.

> extended lines:
> http://url.ca/ngups

Invalid URL.

(In reply to satellitgo from comment #11)
> http://ur1.ca/ngurq
> 
> sorry for error on address

Doesn't really show us anything.
Comment 13 satellitgo 2015-08-18 09:50:16 EDT
did a f23 alpha-2 netinstall x86_64 of default workstation:
freezes after boot - login plus GIS  get abrt tainted kernel flags:GD
BZ 1249048 cc  only 1 of three abrt reportable:

intel NUC efi boot of gnome-disks restore USB install to external USB HD
Comment 14 satellitgo 2015-08-18 11:24:10 EDT
http://paste.fedoraproject.org/256320/11009143/

system76 intel Core i7-363QM CPU @ 2.40GHz x 8
gnome disks restore of f23 alpha-2 netinstall default workstation install to HD
Comment 15 Jeremy Rimpo 2015-08-24 11:11:55 EDT
I'm also experiencing lockups when booting into the Alpha Live ISO, though I can bypass the issue by booting into basic graphics mode. This is on a dual Intel/nVidia laptop, and the Intel driver should be the active driver. However, even running in basic graphics mode, I notice the integrated GPU light is on, indicating the nVidia card is powered on.
Comment 16 Jeremy Rimpo 2015-08-24 11:30:26 EDT
I had previously done a dnf upgrade from 22, decided to remove the bumblebee/nvidia drivers due to occasional, random x server crashes, only to run into lockups during boot. So I decided to start fresh and do a clean root install keeping only my /home partition, only to discover the Alpha ISO was ALSO locking up at boot, as per my last post. (Which I bypassed as noted above.)

I then ran into the Alpha FWRaid anaconda issue, which I've just bypassed via updates, so I should have a freshly installed system to test with shortly.

I'll see if I can pull some non-truncated logs.
Comment 17 Adam Williamson 2015-08-24 12:53:29 EDT
Discussed at 2015-08-24 freeze exception review meeting: http://meetbot-raw.fedoraproject.org/fedora-blocker-review/2015-08-24/f23-blocker-review.2015-08-24-16.03.log.txt . We agreed to punt on this one as there isn't sufficient information to really be able to make a call on how significant the issue is, but those of us with Intel systems are planning to try and see if we can reproduce.
Comment 18 Jeremy Rimpo 2015-08-24 13:28:28 EDT
I'm about to upload a log using nomodeset - which produces a working system albeit with vesa running - and the regular boot - which eventually locks up with  (if I remember right) my Caps Lock and Scroll Lock keys blinking.

I think some keys lines are here:

Aug 24 11:48:42 yeesha.rimpo.us kernel: nouveau E[   PFIFO][0000:01:00.0] BIND_ERROR [ UNK00 ]
Aug 24 11:48:42 yeesha.rimpo.us kernel: nouveau E[   PFIFO][0000:01:00.0] PIO_ERROR
Aug 24 11:48:42 yeesha.rimpo.us kernel: nouveau E[   PFIFO][0000:01:00.0] FB_FLUSH_TIMEOUT
Aug 24 11:48:42 yeesha.rimpo.us kernel: nouveau E[   PFIFO][0000:01:00.0] DROPPED_MMU_FAULT 0x00000000
Aug 24 11:48:42 yeesha.rimpo.us kernel: nouveau E[    PBUS][0000:01:00.0] MMIO read of 0x00000000 FAULT at 0x002100 [ !ENGINE ]

Meanwhile, the nomodeset/vesa log mostly has errors to due with WAYLAND_DISPLAY not being set, which I'm guessing is expected behavior from the Vesa driver.
Comment 19 Jeremy Rimpo 2015-08-24 13:30:58 EDT
Created attachment 1066566 [details]
journalctl with standard (non-working) boot
Comment 20 Jeremy Rimpo 2015-08-24 13:36:00 EDT
Created attachment 1066571 [details]
journalctl with nomodeset (working) boot

Note loading of nouveau driver in an optimus (dual card) system.
The nvidia card should not normally be enabled at boot, but it seems like nouveau is trying to load anyway...
I'm guessing it fails to then bind to a GPU that is off - but maybe there's more at work.
Is the Intel driver even trying to load?

NOTE that I had a booting and seemingly functional system with bumblebee installed with nvidia drivers. (With random X restarts.)

HOWEVER the discreet gpu light, which is supposed to only turn on when the card is on, is ALWAYS on after initial boot. This is true with nomodeset, bumblebee, or a standard wayland/nouveau boot.
Comment 21 Jeremy Rimpo 2015-08-24 13:43:50 EDT
I should point out that I'm also confused by satellitgo's descriptions. This may not actually be the same problem... I'm never able to actually fully boot and log in. It sounds like satellitgo can, but locks up at some point.
Comment 22 Adam Williamson 2015-08-24 14:04:54 EDT
Jeremy: your error messages are in nouveau, not intel. Please file a separate bug under xorg-x11-drv-nouveau , or even upstream (on freedesktop.org) as more devs are likely to see it there. Please include logs from booting with 'drm.debug=14'. thanks!
Comment 23 Jeremy Rimpo 2015-08-24 14:20:25 EDT
Will do.
Comment 24 satellitgo 2015-08-24 21:20:31 EDT
Created attachment 1066673 [details]
Crash.log wks fereeze f23
Comment 25 satellitgo 2015-08-24 21:22:30 EDT
booted with drm.debug=14

journalctl -b -1 > crash.log
Comment 26 Adam Williamson 2015-08-24 21:27:17 EDT
there is no debug data there and no drm.debug parameter logged in the kernel args. you have to pass drm.debug=14 *on the boot that hangs*, not on the boot where you run the journalctl command.
Comment 27 satellitgo 2015-08-25 11:27:13 EDT
Created attachment 1066913 [details]
new 'drm.debug=14" freeze in f23 workstation with intel graphics
Comment 28 Adam Williamson 2015-08-25 11:41:33 EDT
That log still does not have DRM debugging enabled and does not appear to contain any information relating to any crash or hang.

Let me try the instructions once more, step by step:

1. Boot the system
2. At the grub menu, hit 'e'
3. Go down to the line that starts 'linux16' or 'linuxefi'. To the end of the line (after LANG=something, probably) add 'drm.debug=14'
4. Hit F10, let the system boot
5. Log in and wait for the crash/hang to happen
6. Reboot the system (no need to add the drm.debug=14 parameter this time)
7. Log in, open a terminal
8. Become root (e.g. 'sudo su' or just 'su')
9. Run: journalctl -b -1 > crash.log'
10. Attach the file crash.log to this bug

thanks.
Comment 29 satellitgo 2015-08-25 18:17:40 EDT
Further testing:
May be related to having ipv6 enabled 
I disconnected wireless and then turned off ipv6 on wired connection.
No freezes in 8 hrs so far. (Last items in error.log mention IPv6 MTU) :

::snip::
Aug 25 08:05:15 localhost.localdomain PackageKit[1837]: uid 1000 obtained auth for org.freedesktop.packagekit.system-sources-refresh
Aug 25 08:05:18 localhost.localdomain gnome-session[2180]: Window manager warning: Invalid WM_TRANSIENT_FOR window 0x562a01a00003 specified for 0x1a00945 (XChat: Edi).
Aug 25 08:05:30 localhost.localdomain NetworkManager[1069]: <warn>  (wlp2s0): Lowering IPv6 MTU (1280) to match device MTU (0)
Aug 25 08:05:30 localhost.localdomain NetworkManager[1069]: <warn>  (wlp2s0): IPv6 MTU (0) smaller than 1280, adjusting
Aug 25 08:05:30 localhost.localdomain NetworkManager[1069]: <warn>  (wlp2s0): Raising device MTU (0) to match IPv6 MTU (1280)
Aug 25 08:06:11 localhost.localdomain gnome-session[2180]: Window manager warning: Invalid WM_TRANSIENT_FOR window 0x562a01a00003 specified for 0x1a014f5 (XChat: Edi).
Aug 25 08:06:52 localhost.localdomain gnome-session[2180]: Window manager warning: Invalid WM_TRANSIENT_FOR window 0x562a01a0016a specified for 0x1a02f4c (XChat: Pre).
Aug 25 08:06:54 localhost.localdomain NetworkManager[1069]: <warn>  (wlp2s0): Lowering IPv6 MTU (1280) to match device MTU (0)
Aug 25 08:06:54 localhost.localdomain NetworkManager[1069]: <warn>  (wlp2s0): IPv6 MTU (0) smaller than 1280, adjusting
Aug 25 08:06:54 localhost.localdomain NetworkManager[1069]: <warn>  (wlp2s0): Raising device MTU (0) to match IPv6 MTU (1280)
Aug 25 08:06:59 localhost.localdomain PackageKit[1837]: refresh-cache transaction /4_ebbccedd from uid 1000 finished with success after 103603ms
Aug 25 08:07:05 localhost.localdomain PackageKit[1837]: get-updates transaction /5_dacedaea from uid 1000 finished with success after 4432ms
Aug 25 08:07:05 localhost.localdomain PackageKit[1837]: new update-packages transaction /6_cbbeeacb scheduled from uid 1000
Aug 25 08:08:36 localhost.localdomain NetworkManager[1069]: <warn>  (wlp2s0): Lowering IPv6 MTU (1280) to match device MTU (0)
Aug 25 08:08:36 localhost.localdomain NetworkManager[1069]: <warn>  (wlp2s0): IPv6 MTU (0) smaller than 1280, adjusting
Aug 25 08:08:36 localhost.localdomain NetworkManager[1069]: <warn>  (wlp2s0): Raising device MTU (0) to match IPv6 MTU (1280)
Comment 30 Adam Williamson 2015-08-25 18:32:27 EDT
Can you try splitting those two things up? Try it without wireless but with IPv6 on wired, and try it with wireless but without IPv6. Always best to change *one* thing at a time. Thanks.
Comment 31 Josh Boyer 2015-08-25 19:00:59 EDT
And if you're still doing this with whatever kernel came with Alpha, that's probably not a great idea.  Use the -rc8 kernel in updates-testing as it has numerous fixes.

Or put another way, if you isolate it with the Alpha kernel that is excellent.  However the first thing we will ask is to test an updated kernel.
Comment 32 satellitgo 2015-08-26 06:13:58 EDT
(In reply to Josh Boyer from comment #31)
> And if you're still doing this with whatever kernel came with Alpha, that's
> probably not a great idea.  Use the -rc8 kernel in updates-testing as it has
> numerous fixes.
> 
> Or put another way, if you isolate it with the Alpha kernel that is
> excellent.  However the first thing we will ask is to test an updated kernel.


Thank you
you were correct:

I did a 'dnf update kernel' (the -rc8 kernel in updates-testing) 

I am not experiencing any more lockups.

All tests with ff; thunderbird; xchat running
intel NUC i3; 8 GB memory;internal wireless card; 1 TB internal HD; HDMI monitor; USB mouse and keyboard
efi boot with  'drm.debug=14'

tested with:
no wireless + wired ipv4 only
no wireless + wired ipv4 and ipv6
wireless ipv4 only + wired ipv4 and ipv6
wireless ipv4 and ipv6 + wired ipv4 and ipv6
Comment 33 Jeremy Rimpo 2015-08-26 13:04:30 EDT
(In reply to Josh Boyer from comment #31)
> And if you're still doing this with whatever kernel came with Alpha, that's
> probably not a great idea.  Use the -rc8 kernel in updates-testing as it has
> numerous fixes.
> 
> Or put another way, if you isolate it with the Alpha kernel that is
> excellent.  However the first thing we will ask is to test an updated kernel.

Actually, I just pulled in the rc8 kernel this morning along with several mesa updates, and the nvidia optimus issues appear to be fixed for me. So I suppose I'll forego the new ticket.
Comment 34 Adam Williamson 2015-09-10 15:35:34 EDT
Discussed at 2015-09-10 freeze exception review meeting: https://meetbot-raw.fedoraproject.org/fedora-blocker-review/2015-09-10/f23-blocker-review.2015-09-10-16.00.log.txt . This sounds like it is already fixed and the appropriate course of action is simply to close the bug.

Note You need to log in before you can comment on or make changes to this bug.