Bug 1007976 - Fedora 19 hangs ~50% of the time on boot (requires hard reset)
Fedora 19 hangs ~50% of the time on boot (requires hard reset)
Status: CLOSED CURRENTRELEASE
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
19
x86_64 Linux
unspecified Severity high
: ---
: ---
Assigned To: Kernel Maintainer List
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-09-13 12:19 EDT by Brian
Modified: 2014-01-04 18:03 EST (History)
11 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-01-04 18:03:15 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
boot screen messages with quiet and rhgb removed. booting to runlevel 3 (1.59 MB, image/jpeg)
2013-09-13 12:19 EDT, Brian
no flags Details
dmesg output (66.07 KB, text/plain)
2014-01-04 15:17 EST, Brian
no flags Details
smartctl for sata device (4.22 KB, text/plain)
2014-01-04 15:27 EST, Brian
no flags Details
smartctl for cdrom (492 bytes, text/plain)
2014-01-04 15:38 EST, Brian
no flags Details

  None (edit)
Description Brian 2013-09-13 12:19:25 EDT
Created attachment 797437 [details]
boot screen messages with quiet and rhgb removed. booting to runlevel 3

Description of problem:
System hangs during boot about 50% of the time requiring a hard reset.

Version-Release number of selected component (if applicable):
Fedora 19, kernel 3.10.10 & 3.10.11 both attempted

How reproducible:
about 50% of the time on boot

Steps to Reproduce:
1. boot or reboot computer
2. system hangs
3. hard shutdown/reset

Actual results:
system frozen with error message
-----------------------
[drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
ata1.00: exception Emask 0x52 SAct 0x0 SErr 0xffffffff action 0xe frozen
ata1.00 SError: { RecovData RecovComm UnrecovData Persist Proto HostInt PHYRdyChg PHYInt WommWake 10B8B Dispar BadCRC Handshk LinkSeq TrStaTrnsUnrecFIS DevExch }
ata1.00 failed command: IDENTIFY PACKET DEVICE
ata1.00: cmd a1/00:01:00:00:00/00:00:00:00:00/00 tag 0 pio 512 in
         res 40/00:03:00:00:00/00:00:00:00:00/a0 Emask 0x56 (ATA bus error)
ata1.00: status: { DRDY }
ata1.00: hard resetting link
ata2.00: exception Emask 0x52 SAct 0x1 SErr 0xffffffff action 0xe frozen
ata2.00 SError: { RecovData RecovComm UnrecovData Persist Proto HostInt PHYRdyChg PHYInt WommWake 10B8B Dispar BadCRC Handshk LinkSeq TrStaTrnsUnrecFIS DevExch }
ata2.00: failed command: READ FPDMA QUEUED
ata2.00: cmd 60:08:00:78:03:00/00:00:00:00:00/40 tag0 ncq 4096 in
         res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x56 (ATA bus error)
ata2.00 status: { DRDY }
ata2: hard resetting link
-----------------------
Expected results:
system booting

Additional info:
New install of Fedora 19 on new system built with i7 4770k, Asus Z87 PRO, Samsung 840 PRO ssd, using onboard graphics chipset.

Attempted to use the following boot options to no avail:
acpi=off
libdata.force=noncq
Comment 1 Brian 2013-09-13 12:23:04 EDT
oops, I made a typo in transcription: error message should say "CommWake" not "WommWake".
Comment 2 Josh Boyer 2013-09-18 16:34:53 EDT
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 19 kernel bugs.

Fedora 19 has now been rebased to 3.11.1-200.fc19.  Please test this kernel update and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you experience different issues, please open a new bug report for those.
Comment 3 Brian 2013-09-18 23:45:29 EDT
Couldn't get kernel 10.11 to finish booting after  about 10 tries, then gave up. 

I reverted back to 3.10.11 as I can actually fully boot 50% of the time and use my computer...
Comment 4 Brian 2013-09-18 23:46:25 EDT
Another typo... meant to say "kernel 3.11" - sorry
Comment 5 Justin M. Forbes 2014-01-03 17:09:27 EST
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 19 kernel bugs.

Fedora 19 has now been rebased to 3.12.6-200.fc19.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 20, and are still experiencing this issue, please change the version to Fedora 20.

If you experience different issues, please open a new bug report for those.
Comment 6 Brian 2014-01-03 20:00:52 EST
Same issue with 3.12.6-200.fc19
Comment 7 Michele Baldessari 2014-01-04 09:56:57 EST
Hi Brian,

if you boot with libata.atapi_passthru16=0 does it work?

Thanks,
Michele
Comment 8 Brian 2014-01-04 14:36:22 EST
This seems to be working. I just rebooted 5+ times in a row successfully. 

Could you provide any insight to a kernel layman on what my issue is/was?
Comment 9 Michele Baldessari 2014-01-04 15:10:11 EST
Glad it works.

Well some buggy devices need the above workaround to be working.
Likely one of the sata devices on your system which is not entirely compliant.

Can you attach the full dmesg output and also 'smartctl -a /dev/<each_sata_device>'

Thanks,
Michele
Comment 10 Brian 2014-01-04 15:17:42 EST
Created attachment 845509 [details]
dmesg output

dmesg output after booting with libata.atapi_passthru16=0
Comment 11 Brian 2014-01-04 15:27:39 EST
Created attachment 845542 [details]
smartctl for sata device

smartctl for sda
Comment 12 Brian 2014-01-04 15:38:27 EST
Created attachment 845543 [details]
smartctl for cdrom

smartctl for DVD RW drive
Comment 13 Michele Baldessari 2014-01-04 16:37:07 EST
I'd guess that:
[    0.802608] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[    0.806611] ata1.00: ATAPI: ASUS    DRW-24B1ST   c, 1.05, max UDMA/100
[    0.807484] ata1.00: configured for UDMA/100
[    0.809806] scsi 0:0:0:0: CD-ROM            ASUS     DRW-24B1ST   c   1.05 PQ: 0 ANSI: 5
[    0.813477] sr0: scsi3-mmc drive: 48x/48x writer dvd-ram cd/rw xa/form2 cdda tray

is the culprit here.

In any case, the ATA subsystem does not seem to allow to set atapi_passthru16 quirks for specific devices, so I think you'll have to live with this
boot parameter on this system.

Ok to close this BZ?

Thanks,
Michele
Comment 14 Brian 2014-01-04 17:39:36 EST
Sure, close this BZ. Is there a downside to this boot parameter?
Comment 15 Michele Baldessari 2014-01-04 18:02:31 EST
It should not AFAIK no

Note You need to log in before you can comment on or make changes to this bug.