Bug 459144

Summary: ahci has timeouts during boot on JMB363-SiI5723 cascade without drives
Product: Red Hat Enterprise Linux 5 Reporter: Jukka Lehtonen <jukka.lehtonen>
Component: kernelAssignee: David Milburn <dmilburn>
Status: CLOSED WONTFIX QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: low Docs Contact:
Priority: medium    
Version: 5.2CC: jfeeney, peterm
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-10-30 22:14:51 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
dmesg none

Description Jukka Lehtonen 2008-08-14 18:08:26 UTC
Description of problem:
Gigabyte GA-EP45-DQ6 (rev 1.0, BIOS F5) motherboard has ICH10R and Jmicron JMB363 SATA2 controllers.  The board has one Silicon Image 5723 RAID controller connected to each SATA port of the JMB363.  If there are no hard drives connected to ports of a 5723 chip, ahci module produces warnings and delays boot somewhat.

Version-Release number of selected component (if applicable):
kernel-2.6.18-92.el5
kernel-2.6.18-92.1.10.el5

How reproducible:
Always

Steps to Reproduce:
1. Boot without drives connected to 5723 ports
2.
3.
  
Actual results:
ahci scans SATA port on JMB363 and is forced to do resets, although there are no hard drives connected.

Expected results:
ahci observes the SiI5723 device and finds out that there is no hard drives connected to it.

Additional info:

# lspci -n
00:00.0 0600: 8086:2e20 (rev 02)
00:01.0 0604: 8086:2e21 (rev 02)
00:1a.0 0c03: 8086:3a37
00:1a.1 0c03: 8086:3a38
00:1a.2 0c03: 8086:3a39
00:1a.7 0c03: 8086:3a3c
00:1b.0 0403: 8086:3a3e
00:1c.0 0604: 8086:3a40
00:1c.4 0604: 8086:3a48
00:1d.0 0c03: 8086:3a34
00:1d.1 0c03: 8086:3a35
00:1d.2 0c03: 8086:3a36
00:1d.7 0c03: 8086:3a3a
00:1e.0 0604: 8086:244e (rev 90)
00:1f.0 0601: 8086:3a16
00:1f.2 0104: 8086:2822
00:1f.3 0c05: 8086:3a30
01:00.0 0300: 10de:0600 (rev a2)
02:00.0 0604: 111d:802d (rev 0d)
03:01.0 0604: 111d:802d (rev 0d)
03:02.0 0604: 111d:802d (rev 0d)
03:03.0 0604: 111d:802d (rev 0d)
03:04.0 0604: 111d:802d (rev 0d)
03:05.0 0604: 111d:802d (rev 0d)
03:06.0 0604: 111d:802d (rev 0d)
05:00.0 0200: 10ec:8168 (rev 02)
0a:00.0 0106: 197b:2363 (rev 02)
0a:00.1 0101: 197b:2363 (rev 02)
0b:06.0 0c00: 104c:8024

# snip of dmesg
SCSI subsystem initialized
libata version 3.00 loaded.
ahci 0000:00:1f.2: version 3.0
ACPI: PCI Interrupt 0000:00:1f.2[B] -> GSI 19 (level, low) -> IRQ 98
ahci 0000:00:1f.2: AHCI 0001.0200 32 slots 6 ports 3 Gbps 0x3f impl RAID mode
ahci 0000:00:1f.2: flags: 64bit ncq sntf stag pm led clo pmp pio slum part 
PCI: Setting latency timer of device 0000:00:1f.2 to 64
scsi0 : ahci
scsi1 : ahci
scsi2 : ahci
scsi3 : ahci
scsi4 : ahci
scsi5 : ahci
ata1: SATA max UDMA/133 abar m2048@0xec606000 port 0xec606100 irq 106
ata2: SATA max UDMA/133 abar m2048@0xec606000 port 0xec606180 irq 106
ata3: SATA max UDMA/133 abar m2048@0xec606000 port 0xec606200 irq 106
ata4: SATA max UDMA/133 abar m2048@0xec606000 port 0xec606280 irq 106
ata5: SATA max UDMA/133 abar m2048@0xec606000 port 0xec606300 irq 106
ata6: SATA max UDMA/133 abar m2048@0xec606000 port 0xec606380 irq 106
ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata1.00: HPA detected: current 625140335, native 625142448
ata1.00: ATA-8: ST3320613AS, SD11, max UDMA/133
ata1.00: 625140335 sectors, multi 0: LBA48 NCQ (depth 31/32)
ata1.00: configured for UDMA/133
ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata2.00: ATA-8: ST3320613AS, SD11, max UDMA/133
ata2.00: 625142448 sectors, multi 0: LBA48 NCQ (depth 31/32)
ata2.00: configured for UDMA/133
ata3: SATA link down (SStatus 0 SControl 300)
ata4: SATA link down (SStatus 0 SControl 300)
ata5: SATA link down (SStatus 0 SControl 300)
ata6: SATA link down (SStatus 0 SControl 300)
  Vendor: ATA       Model: ST3320613AS       Rev: SD11
  Type:   Direct-Access                      ANSI SCSI revision: 05
SCSI device sda: 625140335 512-byte hdwr sectors (320072 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
SCSI device sda: 625140335 512-byte hdwr sectors (320072 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
 sda: sda1 sda2 sda3 sda4 < sda5 sda6 >
sd 0:0:0:0: Attached scsi disk sda
  Vendor: ATA       Model: ST3320613AS       Rev: SD11
  Type:   Direct-Access                      ANSI SCSI revision: 05
SCSI device sdb: 625142448 512-byte hdwr sectors (320073 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: drive cache: write back
SCSI device sdb: 625142448 512-byte hdwr sectors (320073 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: drive cache: write back
 sdb: sdb1 sdb2 sdb3 sdb4 < sdb5 sdb6 >
sd 1:0:0:0: Attached scsi disk sdb
ACPI: PCI Interrupt 0000:0a:00.0[A] -> GSI 16 (level, low) -> IRQ 169
ahci 0000:0a:00.0: AHCI 0001.0000 32 slots 2 ports 3 Gbps 0x3 impl SATA mode
ahci 0000:0a:00.0: flags: 64bit ncq pm led clo pmp pio slum part 
PCI: Setting latency timer of device 0000:0a:00.0 to 64
scsi6 : ahci
scsi7 : ahci
ata7: SATA max UDMA/133 abar m8192@0xec400000 port 0xec400100 irq 169
ata8: SATA max UDMA/133 abar m8192@0xec400000 port 0xec400180 irq 169
ata7: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata7.15: Port Multiplier 1.1, 0x1095:0x5723 r33, 2 ports, feat 0x1/0x9
ata7.00: hard resetting link
ata7.00: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
ata7.01: hard resetting link
ata7.01: SATA link down (SStatus 0 SControl 320)
ata7.00: ATA-7: External Disk 0, 1.1570, max UDMA/133
ata7.00: 976773168 sectors, multi 1: LBA48 NCQ (depth 31/32)
ata7.00: configured for UDMA/133
ata7: EH complete
ata8: port is slow to respond, please be patient (Status 0xd0)
ata8: softreset failed (device not ready)
ata8: port is slow to respond, please be patient (Status 0xd0)
ata8: softreset failed (device not ready)
ata8: port is slow to respond, please be patient (Status 0xd0)
ata8: softreset failed (device not ready)
ata8: limiting SATA link speed to 1.5 Gbps
ata8: softreset failed (device not ready)
ata8: reset failed, giving up
  Vendor: ATA       Model: External Disk 0   Rev: 1.15
  Type:   Direct-Access                      ANSI SCSI revision: 05
SCSI device sdc: 976773168 512-byte hdwr sectors (500108 MB)
sdc: Write Protect is off
sdc: Mode Sense: 00 3a 00 00
SCSI device sdc: drive cache: write back
SCSI device sdc: 976773168 512-byte hdwr sectors (500108 MB)
sdc: Write Protect is off
sdc: Mode Sense: 00 3a 00 00
SCSI device sdc: drive cache: write back
 sdc: sdc1 sdc2
sd 6:0:0:0: Attached scsi disk sdc
device-mapper: uevent: version 1.0.3

# cat /proc/scsi/scsi
Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 00
  Vendor: ATA      Model: ST3320613AS      Rev: SD11
  Type:   Direct-Access                    ANSI SCSI revision: 05
Host: scsi1 Channel: 00 Id: 00 Lun: 00
  Vendor: ATA      Model: ST3320613AS      Rev: SD11
  Type:   Direct-Access                    ANSI SCSI revision: 05
Host: scsi6 Channel: 00 Id: 00 Lun: 00
  Vendor: ATA      Model: External Disk 0  Rev: 1.15
  Type:   Direct-Access                    ANSI SCSI revision: 05

# cat /proc/interrupts 
           CPU0       CPU1       
  0:   15500143          0    IO-APIC-edge  timer
  1:        121       7100    IO-APIC-edge  i8042
  6:          5          0    IO-APIC-edge  floppy
  8:          1          0    IO-APIC-edge  rtc
  9:          0          0   IO-APIC-level  acpi
 12:      71594          0    IO-APIC-edge  i8042
 66:     137498          0   IO-APIC-level  ide2
 74:          0          0   IO-APIC-level  ehci_hcd:usb1, uhci_hcd:usb5, uhci_hcd:usb8
 82:       3654          0   IO-APIC-level  ehci_hcd:usb2, uhci_hcd:usb6
 90:          0          0   IO-APIC-level  uhci_hcd:usb4
 98:          0          0   IO-APIC-level  uhci_hcd:usb7
106:       7794      36161         PCI-MSI  ahci
114:         16      24466         PCI-MSI  eth0
122:        174          0   IO-APIC-level  HDA Intel
169:        803    1038365   IO-APIC-level  uhci_hcd:usb3, ahci, nvidia
NMI:        365        306 
LOC:   15499836   15499766 
ERR:          0
MIS:          0

scsi[0-5] are the SATA channels on the ICH10R.
scsi6 (ata7) is the first port on JMB363, and the connected SiI5723 is in RAID1 mode with two hard drives connected.
scsi7 (ata8) is the second port on JMB363, and the connected SiI5723 has no drives.

If neither SiI5723 has drives, then both ata7 and ata8 are slow to respond.

Comment 3 David Milburn 2010-07-13 19:11:58 UTC
Would it be possible to try and reproduce this on a more recent kernel and
attach the full dmesg output? Thanks.

http://people.redhat.com/jwilson/el5/206.el5/

Comment 4 Jukka Lehtonen 2010-07-15 15:34:46 UTC
Created attachment 432116 [details]
dmesg

Output of 2.6.18-206.el5, as requested.

Comment 5 David Milburn 2010-07-15 18:07:55 UTC
I have backported this upstream patch to RHEL5, I will make a test kernel
available as soon as possible.

commit 5594639aab8b5614cb27a3e5b2b627505cbcd137
Author: Tejun Heo <tj>
Date:   Tue Aug 4 14:30:08 2009 +0900

    ahci: add workaround for on-board 5723s on some gigabyte boards
    
    Some gigabytes have on-board SIMG5723s connected to JMB ahcis.  These
    are used to implement hardware raid.  Unfortunately some firmware
    revisions on these 5723s don't bring the link down when all the
    downstream ports are unoccupied while not responding to reset protocol
    which makes libata think that there's device attached to the port but
    is not responding and retry.  This results in painfully wrong boot
    detection time for these ports when they're empty.
    
    This patch quirks those boards such that ahci gives up after the
    initial timeout.  Combined with parallel probing, this gives quick
    enough probing and also is safe because SIMG5723 will respond to the
    first try if any of the downstream ports is occupied.

Comment 6 John Feeney 2013-10-30 22:14:51 UTC
This Bugzilla has been reviewed by Red Hat and is not planned on being
addressed in Red Hat Enterprise Linux 5, and therefore is being closed.
If this bug is critical to production systems, please contact your Red
Hat support representative and provide a sufficient business justification
in order to re-open it.