Bug 248271

Summary: Wake up from suspend fails (hard disk and root filesystem are lost)
Product: [Fedora] Fedora Reporter: Philippe Rigault <prigault>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: urgent Docs Contact:
Priority: low    
Version: 7   
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: kernel-2.6.22.4-65.fc7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-08-27 19:37:23 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Philippe Rigault 2007-07-14 20:23:12 UTC
Description of problem:
Wake up from suspend fails because access to hard disk and root filesystem are 
lost.

Version-Release number of selected component (if applicable):
kernel-2.6.21-1.3228.fc7.x86_64

How reproducible:
Always


Steps to Reproduce:
1. Boot laptop, log in (works fine)
2. Suspend laptop (works fine)
3. Wake it up
  
Actual results: 
The laptop _seems_ to wake up O.K at first, screen comes up OK (on a graphical 
session, icons disappear on desktop).
However, the machine is totally unusable at this point. Executing any program 
is impossible because the root filesystem is no longer available (trying to 
diagnose with programs is moot at this point since programs like 'df' are no 
longer accessible)

The machines still responds to 'ping', however.

Expected results: 
Laptop wakes up OK

Additional info:
1. Machine is a HP Pavilion dv2000, model dv2422ca

2. Error messages: switching to a terminal session (Ctrl-Alt-Fn), one sees the 
following messages:
ATA abnormal status 0x80 on port ...
ata1.00: ...
ata1: failed ...
<verbose output that scrolls too fast for me to catch it>
EXT3-fs error (device sda2) ext3_find_entry: reading directory #1664663 offset
scsi 0:0:0:0: rejecting I/O to dead device
scsi 0:0:0:0: rejecting I/O to dead device
scsi 0:0:0:0: rejecting I/O to dead device
...

This message from syslogd can be trapped on a terminal:
machine_name kernel: journal commit I/O errorRead

3. What is more interesting is that on another near-exact model (dv 2000, 
model dv2310ca --which is the previous model), suspend/resume works perfectly 
(and yes, with the Nvidia driver and ndiswrapper).
The two boxes run the exact same Fedora7 + updates, have the exact same 
hardware (except one has Turion-1.6GHz/1GB RAM and the other Turion-1.8GHz/2GB 
RAM). The BIOS version is different though (F.23 for dv 2310ca and F.34 for 
dv2422ca).

4. The symptoms are the same whether suspend is triggered by the GUI (KDE or 
Gnome) or by a terminal ('pm-suspend').

5. I tried different quirks from:
 http://people.freedesktop.org/~hughsient/quirk/quirk-suspend-index.html
 - quirks tried: --quirk-s3-bios
                 --quirk-s3-mode 
                 --quirk-s3-bios --quirk-s3-mode 
                 --quirk-dpms-on
                 --quirk-vbe-post --quirk-vbemode-restore
  None of this changed anything.
  Also, removing nvidia and ndiswrapper do not change anything either.

6. The filesystems themselves are OK, each time I rebooted after a failed 
suspend/resume, I added the 'forcefsck' to kernel boot options.

7. Lastly, one user-visible difference between the boxes is that the 
configuration menu for 'Lid Switch Close' action is available on the dv2310ca 
but not on the dv2422ca (this menu is a separate tab 'Button Actions' on KDE 
Control Center -> Power Control -> Laptop Battery).

Any ideas on how to pursue this further ?

Cheers,

Philippe

Comment 1 Philippe Rigault 2007-07-14 20:31:25 UTC
> The two boxes run the exact same Fedora7 + updates, have the exact same
> hardware (except one has Turion-1.6GHz/1GB RAM and the other
> Turion-1.8GHz/2GB RAM). 

Actually, their hard disks are different:

============================================================
dv2310ca (the one that works): 120GB SAMSUNG
============================================================
$ /sbin/hdparm -i /dev/sda

/dev/sda:

 Model=SAMSUNG HM120JI                         , FwRev=YF100-18, 
SerialNo=S116J10P403638

 Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs }
 RawCHS=16383/16/63, TrkSize=34902, SectSize=554, ECCbytes=4
 BuffType=DualPortCache, BuffSize=8192kB, MaxMultSect=16, MultSect=?16?
 CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=234441648
 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
 PIO modes:  pio0 pio1 pio2 pio3 pio4
 DMA modes:  mdma0 mdma1 mdma2
 UDMA modes: udma0 udma1 udma2 udma3 udma4 *udma5
 AdvancedPM=yes: disabled (255) WriteCache=enabled
 Drive conforms to: ATA/ATAPI-7 T13 1532D revision 0:  ATA/ATAPI-1 ATA/ATAPI-2 
ATA/ATAPI-3
ATA/ATAPI-4 ATA/ATAPI-5 ATA/ATAPI-6 ATA/ATAPI-7

 * signifies the current active mode

============================================================
dv2422ca (the one that fails): 160GB HITACHI
============================================================
$ /sbin/hdparm -i /dev/sda


/dev/sda:

 Model=Hitachi HTS541616J9SA00                 , FwRev=SB4OC7BP, SerialNo=      
SB2404SJJV6GAE
 Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs }
 RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4
 BuffType=DualPortCache, BuffSize=7516kB, MaxMultSect=16, MultSect=?16?
 CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=268435455
 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
 PIO modes:  pio0 pio1 pio2 pio3 pio4
 DMA modes:  mdma0 mdma1 mdma2
 UDMA modes: udma0 udma1 udma2 udma3 udma4 *udma5
 AdvancedPM=yes: mode=0x80 (128) WriteCache=enabled
 Drive conforms to: ATA/ATAPI-7 T13 1532D revision 1:  ATA/ATAPI-2 ATA/ATAPI-3 
ATA/ATAPI-4 ATA/ATAPI-5 ATA/ATAPI-6 ATA/ATAPI-7

 * signifies the current active mode



Comment 2 Philippe Rigault 2007-08-27 19:37:23 UTC
Fixed in kernel-2.6.22.4-65.fc7.x86_64