Bug 806676 - Regression: SATA hot swap broken - one CPU goes 100% - unable to synchronize and stop disk
Summary: Regression: SATA hot swap broken - one CPU goes 100% - unable to synchronize ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 16
Hardware: x86_64
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 808743 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-03-25 23:29 UTC by Dariusz Garbowski
Modified: 2012-04-26 03:28 UTC (History)
8 users (show)

Fixed In Version: kernel-2.6.43.2-6.fc15
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-04-22 18:25:38 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Details of failure of eSATA hotswap on 2nd attempt (27.55 KB, text/plain)
2012-04-08 10:16 UTC, Dr J Austin
no flags Details

Description Dariusz Garbowski 2012-03-25 23:29:18 UTC
Description of problem:

Kernel 3.3.0-4.fc16.x86_64 breaks SATA hot swap. This is regression compared to kernel 3.2.10-3.fc16.x86_64 (and previous ones).

Attempting to "delete" a device doesn't properly synchronize and stop the disk with the new kernel, causes one CPU to go 100% busy (ksoftirqd and two kworker processes) and after a long time the processes stop their work (time out) and give up.

Details below.

--------------
Version-Release number of selected component (if applicable):

kernel-3.3.0-4.fc16.x86_64


--------------
How reproducible:


Steps to Reproduce:
1. Start OS with the broken kernel.
2. Use SATA drive that is not mounted and as root issue command:

# echo 1 > /sys/block/sdd/device/delete

Expected results:
The disk should sync and power down.

Actual results:
# uname -a
Linux localhost.localdomain 3.3.0-4.fc16.x86_64 #1 SMP Tue Mar 20 18:05:40 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
# echo 1 > /sys/block/sdd/device/delete
# dmesg
...
[ 5736.061664] sd 3:0:0:0: [sdd] Synchronizing SCSI cache
# top
top - 16:37:02 up  1:37, 28 users,  load average: 2.32, 0.79, 0.32
Tasks: 239 total,   2 running, 237 sleeping,   0 stopped,   0 zombie
Cpu0  :  2.7%us,  0.3%sy,  0.0%ni, 96.3%id,  0.0%wa,  0.3%hi,  0.3%si,  0.0%st
Cpu1  :  3.7%us,  0.3%sy,  0.0%ni, 96.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu2  :  1.3%us,  0.7%sy,  0.0%ni, 98.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu3  :  0.0%us, 38.4%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.7%hi, 60.9%si,  0.0%st
Mem:   7662892k total,  4288944k used,  3373948k free,   121168k buffers
Swap:  8388604k total,        0k used,  8388604k free,  1354444k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                      
   19 root      20   0     0    0    0 R 42.5  0.0   0:44.92 ksoftirqd/3                                                                                                  
 6128 root      20   0     0    0    0 S 28.9  0.0   0:24.39 kworker/3:2                                                                                                  
 6405 root      20   0     0    0    0 S 27.9  0.0   0:14.86 kworker/3:1                                                                                                  
 3033 thufor    20   0 3516m 762m  33m S  6.3 10.2  16:26.69 firefox                                                                                                      
 1346 root      20   0  394m 171m  46m S  1.3  2.3   4:42.89 X                                                                                                            
 2045 thufor    20   0  558m  46m  24m S  0.7  0.6   0:22.82 konsole                                                                                                      
 2057 thufor     9 -11  449m 5760 4176 S  0.7  0.1   0:47.80 pulseaudio                                                                                                   
    3 root      20   0     0    0    0 S  0.3  0.0   0:02.05 ksoftirqd/0                                                                                                  
 1068 root      20   0  6940  324  220 S  0.3  0.0   0:00.07 gpm                                                                                                          
 2046 thufor    20   0  308m  10m 7860 S  0.3  0.1   0:27.79 gkrellm                                                                                                      
 6363 thufor    20   0 15400 1288  880 R  0.3  0.0   0:00.15 top                                                                                                          
    1 root      20   0 60172  24m 2052 S  0.0  0.3   0:01.08 systemd                                                                                                      
    2 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kthreadd                                                                                                 
After quite a long time CPU usage goes down...

# dmesg
...
[ 6096.062007] sd 3:0:0:0: timing out command, waited 360s
[ 6456.063006] sd 3:0:0:0: timing out command, waited 360s
[ 6816.064006] sd 3:0:0:0: timing out command, waited 360s
[ 6816.064021] sd 3:0:0:0: [sdd]  Result: hostbyte=DID_OK driverbyte=DRIVER_OK
[ 6816.064026] sd 3:0:0:0: [sdd] Stopping disk
[ 6996.065011] sd 3:0:0:0: timing out command, waited 180s
[ 6996.065036] sd 3:0:0:0: [sdd] START_STOP FAILED
[ 6996.065040] sd 3:0:0:0: [sdd]  Result: hostbyte=DID_OK driverbyte=DRIVER_OK

---------------
Additional info:
The same sequence works correctly with previous kernel (and many previous kernels, going back to at least F14):

# uname -a
Linux localhost.localdomain 3.2.10-3.fc16.x86_64 #1 SMP Thu Mar 15 19:39:46 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
# echo 1 > /sys/block/sdd/device/delete
# dmesg
...
[63327.130113] sd 3:0:0:0: [sdd] Synchronizing SCSI cache
[63327.130450] sd 3:0:0:0: [sdd] Stopping disk
[63327.890959] ata3.00: disabled

Motherboard: Asus M4A78T-E

Smolt Data:
General
=================================
UUID: 8bab98ba-aadf-404e-b060-8be7bfebe7f1
OS: Fedora release 16 (Verne)
Default run level: Unknown
Language: en_US.UTF-8
Platform: x86_64
BogoMIPS: 6421.32
CPU Vendor: AuthenticAMD
CPU Model: AMD Phenom(tm) II X4 955 Processor
CPU Stepping: 2
CPU Family: 16
CPU Model Num: 4
Number of CPUs: 4
CPU Speed: 3210
System Memory: 7483
System Swap: 8191
Vendor: System manufacturer
System: System Product Name System Version
Form factor: Desktop
Kernel: 3.3.0-4.fc16.x86_64
SELinux Enabled: 1
SELinux Policy: targeted
SELinux Enforce: Enforcing
MythTV Remote: Unknown
MythTV Role: Unknown
MythTV Theme: Unknown
MythTV Plugin: 
MythTV Tuner: -1


Devices
=================================
(4130:38406:4130:38400) pci, pcieport, PCI/PCI, RS780 PCI to PCI bridge (PCIE port 2)
(4098:17296:4098:17296) pci, ahci, STORAGE, SB7x0/SB8x0/SB9x0 SATA Controller [IDE mode]
(4130:38407:4130:38400) pci, pcieport, PCI/PCI, RS780 PCI to PCI bridge (PCIE port 3)
(4098:38420:4163:33613) pci, radeon, VIDEO, Radeon HD 3300 Graphics
(4130:38400:4130:38400) pci, None, HOST/PCI, RS780 Host Bridge
(4130:38402:4130:38402) pci, None, PCI/PCI, RS780/RS880 PCI to PCI bridge (int gfx)
(4358:13315:4163:33668) pci, firewire_ohci, FIREWIRE, P8P67 Deluxe Motherboard
(4130:4612:0:0) pci, None, HOST/PCI, Family 10h Processor Link Control
(4130:38403:4130:38400) pci, pcieport, PCI/PCI, RS780 PCI to PCI bridge (ext gfx port 0)
(4130:4609:0:0) pci, None, HOST/PCI, Family 10h Processor Address Map
(4130:4608:0:0) pci, None, HOST/PCI, Family 10h Processor HyperTransport Configuration
(4130:4611:0:0) pci, k10temp, HOST/PCI, Family 10h Processor Miscellaneous Control
(4130:4610:0:0) pci, None, HOST/PCI, Family 10h Processor DRAM Controller
(4098:38032:6023:8195) pci, radeon, VIDEO, RV730XT [Radeon HD 4670]
(4098:43576:6023:43576) pci, snd_hda_intel, MULTIMEDIA, RV710/730
(6505:4134:4163:33564) pci, ATL1E, ETHERNET, AR8121/AR8113/AR8114 Gigabit or Fast Ethernet
(4098:17308:4098:17308) pci, pata_atiixp, STORAGE, SB7x0/SB8x0/SB9x0 IDE Controller
(4098:17285:4098:17285) pci, None, SERIAL, SBx00 SMBus Controller
(4098:17309:4098:17283) pci, None, PCI/ISA, SB7x0/SB8x0/SB9x0 LPC host controller
(4098:17283:33623:4163) pci, snd_hda_intel, MULTIMEDIA, SBx00 Azalia (Intel HDA)
(4098:17305:4098:17302) pci, ohci_hcd, USB, SB7x0/SB8x0/SB9x0 USB OHCI2 Controller
(4098:17284:0:0) pci, None, PCI/PCI, SBx00 PCI to PCI Bridge
(4098:17302:4098:17302) pci, ehci_hcd, USB, SB7x0/SB8x0/SB9x0 USB EHCI Controller
(4098:38415:4098:38415) pci, snd_hda_intel, MULTIMEDIA, RS780 Azalia controller
(4098:17303:4098:17304) pci, ohci_hcd, USB, SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
(4098:17304:4098:17305) pci, ohci_hcd, USB, SB7x0 USB OHCI1 Controller
(4098:17302:4098:17303) pci, ehci_hcd, USB, SB7x0/SB8x0/SB9x0 USB EHCI Controller
(4098:17304:4098:17304) pci, ohci_hcd, USB, SB7x0 USB OHCI1 Controller
(4098:17303:4098:17303) pci, ohci_hcd, USB, SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
(4279:36949:4279:36949) pci, 3c59x, ETHERNET, 3C905B Fast Etherlink XL 10/100


Filesystem Information
=================================
device mtpt type bsize frsize blocks bfree bavail file ffree favail
-------------------------------------------------------------------
/dev/sda2 / ext4 4096 4096 27920769 10050143 8652036 6995968 6221568 6221568
/dev/sdb1 WITHHELD ext4 4096 4096 487910570 136914508 132030725 122101760 121526855 121526855

Comment 1 Dariusz Garbowski 2012-03-25 23:35:33 UTC
Additionally after the "delete" operation and timeout the device is gone (kernel 3.3.0):

# ll /dev/sdd
ls: cannot access /dev/sdd: No such file or directory

# ll /dev/sd*
brw-rw----. 1 root disk 8,   0 Mar 25 16:31 /dev/sda
brw-rw----. 1 root disk 8,   1 Mar 25 15:00 /dev/sda1
brw-rw----. 1 root disk 8,   2 Mar 25 15:00 /dev/sda2
brw-rw----. 1 root disk 8,  16 Mar 25 16:31 /dev/sdb
brw-rw----. 1 root disk 8,  17 Mar 25 15:00 /dev/sdb1
brw-rw----. 1 root disk 8,  32 Mar 25 16:31 /dev/sdc
brw-rw----. 1 root disk 8,  33 Mar 25 15:00 /dev/sdc1
brw-rw----. 1 root disk 8,  64 Mar 25 16:31 /dev/sde
brw-rw----. 1 root disk 8,  80 Mar 25 15:00 /dev/sdf
brw-rw----. 1 root disk 8,  96 Mar 25 15:00 /dev/sdg
brw-rw----. 1 root disk 8, 112 Mar 25 15:00 /dev/sdh
brw-rw----. 1 root disk 8, 128 Mar 25 15:00 /dev/sdi

Comment 2 Matthias Hensler 2012-03-29 07:50:09 UTC
I can verify this problem on a Lenovo Thinkpad T420 with an external esata drive. Also esata is broken after suspend.

This bug is related to https://bugzilla.redhat.com/show_bug.cgi?id=807632. I will add more information there.

Comment 3 Josh Boyer 2012-04-02 13:33:14 UTC
*** Bug 808743 has been marked as a duplicate of this bug. ***

Comment 4 Josh Boyer 2012-04-04 14:48:47 UTC
I've applied the submitted patch to all Fedora branches.  It will be in the next submitted update.

Comment 5 Fedora Update System 2012-04-05 12:50:32 UTC
kernel-3.3.1-3.fc17 has been submitted as an update for Fedora 17.
https://admin.fedoraproject.org/updates/kernel-3.3.1-3.fc17

Comment 6 Fedora Update System 2012-04-05 12:53:22 UTC
kernel-3.3.1-3.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/kernel-3.3.1-3.fc16

Comment 7 Fedora Update System 2012-04-05 18:24:48 UTC
Package kernel-3.3.1-3.fc17:
* should fix your issue,
* was pushed to the Fedora 17 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing kernel-3.3.1-3.fc17'
as soon as you are able to, then reboot.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2012-5346/kernel-3.3.1-3.fc17
then log in and leave karma (feedback).

Comment 8 Dariusz Garbowski 2012-04-07 05:56:26 UTC
I'm afraid I can still reproduce the issue.

Tested with kernel:

# uname -a
Linux localhost.localdomain 3.3.1-3.fc16.x86_64 #1 SMP Wed Apr 4 18:08:51 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

Comment 9 Fedora Update System 2012-04-08 03:27:07 UTC
kernel-3.3.1-3.fc16 has been pushed to the Fedora 16 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 10 Dr J Austin 2012-04-08 10:16:15 UTC
Created attachment 576011 [details]
Details of failure of eSATA hotswap on 2nd attempt

I believe this problem is still present in kernel
[root@minix ~]# uname -a
Linux minix 3.3.1-3.fc16.x86_64 #1 SMP Wed Apr 4 18:08:51 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux


Summary
Plugin eSATA disk - device driver entries are created
/dev/sdb
/dev/sdb1
Disk is mounted

umount the disk
the device driver entries remain
/dev/sdb
/dev/sdb1
unplug the eSATA drive
after a delay the drive entries are removed

Plugin the eSATA drive - Nothing is seen in dmesg, /dev, /var/log/messages

Attachment shows full story

John

Comment 11 Dariusz Garbowski 2012-04-08 23:16:21 UTC
Reopening since the bug is not fixed in the latest update.

Comment 12 Patrick 2012-04-08 23:35:56 UTC
This issue was fixed for me with kernel-3.3.1-3.fc16.x86_64 from updates-testing.

Dr Austin: are you sure that your eSATA drive is plugged into an eSATA port that is internally (in your PC) hooked up to an eSATA capable SATA port and not a regular SATA port? The reason I'm asking: my Gigabyte motherboard has a ton of SATA ports but only 2 are marked specifically for offering eSATA services. So the eSATA connectors on my PC are hooked up internally to those two specific eSATA-service-offering SATA ports. I'm definitely not very knowledgeable about SATA so I might be wrong in assuming you need a specific SATA port for eSATA services but maybe it's worth checking. Also AFAIK in the BIOS the SATA port needs to be set to AHCI for eSATA to work. Hopefully someone with far better knowledge than I can comment if these points are valid.

Comment 13 Dariusz Garbowski 2012-04-09 00:27:12 UTC
I have done more testing. It seems that the issue is fixed (or at least partially fixed). There still seems to be something funny with device mapper keeping the device busy and I'm 95% certain that during my testing of the updated kernel two days ago I have been able to reproduce this issue but I can't seem to do it today.

So for it looks like it's fixed for me. Thanks! Sorry for the noise.

(I do however reserve the right to report otherwise if I find a case when this bug shows up again. ;-)

Comment 14 Dr J Austin 2012-04-09 14:09:18 UTC
I have brought out eSATAp (+5V and +12V) sockets to the front of two machines,
the internal connection being to standard SATA motherboard ports.
On one machine the internal ports are 6Gb/s. A 6Gb/s rated SSD will revert to
3Gb/s on this port presumably because of the discontinuity at the plug/socket.
Otherwise the sockets have been working successfully for up to two years with both mechanical and SSD disks

I have just doubled checked 3.3.1-3.fc16.x86_64 and 2nd attempt failed again

I have re-installed and booted 3.2.10-2.fc16.x86_64 and 2nd attempt succeeded with the older kernel

The ONLY change being the booting of a different kernel

The same sequence of events using an Ext USB device instead of eSATA is successful with both kernels

For info - I don't really believe this can make a difference, but ...

I mount/umount eSATA and USB disks as follows (again for about 2 years)
(Maybe there is a better method these days!)
(/usr/share/polkit-1/actions/org.freedesktop.udisks.policy has been hacked to give a normal user permission for this)

udev rules detect the ID_FS_LABEL and set the mount options and disk scheduler
as required calling sd_mount
sd_umount now only removes the mount point and is 90% commented out
-------------------------------------------------------------
root@minix rules.d 1005$ cat /lib/udev/rules.d/79-ja_test.rules 
ACTION=="add", KERNEL=="sd*", SUBSYSTEM=="block", ENV{ID_FS_LABEL}=="cssd4", RUN:="/usr/sbin/sd_mount %k $env{ID_FS_LABEL} noatime,discard noop",  OPTIONS+="last_rule"

ACTION=="add", KERNEL=="sd*", SUBSYSTEM=="block", ENV{ID_FS_LABEL}=="cb1", RUN:="/usr/sbin/sd_mount %k $env{ID_FS_LABEL} noatime noop",  OPTIONS+="last_rule"
ACTION=="add", KERNEL=="sd*", SUBSYSTEM=="block", ENV{ID_FS_LABEL}=="cb2", RUN:="/usr/sbin/sd_mount %k $env{ID_FS_LABEL} noatime noop",  OPTIONS+="last_rule"
ACTION=="add", KERNEL=="sd*", SUBSYSTEM=="block", ENV{ID_FS_LABEL}=="corsair1", RUN:="/usr/sbin/sd_mount %k $env{ID_FS_LABEL} noatime noop",  OPTIONS+="last_rule"
ACTION=="add", KERNEL=="sd*", SUBSYSTEM=="block", ENV{ID_FS_LABEL}=="throttle", RUN:="/usr/sbin/sd_mount %k $env{ID_FS_LABEL} noatime noop",  OPTIONS+="last_rule"
ACTION=="add", KERNEL=="sd*", SUBSYSTEM=="block", ENV{ID_FS_LABEL}=="cr1", RUN:="/usr/sbin/sd_mount %k $env{ID_FS_LABEL} noatime noop",  OPTIONS+="last_rule"
ACTION=="add", KERNEL=="sd*", SUBSYSTEM=="block", ENV{ID_FS_LABEL}=="cr2", RUN:="/usr/sbin/sd_mount %k $env{ID_FS_LABEL} noatime noop",  OPTIONS+="last_rule"

ACTION=="add", KERNEL=="sd*", SUBSYSTEM=="block", ENV{ID_FS_LABEL}=="seagate250", RUN:="/usr/sbin/sd_mount %k $env{ID_FS_LABEL} null cfq",  OPTIONS+="last_rule"
ACTION=="add", KERNEL=="sd*", SUBSYSTEM=="block", ENV{ID_FS_LABEL}=="crossfire", RUN:="/usr/sbin/sd_mount %k $env{ID_FS_LABEL} null cfq",  OPTIONS+="last_rule"
ACTION=="add", KERNEL=="sd*", SUBSYSTEM=="block", ENV{ID_FS_LABEL}=="wd250", RUN:="/usr/sbin/sd_mount %k $env{ID_FS_LABEL} null cfq",  OPTIONS+="last_rule"
ACTION=="add", KERNEL=="sd*", SUBSYSTEM=="block", ENV{ID_FS_LABEL}=="max80", RUN:="/usr/sbin/sd_mount %k $env{ID_FS_LABEL} null cfq",  OPTIONS+="last_rule"

ACTION=="remove", KERNEL=="sd*", SUBSYSTEM=="block", ENV{ID_FS_LABEL}=="cssd4", RUN:="/usr/sbin/sd_umount %k $env{ID_FS_LABEL}",  OPTIONS+="last_rule"

ACTION=="remove", KERNEL=="sd*", SUBSYSTEM=="block", ENV{ID_FS_LABEL}=="cb1", RUN:="/usr/sbin/sd_umount %k $env{ID_FS_LABEL}",  OPTIONS+="last_rule"
ACTION=="remove", KERNEL=="sd*", SUBSYSTEM=="block", ENV{ID_FS_LABEL}=="cb2", RUN:="/usr/sbin/sd_umount %k $env{ID_FS_LABEL}",  OPTIONS+="last_rule"
ACTION=="remove", KERNEL=="sd*", SUBSYSTEM=="block", ENV{ID_FS_LABEL}=="corsair1", RUN:="/usr/sbin/sd_umount %k $env{ID_FS_LABEL}",  OPTIONS+="last_rule"
ACTION=="remove", KERNEL=="sd*", SUBSYSTEM=="block", ENV{ID_FS_LABEL}=="throttle", RUN:="/usr/sbin/sd_umount %k $env{ID_FS_LABEL}",  OPTIONS+="last_rule"
ACTION=="remove", KERNEL=="sd*", SUBSYSTEM=="block", ENV{ID_FS_LABEL}=="cr1", RUN:="/usr/sbin/sd_umount %k $env{ID_FS_LABEL}",  OPTIONS+="last_rule"
ACTION=="remove", KERNEL=="sd*", SUBSYSTEM=="block", ENV{ID_FS_LABEL}=="cr2", RUN:="/usr/sbin/sd_umount %k $env{ID_FS_LABEL}",  OPTIONS+="last_rule"
#ACTION=="remove", KERNEL=="sd*", SUBSYSTEM=="block", ENV{ID_SERIAL_SHORT}=="A740000000000139", RUN:="/usr/sbin/sd_umount %k",  OPTIONS+="last_rule"

ACTION=="remove", KERNEL=="sd*", SUBSYSTEM=="block", ENV{ID_FS_LABEL}=="seagate250", RUN:="/usr/sbin/sd_umount %k $env{ID_FS_LABEL}",  OPTIONS+="last_rule"
ACTION=="remove", KERNEL=="sd*", SUBSYSTEM=="block", ENV{ID_FS_LABEL}=="crossfire", RUN:="/usr/sbin/sd_umount %k $env{ID_FS_LABEL}",  OPTIONS+="last_rule"
ACTION=="remove", KERNEL=="sd*", SUBSYSTEM=="block", ENV{ID_FS_LABEL}=="wd250", RUN:="/usr/sbin/sd_umount %k $env{ID_FS_LABEL}",  OPTIONS+="last_rule"
ACTION=="remove", KERNEL=="sd*", SUBSYSTEM=="block", ENV{ID_FS_LABEL}=="max80", RUN:="/usr/sbin/sd_umount %k $env{ID_FS_LABEL}",  OPTIONS+="last_rule"
-----------------------------------------------------------------------------

root@minix rules.d 1006$ cat /usr/sbin/sd_mount
#!/usr/bin/perl
#F15 Version July 4th 2011
#sd_mount sdb2 css2 noatime,discard noop
#If there are no mount options for a partition enter null
#Called with four arguments from /lib/udev/rules.d

#Set up debug logging
#$uuid=`uuidgen`;chop $uuid;
#open(MESS,">/tmp/mess_$uuid");
#print MESS `date`;
$dev = $ARGV[0];
$label = $ARGV[1];
$options=$ARGV[2];
$schedule=$ARGV[3];
#print MESS "$dev   $label   $options   $schedule \n";
##########################################################
#Get the "base" device
if ( "$dev" =~ /sd[a-g][1-9]/ ) {$disk = $dev;chop($disk);}
if ( "$dev" =~ /sd[a-g][1-2][0-9]/ ) {$disk = $dev;chop($disk);chop($disk);}
#print MESS "dev= ",$dev,"   ","disk= ",$disk,"\n";
##########################################################
#noop deadline [cfq]
if( "$schedule" eq "noop" ){
        $command="echo noop > /sys/block/"."$disk"."/queue/scheduler";
#       print MESS "$command  \n";
        `$command`;
        }
##########################################################
#Assign the correct mount options for the partition
if( "$options" eq "null" ){$options="";} else {$options="-o $options";}
#print MESS "Mount Options are $options\n";
##########################################################
#Find who is logged into the Console - Only one?
$who = `w`;
#print MESS $who;
@lines = split (/\n/, $who);
#Skip the first line - it has ":" in it !
for ($j =1 ; $j <= $#lines; $j++){
#       Split on 1 or more white spaces
        @line = split (/\s+/, $lines[$j]);
#       Find line with :[0-9] in 2nd column
        if ( $line[1] =~ /\:\d/ ){
#                print MESS "Console User is ",$line[0],"\n";
                $user = $line[0];
                }
        }
($uid,$gid)= (stat("/home/$user"))[4,5];
#print MESS $uid,"  ",$gid,"\n";
##########################################################
#Create the mount point - it will not be deleted by Thunar, PCManfm or Dolphin
#The mount points are deleted by sd_umount when Dolphin umounts the "last" mounted partition
mkdir ("/media/$label",0777);
chmod 0777,"/media/$label";
$no = chown $uid,$gid,"/media/$label";
##########################################################
#Mount the partition as required
`/bin/mount -t auto $options /dev/$dev /media/$label`;
#print MESS "Just past the mount command\n";
##########################################################
#Change uid and gid of the mounted file system
$no = chown $uid,$gid,"/media/$label";
##########################################################
#close MESS;
--------------------------------------------------------------------------

root@minix rules.d 1007$ cat /usr/sbin/sd_umount
#!/usr/bin/perl
#sd_umount sdb2 css2
#Called with two arguments from /lib/udev/rules.d
#The current (F15) device removal using Dolphin proceeds as follows
#Individual partitions can be "ejected" ie umounted and /dev/sdxy removed by Dolphin
#When the last one is removed then the 79 rule is activated for each partition that matches
#ie if sdb1, sdb2 and sdb3 are present on the disk then /dev/sdb1,2 & 3 are created
#79 only matches and mounts cr1 and cr2 say (cr3 being swap say) using sd_mount
#Selecting Dolphin "Safely Remove" umounts a partition cr2 say, cr1 remains mounted, sdb1, 2 & 3 remain
#The two mount points remain
#When Dolphin is used to umount the last partition cr1 say then
#cr1 is umounted, Dolphin removes /dev/sdb1, sdb2, sdb3 leaving /dev/sdb
#This script then removes the mount points. This is probably only necessary to permit sd_mount
#to create them again when the disk is plugged in again.  Should sd_mount not check for the presence
#of the mount point and just change the ownership/permissions if required?
#Physically removing the disk deletes /dev/sdb somehow
#USB and eSATA devices behave differently wrt the calling of this script
#USB behavour is described above. Dolphin/something uses 79 rule when the last partition is umounted
#For an eSATA device the 79 rule is activated up to 30s after the eSATA device is removed!

#$uuid=`uuidgen`;chop $uuid;
#Set up debug logging
#open(MESS,">/tmp/mess1_$uuid");
#print MESS `date`;
$dev = $ARGV[0];
$label = $ARGV[1];
#print MESS "$dev   $label\n";
#print MESS "$dev \n";
#close MESS;
#exit;
##########################################################
#Get the "base" device
#if ( "$dev" =~ /sd[a-g][1-9]/ ) {$disk = $dev;chop($disk);}
#if ( "$dev" =~ /sd[a-g][1-2][0-9]/ ) {$disk = $dev;chop($disk);chop($disk);}
#print MESS "dev= ",$dev,"   ","disk= ",$disk,"\n";
##########################################################
#Remove the mount point
rmdir ("/media/$label") || die "Cannot rmdir /media/$label";

#close MESS;
exit;

Comment 15 Fedora Update System 2012-04-11 00:26:59 UTC
kernel-3.3.1-5.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/kernel-3.3.1-5.fc16

Comment 16 Fedora Update System 2012-04-11 00:28:42 UTC
kernel-3.3.1-5.fc17 has been submitted as an update for Fedora 17.
https://admin.fedoraproject.org/updates/kernel-3.3.1-5.fc17

Comment 17 Fedora Update System 2012-04-11 00:29:32 UTC
kernel-2.6.43.1-5.fc15 has been submitted as an update for Fedora 15.
https://admin.fedoraproject.org/updates/kernel-2.6.43.1-5.fc15

Comment 18 Dr J Austin 2012-04-11 14:26:43 UTC
ja@minix ~ 1$ uname -a
Linux minix 3.3.1-5.fc16.x86_64 #1 SMP Tue Apr 10 19:56:52 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

I have just installed and tested the above kernel and still have the same
problem when plugging in an eSATA device the second time following a clean reboot

Comment 19 Fedora Update System 2012-04-12 01:09:07 UTC
Package kernel-3.3.1-5.fc17:
* should fix your issue,
* was pushed to the Fedora 17 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing kernel-3.3.1-5.fc17'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2012-5701/kernel-3.3.1-5.fc17
then log in and leave karma (feedback).

Comment 20 Dr J Austin 2012-04-12 09:03:48 UTC
This problem is still present in
kernel-3.3.1-5.fc16 (See Comment 18) and
ja@minix ~ 1$ uname -a
Linux minix 3.3.1-5.fc17.x86_64 #1 SMP Tue Apr 10 20:42:28 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

I have only been providing success/failure feedback

Is there more useful testing I can perform?

Are 807632 (Closed) and this bug the same thing?
My problem is really defined better in 807632

Comment 21 Fedora Update System 2012-04-13 21:32:53 UTC
kernel-3.3.1-5.fc16 has been pushed to the Fedora 16 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 22 Fedora Update System 2012-04-14 00:40:44 UTC
kernel-2.6.43.2-2.fc15 has been submitted as an update for Fedora 15.
https://admin.fedoraproject.org/updates/kernel-2.6.43.2-2.fc15

Comment 23 Fedora Update System 2012-04-14 04:33:15 UTC
kernel-3.3.1-5.fc17 has been pushed to the Fedora 17 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 24 Dr J Austin 2012-04-14 13:01:56 UTC
It looks as if this may be my/the problem

http://www.spinics.net/lists/linux-ide/msg43173.html
(Date: Fri, 13 Apr 2012 09:24:07 +08002012_04_13)

"...
> The fundamental problem with this patch is that all SATA ports are 
> hotpluggable... even the ones the firmware/silicon failed to mark as 
> hotpluggable via AHCI's PORT_CMD_MPSP | PORT_CMD_HPCP

So the acceptable solution is to add runtime pm support for hotpluggable
port.

I'll send new patches.

Thanks,
Lin Ming"

Comment 25 Fedora Update System 2012-04-21 16:47:41 UTC
kernel-2.6.43.2-6.fc15 has been submitted as an update for Fedora 15.
https://admin.fedoraproject.org/updates/kernel-2.6.43.2-6.fc15

Comment 26 Dariusz Garbowski 2012-04-22 14:56:04 UTC
Unfortunately I can still reproduce the issue with the following kernels:

Linux localhost.localdomain 3.3.2-1.fc16.x86_64 #1 SMP Sat Apr 14 00:31:23 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

Linux localhost.localdomain 3.3.1-3.fc16.x86_64 #1 SMP Wed Apr 4 18:08:51 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

I tested first thing after reboot -- login into KDE and from Konsole issue as root:

echo 1 > /sys/block/sde/device/delete

I see exactly what I reported originally when I created this bug report.

Comment 27 Dariusz Garbowski 2012-04-22 15:06:27 UTC
Also, once the issue happens, attempt to reboot fails -- the system just freezes half way though shutdown process and I have to hard power down the system.

As a reference check, I retested with older kernel and the same command works like a charm, no bugs, on kernel:

Linux localhost.localdomain 3.2.10-3.fc16.x86_64 #1 SMP Thu Mar 15 19:39:46 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

Looks like kernel 3.3.x is really at fault.

Comment 28 Dr J Austin 2012-04-22 16:04:21 UTC
I believe that this kernel from koji solves this problem as well as
this very nasty one - Bug 811138
ja@minix ~ 1$ su -
Password: 
[root@minix ~]# uname -a
Linux minix 3.3.2-8.fc17.x86_64 #1 SMP Sat Apr 21 12:44:25 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
[root@minix ~]# ls -l /dev/sdb*
brw-rw----. 1 root disk 8, 16 Apr 22 16:49 /dev/sdb
brw-rw----. 1 root disk 8, 17 Apr 22 16:49 /dev/sdb1
[root@minix ~]# mount|grep sd
/dev/sda4 on / type ext4 (rw,noatime,seclabel,user_xattr,acl,barrier=1,data=ordered,discard)
/dev/sda3 on /boot type ext4 (rw,noatime,seclabel,user_xattr,acl,barrier=1,data=ordered,discard)
/dev/sdb1 on /media/throttle type ext4 (rw,noatime,seclabel,user_xattr,acl,barrier=1,data=ordered)
[root@minix ~]# umount /media/throttle
[root@minix ~]# mount|grep sd
/dev/sda4 on / type ext4 (rw,noatime,seclabel,user_xattr,acl,barrier=1,data=ordered,discard)
/dev/sda3 on /boot type ext4 (rw,noatime,seclabel,user_xattr,acl,barrier=1,data=ordered,discard)
[root@minix ~]# ls -l /dev/sdb*
brw-rw----. 1 root disk 8, 16 Apr 22 16:49 /dev/sdb
brw-rw----. 1 root disk 8, 17 Apr 22 16:49 /dev/sdb1
[root@minix ~]# echo 1 > /sys/block/sdb/device/delete
[root@minix ~]# ls -l /dev/sdb*
ls: cannot access /dev/sdb*: No such file or directory
[root@minix ~]# dmesg
...
[ 3123.257185] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 3123.286341] ata4.00: ATA-8: THROTTLE, 081016, max UDMA/100
[ 3123.286351] ata4.00: 63078400 sectors, multi 0: LBA 
[ 3123.286359] ata4.00: applying bridge limits
[ 3123.286945] ata4.00: configured for UDMA/100
[ 3123.286961] ata4: EH complete
[ 3123.287234] scsi 3:0:0:0: Direct-Access     ATA      THROTTLE         0810 PQ: 0 ANSI: 5
[ 3123.287700] sd 3:0:0:0: [sdb] 63078400 512-byte logical blocks: (32.2 GB/30.0 GiB)
[ 3123.288034] sd 3:0:0:0: [sdb] Write Protect is off
[ 3123.288049] sd 3:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[ 3123.288137] sd 3:0:0:0: [sdb] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[ 3123.288671] sd 3:0:0:0: Attached scsi generic sg1 type 0
[ 3123.292416]  sdb: sdb1
[ 3123.293557] sd 3:0:0:0: [sdb] Attached SCSI disk
[ 3123.498620] EXT4-fs (sdb1): mounted filesystem with ordered data mode. Opts: (null)
[ 3123.498638] SELinux: initialized (dev sdb1, type ext4), uses xattr
[ 3204.120690] sd 3:0:0:0: [sdb] Stopping disk
[ 3204.125630] ata4.00: disabled

Comment 29 Dariusz Garbowski 2012-04-22 16:22:53 UTC
There's no build 3.3.2-8.fc16.x86_64, for Fedora 16, available :( So I cannot test it. Can we have a build for F16, pretty please?

Comment 30 Dr J Austin 2012-04-22 16:34:18 UTC
I should have install this one for F16 !!!!
(I am using the F17 version on my F16 machine OK - so far)

https://bugzilla.redhat.com/show_bug.cgi?id=807632

--- Comment #30 from Fedora Update System <updates> 2012-04-21 12:26:06 EDT ---
kernel-3.3.2-6.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/kernel-3.3.2-6.fc16

Comment 31 Dariusz Garbowski 2012-04-22 18:25:38 UTC
Thanks for the pointer!

I can confirm that the issue is fixed with kernel-3.3.2-6.fc16.
Thanks!

Comment 32 Fedora Update System 2012-04-26 03:28:55 UTC
kernel-2.6.43.2-6.fc15 has been pushed to the Fedora 15 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.