Bug 130232
Summary: | IDE subsystem causes infinite hotplug add/remove loop | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | David Zeuthen <davidz> | ||||
Component: | kernel | Assignee: | Arjan van de Ven <arjanv> | ||||
Status: | CLOSED WONTFIX | QA Contact: | Brian Brock <bbrock> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | rawhide | CC: | alan, kjw, mclasen, wtogami | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | i686 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2004-11-16 23:09:18 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
David Zeuthen
2004-08-18 12:05:03 UTC
Created attachment 102836 [details]
/var/log/messages logs
This describes how to reproduce the bug
Oh, '# touch /dev/hde1' just before the Oops should be '# remove PCMCIA card reader'. Sorry about that. IDE hotplug in current 2.6 is terminally broken. I've been working on fixing it for about a week and its approaching the point of basic stability. Even then it cannot fully support sysfs and making it handle sysfs is a major job that would require full time engineering resource and a lot of prior upstream discussion and design. Secondly ide-cs does nothing to deal with hotplug events, you are discussing a property of the core block layer code with removable devices where each open causes a recheck of the partition data. Thats "UPSTREAM" and isn't PCMCIA specific. Deal with it in your HAL design I suspect. With my current patches I can rmmod/insmod ide drivers without crashes, security holes in /proc and so on. I'm now working on getting them in a form the maintainer is happy about and merged upstream. Until then its Arjan's call but essentially /proc/ide, and ide hotplugging of any kind are not safe in the FC1/FC2 kernel. So if hotplug remove/add is a property of the core block layer code when rereading partition data, how come I'm only seeing this for block devices backed by IDE and not USB or IEEE1394? Are you trying to say this is a bug upstream? I'm not sure why USB doesn't trigger it. Perhaps that uses a different approach for reading partition tables. The IDE layer itself has nothing to do with the hotplug events however. In fact its gloriously ignorant on a lot of hotplug issues. One of the problems IDE has is that there isn't a good way to learn about media changes reliably. So each open assumes the media might have changed - much like floppy except we don't partition them. I wasn't clear; I'm implying it's a bug that hotplug events are triggered from the block layer for IDE devices, not that USB etc. should also trigger them! That would be a nightmare. Here's why triggering the hotplug events is a bad idea: from userspace we do want to open the device even if it's not mounted because we want to read off the drive_id (serial# etc.) from the top-level block device and volume_id (label etc.) from the partition-level block devices stuff before even creating a mount point (the mount point may include the label). Just look at some of the callouts included with udev or look at the hal code. Another general problem is that even though USB Mass Storage devices has mechanisms for reporting media changes very few vendors correctly implements it. So, in conclusion, all polling for removable media should occur from userspace. Therefore, is it correct to say that these hotplug events from my PCMCIA Compact Flash card readers is a (upstream) bug? That was my original question anyway :-) When the last user closes an IDE device we discard all information about it. We simply don't know if its the same CF card next open. We don't even get a media changed error on an I/O because the IDE controller is in the CF card so it hasn't seen a media change. So every time you are first opener we will generate a partition table. This I suspect generates the hot plug events. Side item - there are two cases to teach your HAL code about for drive vendor and model where duplicates occur which might be worth knowing about. #1 Maxtor in the model and a serial of "M0000000000000000000" #2 "Integrated Technology Express" in the model/vendor info. h/w IDE raid volumes all with the same id/serial. Alan Well, USB and all the other buses (SCSI emulation I presume) does it somewhat differently [1] and also supports removable media. Since the kernel is an abstraction mechanism this is bad, it breaks the abstraction, as the behaviour depends on physical connection mechanisms. I think by now it's sane to assume that some userspace process is polling on the devices with removable storage if the use of the system is desktop etc. (hal does this for Fedora but right now I have to blacklist the ide-cs stuff, so yeah, it works, but I can't read the volume label before creating the mount point etc) I guess upstream agrees here as well cf. the 'removable' file in sysfs for every block device (which is only an approximation btw), so how about changing the behaviour in the IDE code? [1] : btw, why is that, isn't the kernel layered, e.g. partition table detection should be above the block layer? (I'm just trying to pick up some kernel tricks on the side, thank you :-) The IDE hardware doesn't support a change in the way the IDE code assumes that media can change without warning. I refuse to break that and let users trash disks without warning just because it gives HAL some hiccups. The partition table scanning is a seperate library routine called by various drivers. Alan PS: not just ide-cs - all ide removables will do this I suspect - eg ide floppies, M/O drives As I've already stated this is not specific to HAL at all - it applies to mount(1) (when using -t auto), udev callouts and anything else that opens the device before it's mounted. While this may not have been an issue in the past, it certainly is now given that the kernel sends hotplug events and udev creates/removes device nodes based on this. This is a real problem. Even for mount(1). To me this seems like a split personality :-). On the one hand the kernel refuses to poll for new media (which is fair enough), and the other hand it sends lots of hotplug events if userspace tries to. For some devices. Either way, hal already works around this issue, so I'll just shut up now. Thanks, David Reopening this bug as it is the root cause for a regression in FC3. I note the following behaviour in FC3 (full updates as of 2005.08.14): mkdir /var/log/hotplug touch /var/log/hotplug/events cardctl eject 0 tail -f /var/log/hotplug/events & cardctl insert 0 # slot 0 contains cf card You should see in the log: add for pcmcia remove for module /module/ide_cs remove for module drivers /bus/pcmcia/drivers/ide-cs add for module /module/ide_cs add for module drivers /bus/pcmcia/drivers/ide-cs (pause as pcmcia scripts run) add for ide add for block (hdc) add for block (hdc1) remove for block (hdc1) add for block (hdc1) and I'm not trying to mount the drive or anything. hotplug just seems to excessively thrash on the adding and deleting of hdc1. here's some more strangeness: # ls -al /dev/hdc1 ; mount -v /dev/hdc1 /mnt/cf ; ls -al /dev/hdc1 brw-rw---- 1 root disk 22, 1 Aug 14 11:30 /dev/hdc1 mount: you didn't specify a filesystem type for /dev/hdc1 I will try all types mentioned in /etc/filesystems or /proc/filesystems Trying vfat mount: special device /dev/hdc1 does not exist ls: /dev/hdc1: No such file or directory and simultaneously the log shows a remove/add for hdc1 pair. so the act of mounting causes a remove? That sure is strange. However, mount specifying -t vfat seems to work, but only because it seems to beat hotplug to the punch. note that the trailing ls still fails to find the device, because /dev/hdc1 has been removed by the same hotplug remove/add pair that you saw above: # ls -al /dev/hdc1 ; mount -v -t vfat /dev/hdc1 /mnt/cf ; ls -al /dev/hdc1 brw-rw---- 1 root disk 22, 1 Aug 14 11:33 /dev/hdc1 /dev/hdc1 on /mnt/cf type vfat (rw) ls: /dev/hdc1: No such file or directory Similarly, hdparm -i /dev/hdc causes hotplug to remove/add hdc1 Indeed, the above suggested "touch /dev/hdc1" triggers a hotplug remove/add pair. What does this break? /etc/pcmcia/ide for one. Though bug 120486 is suggesting that it should be migrated out of the pcmcia scripts into hotplug. but while hotplug exhibits this hyper-aggressive remove/add behaviour, that just isn't going to happen. This is still broken with a fully updated FC3 as of 2005-10-29. *poke* *poke* :) FC3 wont fix. Upstream changes for this did get discussed and will probably get into FC5, and maybe FC4 as the upstream kernel changes. It appears this is working now (and probably have for some time), so I've removed the special handling for ide-cs in hal. http://gitweb.freedesktop.org/?p=hal;a=commit;h=602bbb270d0851047a0bebc442a1fdc92a4f91c7 Do you know when in 2.6 this change was introduced? The git history don't tell me much... This appears to be the patch http://kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=9810933701a09f9c4dd0ad963d5ec2efb7df07b7 It was done as part of fixing ide_cs handling logic, unrelated to but happening tofix the problem you see in that case. Hal will I suspect still trigger the same behaviour if faced with a true removable such as an Iomega or maybe a clik drive. Actually I've got a Clik I should see if we still handle that right. Ok we handle clik! as ide-floppy and it correctly avoids duplicate scans because it has proper media change logic. Yup, that's here http://gitweb.freedesktop.org/?p=hal;a=blob;h=7cb74ee685d9da847c1060ea1277211ee6b8b657;hb=7b1d143b988b378b3269b767259d387e64b14718;f=fdi/preprobe/10osvendor/10-ide-drives.fdi Right now we only support partitioned media (with the fs on partition 4) but that's about to change in a release or two... |