magicdev-1.1.4-8.0-x86_64 magicdev eats loads of CPU on this machine, as it seems to be taking .5 seconds to open each one of the CD drives. 10:59:25.400380 open("/dev/cdrom", O_RDONLY|O_NONBLOCK|O_EXCL) = 14 <0.519990> 10:59:25.920370 ioctl(14, CDROM_MEDIA_CHANGED, 0x56a9f0) = 1 <0.010000> 10:59:25.930370 ioctl(14, CDROM_DRIVE_STATUS, 0x7fffffff) = 2 <0.000000> 10:59:25.930370 close(14) = 0 <0.279995> 10:59:26.210365 open("/dev/cdrom1", O_RDONLY|O_NONBLOCK|O_EXCL) = 14 <0.539989> 10:59:26.750354 ioctl(14, CDROM_MEDIA_CHANGED, 0x7fffffff) = 1 <0.000000> 10:59:26.750354 ioctl(14, CDROM_DRIVE_STATUS, 0x7fffffff) = 2 <0.000000> 10:59:26.750354 close(14) = 0 <0.279995>
Created attachment 123965 [details] magicdev-ignore-aopen-drives.patch
@Bastien: your fix does not address the problem but blacklists innocent devices This "magicdev eats cpu"-bug seems to have been reported for several years now (google finds reports since year 2000), but was never root-caused. Instead several myths have been propagated.
Hi, I came to this conclusion: 5) per 2GB memory in "Cached state" the open("/dev/cdrom") system call takes 0.1 seconds longer E.g. when you have nearly 16GB "Cached" it will take approx. 0.8 seconds Appendix ======== Experimental setup (here for a 16GB machine): 1) script script-strace-magicdev strace -T -ttt -p 23055 2) script script-meminfo csh while 1 echo `date` `date +%s` `cat /proc/meminfo` sleep 1 end 3) Fill the filesystem cache and empty it for several times, e.g. 3a) 22#: date ; time dd if=/dev/zero of=/usr3/test.mayerg bs=1024k count=7000 ; date Mon Feb 20 12:57:36 MET 2006 7000+0 records in 7000+0 records out 0.010u 30.200s 0:30.34 99.5% 0+0k 0+0io 122pf+0w Mon Feb 20 12:58:06 MET 2006 ... 3b) Empty "Cached memory" with "consmem -m 14000" (allocate 14GB) [use your favourite memory consuming application here]; increase and decrease the number a bit until "cached" as shown by top goes way below <1000MB 3c) repeat at 3a until enough data samples 4) Visualize the interdependecy with gnuplot 4a) prepare the data: script offending lines etc. from the raw data cp script-meminfo memo vi memo grep open script-strace-magicdev> opencd vi opencd replace the first decimal point with blank, because milliseconds are not needed here: %s/\./ / remove the opens of "mtab" which are interspered, remoce < > from last column 4b) gnuplot for memory usage (e.g."cached" memory from /proc/meminfo) plot "memo" using ($7):($20) w lin 4c) gnuplot for the time spent in the open(/dev/cdrom) system call plot "opencd" using ($1):($7) w lin 5) Conclusion: per 2GB memory in "Cached state" the open() takes 0.1 seconds 6) Further investigation: Enable "profile" for the kernel and inspect these numbers: 2718 .text.lock.ioctl 35.7632 11959 invalidate_bdev 28.7476 7) Result: open("/dev/cdrom") triggers invalidate_bdev (and other?) system calls which consume lots of time under certain circumstances 8) Expected result: open("/dev/cdrom") should not take so long (e.g. <0.005 sec; this is possible as shown be the experiment). There is no reason this should take longer when "Cached" memory is large.
Created attachment 124883 [details] Grapg: seconds since epoch vs. time needed for open()
Created attachment 124884 [details] Graph: seconds since epoch vs. Amount of Cached memory
Created attachment 124885 [details] "opencd" raw data from system call trace (edited slighty so gnuplot can read it)
Created attachment 124886 [details] "memo" raw data from /proc/meminfo (with date and sec since epoch)
This is clearly reporting a kernel issue, not a magicdev issue, and it would be good to reassign the component. (magicdev isn't even used on RHEL4, but polling /dev/cdrom is still done I think.) But that being said, I'd really advise running magicdev or the GNOME desktop at all on a production server. (Maybe my imagination just doesn't stretch to workstations with >= 16GB of memory.) For RHEL3 magicdev can be turned off by just turning off automount and autoplay in the GNOME preferences.
I agree this is a kernel issue, but don't know how to reassign the component, can you do this here? And yes, we are using 16 GB workstations with RHEL3 and would like to preserve magicdev for the sake of user-friendliness (because this mounts inserted CDs nicely). Please see my IT#65386 and IT#85873.
Reassigning to kernel as per comment #8
NEEDINFO_ENG has been deprecated in favor of NEEDINFO or ASSIGNED. Changing status to ASSIGNED for ENG review.
Hello Gunther - When cdrom_open() detects that the media has changed it must invalidate all buffers associated with the device, otherwise you run the risk of data corruption due to cached pages being out-of-sync with the disk. A machine with 16GB of memory can have a very large buffercache ("cached state" memory). The large lag on opening the cdrom device is not a bug, but a side-effect of the unusually large (for a workstation) buffercache. The real soure of the problem appears to be that ioctl(CDROM_MEDIA_CHANGED) always returns true and ioctl(CDROM_DISK_STATUS) always returns CDS_TRAY_OPEN in the strace log provided here. 14:50:20.319004 open("/dev/cdrom", O_RDONLY|O_NONBLOCK|O_EXCL) = 14 <0.008773> 14:50:20.327874 ioctl(14, CDROM_MEDIA_CHANGED, 0x56ac10) = 1 <0.002832> 14:50:20.330793 ioctl(14, CDROM_DRIVE_STATUS, 0x7fffffff) = 2 <0.000864> 14:50:20.331741 close(14) = 0 <0.000231> 14:50:20.332055 open("/dev/cdrom1", O_RDONLY|O_NONBLOCK|O_EXCL) = 14 <0.011797> 14:50:20.343942 ioctl(14, CDROM_MEDIA_CHANGED, 0x7fffffff) = 1 <0.004024> 14:50:20.348051 ioctl(14, CDROM_DRIVE_STATUS, 0x7fffffff) = 2 <0.001275> /* drive status possibilities returned by CDROM_DRIVE_STATUS ioctl */ #define CDS_NO_INFO 0 /* if not implemented */ #define CDS_NO_DISC 1 #define CDS_TRAY_OPEN 2 #define CDS_DRIVE_NOT_READY 3 #define CDS_DISC_OK 4 From the sysreport info, it looks like this system is a SUN W1100z/2100z. The hardware cert catalog shows this system to be certified as compatible, but the listed cdrom device is a Samsung model. See: http://bugzilla.redhat.com/hwcert/show.cgi?id=148524 The cdrom drives in the sysreport appear to be Acer/Aopen: hdc: AOPEN COM5232/AAH PRO, ATAPI CD/DVD-ROM drive hdd: AOPEN DUW1608/ARR, ATAPI CD/DVD-ROM drive Is there any way that you can test with another cdrom device(s)? The one that is certified with this system in the HCL is a SAMSUNG CDRW/DVD SM-352F.
The RHEL3 kernel seems to think that there is constantly a new CD when a media is in the drive. Ie. the attached testcase will show 10 occurences of "Detected CD change with CDROM_MEDIA_CHANGED", instead of just the one.
Created attachment 133406 [details] test.c This same code seems to work fine on RHEL4. Am I doing (and magicdev by the same token) something stupid?
Product Management has reviewed and declined this request. You may appeal this decision by reopening this request.