sorry, "smartd" isn't list in "Component" Description of problem: VMware IDE CD-ROM will cause trouble on (automatic started) smartd on reboot. Something like problems with DMA will be displayed. Version-Release number of selected component (if applicable): RHL severn How reproducible: Always Steps to Reproduce: 1.install severn in VMware, on CD-ROM as IDE drive defined 2.Reboot Actual Results: smartd shows some strange messages Expected Results: smartd should be more smart ;-) Additional info:
Created attachment 93176 [details] Extract of /var/log/messages after smartd start Normal messages from smartd about failure to use /dev/hda for SMART monitoring.
Blah !!! In addition, I wanted to say that I don't thinkthis is a bug at all. /dev/hda could be anything, thus smartd is configured to scann for SMART devices on all possible /dev entries. It finds /dev/hda not capable, therefore doesn't use it. /dev/hda just happens to be a VMware virtual cdrom. Any virtual disks are not SMART capable. These are not weird errors .. Unless original poster sees more than I have shown, this bug is NOTABUG. Just turn off smartd start at boot, and it starts quicker, with no errors. Cheers' Michael
It's the same message like Michael posted. But my opinion is that smartd should recognize, that the drive is not physical (a simple ide/hdx/model lookup would help in this case here) and do not try to adjust DMA settings...
The SYSLOG output that is posted is "to be expected". In particular, the CD-ROM device does not support SMART. So, in scanning the drives, smartd recognized this and did not enable SMART monitoring of the device. A good solution is to edit /etc/smartd.conf, remove the DEVICESCAN directive, and put in "by hand" a list of the devices that you DO want to monitor.
smartd startup should not say [FAILED] if it finds no smart capable device. I think that it is what pb complains about. And if smartd is not running (like in this case if no smart capable devices are found), it should not say [FAILED] at shutdown IMHO.
I think that pb's complaint was simply that smartd tried to probe a device (a CDROM ATA packet device) that has no SMART capability. And pb is correct that it would not be terribly difficult to auto-detect such devices. But it is hard to do it in a general and portable way. That's why we rely on a configuration file (/etc/smartd.conf) to define the devices to monitor. It *would* be possible to have smartd exit cleanly (with zero status, so that there was no [FAILED] message) if no devices were found to be monitored. However the better solution in this case is simply to turn off the service: chkconfig --stop smartd will do this in a simple and reversible way. In general, it IS correct to report failure if no devices are found to monitor. This is because, if the user has NOT turned off smartd, it means that there ARE devices to monitor. In which case, if nothing can be found to monitor, something is wrong. The behavior you are asking for would be incorrect in such cases. Note that the current release of smartmontools (version 5.1-18) has a new command line option called --quit. By setting this to --quit=never, smartd will NOT exit with an error if no devices are found -- it will continue to run in the background, waiting for a HUP signal to re-read the config file. This may be appropriate if the user has removable devices to monitor. But I think that it's not the behavior that you are asking for.
Starting with smartmontools release 5.22, an attempt to use smartd or smartctl on an IDE packet device (such as a CD-ROM) should not generate syslog warning messages from the kernel, and should give clearer error messages to the user about SMART not being available on packet devices.
Hmm, Fedora Core 1 still ships 5.21, resulting in: Nov 17 20:44:49 vmfc1 smartd[1131]: smartd version 5.21 Copyright (C) 2002-3 Bruce Allen Nov 17 20:44:49 vmfc1 smartd[1131]: Home page is http://smartmontools.sourceforge.net/ Nov 17 20:44:49 vmfc1 smartd[1131]: Opened configuration file /etc/smartd.conf Nov 17 20:44:49 vmfc1 smartd[1131]: Configuration file /etc/smartd.conf parsed. Nov 17 20:44:49 vmfc1 smartd[1131]: Device: /dev/hda, opened Nov 17 20:44:49 vmfc1 smartd[1131]: Device: /dev/hda, unable to read Device Identity Structure Nov 17 20:44:49 vmfc1 smartd[1131]: Unable to register ATA device /dev/hda at line 30 of file /etc/smartd.conf Nov 17 20:44:49 vmfc1 smartd[1131]: Unable to register device /dev/hda (no Directive -d removable). Exiting. Nov 17 20:44:49 vmfc1 smartd: smartd startup failed
I'll be issusing smartmontools 5.24 fairly soon. If you could test that under vmware I'd be grateful. Bruce
I've just issued smartmontools 5.25, which should reduce the number of these error messages. 5.25 is a developement/testing release: 5.26 will be the next stable release. If you could test 5.25 on your system that would be helpful.
This version still cause troubles: Nov 23 14:51:20 vmfc1 smartd[1106]: smartd version 5.25 Copyright (C) 2002-3 Bruce Allen Nov 23 14:51:20 vmfc1 smartd[1106]: Home page is http://smartmontools.sourceforge.net/ Nov 23 14:51:20 vmfc1 smartd[1106]: Opened configuration file /etc/smartd.conf Nov 23 14:51:20 vmfc1 smartd[1106]: Drive: DEVICESCAN, implied '-a' Directive on line 21 of file /etc/smart d.conf Nov 23 14:51:20 vmfc1 smartd[1106]: Configuration file /etc/smartd.conf was parsed, found DEVICESCAN, scann ing devices Nov 23 14:51:20 vmfc1 smartd[1106]: Device: /dev/hda, opened Nov 23 14:51:30 vmfc1 kernel: hda: irq timeout: status=0xd1 { Busy } Nov 23 14:51:30 vmfc1 kernel: hda: irq timeout: error=0x04Aborted Command Nov 23 14:51:30 vmfc1 smartd[1106]: Device: /dev/hda, packet devices [this device CD/DVD] not SMART capable Nov 23 14:51:30 vmfc1 smartd[1106]: Unable to register ATA device /dev/hda at line 21 of file /etc/smartd.c onf # cat /proc/ide/hda/model VMware Virtual IDE CDROM Drive # cat /proc/ide/hda/identify 85c4 0000 0000 0000 0000 0000 0000 0000 0000 0000 3030 3030 3030 3030 3030 3030 3030 3030 3030 3031 0000 0040 0000 3030 3030 3030 3031 564d 7761 7265 2056 6972 7475 616c 2049 4445 2043 4452 4f4d 2044 7269 7665 2000 2020 2020 2020 2020 0000 0000 0f00 0000 0400 0200 0006 0000 0000 0000 0000 0000 0000 0000 0000 0007 0007 0003 0078 0078 0078 0078 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0004 0017 4218 4000 4000 4218 4000 4000 0207 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
I think the problem is with the VMware Virtual IDE CDROM Drive, and is easily fixed by VMware. This note is intended for the VMware engineers, but is posted here to be publicly visible. The problem is that the value of IDENTIFY [PACKET] DEVICE Word 80 is 0x0004, and it should be 0x001e (or 0x0010). See the end of this document for the full 256-word IDENTIFY PACKET DEVICE output. To verify what I am saying, could you please post the output of: hdparm -i /dev/hda hdparm -I /dev/hda smartctl -i /dev/hda --------------------------------------------------------------------- Details: Please refer to the ATA-4 revision 17 specification here: http://www.t13.org/project/d1153r17.pdf . Page numbers below refer to the page numbers printed in the document, NOT to the page numbers shown by xpdf or acroread. NOTE: this document follows the ATA specifications and numbers objects (bits, bytes, words) from 0 to N-1, not from 1 to N. WORD 81 ------- Let's start with word81== 0x0017 Section 8.12.44 Word 81: Minor version number, pages 87/88: "If an implementor claims that the revision of the standard they used to guide their implementation does not need to be reported or if the implementation was based upon a standard prior to the ATA-3 standard, word 81 shall be 0000h or FFFFh. Table 12 defines the value that may optionally be reported in word 81 to indicate the revision of the standard that guided the implementation." From Table 12: 0x0017 means ATA/ATAPI-4 T13 1153D revision 17 --------------------------------- WORD 80 ------- Now let's look at: word80 == 0x0004 == 00000000 00000100 THIS HAS THE WRONG VALUE. IT SHOULD HAVE VALUE: 0x001e == 00000000 00011110 or 0x0010 == 00000000 00010000 Section 8.12.43 Word 80: Major version number, page 87. See also page 80: "If not 0000h or FFFFh, the device claims compliance with the major version(s) as indicated by bits 1 through 4 being equal to one. Values other than 0000h and FFFFh are bit significant. Since ATA standards maintain downward compatibility, it is allowed for a device to set more then one bit." The given value of word 80, 0x0004, indicates only ATA-2 compliance. This is WRONG. It's not consistent with word 81, which indicates ATA/ATAPI-4 revision 17. In fact packet interface devices didn't exist before ATA-4, so it doesn't make sense for this virtual CD-ROM packet device to report ATA-2 compliance. In fact it was only with ATAPI-4 revision 7 that the IDENTIFY PACKET DEVICE command (which this virtual VMware device responds to) was introduced. Since packet interface devices were only introduced with ATA-4, it would make sense to only claim compliance with ATA-4, hence giving word80 a value 0x0010. However many vendors seem to give this word a value that suggests backwards compliance to ATA-1, hence a word80 value of 0x001e would also be OK. By the way, the VMware engineers who got this wrong may have simply made the elementary mistake of thinking that a value '4' in word 80 indicated ATA-4 compliance! # cat /proc/ide/hda/model VMware Virtual IDE CDROM Drive # cat /proc/ide/hda/identify 85c4 0000 0000 0000 0000 0000 0000 0000 0000 0000 3030 3030 3030 3030 3030 3030 3030 3030 3030 3031 0000 0040 0000 3030 3030 3030 3031 564d 7761 7265 2056 6972 7475 616c 2049 4445 2043 4452 4f4d 2044 7269 7665 2000 2020 2020 2020 2020 0000 0000 0f00 0000 0400 0200 0006 0000 0000 0000 0000 0000 0000 0000 0000 0007 0007 0003 0078 0078 0078 0078 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0004 0017 4218 4000 4000 4218 4000 4000 0207 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
Here the requested data, btw: probably a workaround is still possible, don't know how quick VMware fixes such a bug and release a new build... # hdparm -i /dev/hda /dev/hda: Model=VMware Virtual IDE CDROM Drive, FwRev=00000001, SerialNo=00000000000000000001 Config={ SoftSect Fixed Removeable DTR<=5Mbs DTR>10Mbs nonMagnetic } RawCHS=0/0/0, TrkSize=0, SectSize=0, ECCbytes=0 BuffType=unknown, BuffSize=32kB, MaxMultSect=0 (maybe): CurCHS=0/0/0, CurSects=0, LBA=yes, LBAsects=0 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120} PIO modes: pio0 pio1 pio2 pio3 pio4 DMA modes: sdma0 sdma1 sdma2 mdma0 mdma1 mdma2 UDMA modes: udma0 udma1 *udma2 AdvancedPM=no Drive conforms to: ATA/ATAPI-4 T13 1153D revision 17: 2 * signifies the current active mode # hdparm -I /dev/hda /dev/hda: ATAPI CD-ROM, with removable media Model Number: VMware Virtual IDE CDROM Drive Serial Number: 00000000000000000001 Firmware Revision: 00000001 Standards: Likely used CD-ROM ATAPI-1 Configuration: DRQ response: 50us. Packet size: 12 bytes Capabilities: LBA, IORDY(can be disabled) Buffer size: 32.0kB DMA: sdma0 sdma1 sdma2 mdma0 mdma1 mdma2 udma0 *udma1 udma2 Cycle time: min=120ns recommended=120ns PIO: pio0 pio1 pio2 pio3 pio4 Cycle time: no flow control=120ns IORDY flow control=120ns Commands/features: Enabled Supported: * NOP cmd * DEVICE RESET cmd * PACKET command feature set * Power Management feature set # smartctl -i /dev/hda smartctl version 5.25 Copyright (C) 2002-3 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF INFORMATION SECTION === Device Model: VMware Virtual IDE CDROM Drive Serial Number: 00000000000000000001 Firmware Version: 00000001 Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 4 ATA Standard is: ATA/ATAPI-4 T13 1153D revision 17 Local Time is: Mon Nov 24 08:25:04 2003 CET SMART support is: Unavailable - Packet Interface Devices [this device: CD/DVD] don't support ATA SMART A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.
Sorry: s/possible/needed/
Here is the 'smoking gun' from hdparm: Drive conforms to: ATA/ATAPI-4 T13 1153D revision 17: 2 ^ This SHOULD say either: Drive conforms to: ATA/ATAPI-4 T13 1153D revision 17: 4 to indicate that the drive conforms to ATA/ATAPI-4 OR Drive conforms to: ATA/ATAPI-4 T13 1153D revision 17: 1 2 3 4 to indicate that the drive conforms to ATA/ATAPI-1, 2, 3, and 4. Case closed: the problem is with the VMware virtual device.
It will be fixed in next release, but your detection code is wrong: word 80 has precedence over word 81, so if word 80 says that device does not support any known ATA/ATAPI standard, you should assume that it supports only good old SFF8020 document - which stored version in word 74/75 - instead of peeking at word 81, hoping that it will provide more correct information than word 80.
Petr, You said "word 80 has precedence over word 81". Could you please provide a reference for this claim? Thanks! Bruce
Petr, I got dragged into this by RedHat! Bruce is correct, you are wrong. Go fix your side. SFF-8020 is dead retired and a POS. Using it as more than a refernce is "foolish", this includes Microsoft's requirement for WHQL certs. Since VMware is concerned about WHQL, you have an reason to look at SFF-8020. "Look" and use as reference but is wrong and does very bad things. Cheers, Andre Hedrick The X(iting)-Linux ATA/ATAPI/SATA guy