Description of problem: I am at a loss to pin-point just which package has the proble. I have two systems: #1 -- ABIT mobo with Athlon 64 4400+ (dual) processor, 2GB ram, one 200GB PATA drive and two 300GB SATA drives configured as a single LVM volume group (all Maxtor), x86_64 systems. Works fine with both FC5 and FC6 running the latest kernels in testing -- 2.6.18-1.2224.fc5 for FC5 and 2.6.18-1.2835.fc6 for FC6 ... looking at the (src.rpm) patch list indicates these are pretty close. #2 -- ASUS A8N-E mobo with Athlon 64 4400+ (dual) processor, 2GB ram, two 120GB PATA drives and two 500GB SATA drives configured as a single LVM volume group (all Maxtor), x86_64 systems. FC5 works fine with both the 2.6.18-1.2200.fc5 and the 2.6.18-1.2224 kernels ... I also tried the 2.6.18-1.2835.fc6 kernel from FC6 and it works fine also. However, after installing FC6, I am getting bootup time disk errors (see attached portion of /var/log/messages which covers one bootup/shutdown). OK, it has got to be the kernel ... wrong ... on the FC6 system I installed and tried the latest testing update 2.6.18-1.2835.fc6 as well as the FC5 testing update 2.6.18-1.2224.fc5 ... no difference ... I still get the errors. A portion of the log indicating thje errors is: hda: drive_cmd: status=0x51 { DriveReady SeekComplete Error } hda: drive_cmd: error=0x04 { DriveStatusError } ide: failed opcode was: 0xb0 hdb: drive_cmd: status=0x51 { DriveReady SeekComplete Error } hdb: drive_cmd: error=0x04 { DriveStatusError } ide: failed opcode was: 0xb0 ... smartd version 5.36 [x86_64-redhat-linux-gnu] Copyright (C) 2002-6 Bruce Allen ... SCSI device sda: drive cache: write back ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 ata3.00: tag 0 cmd 0xb0 Emask 0x1 stat 0x51 err 0x4 (device error) ata3: EH complete ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 ata3.00: tag 0 cmd 0xb0 Emask 0x1 stat 0x51 err 0x4 (device error) ata3: EH complete ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 ata3.00: tag 0 cmd 0xb0 Emask 0x1 stat 0x51 err 0x4 (device error) ata3: EH complete ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 ... See the attached log for the whole thing.
Created attachment 140603 [details] bootup/showon of /var/log/messages
Definitely not smartmontools problem. This must be either hardware, configuration or kernel problem.
There may be something wrong in the kernel but it is something unique in FC6 that is triggering this error. FC5 did not (and still does not) have problems with this hardware regardless of the kernel whereas FC6 does (regardless of which kernel I tried).
hda: drive_cmd: status=0x51 { DriveReady SeekComplete Error } hda: drive_cmd: error=0x04 { DriveStatusError } 51/04 is the drive saying "I don't know what that command means" after being told to do something. It's not an error. ide: failed opcode was: 0xb0 0xb0 happens to be a SMART command. Your hard disk doesn't support SMART.
Dave -- you are correct about the error but incorrect about the cause ... The drive does support smart but (apparently) had NOT been enabled by smartd before smartd attempted a smart-related operation ... I believe there is a race condition in smartd which (for some reason) only shows itself with FC6 (both i386 and x86_64) and this particular system but not on my other (very, very) similar system ... mobo difference. I have disabled automatic startup of smartd during bootup. After bootup, I run "tail -f /var/log/messages | tee xxx.log" to capture information. 1. If I start smartd, I get the errors as shown in the attached /var/log/messages. 2. However, if I first enable smart with: smartctl -s on /dev/hda smartctl -d ata -s on /dev/sda smartctl -d ata -s on /dev/sdb and then start smartd, I get no errors (see attached log). 3. After bootup, if I run "smartctl -a /dev/hda" before enabling smart on the drive, I get the erros you (Dave) indicated.
Created attachment 140727 [details] log when smart turned on BEFORE starting smartd
Isn't it just possible to enable SMART on disks in BIOS? Please report the problem upstream. http://smartmontools.sourceforge.net/
Also can you try smartmontools 5.33 from original FC-5 install?
The hardware system has two (among other) partitions ... one with Fc6 x86_64 installed (hda6), one with FC6 x86_64 installed (hda5), and one with FC6 i386 install (hda7 just to see if that made a difference). Same BIOS settings, same BIOS, same hardware. I tried FC5 and FC6 kernels on both FC5 and FC6 systems. I also tried smartmontools 5.36-fc5.1 from FC5 on the FC6 system ... no differences: no errors with FC5 and the errors with FC6. I believe there is something about FC6 (not necessarily smartmontools) in combination with this mobo which causes this problem. At this point I have a simple workaround -- create a simple init script which is started earlier than smartd and enables smart on all the disks .. kludgy but it should work and now that I understand the problem, the errors are not anyway near as serious. In the interest of getting things fixed, I intend to: 1. bootup FC5 with autostart of smartd disable to see if something else is enabling smartd. 2. Reporting the problem upstream ... ugh ... I can reproduce the problem on my hardware (one system) but it may be a problem for others (such as the two FC6 system I have which do not have the problem). 3. Take a look at the smartmontools code to see if another set of eyes can see something.
It is interesting to see the comment in the smartcrl man page for the -s option: "-s VALUE, --smart=VALUE Enables or disables SMART on device. The valid arguments to this option are on and off. Note that the command ´-s on´ (perhaps used with with the ´-o on´ and ´-S on´ options) should be placed in a startup script for your machine, for example in rc.local or rc.sysinit. In principle the SMART feature settings are preserved over power-cycling, but it doesn´t hurt to be sure." It is a little unfortunate that this piece of advice is quite buried in the man page, because without it, quite a few people are probably getting these somewhat frightening errors (as I did also). Of course, it will not do to put the command into the rc.local file, since the smart daemon is started up before that.
Okay, I believe I have found the exact problem, and it is not smartd. (First I downloaded the smartd code and looked at it, finding no problem, so I then looked elswhere). The problem is in the smartd init.d file for FC6, at /etc/rc.d/init.d/smartd. It calls a configuration program written in python, at /usr/sbin/smartd-conf.py. If smartd finds that the disks are not SMART enabled, it will enable them before trying to use them, but smartd-conf.py does not do so. It tries to call smartcrl without first enabling the drives. The offending line is: status = os.system("/usr/sbin/smartctl -i %s%s 2>&1 >/dev/null" % (driver, drive.device)) On my system I simply changed it to: status = os.system("/usr/sbin/smartctl -s on -i %s%s 2>&1 >/dev/null" % (driver, drive.device)) I also note that the version of smartd-conf.py on FC5 does *not* issue any such smartctl calls, so that is why the problem popped up with FC6.
Thanks for the investigation. I've added the '-s on' to smartmontools-5.36-8.fc7.
This fix is not in the latest update smartmontools-5.37-1.1.fc6. Is it in F7t4?