Hide Forgot
Description of problem: After yum updating yesterday, my el5 machine will no longer boot. The error message is: ERROR: sil: wrong # of devices in RAID set "sil_agajebcbafbd" [1/2] on /dev/sdb Did a web-search and found a similar problem with umbuntu in their bug database and they tracked it back to a recent changes to dmraid. https://bugs.launchpad.net/ubuntu/+source/dmraid/+bug/292302 I can now only boot successfully by adding nodmraid to the kernel boot parameters. This problem may be in the rc.d init scripts or rc.sysinit scripts that invoke dmraid, I tried downgrading dmraid and dmraid-events to no avail. Tried booting with older kernels and no help that way either. Something changed in how dmraid is invoked in the boot sequence which prevents my machine from booting successfully. The /dev/sdb device in question passes all fdisk partition and fsck checks with flying colors when nodmraid is passed to the kernel. Without adding nodmraid, no partitions on /dev/sdb can be mounted at all (claims they are busy) and therefore fsck will not run either. I can not manually mount them (always get back the busy message). Nasty bugger of a change. Version-Release number of selected component (if applicable): el5 yum updated yesterday How reproducible: Problem appears immediately after update, adding nodmraid to the kernel boot parameters works around the issue Steps to Reproduce: boot the machine 1. 2. 3. Actual results: See above Expected results: successful boot Additional info:
Hi, If it helps, here is what running dmraid -rD -d -vvv shows *after* booting with kernel parameter nodmraid: WARN: locking /var/lock/dmraid/.lock NOTICE: skipping removable device /dev/hdc NOTICE: /dev/sda: asr discovering NOTICE: /dev/sda: ddf1 discovering NOTICE: /dev/sda: hpt37x discovering NOTICE: /dev/sda: hpt45x discovering NOTICE: /dev/sda: isw discovering NOTICE: /dev/sda: jmicron discovering NOTICE: /dev/sda: lsi discovering NOTICE: /dev/sda: nvidia discovering NOTICE: /dev/sda: pdc discovering NOTICE: /dev/sda: sil discovering NOTICE: /dev/sda: via discovering NOTICE: /dev/sdb: asr discovering NOTICE: /dev/sdb: ddf1 discovering NOTICE: /dev/sdb: hpt37x discovering NOTICE: /dev/sdb: hpt45x discovering NOTICE: /dev/sdb: isw discovering NOTICE: /dev/sdb: jmicron discovering NOTICE: /dev/sdb: lsi discovering NOTICE: /dev/sdb: nvidia discovering NOTICE: /dev/sdb: pdc discovering NOTICE: /dev/sdb: sil discovering NOTICE: sil: areas 1,2,3,4[4] are valid NOTICE: writing metadata file "sdb_0.dat" NOTICE: writing offset to file "sdb_0.offset" NOTICE: writing metadata file "sdb_1.dat" NOTICE: writing offset to file "sdb_1.offset" NOTICE: writing metadata file "sdb_2.dat" NOTICE: writing offset to file "sdb_2.offset" NOTICE: writing metadata file "sdb_3.dat" NOTICE: writing offset to file "sdb_3.offset" NOTICE: writing size to file "sdb.size" NOTICE: /dev/sdb: sil metadata discovered NOTICE: /dev/sdb: via discovering INFO: RAID device discovered: /dev/sdb: sil, "sil_agajebcbafbd", mirror, ok, 625140400 sectors, data@ 0 WARN: unlocking /var/lock/dmraid/.lock If you need or want any of the metadata files generated above, just let me know and I would be happy to tgz them up and post them. I would also be happy to run any diagnostic commands that you think might help find out what changed and why I can no longer boot successfully without the nodmraid kernel option. Thanks, Kevin
Kevin, this looks like some remnant SoftRAID metadata being discovered in early boot due to initrd changes applied by the update which hasn't been discovered before. Or did you have an operational SoftRAID mirror on /dev/sdb and some other device before? In case the former applies, remove the Silicon Image metadata from sdb with "dmraid -rE /dev/sdb", thus bein able to avoid the nodmraid kernel command line argument again. I don't assume the latter applies but please come back if so.
Hi Heinz, Yes, I believe it was the former although it has been so long ago that I don't really remember if I first set up the disks under nvraid or not. I must have. Needless to say I had disabled all raid in the bios long ago (needed the extra disk space) so I should have known something was wrong. The first boot on this machine was with pre Fedora 5 over 6 years ago. I removed the metadata as you suggested and it booted just fine. So put this one down to user error. Scared me, as this machine has been a rock. Thanks, Kevin
Kevin, happy to hear this fixed the situation for you. dmraid is supposed to discover any RAID devices it supports in early boot now so this is not a bug. You can work around related issues where RAID devices are being recognized you don't want to be recognizred either way (nodmraid|erase metadata).