At present bkr machine-test just schedules /distribution/install in its jobs, which is a good way of making sure the system can boot and install a distro. But there are many hardware problems which this will never find, so it's not a good way to test a flakey system. We could write a new task which performs actual hardware tests, such as: * check SMART data on all disks * perform SMART self-tests on all disks * run bad block checking on all disks * run a memory tester? * run some kinds of CPU self-tests? The bkr machine-test command could have an option --aggressive which adds this task when it schedules a job.
Some points that might help: * SMART is available only on quite small number of machines, due to machines using either - SCSI/SAS drives (no SMART at all) - additional layer (HW raid) between the drives and the OS * bad blocks can be done using the "badblocks" utility - make sure to do write testing - make sure to specify larger "N blocks at a time" value, speed reasons - make sure to use `-t' to specify at least one pseudorandom pass, normal check DO NOT detect silent offset pointer corruption (!!) * memory testing via memtest86+ could be hard to do automatically, a tool called "memtester" [1] can do it while the system is running - it doesn't test all memory, just what it can lock, .. still useful * CPU stress testing can be done - using "cpuburn" (burnMMX, ...) running over some period of time (at least 30min) - using a Prime95 equivalent for Linux, "mprime" CLI tool, which can also perform stress tests with verification of result correctness, however it uses rather arch-specific instructions (AVX on intel), which might not be a relevant test method [1] http://pyropus.ca/software/memtester/ All of this would need to be done from initramfs as HDD testing would effectively overwrite/erase everything. A few approaches come to my mind, but all of them would need all the tools along with beaker-related result uploader in the initramfs anyway: * using anaconda and %pre section * using RHEL-based (dracut) initramfs * using a completely custom (glibc-based) kernel + initramfs pair - might be a bit more complex to be architecture-independent - not as complex as it seems, I've built several in the past .. just my $0.02 ..