Bug 245887 - FEAT RHEL5: Hardware Test Suite for diskless systems
FEAT RHEL5: Hardware Test Suite for diskless systems
Status: CLOSED CURRENTRELEASE
Product: Red Hat Hardware Certification Program
Classification: Red Hat
Component: Test Suite (harness) (Show other bugs)
5
x86_64 Linux
high Severity high
: ---
: ---
Assigned To: Greg Nichols
Chris Williams
: FutureFeature, Reopened
Depends On:
Blocks: 245603
  Show dependency treegraph
 
Reported: 2007-06-27 03:35 EDT by George Beshers
Modified: 2008-05-01 11:39 EDT (History)
8 users (show)

See Also:
Fixed In Version:
Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-03-06 21:08:11 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
Panic when running HTS info test on flipper. (117.41 KB, text/plain)
2007-10-18 22:01 EDT, Jonathan Lim
no flags Details
HTS passed on dopple. (1.12 MB, text/plain)
2007-10-18 22:03 EDT, Jonathan Lim
no flags Details
dmidecode output for flipper (24.12 KB, text/plain)
2007-10-18 22:07 EDT, Jonathan Lim
no flags Details
dmidecode output for dopple (12.90 KB, text/plain)
2007-10-18 22:07 EDT, Jonathan Lim
no flags Details
HTS results for x86_64 with NFSroot and using iSCSI for swap (505.06 KB, application/octet-stream)
2007-11-16 16:40 EST, Jonathan Lim
no flags Details

  None (edit)
Description George Beshers 2007-06-27 03:35:50 EDT
Description of problem:
  Carlsbad is the new SGI high density cluster product.
  The major new feature as far as hardware certification
  is concerned is that the nodes are diskless.

  This BZ is to track the effort to hwcert the diskless nodes
  in RHEL5.1.

  The nodes are x86_64 cpus on a SuperMicro motherboard---
  identical to the XE 310. 


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:
Comment 1 George Beshers 2007-06-28 11:15:19 EDT
1. If there are no disks do the disk tests need to be run?

2. With NFSroot how do we avoid cycling the network connection
   thereby crashing the root file system.

Comment 2 George Beshers 2007-06-28 11:19:01 EDT
3. Will Xen certification be required on the diskless nodes of a cluster?
Comment 4 Jay Lan 2007-07-17 17:49:18 EDT
4. Hardware Test Suite seems to require SELinux to be set to "enforcing".
   With that policy, diskless system with nfsroot would not allow remote
   login because SELinux prevents /usr/sbin/sshd "entrypoint" access to 
   /bin/bash (nfs_t). Setting SELinux policy to "permissive" fixed that problem. 
Comment 5 Greg Nichols 2007-09-17 13:53:23 EDT
HTS 5.1 Release 1 includes support for NFSRoot systems.
Comment 6 Jonathan Lim 2007-10-18 22:01:41 EDT
Created attachment 231831 [details]
Panic when running HTS info test on flipper.

When running HTS with flipper as the client and dopple as the
NFSroot server, flipper panicked while undergoing the "info" test.

The same problem occurred when running just that test alone on
flipper without NFSroot.
Comment 7 Jonathan Lim 2007-10-18 22:03:45 EDT
Created attachment 231841 [details]
HTS passed on dopple.

There were no errors when running HTS on dopple (without NFSroot).
Comment 8 Jonathan Lim 2007-10-18 22:07:08 EDT
Created attachment 231861 [details]
dmidecode output for flipper
Comment 9 Jonathan Lim 2007-10-18 22:07:48 EDT
Created attachment 231871 [details]
dmidecode output for dopple
Comment 10 Rob Landry 2007-10-19 14:58:31 EDT
Can this be tried again with the -53 kernel.  -52 had a bug in it where cat'ing
/proc/scsi could cause the system to panic, this was addressed in the -53
kernel.  hts doesn't itself actually cat that, but sysreport/sos do which hts
does call in the info test.  Supposedly that but should of only impacted the
megaraid-sas driver.

- [scsi] megaraid_sas: kabi fix for /proc entries (Chip Coldwell ) [323231]

...in either case, nothing in hts should be capable of causing a kernel panic, I
wouldn't think.
Comment 11 Jonathan Lim 2007-10-19 15:08:20 EDT
> Can this be tried again with the -53 kernel.

The machine I borrowed to do the test on has been returned.  I'll have to check
if I can borrow it again, but it won't be so soon.
Comment 12 Jonathan Lim 2007-10-25 20:21:45 EDT
> Can this be tried again with the -53 kernel.

I've run HTS with the -53 kernel (info test only) and it passes.

However, I ran into a new problem. The NFS root has the following entry in
/etc/fstab:

  /dev/VolGroup00/LogVol01 swap swap defaults 0 0

but that swap space doesn't show up when the system is up.  Subsequently,
the threaded memory test fails and the system hangs.  Also, the swapon man
page has a note saying that swap over NFS may not work.  So how do I get
around this?
Comment 13 Jonathan Lim 2007-11-16 16:40:37 EST
Created attachment 261851 [details]
HTS results for x86_64 with NFSroot and using iSCSI for swap

Attached are the results from running HTS on an x86_64 system
with NFSroot and using iSCSI for swap:

  1. NFSroot and swap (512MB) are provided from another x86_64
     system.  Both are running 2.6.18-53.el5 (RHEL5.1-GA).

  2. The network test for eth0 and eth1 are disabled because
     any interruption to NFS causes the system to hang.

  3. After running HTS, rpm fails to run with the following
     error:

       rpmdb: PANIC: fatal region error detected; run recovery
       error: db4 error(-30977) from dbenv->open: DB_RUNRECOVERY: Fatal error,
run database recovery
       error: cannot open Packages index using db3 -  (-30977)
       error: cannot open Packages database in /var/lib/rpm
Comment 14 Greg Nichols 2008-03-06 21:07:36 EST
Created 436419 for above issue.  Closing this FEAT bug as it has been incorporated.

Note You need to log in before you can comment on or make changes to this bug.