Bug 296931
Summary: | IBM QS21 system with no swap fails memory test | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Retired] Red Hat Hardware Certification Program | Reporter: | Monza Lui <mlui> | ||||||||||||||
Component: | Test Suite (tests) | Assignee: | Greg Nichols <gnichols> | ||||||||||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Joseph Kachuck <jkachuck> | ||||||||||||||
Severity: | medium | Docs Contact: | |||||||||||||||
Priority: | medium | ||||||||||||||||
Version: | 5 | CC: | bxu, dmfaria, hannsj_uhl, rlandry | ||||||||||||||
Target Milestone: | --- | ||||||||||||||||
Target Release: | --- | ||||||||||||||||
Hardware: | ppc64 | ||||||||||||||||
OS: | Linux | ||||||||||||||||
URL: | http://www-03.ibm.com/systems/bladecenter/qs21/ | ||||||||||||||||
Whiteboard: | IBM | ||||||||||||||||
Fixed In Version: | 5.1-11 | Doc Type: | Bug Fix | ||||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||||
Clone Of: | Environment: | ||||||||||||||||
Last Closed: | 2007-11-06 21:55:28 UTC | Type: | --- | ||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||
Embargoed: | |||||||||||||||||
Attachments: |
|
Description
Monza Lui
2007-09-19 20:15:53 UTC
Created attachment 200101 [details] Result with "tempered hts We diff /usr/share/hts/tests/network/network.py << self.downAllInterfaces() << self.setSignalHandler(self.restoreAllInterfaces) << if not self.bringUpInterface(): >> #self.downAllInterfaces() >> #self.setSignalHandler(self.restoreAllInterfaces) >> if not 1: #self.bringUpInterface(): Created attachment 200121 [details] Result with "tempered hts Due to limitation of NFSroot, we ran the hts testsuite: 1) but commented out the part where it resets the network. See followings for how we have edited the code. 2) with SElinux is disabled 3) and memory test failed due to unavailability of swap partition diff /usr/share/hts/tests/network/network.py << self.downAllInterfaces() << self.setSignalHandler(self.restoreAllInterfaces) << if not self.bringUpInterface(): >> #self.downAllInterfaces() >> #self.setSignalHandler(self.restoreAllInterfaces) >> if not 1: #self.bringUpInterface(): Monza, RHEL5.1 is going to be launched on Q4 2007. Now the version is Alpha or Beta. We don't accept the test results on Alpha or Beta version. Would you please get the test run and upload the RPMs after we launching the GA version? Thanks, / Hi Xu, yes we are planning to upload a successful run later after GA. However, what we need right now is a hts that is adopted to the diskless environment. Ron, we did not get the hts for diskless environment last week. Will we have it this week? Thank you. Sorry, I meant Rob. Hi Monza, I think you can contact your TAM on geting test suite and related software. Regards -YK We got the new hts and ran the test again. This time we are able control the test so that it does not restart the network which could cause problem with our diskless environment on Cell. However, we currently still have one old and one new problem with the testsuite: 1) Swap space requirement (old problem) The test failed all memory tests because our QS21 is diskless and therefore does not have swap space. 2) Unmount USB drives (new problem) In the QS21 environment, three USB devices always show up as they are present in the chassis. However, they are not unmountable. And now the hts always complains about not able to umount them and fails. [root@cell21c ~]# lsusb Bus 003 Device 001: ID 0000:0000 Bus 002 Device 001: ID 0000:0000 Bus 001 Device 001: ID 0000:0000 Your prompt response is highly appreciated as our GA dates are drawing really close. Thank you. Created attachment 237751 [details]
hts results from three runs with a nfs aware test suite
From the above attached results, all three test runs on USB have the following log: USB test: USB Hub Interface appears to be plugged into bus 1 port 0 USB Hub Interface appears to be plugged into bus 2 port 0 USB Hub Interface appears to be plugged into bus 3 port 0 How many unused USB sockets are there? response: 0 No USB sockets to test ...finished running ./usb.py, exit code=0 These tests PASS. What indication do you have that the test fails? Also, please run the threaded memory test directly, ala: $ cd /usr/share/hts/tests/memory $ make $ ./threaded_memtest -qpv -m100% -t10 And attach the output. Thanks! USB test: I answered '0' to "How many unused USB sockets are there?" because if I answer 3, it will ask me to unplug the device, but as Monza said above, it is not possible to unmount these USB devices, and the test will repeat indefinitely until I answer 'NO': ---- USB test: USB Hub Interface appears to be plugged into bus 1 port 0 USB Hub Interface appears to be plugged into bus 2 port 0 USB Hub Interface appears to be plugged into bus 3 port 0 How many unused USB sockets are there? 3 response: 3 testing socket 1 of 3... Please plug in a USB device - continue? (yes|no) yes response: yes found device at /sys/devices/pci0005:00/0005:00:01.0/usb1/1-0:1.0 Please unplug the device and hit the Enter key: Did not confirm the device - repeating test. testing socket 1 of 3... Please plug in a USB device - continue? (yes|no) no response: no ...finished running ./usb.py, exit code=1 recovered exit code=1 hts-report-result /HTS/hts/usb FAIL /var/log/hts/runs/1/usb/output.log ---- ########################################################################### Memory test - We do not have any enabled swap space as we are working over NFS. 1 - I am posting the make result as tmp.a25339 2 - The test result is: ---- [root@cell21c memory]# ./threaded_memtest -qpv -m100% -t10 Warning: memsize > free_mem. You will probably hit swap. Detected 4 processors. RAM: 89.1% free (1784M/2003M) Testing 2003M RAM for 10 seconds using 8 threads: thread 0: mapping 250M RAM thread 1: mapping 250M RAM thread 2: mapping 250M RAM thread 3: mapping 250M RAM thread 4: mapping 250M RAM thread 5: mapping 250M RAM thread 6: mapping 250M RAM thread 7: mapping 250M RAM thread 1: mapping complete thread 0: mapping complete thread 5: mapping complete thread 4: mapping complete thread 2: mapping complete Killed Created attachment 239061 [details]
tmp.a25339 - memory test make log
usb test: if there are no unused sockets (meaning, physical sockets available for use to plug physical usb devices into), then 0 is the correct answer, and the usb test passes. So the memory issue is all that remains open. Please supply /var/hts/results.xml Also, please try running threaded_memtest at 90% of free memory, via: ./threaded_memtest -qpv -m90% -t10 Thanks! Created attachment 242061 [details]
Results of: ./threaded_memtest -qpv -m90% -t10
There is no /var/hts/results.xml. The test is attached.
[root@cell21c memory]# ls /var/hts
config.xml hts-IBM-Cell_QS21-Tikanga_ppc64_results-1.noarch.rpm plan.xml
Please try the revised test suite: http://people.redhat.com/rlandry/hts/hts-5.1-8.el5.noarch.rpm Note that you can run the memory test via: hts certify --test memory Hi, Now the memory test passed when running hts certify --test memory. But the test suite is bringing the network down without asking me, so all the system stops, because it is NFS based. The previous version I used do not have this issue (hts-5.1-3.el5.noarch_10_18.rpm). Can the logs for this failed run be attached? Presumably the're at least partial up till the network interface went down. Also, was this done as a clean run or with an existing plan after an hts upgrade? Rob, not quite sure about your second question. What do you mean by "with existing plan after an hts upgrade"? Thanks. To run hts you have to have a plan file; what gets planned can change between releases however I believe hts will attempt to use an existing plan file if it finds one. So if hts was just updated and a new plan wasn't created then it's possible that the old plan is the cause of the network shutdown. Unfortunately I'm not the developer on this, so I can't say for certain that's likely the cause, but given the timing I figured I'd at least offer the suggestion to see if it makes any difference since it should be pretty quick to tell. "hts clean" will remove all old test runs, and even the old test plan. I don't think that's the issue here, but it's worth a try. The results containing the network failure would be a help, as well as the output of the "mount" command. Sorry my mistake, I removed the previous version of hts, installed the new one and ran the test without creating a new plan. I created a plan and now it asks before turning the network down. Another issue: I started the server on a QS21 (nfs environment) running hts server start, but hts can not mount the supposedly exported directory. Do you know if I can run the server on a QS21? [root@cell21c ~]# mount -o rw,intr,rsize=12288,wsize=12288,udp cell21e.ltc.austin.ibm.com:/var/hts/export /tmp/mounttest/ mount: cell21e.ltc.austin.ibm.com:/var/hts/export failed, reason given by server: Permission denied [root@cell21c ~]# mount cell21e.ltc.austin.ibm.com:/var/hts/export /tmp/mounttest/ mount: cell21e.ltc.austin.ibm.com:/var/hts/export failed, reason given by server: Permission denied No, you can't run the hts server on an nfs root system. Moving this to MODIFIED, as the memory fix will be released with R10. Greg, QS21 is a nfs root (diskless) system. We have requested that this to be supported in hts so that we can run hardware certification. Please check with Rob. Reopening the bug. As I understand it, your question in comment #25 is, can the HTS server be run on the System Under Test. The answer is no. The HTS server must be run on a seperate system so that the network interfaces may be tested. For the related question, can the HTS server be an nfs-root system, the answer is also no. Do I understand the situation correctly? Greg, thank you for the clarification :) We are testing the latest hts with Greg's suggestions. Will post result ASAP. Created attachment 249341 [details]
HTS results
All tests passed: [root@cell21c daniel]# hts print loaded configuration /var/hts/config.xml loaded plan /var/hts/plan.xml loaded results /var/hts/results.xml Red Hat Hardware Certification test -------------------------------------------- Test Suite: 5.1 Release: 8 Plan Created: 2007-10-25 20:25:26 Test Server: cell8.ltc.austin.ibm.com -------------------------------------------- Run: 1 on 2007-10-26 13:26:51 -------------------------------------------- Tests: 5 planned, 5 run, 5 passed, 0 failed -------------------------------------------- Test Run 1 ---------------------------------------------------------------- usb - PASS network eth1 net_00_1a_64_0e_03_03 - PASS memory - PASS core - PASS info - PASS Combined Results for 1 Runs: -------------------------------------------- 5 tests planned 5 tests run 0 tests always failed 5 tests always passed Hi Greg, anything else we need to do to have QS21 on the hardware list (https://hardware.redhat.com/)? Monza, To have the hardware listed, you'll need to open a certification request on hardware.redhat.com, providing the specs for the model separately from this bug. From the above it looks like this bz is resolved by the latest hts package so I'm going to go ahead and close this bug as well. Presuming the package #'s line up you may re-use the same results from above as part of the hwcert. -Rob Thank you for the pointer, Rob :) Opened the following cert request - https://hardware.redhat.com/show.cgi?id=369641 |