Bug 1031456

Summary: info always failed and caused machine halt
Product: [Retired] Red Hat Hardware Certification Program Reporter: chengjianjun <chengjianj>
Component: Test Suite (tests)Assignee: Greg Nichols <gnichols>
Status: CLOSED CANTFIX QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 1.6.4CC: chengjianj, gnichols, qcai, rlandry
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-12-02 03:14:30 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description chengjianjun 2013-11-18 02:57:07 UTC
Description of problem:

The info test can not be completed no matter which test I run. 
When the screen says "Running plugins.Please wait ...",the system halts.
I have to force shutdown the machine.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:When doing hwcert-backend plan for the first time on my machine,it comes out with another machine type NF5280 and I need to change it to current teating machine SA5212H2

Comment 1 chengjianjun 2013-11-18 03:44:43 UTC
I have reinstalled the RHEL6.4 and test suit,but still can't get passed and the system still halt

Comment 2 Greg Nichols 2013-11-18 16:24:16 UTC
What versions of hwcert-client and sos are installed?

Comment 3 chengjianjun 2013-11-19 02:13:24 UTC
(In reply to Greg Nichols from comment #2)
> What versions of hwcert-client and sos are installed?

hwcert-client version is 1.6.4-57.el6 
sos version?

I installed these packages:

dt-15.14-2.EL6.x86_64 
hwcert-client-1.6.4-57.el6.noarch 
hwcert-client-info-1.6.4-57.el6.noarch 
kernel-debuginfo-2.6.32-358.el6.x86_64 
kernel-debuginfo-common-x86_64-2.6.32-358.el6.x86_64 
lmbench-3.0a7-7a.EL6.x86_64 
stress-0.18.8-1.3.EL6.x86_64

just as I do on other machine

Comment 4 chengjianjun 2013-11-19 06:04:51 UTC
The sosreport version is 2.2

Comment 5 chengjianjun 2013-11-20 05:28:46 UTC
(In reply to Greg Nichols from comment #2)

Does the test suite read machine information such as vendor,make and model from BIOS?

I doubt that the BIOS version is not appropriate

Comment 6 Rob Landry 2013-11-20 22:19:12 UTC
Does sosreport run correctly on this box when called outside of the testsuite or is the halt reproducible there as well?

Comment 7 chengjianjun 2013-11-25 00:36:25 UTC
(In reply to Rob Landry from comment #6)
> Does sosreport run correctly on this box when called outside of the
> testsuite or is the halt reproducible there as well?

The halt is reproducible as well.

Comment 8 chengjianjun 2013-11-25 00:56:04 UTC
(In reply to Rob Landry from comment #6)
> Does sosreport run correctly on this box when called outside of the
> testsuite or is the halt reproducible there as well?

When I ran the sosreport independently without the hwcert client ,halt appeared .

Screen said

"Running plugins. Please wait ...

completed  [19/72] ..."

then halted...

Comment 9 Rob Landry 2013-11-25 18:26:40 UTC
So the good and the bad news is you're not fighting with the hwcert test suite as it's reproducible with sosreport alone.  This means the halt is caused by something called inside of sosreport.  

Sosreport is a requirement of hwcert as it is used by RH support to understand the customer environment, and it is a certification blocker as it would be a bad customer experience if their call to support about one issue led to a system halt.

The next steps are to figure out what caused the halt to determine a plan from there.  Unfortunately [19/72] isn't specific enough as it depends on which of the overall available possible plugins was 19.

Utilizing the -v option on sosreport should help provide additional context to the sos run and hopefully help identify which plugin was last run.  The -n option to disable a suspected plugin to see if sos report then completes, followed by a -o to enable only the suspected plugin to reproduce the halt should provide you the tools to be able to narrow down which plugin is causing the issue.

Once we know which plugin, that plugin can be inspected to see if we can determine a specific cause and a resolution plan from there.  Most likely a BIOS and/or kernel change would be required.

Comment 10 chengjianjun 2013-11-26 11:32:47 UTC
I ran the 'sosreport -l' and found the 72 enabled plugins .The 19th plugin is hardware .

Then ran 'sosreport -n hardware',got passed...

I will test my machine without the hardware plugin to see if the results are acceptable by RH cert team.

Otherwise,I have to change BIOS version or kernel as you said.

I'll report the result later

Thanks

Comment 11 chengjianjun 2013-11-26 12:06:38 UTC
(In reply to chengjianjun from comment #10)

> I will test my machine without the hardware plugin to see if the results are
> acceptable by RH cert team.


In fact , the sosreport option which is right after the passed video test seems unchangable.

I can't just skip the hardware plugin while the test is running .

Can we get paused at the beginning of the sosreport and change the option to get passed or add an option before running hwcert-backend because there is a notice saying "Usage: sosreport [options]".

Comment 12 chengjianjun 2013-12-02 03:14:30 UTC
I have got this problem solved .
Just delete the hardware plugin under the directory :/usr/lib/python2.6/site-packages/sos/plugins/

Comment 13 Red Hat Bugzilla 2023-09-14 01:53:50 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days