RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 787126 - The BIOS has corrupted hw-PMU resources (.............) ERST: Can not request iomem region............RST
Summary: The BIOS has corrupted hw-PMU resources (.............) ERST: Can not request...
Keywords:
Status: CLOSED CANTFIX
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel
Version: 6.2
Hardware: i686
OS: Linux
unspecified
high
Target Milestone: rc
: ---
Assignee: Prarit Bhargava
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-02-03 09:00 UTC by Max Novaha
Modified: 2019-10-10 09:02 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-03-19 12:55:29 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Screenshot of RBSU (19.35 KB, image/png)
2012-03-19 14:12 UTC, Tony Camuso
no flags Details

Description Max Novaha 2012-02-03 09:00:47 UTC
Description of problem:

After rebotin sysyem om my HP DL 380 G6 i got error:
The BIOS has corrupted hw-PMU resources (.............) ERST: Can not request iomem region............RST

Version-Release number of selected component (if applicable):

Version-Release number of selected component (if applicable):
2.6.32-220.4.1.el6.i686

How reproducible:
unknown

Steps to Reproduce:
1. Install Centos 6.0
2. Update it to Centos 6.2
3. reboot system
  
Actual results:

The BIOS has corrupted hw-PMU resources (.............) ERST: Can not request iomem region............RST

Expected results:

unknown

Comment 2 Prarit Bhargava 2012-02-03 14:09:16 UTC
Hello, this is a BIOS issue.  Please contact your hardware vendor for assistance and/or a BIOS update.

P.

Comment 3 Max Novaha 2012-02-03 18:43:06 UTC
Funny. I write to HP technical support. They tell my that its software problems. I use latest version of Bios. And what can i do now ? Where can i go ?

P.S.: Cantos communiti tell me what my problem like this:      https://bugzilla.redhat.com/show_bug.cgi?id=688547

Comment 4 Max Novaha 2012-02-03 18:58:40 UTC
Funny. I write to HP technical support. They tell my that its software problems. I use latest version of Bios. And what can i do now ? Where can i go ?

P.S.: Cantos communiti tell me what my problem like this:      https://bugzilla.redhat.com/show_bug.cgi?id=688547

Comment 5 Don Zickus 2012-02-06 15:13:27 UTC
(In reply to comment #4)
> Funny. I write to HP technical support. They tell my that its software
> problems. I use latest version of Bios. And what can i do now ? Where can i go
> ?
> 
> P.S.: Cantos communiti tell me what my problem like this:     
> https://bugzilla.redhat.com/show_bug.cgi?id=688547
Hi Max,

Please attach your complete dmesg output.  It is hard to see what the error is refering to based on the small pieces you provided in the description.

Cheers,
DOn

Comment 6 Max Novaha 2012-02-06 19:36:41 UTC
Sorry but i back to kernel 2.6.23-71. But here discuss the problem as I have:

1.http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?objectID=c02911009
2.https://bugzilla.redhat.com/show_bug.cgi?id=694913
3. https://www.redhat.com/wapps/sso/login.html redirect=https://access.redhat.com/kb/docs/DOC-55869

But HP forgot the Gen 6 and 7 servers. And i dont understand why. I like the Red Hat and Cent OS but i cant use latest version of this OS because all forgot about us.

Comment 7 Prarit Bhargava 2012-02-06 19:59:50 UTC
Max, can you attach the output of dmesg to this BZ?  At least we can take a look at your log messages and get an idea of what went wrong.

Thanks,

P.

Comment 8 Tony Camuso 2012-02-07 12:21:53 UTC
The BIOS is using performance counters and is conflict with perf.

Should be an updated BIOS to fix the problem. 

I am looking into this now.

Comment 10 George Rafaelov 2012-03-18 20:48:22 UTC
Hi guys, I have the same problem on ProLiant DL360 G7. I think I have the latest firmware. 

[root@beekas-03 ~]# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 6.2 (Santiago)
[root@beekas-03 ~]# uname -a
Linux beekas-03.vimpelcom.ru 2.6.32-220.2.1.el6.x86_64 #1 SMP Tue Dec 13 16:21:34 EST 2011 x86_64 x86_64 x86_64 GNU/Linux


Here is a little peace of my dmesg:

Setting APIC routing to physical flat
  alloc irq_desc for 48 on node 0
  alloc kstat_irqs on node 0
alloc irq_2_iommu on node 0
alloc irq_2_iommu on node 0
alloc irq_2_iommu on node 0
alloc irq_2_iommu on node 0
alloc irq_2_iommu on node 0
alloc irq_2_iommu on node 0
alloc irq_2_iommu on node 0
alloc irq_2_iommu on node 0
alloc irq_2_iommu on node 0
alloc irq_2_iommu on node 0
alloc irq_2_iommu on node 0
alloc irq_2_iommu on node 0
alloc irq_2_iommu on node 0
alloc irq_2_iommu on node 0
alloc irq_2_iommu on node 0
..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
CPU0: Intel(R) Xeon(R) CPU           E5606  @ 2.13GHz stepping 02
Performance Events: PEBS fmt1+, Westmere events, Broken BIOS detected, complain to your hardware vendor.
[Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 38d is 330)
Intel PMU driver.
... version:                3
... bit width:              48
... generic registers:      4
... value mask:             0000ffffffffffff
... max period:             000000007fffffff
... fixed-purpose events:   3
... event mask:             000000070000000f
NMI watchdog enabled, takes one hw-pmu counter.
Booting Node   1, Processors  #1 Ok.
Booting Node   0, Processors  #2 Ok.
Booting Node   1, Processors  #3 Ok.
Booting Node   0, Processors  #4 Ok.
Booting Node   1, Processors  #5 Ok.
Booting Node   0, Processors  #6 Ok.
Booting Node   1, Processors  #7
Brought up 8 CPUs
Total of 8 processors activated (34133.19 BogoMIPS).


How to solve this problem????

Comment 11 Prarit Bhargava 2012-03-19 12:55:29 UTC
(In reply to comment #10)
> Hi guys, I have the same problem on ProLiant DL360 G7. I think I have the
> latest firmware. 
> 
> Performance Events: PEBS fmt1+, Westmere events, Broken BIOS detected, complain
> to your hardware vendor.
> [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 38d is 330)

George, as the message says, this is a HW problem and the fix is to update the BIOS.

I am closing this issue as NOTABUG.

P.

Comment 12 Tony Camuso 2012-03-19 14:12:00 UTC
Created attachment 571134 [details]
Screenshot of RBSU

For those of you still encountering this problem even after the latest BIOS update ...

. Go into RBSU. Hit "control A" for the Service Options to appear (see attached screenshot)

. Set the "Processor Power and Utilization Monitoring" EV to Disabled. Reboot the server. This time the BIOS will no longer reserve any performance counters. The "perf" subsystem will not complain about the BIOS in "dmesg".

Comment 13 Don Zickus 2012-03-19 14:43:57 UTC
Hi,

The alternative is to dig through the BIOS options and disable something like CPU monitoring.  HP has an online document that details the steps to have the BIOS disable the use of perf counters.  Unfortunately, I do not know where it is off the top of my head.

HP buries the use of the perf counter to some obscure register, so unless you are performing complex perf analysis you most likely will not run into any data corruption issues with the retrieved data.

Cheers,
Don

Comment 14 George Rafaelov 2012-03-19 15:29:25 UTC
(In reply to comment #12)
> Created attachment 571134 [details]
> Screenshot of RBSU
> 
> For those of you still encountering this problem even after the latest BIOS
> update ...
> 
> . Go into RBSU. Hit "control A" for the Service Options to appear (see attached
> screenshot)
> 
> . Set the "Processor Power and Utilization Monitoring" EV to Disabled. Reboot
> the server. This time the BIOS will no longer reserve any performance counters.
> The "perf" subsystem will not complain about the BIOS in "dmesg".

Tony,
I have tried to disable "Processor Power and Utilization Monitoring" and have the same error ....

Comment 15 Tony Camuso 2012-03-19 16:00:34 UTC
George, 

On some platforms, it is possible that system BIOS does not release the performance counters when you perform this work-around. 

There is a Maintenance BIOS release (Date TBD) that will correct this problem. 

Meanwhile, as Don and Prarit pointed out, it is safe to ignore this message in most applications.

Comment 16 Jim 2014-11-22 22:17:18 UTC
This bug should be reopened and actually given the attention it deserves. I have seen no actual troubleshooting on Red Hats part to understand this problem.

I have seeing the same issue on my ProLiant DL380 G7 production server, which has been functioning 100% correctly for years. Red Hat recommended I update the kernel to 358 from 220 and now I am seeing this issue, but only after upgrading the kernel.

I have already spoken to HP who simply said to upgrade the BIOS, Firmware, and then talk to Red Hat about the kernel.

1. The BIOS is the latest version.
2. The Firmware is the latest version.

Now what? We cannot simply accept the fact that this is a BIOS/Firmware issue regardless what the error message indicates. It is clearly a problem with the kernel, as the issue only began after upgrading the kernel and we have the latest version of BIOS/Firmware.

So, its not a bug? At least kick out a little effect before closing a case that you do not understand.

Comment 17 Don Zickus 2014-11-24 15:57:23 UTC
Hi Jim,

Red Hat is firmly aware of this problem and so is HP.

The problem is HP (and every other OEM on the market) is 'secretly' using hardware performance counters to monitor hardware so their firmware can adapt their settings dynamically based on the results.

Now, the question is how can an OS reliably using that same hardware if the BIOS is 'secretly' using.  It can't.  And we have had our performance benchmarks tainted because of this usage.  In addition, broken BIOSes neglect to properly clean up the performance counter registers upon returning from an SMI, leaving the OS with false data to act upon.  Again very bad.

We developed a test to detect this usage and report it.

HP (and the OEMs) have acknowledge this usage years ago and have agreed to create BIOS knobs to disable it.

You need to contact HP for the correct documentation to disable dynamic power control.

We have even contacted Intel for guidance on hardware resource sharing and they do not have a good solution other than standardizing our detection algorithm and use that as indication of hardware sharing.  But they will not release a spec reference to address this (because the recommendation is too hokey).

The end result of this, is OEM BIOSes.  The OS can not do anything without a formal communicated spec that details how an OS and a BIOS can share the same exact hardware registers during active use.

Red Hat's recommends disabling the BIOS feature and so does HP actually I believe now.

Would closing this as a 'CANT_FIX' instead of 'NOTABUG' be better for you?

Cheers,
Don

Comment 18 Jim 2014-11-24 16:38:51 UTC
Morning Don,

Thank you for your response.

The HP advisory relating to this issue is located at the link below. The instructions to disable "Processor Power and Performance Monitoring" are not a solution, as it clearly states the errors are "avoided." Now, if these errors are causing your system to panic and reboot, avoiding the errors does not work. I have attempted this "solution" anyway, but as expected, it does nothing to solve the problem.

Closing the bug as 'CANT_FIX' would be more honest to customers than 'NOTABUG' as clearly HP and Red Hat are kicking this bug back and forth and neither would like to take responsibility to get it fixed or addressed appropriately.

My question is this:

Do the hardware vendors adhere to the specifications of the OS creators or do the OS creators adhere to the specifications of the hardware manufacturers? 

Given the nature of Linux, it would seem to me that Red Hat would come out with a solution to get around this, even if they have to disable portions of HP hardware. While, my experience of this issue points to the OS, since we have only seen this issue since upgrading the kernel; the fact is this is occurring on all the HP systems listed below and only RedHat Linux and SUSE Linux are the affected operating systems.

Looking at the list below, this indicates that Red Hat 6 does not support the below hardware platforms. So, if closing the bug as 'CANTFIX' is Red Hats version of adequately addressing this BUG, then we will continue to see folks kicking comments back and forth to HP and RedHat until one of the companies does their job.


Hardware Platforms Affected: HP ProLiant BL280c G6 Server Blade, HP ProLiant BL2x220c G6 Server Blade, HP ProLiant BL2x220c G7 Server Blade, HP ProLiant BL460c G6 Server Blade, HP ProLiant BL460c G7 Server Blade, HP ProLiant BL460c Gen8 Server Blade, HP ProLiant BL465c G6 Server Blade, HP ProLiant BL465c G7 Server Blade, HP ProLiant BL465c Gen8 Server Blade, HP ProLiant BL490c G6 Server Blade, HP ProLiant BL490c G7 Server Blade, HP ProLiant BL495c G6 Server Blade, HP ProLiant BL620c G7 Server Blade, HP ProLiant BL680c G7 Server Blade, HP ProLiant BL685c G6 Server Blade, HP ProLiant BL685c G7 Server Blade, HP ProLiant DL160 G6 Server, HP ProLiant DL160 Gen8 Server, HP ProLiant DL160se G6 Server, HP ProLiant DL165 G7 Server, HP ProLiant DL170e G6 Server, HP ProLiant DL170h G6 Server, HP ProLiant DL180 G6 Server, HP ProLiant DL320 G6 Server, HP ProLiant DL320e Gen8 v2 Server, HP ProLiant DL360 G6 Server, HP ProLiant DL360 G7 Server, HP ProLiant DL360p Gen8 Server, HP ProLiant DL380 G6 Server, HP ProLiant DL380 G7 Server, HP ProLiant DL380p Gen8 Server, HP ProLiant DL385 G6 Server, HP ProLiant DL385 G7 Server, HP ProLiant DL388p Gen8 Server, HP ProLiant DL560 Gen8 Server, HP ProLiant DL580 G7 Server, HP ProLiant DL585 G6 Server, HP ProLiant DL585 G7 Server, HP ProLiant DL785 G6 Server, HP ProLiant DL980 G7 Server, HP ProLiant DL985 G7 Server, HP ProLiant ML150 G6 Server, HP ProLiant ML310e Gen8 v2 Server, HP ProLiant ML350p Gen8 Server, HP ProLiant MicroServer Gen8, HP ProLiant SL160s Gen8 Server, HP ProLiant SL230s Gen8 Server, HP ProLiant SL250s Gen8 Server, HP ProLiant SL270s Gen8 Server
Operating Systems Affected: Red Hat Enterprise Linux (Itanium), Red Hat Enterprise Linux 6 (x86), SUSE Linux Enterprise Server 11 (x86), SUSE Linux Enterprise Server 11 (x86-64)

HP Advisory:
http://h20565.www2.hp.com/hpsc/doc/public/display?sp4ts.oid=4268686&docId=emr_na-c03265132&lang=en&cc=us

Comment 19 Don Zickus 2014-11-24 19:02:21 UTC
(In reply to Jim from comment #18)
> Morning Don,
> 
> Thank you for your response.
> 
> The HP advisory relating to this issue is located at the link below. The
> instructions to disable "Processor Power and Performance Monitoring" are not
> a solution, as it clearly states the errors are "avoided." Now, if these
> errors are causing your system to panic and reboot, avoiding the errors does
> not work. I have attempted this "solution" anyway, but as expected, it does
> nothing to solve the problem.

Hi Jim,

'Avoided' is probably a poor word choice.  Basically the fix is to tell the BIOS to let the OS do power management control.  The stops the BIOS from 'secretly' touching those registers.  That is what those instructions are trying to do.

As for it not working, I am sorry to hear that.  All I can say is Red Hat's kernel is attempting to detect if those performance counters are enabled or not.  If so, that clearly indicates the BIOS is using those registers.

Therefore if the OS detects the registers are enabled, then there isn't anything the OS can do.  We can manually disable them, but a timer interrupt to fire an SMI just re-enables them.  So we aren't accomplishing anything there.

This is really an HP problem.  HP knows this.  All the folks I have dealt with at HP recognize this. 


> 
> Closing the bug as 'CANT_FIX' would be more honest to customers than
> 'NOTABUG' as clearly HP and Red Hat are kicking this bug back and forth and
> neither would like to take responsibility to get it fixed or addressed
> appropriately.

Fair enough.


> 
> My question is this:
> 
> Do the hardware vendors adhere to the specifications of the OS creators or
> do the OS creators adhere to the specifications of the hardware
> manufacturers? 

We software folks are at the whim of the hardware vendors.  Programming software would be soooo much easier if we could force them to fix their bugs and design things how we think they should be designed. :-)  But we can't.  We have to write quirks and such.


> 
> Given the nature of Linux, it would seem to me that Red Hat would come out
> with a solution to get around this, even if they have to disable portions of
> HP hardware. While, my experience of this issue points to the OS, since we
> have only seen this issue since upgrading the kernel; the fact is this is
> occurring on all the HP systems listed below and only RedHat Linux and SUSE
> Linux are the affected operating systems.

We can't.  We literally can't.  Trust me, the holy grail of open source would be to disable SMIs as they cause all sorts of hair pulling issues when returning back to the OS.  We can mask the 'error', but then we are stuck explaining to customers why their performance data is inconsistent or why their system is randomly panic'ing.

All we can do is ask the OEM vendors to provide a mechanism and document to disable the SMIs on their system that is using these registers in their BIOS options.


> 
> Looking at the list below, this indicates that Red Hat 6 does not support
> the below hardware platforms. So, if closing the bug as 'CANTFIX' is Red
> Hats version of adequately addressing this BUG, then we will continue to see
> folks kicking comments back and forth to HP and RedHat until one of the
> companies does their job.


We can fully support these systems.  It just has some limitations.  But AFAIK HP is fully aware of this and fully endorses disabling their BIOS features to run RHEL.  After dealing with this for years, I have heard only a few customers complain about this.  So the assumption was the document is good enough or HP is shipping systems with this disabled be default.

Honestly, you are the first person I have heard raise this problem in over two years.

Cheers,
Don

Comment 20 Jim 2014-11-26 16:11:16 UTC
Happy Thanksgiving weekend all.


Thank you again for your response, as your argument emits sympathy for the software folks. Although, why doesn't any other operating system have this problem? This issue has not been reported on Windows, HP-UX, Solaris, or any other Unix variant. Only RHEL and SUSE. 

Also, a quick internet search will show numerous results on personnel having this same error in recent times. It is located on a CentOS 7 forum even which provides nervousness for future Linux compatibility prospects. 

I have been in constant contact with HP and they are aware of our situation, as the error is also being seen on a CentOS 6.5 platform running Asterisk. 

Hopefully, a solution can be found by HP to better remove utilization of these resources, but the fact still remains that the bug only exists on Linux server variants (RHEL, SUSE, CentOS).

Thank you for the information and honestly, I really appreciate it and have learned a lot.

Enjoy the holiday weekend.


Best Regards,
Jim Carpenter, MISM
System Administrator

Comment 21 Valent Turkovic 2015-06-05 12:54:14 UTC
I can confirm that this issue is still present on both latest CentOS 7 and Fedora 22.

After disabling "Processor Power and Utilization Monitoring." in BIOS system boots.

Interesting thing is that this issue doesn't prevent Fedora 22 Live image from booting, only once it is installed to hard drive this becomes an issue.

Comment 22 Don Zickus 2015-06-05 13:33:09 UTC
Hi,

Can you re-explain what the problem is.  This bz was orginally opened based on
some warnings about performance counters.  Those counters have zero affect on 
a running system and only come into play if you are relying on the hardlockup
detector to monitor your system for lockups or if you are using the perf tool
to analyze performance of your system.

Outside of those scenarios, there is no impact on the running OS.  So if you are
seeing something strange, you will have to describe it again for me and provide the
output of a dmesg log.

Thanks!

Cheers,
Don

Comment 23 greg 2015-06-24 07:25:42 UTC
I have done some troubleshooting. The issue seems to be caused by the HP bios performance counters, but exist in redhat after that.

I have an install of Redhat 7 on a SEAGATE 1TB USB HDD on a HP microserver gen8. The issue occurred after a number of reboots. I plugged it in a Dell server and HP desktop afterwards, and the issue stays!


********************************************************
Server: HP Proliant Microserver gen8
OS: Redhat 7.

HP advisory: 
http://h20564.www2.hp.com/hpsc/doc/public/display?docId=emr_na-c03265132

redhat:
https://bugzilla.redhat.com/show_bug.cgi?id=787126

 
situation before the issue occured:
- the server came through post without issues. 
- The grub bootloader worked correctly
- during init, the message: "[Firmware Bug]: The BIOS Has Corrupted Hw-PMU Resources"
was quickly displayed. The OS loaded successfully

situation when the issue occurs:
- the server comes through post without issues.
- the grub bootloader worked correctly
- during init, the message "[Firmware Bug]: The BIOS Has Corrupted Hw-PMU Resources" is displayed. The operating system hangs.
- when I press CTRL ALT F2/3/.. I can change the console. I can then safely login on the command line. But console 1 (runlevel 5) keeps hanging.
- systemctl did not start the neccessery services. There is networking. I am not able to change runlevel because redhat 7 works different then the 

 previous releases. 

troubleshooting done:
- checked in bios for: "Processor Power and Utilization Monitoring", this seems not to be present. 
- reseated CMOS battery (clearing NVRAM was not really possible as I had to take out the systemboard to do this. Microservers are quite small)
- donwgraded bios. issue says.
- used backup rom. issue stays.
- the disk that is used is USB 1TB SEAGATE. 
* tried the disk in a Dell server. The same issue occurs (the message is somewhat different but the same situation occurs)
* tried the disk in an HP desktop. The same issue occurs (the message is somewhat different but the same situation occurs)

I will further troubleshoot the OS to see how this can be solved. I will flash/upgrade/downgrade the kernel, perform updates and test.


The issue seems to be caused by the HP bios performance counters, but is then an issue in redhat. The same issue occurs when connecting the same 

drive to other devices. 

The issue can be prevented by following the HP advisory, but cannot be solved with these actions once it occured. 
http://h20564.www2.hp.com/hpsc/doc/public/display?docId=emr_na-c03265132

Comment 24 greg 2015-06-25 07:30:13 UTC
Some very interesting updates that I would not expect.

- I downgraded the kernel from: 
3.10.0-229.7.2.el7 to 3.10.0-229.4.2.el7 and to 3.10.0-229.1.2.el7
* the issue occurred at all times.

- I passed init=/sysroot/bin/sh to the grub bootloader, the issue did not occur.

- When the issue occurs and I see: [Firmware Bug]: The BIOS Has Corrupted Hw-PMU Resources. 
* I press ctrl alt F2 to change console
* I type: startx and the GUI loads.
* I find many errors about the GUI ( I know I should give you the exact errors, but I didn't copy them ).
* Graphical.mode is loaded. but almost no services are started. There is networking but that is basically it. 
* firewall is inactive: panic mode error when loading this from the GUI as root.
* necessary configured ports in systemctl not open.
* server not reachable through SSH.

- I disabled SElinux: same situation occurs.

- I tried to uninstall GNOME. 
* yum groupremove "GNOME" failed
* yum groupremove @gnome failed. 
* yum groupremove "X Window System" failed.
* yum groupremove "Server with GUI" failed.
* yum group remove "GNOME Desktop" went successful.
* yum groupremove "Server with GUI" failed. 
- GNOME was still installed. after a reboot GNOME loaded successfully.

Any advice on how to reinstall this? 

- I installed KDE after and can manually remove everything related to GDM. I can then set the display manager to KDE.

Anyway I set systemctl panic.mode and tried to boot to runlevel one via a reboot. Now there is a very strange situation:

- from grub I come in the 
[Firmware Bug]: The BIOS Has Corrupted Hw-PMU Resources error. 
- after a few seconds the GUI starts to load.
- after I come in the single user mode. Press ctrl -D to continue. (cli bash usually starts after that)
- but when I press ctrl-D, the Graphical loads again and stays loading. It will never pass this. I see the Graphical screen with the flashing small circle that indicates that the GUI is loading.


I was wondering if:
- I just have 2 issues at the exact same time??? Since the corrupted bios fw error occured, I have this GNOME issue.
- Are these 2 symptoms related? If not, it is very strange that they only occur together and not at the same time. I did not do anything to GNOME previously and it worked just perfect.
- Why can't I uninstall GNOME?

- I am determined to run more tests and finally find out. (if I do not mess up my system before that :)

Comment 25 Don Zickus 2015-06-26 13:41:55 UTC
Hi,

The BIOS corrupt issue has no impact on your ability to run a normal system.
So the GNOME issues you are seeing are unrelated.

As for the HP advisory, when in the BIOS menu, if you press 'Control-A'  does
power utilization option suddenly appear for you?  It did for me on a similar
HP machine and I could continue to follow the instructions and get the
BIOS corrupt message to disappear.  But, yes, without hitting 'Control-A', I
was stumped too.

As for the Dell box and other HP box, all OEMs use the performance and generate
the BIOS corrupt error message.  However, each OEM has a different set of
instructions to disable it.   You have found HP's, I do not know of the top
of my hand what Dell's is.

This has been a problem for 5 or more years now.  Most OEMs have either
recommended the BIOS option to turn of power utilization when running RHEL
or switched to using a different performance counter, such that the error
message stays around but rarely impacts the hardlockup counter and perf tool.

Again the BIOS corrupt messages describes rarely used hardware registers as
being shared by the BIOS (which it is not supposed to be).  It has nothing
to do with any desktop related application (or any kernel drivers other than
oprofile).

Cheers,
Don

Comment 26 greg 2015-06-29 08:47:46 UTC
yes you are right. After tests I could see that it was unrelated. It was very coincident that both issues appeared at the same time. 

But I think that we could draw the conclusion that this is an actual issue in redhat (it might be caused by the hp bios).

When the issue occurs on my HP server, and the issue stays when plugging it in to other devices, the issue should be with the OS. Only a reinstall solved the problem.

The issue is that I see the firmware bug errors. The OS does not fully boot. I can only come into the OS by changing console (ctrl-alt-F1) and login while many services are not loaded. 

What is your reaction on the fact that the issue should be with the OS as it occurs in a Dell server as well (while a clean install would not give the error in a Dell server). The issue travels with the disk to multiple devices.

Comment 27 Don Zickus 2015-06-29 15:53:31 UTC
Hi Greg,

I am a little confused by your reply.  I am focusing on the BIOS/Firmware corrupt problem displayed by the console log.  The other issues seem to be application specific which is outside the kernel domain, you would have to file a separate bug for those issues.

We can continue talking about the Firmware bug, but again there is not much Red Hat can do.  The HP and Dell BIOS silently uses hardware without telling the OS and modifies registers without consent from the OS.  

Is it ok for a good friend of yours to break into your house and take stuff and use the excuse (after getting caught) that you are friends he can take whatever he wants?  I don't.  And neither does the OS.  There has to be communication and there is none right now.

Regarding the fact that you can re-install and the issue goes away.  What issue?  The application problem or the firmware corrupt issue?

Cheers,
Don

Comment 28 greg 2015-06-29 16:38:57 UTC
not sure who is the one that is causing confusion.

I will make it clear

I am talking about the error that the redhat OS displays after boot.

- the server comes through post without issues.
- the grub bootloader worked correctly
- during init, the message "[Firmware Bug]: The BIOS Has Corrupted Hw-PMU Resources" is displayed. The operating system hangs.
- when I press CTRL ALT F2/3/.. I can change the console. I can then safely login on the command line. But console 1 (runlevel 5) keeps hanging.


************************
- issue occurred in HP Microserver gen8
- I move the disk to Poweredge R420. The issue occurs but the message is somewhat different.
- I move the disk to HP desktop, the issue occurs but there is no message (the exact same situation occurs.

*********************
Please explain why this is not an issue with the redhat OS. How could you still sell this to customers? 

Please explain if you troubleshooted it this way. what where the results?

Comment 29 Don Zickus 2015-06-30 14:33:28 UTC
Hi,

This bug has been focused on the BIOS message.  What you are seeing is an
application hang in runlevel 5.  That is outside the scope of this bugzilla.

Please file another bz and assign it to the desktop team.

The reason the BIOS message changes when moving from box to box, is the
firmware is configured differently on different platforms.  The Dell box
is probably using an Intel cpu which causes the message to be slightly
different but has the same meaning.  The other HP server probably has
the firmware configured to allow the OS pwoer control, hence you do not
see this message.

Cheers,
Don

Comment 30 greg 2015-06-30 16:32:06 UTC
Thanks for your reply. 

What I tried to point out was that the Bios message occurs on different machine's with different bios's, that was why I expected an issue with the OS and not with the Bios.

I have reinstall the OS now, so it is not possible to further troubleshoot.

It didn't occur when passing init=/sysroot/bin/sh to the bootloader.

Thanks for your help and your advice.


Note You need to log in before you can comment on or make changes to this bug.