Bug 516652 - need script to set kernel parameters that fix clock drift
Summary: need script to set kernel parameters that fix clock drift
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: ktune
Version: 5.4
Hardware: All
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Thomas Woerner
QA Contact: BaseOS QE
URL:
Whiteboard:
Depends On:
Blocks: 5.4, TechnicalNotes 517365 518039
TreeView+ depends on / blocked
 
Reported: 2009-08-10 21:53 UTC by Alan Zarembok
Modified: 2023-09-14 01:17 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Customers running RHEL5.4 as a kvm guest need to run the script we are creating here to correctly set kernel parameters, in order to avoid issues with clock skew.
Clone Of:
: 517365 (view as bug list)
Environment:
Last Closed: 2010-03-30 08:23:39 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Script to set kernel parameters that fix clock drift. (334 bytes, application/x-shellscript)
2009-08-13 13:44 UTC, Thomas Woerner
no flags Details
Script to set kernel parameters that fix clock drift. (346 bytes, application/x-shellscript)
2009-08-13 14:05 UTC, Thomas Woerner
no flags Details
New version of script to fix clock drift. (1.83 KB, application/x-shellscript)
2009-08-18 13:35 UTC, Thomas Woerner
no flags Details
Script to set kernel parameters that fix clock drift. (1.77 KB, application/x-shellscript)
2009-08-18 14:41 UTC, Thomas Woerner
no flags Details
Script to set kernel parameters that fix clock drift. (1.95 KB, application/x-shellscript)
2009-08-18 15:19 UTC, Thomas Woerner
no flags Details
Script to set kernel parameters that fix clock drift. (2.25 KB, application/x-shellscript)
2009-08-19 15:49 UTC, Thomas Woerner
no flags Details
Latest fix_clock_drift.sh version (2.27 KB, application/x-shellscript)
2009-12-15 13:07 UTC, Thomas Woerner
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2010:0238 0 normal SHIPPED_LIVE ktune bug fix and enhancement update 2010-03-29 12:40:21 UTC

Description Alan Zarembok 2009-08-10 21:53:51 UTC
Description of problem:

RHEL guests running on KVM in RHEL5.4 or RHEV are experiencing issues that cause their clocks to drift (e.g. BZ 507834).  A fix is being worked on (BZ 476075), but in the interim there are some kernel parameters that can be set to work around the problem.  We would like to provide customers with one script that they can run to set all recommended parameters.

The exact parameters and values are being tested by engineering and QA at this time, and will be provided here as soon as they are available.


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Alan Zarembok 2009-08-11 17:29:08 UTC
The kernel parameters to be used for RHEL5 guests are as follows:

64-bit guests:

divider=10 notsc

32-bit guests:

divider=10 clocksource=acpi_pm

We are also discussing the use of one other parameter, lpj=n, but since n is going to be different on every system it is unlikely we could automate this.

Comment 2 Thomas Woerner 2009-08-12 14:15:51 UTC
Are these parameters to be set for every RHEL-5 guest or are these to be enabled by the user/admin? Is there a way to determine if the system is a RHEL-5 guest?

BTW: ktune is an optional package and it not turned on by default.

Comment 3 Alan Zarembok 2009-08-12 15:32:47 UTC
It is not required to automatically run this on a guest--the intent was to direct customers to run the script themselves.

That said it would be good to have a couple of basic sanity checks in place, namely:

1) That the system is actually a kvm guest.
2) That they dont have the pv clock driver installed (which will eliminate the need for these parameters.

On 5.4, you can tell that you have a kvm guest by looking at certain information that is passed to the guest's smbios, and is viewable using dmidecode.  Under "System Information", the manufacturer will be set to "Red Hat".

Glauber...is there a way to tell if we're on a system that has the pv clock installed?

Comment 4 Alan Zarembok 2009-08-12 18:20:12 UTC
Release note added. If any revisions are required, please set the 
"requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

New Contents:
Customers running RHEL5.4 as a kvm guest need to run the script we are creating here to correctly set kernel parameters, in order to avoid issues with clock skew.

Comment 5 Thomas Woerner 2009-08-13 13:44:07 UTC
Created attachment 357321 [details]
Script to set kernel parameters that fix clock drift.

Proposed shell script.

Comment 6 Thomas Woerner 2009-08-13 14:05:11 UTC
Created attachment 357325 [details]
Script to set kernel parameters that fix clock drift.

Proposed shell script with small enhancement to only fix kernel command lines and no other lines containing the word kernel.

Are the architectures o.k.?

Comment 7 Alan Zarembok 2009-08-13 14:35:22 UTC
I believe the arch could only be x86_64 for a 64-bit rhel guest.  Dor, could you confirm, and have a look at the attached script and see if looks ok?

Comment 8 Alan Zarembok 2009-08-13 14:38:06 UTC
Also can we do the verification mentioned in comment #3?

Comment 9 Thomas Woerner 2009-08-13 14:45:42 UTC
dmidecode is an default package in the system-tools group in RHEL-5 and therefore not part of a default installation. Using it is therefore not possible without adding a new requirement.

Please let me know if there is a way to determine if the system is a kvm guest in a default installation.

Comment 10 Andrew Cathrow 2009-08-13 16:30:56 UTC
You could look at parsing the output of hal from hal-device or using smbios via python, have a look at hardware.py in /usr/share/rhn/up2date_client


And re:#7 we should only be looking for x86_64 not the other archs

Comment 11 Andrew Cathrow 2009-08-13 16:36:16 UTC
My initial thoughts are 

1. Ensure that we're running on a KVM guest
(use dmidecode/smbios/hal)

2. Make sure that the parameters aren't already set in grub.conf
(might want to look at augeus's grub lens?) 

3. Make sure that we don't set this option on Xen kernels - only bare metal kernels

4. Verify that we've not got the kvm PV clock.
Might want to check with Glauber about best way but you can look for the pv clock in
/sys/devices/system/clocksource/clocksource0/available_clocksource

5. Only support x86 and x86_64

6. Backup old grub entry so we don't kill the box if there's a problem, I suspect augeus is best approach here

I'm sure they'll be more ;-)

Comment 12 Alan Zarembok 2009-08-13 17:38:57 UTC
(In reply to comment #9)
> dmidecode is an default package in the system-tools group in RHEL-5 and
> therefore not part of a default installation. Using it is therefore not
> possible without adding a new requirement.
> 
> Please let me know if there is a way to determine if the system is a kvm guest
> in a default installation.  

Can you look at how rhn_register is doing it?  They are parsing smbios somehow,
because they are using these same fields to verify floating guest entitlements.
 If you need more information on this, I can put you in touch with the people
who did the rhn implementation, just let me know.

Comment 13 Thomas Woerner 2009-08-14 11:35:18 UTC
Andrew:

1) Please see comments for Alan below.

2) Might it be enough to check for the divider kernel parameter? augeas is not part of the RHEL-5 distribution, therefore not usable.

3) See 1) and what is a "bare metal kernel"?

4) I just installed RHEL-5-Server-x86_64 as a KVM guest on my F-11 machine:
  cat /sys/devices/system/clocksource/clocksource0/available_clocksource
  jiffies

5) OK

6) A backup for every grub entry in grub.conf could bloat the file.
   For example: On my EL-5 server are currently 6 boot entries, because of 
   installing updates.

Comment 14 Thomas Woerner 2009-08-14 11:35:44 UTC
Alan:

I have had a look at rhn* packages in current RHEL-5. There are only checks for xen, but none for KVM/QEMU.

Checking the lshal output in the RHEL-5 KVM guest shows these QEMU specific settings:

  system.firmware.version = 'QEMU'
  system.firmware.vendor = 'QEMU'
  system.bios.version = 'QEMU'
  system.bios.vendor = 'QEMU'

Which one is ok to be used with all RHEL-5 versions and will not change? Are these the same for QEMU without KVM?

Comment 15 Alan Zarembok 2009-08-14 19:10:28 UTC
Thomas,

System.vendor should appear in lshal too, correct?  This should be set to "Red Hat" on a kvm guest.

James, can you tell us where to find the RHN code in RHEL5.4 that checks to see if we're on a KVM guest?

Comment 16 Dor Laor 2009-08-16 12:22:08 UTC
(In reply to comment #7)
> I believe the arch could only be x86_64 for a 64-bit rhel guest.  Dor, could
> you confirm, and have a look at the attached script and see if looks ok?  

The current additions are good. Since kvm supports only x86/i386, it is enough to test these architectures.

There is a guest kernel command line missing - lpj=n.
It is not trivial to put it, although a script can take it from the host: 
http://cleo.tlv.redhat.com/qumrawiki/KVM/TimeKeeping

The lpj, is less critical for time keeping and more for safe boot.

Comment 18 Thomas Woerner 2009-08-17 14:42:54 UTC
(In reply to comment #16)

How can a script running in the guest take the lpj=n parameter from the host?

Comment 23 Andrew Cathrow 2009-08-18 13:07:27 UTC
Glauber, what's the best way to know we've got the new pvclock and it's being used - will it show up as the clocksource under system/devices?

Comment 24 Thomas Woerner 2009-08-18 13:35:05 UTC
Created attachment 357798 [details]
 New version of script to fix clock drift.

Comment 25 Glauber Costa 2009-08-18 14:11:49 UTC
(In reply to comment #23)
> Glauber, what's the best way to know we've got the new pvclock and it's being
> used - will it show up as the clocksource under system/devices?  

For i386, yes. cat /sys/devices/system/clocksources/clocksource0/current_clocksource will show you

In x86_64, this file exists, but it is bogus, probably just a placeholder from generic code. The way to be sure we are using it, is to check on dmesg for a message that looks like this:

  time.c: Using <yyy> MHz WALL KVM GTOD KVM timer

Comment 26 Thomas Woerner 2009-08-18 14:41:41 UTC
Created attachment 357808 [details]
Script to set kernel parameters that fix clock drift.

New check for KVM timer.
Better check for clock drift fixed in the kernel command line.
Moved checks below message and question.

Comment 29 Thomas Woerner 2009-08-18 15:19:41 UTC
Created attachment 357819 [details]
Script to set kernel parameters that fix clock drift.

Final version with text changes and a fix for vendor detection.

Comment 30 Glauber Costa 2009-08-19 10:57:14 UTC
if dmesg | grep -q "time\.c:.*MHz WALL KVM GTOD KVM timer"; then
    echo "ERROR: KVM timer already in use, exiting."
    exit 1
fi

Please note this will only work in RHEL5 x86_64. In i386, we have to proceed with another check.

Comment 31 Thomas Woerner 2009-08-19 12:42:14 UTC
Which check is needed for i686?

Comment 32 Glauber Costa 2009-08-19 12:47:35 UTC
See comment #25:

cat /sys/devices/system/clocksource/clocksource0/current_clocksource must return kvm-clock

Unfortunately, at the moment we froze RHEL5 kernel, i386 already had the clocksource infrastructure in place, while x86_64 had not.

Comment 33 Thomas Woerner 2009-08-19 14:22:15 UTC
Do we need a fix for i686 then at all?

Comment 34 Glauber Costa 2009-08-19 14:42:28 UTC
I don't see your script checking for it anywhere. So unless it is done somewhere else, I'd say yes.

Comment 35 Thomas Woerner 2009-08-19 15:49:27 UTC
Created attachment 357956 [details]
Script to set kernel parameters that fix clock drift.

Added i686 specific clocksource test for kvm-clock.

Comment 36 Andrew Cathrow 2009-09-08 14:58:46 UTC
Is the checking for an existing command line correct?

grep -q -E "^[[:space:]]+kernel.*(lpj=)|(divider=)|(notsc)" "${GRUB_CONF}"; then

- Don't we also need to check for clocksource= 
- Also this looks for any entry not just the default/active one - not sure if that's correct.


I've not tested this but looking at the script won't the sed line actually edit any kernel line including Xen entries and make them unusable?


The text "ERROR: This is no kvm guest, exiting." should be changed to "ERROR: This is not a kvm guest, exiting."


When we push out the new kernel with pv clock (bz 520685) how will we handle updating the configuration to support the pvclock.
We'd at least need to remove the new entries - eg notsc .

Comment 37 Thomas Woerner 2009-09-30 08:40:34 UTC
Yes, the sed line will edit every kernel line. There is no grub.conf parser, which can be used by a simple bash script.

Comment 38 Dor Laor 2009-09-30 09:19:32 UTC
The script needs to be enhanced now since pvclock is supported only on stable_tsc hosts. You can detected it by reading the /proc/cpuinfo flags. More details on https://bugzilla.redhat.com/show_bug.cgi?id=520685#c26

Comment 39 Thomas Woerner 2009-10-22 14:57:20 UTC
Please explain in detail what needs to be changed in the fix_clock_drift.sh script.

Some questions:

- What does "all new host, > AMD rev-F" stand for? How to detect it?
  Is there stable_tsc or constant_tsc in /proc/cpuinfo?
- You really want to modify /etc/rc.sysinit?
- What has to be done and under which circumstances?

Comment 40 Dor Laor 2009-12-14 22:17:45 UTC
(In reply to comment #39)
> Please explain in detail what needs to be changed in the fix_clock_drift.sh
> script.

It looks good.


> 
> Some questions:
> 
> - What does "all new host, > AMD rev-F" stand for? How to detect it?
>   Is there stable_tsc or constant_tsc in /proc/cpuinfo?

only constant_tsc in /proc/cpuinfo.
The problem is that even if the host supports constant_tsc, we do not expose it to the guest, so ktune won't see it.
According to Glauber, it should work also there, it is just not well tested.

> - You really want to modify /etc/rc.sysinit?

No, it is minor, we can skip it

> - What has to be done and under which circumstances?  

none.

Comment 41 Thomas Woerner 2009-12-15 13:07:12 UTC
Created attachment 378498 [details]
Latest fix_clock_drift.sh version

- Added clocksource= to kernel command line test as mentioned in comment #36
- Changed VENDOR=redhat test error string according to comment #36

Comment 45 errata-xmlrpc 2010-03-30 08:23:39 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2010-0238.html

Comment 46 Red Hat Bugzilla 2023-09-14 01:17:39 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days


Note You need to log in before you can comment on or make changes to this bug.