Description of problem: RHEL guests running on KVM in RHEL5.4 or RHEV are experiencing issues that cause their clocks to drift (e.g. BZ 507834). A fix is being worked on (BZ 476075), but in the interim there are some kernel parameters that can be set to work around the problem. We would like to provide customers with one script that they can run to set all recommended parameters. The exact parameters and values are being tested by engineering and QA at this time, and will be provided here as soon as they are available. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
The kernel parameters to be used for RHEL5 guests are as follows: 64-bit guests: divider=10 notsc 32-bit guests: divider=10 clocksource=acpi_pm We are also discussing the use of one other parameter, lpj=n, but since n is going to be different on every system it is unlikely we could automate this.
Are these parameters to be set for every RHEL-5 guest or are these to be enabled by the user/admin? Is there a way to determine if the system is a RHEL-5 guest? BTW: ktune is an optional package and it not turned on by default.
It is not required to automatically run this on a guest--the intent was to direct customers to run the script themselves. That said it would be good to have a couple of basic sanity checks in place, namely: 1) That the system is actually a kvm guest. 2) That they dont have the pv clock driver installed (which will eliminate the need for these parameters. On 5.4, you can tell that you have a kvm guest by looking at certain information that is passed to the guest's smbios, and is viewable using dmidecode. Under "System Information", the manufacturer will be set to "Red Hat". Glauber...is there a way to tell if we're on a system that has the pv clock installed?
Release note added. If any revisions are required, please set the "requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Customers running RHEL5.4 as a kvm guest need to run the script we are creating here to correctly set kernel parameters, in order to avoid issues with clock skew.
Created attachment 357321 [details] Script to set kernel parameters that fix clock drift. Proposed shell script.
Created attachment 357325 [details] Script to set kernel parameters that fix clock drift. Proposed shell script with small enhancement to only fix kernel command lines and no other lines containing the word kernel. Are the architectures o.k.?
I believe the arch could only be x86_64 for a 64-bit rhel guest. Dor, could you confirm, and have a look at the attached script and see if looks ok?
Also can we do the verification mentioned in comment #3?
dmidecode is an default package in the system-tools group in RHEL-5 and therefore not part of a default installation. Using it is therefore not possible without adding a new requirement. Please let me know if there is a way to determine if the system is a kvm guest in a default installation.
You could look at parsing the output of hal from hal-device or using smbios via python, have a look at hardware.py in /usr/share/rhn/up2date_client And re:#7 we should only be looking for x86_64 not the other archs
My initial thoughts are 1. Ensure that we're running on a KVM guest (use dmidecode/smbios/hal) 2. Make sure that the parameters aren't already set in grub.conf (might want to look at augeus's grub lens?) 3. Make sure that we don't set this option on Xen kernels - only bare metal kernels 4. Verify that we've not got the kvm PV clock. Might want to check with Glauber about best way but you can look for the pv clock in /sys/devices/system/clocksource/clocksource0/available_clocksource 5. Only support x86 and x86_64 6. Backup old grub entry so we don't kill the box if there's a problem, I suspect augeus is best approach here I'm sure they'll be more ;-)
(In reply to comment #9) > dmidecode is an default package in the system-tools group in RHEL-5 and > therefore not part of a default installation. Using it is therefore not > possible without adding a new requirement. > > Please let me know if there is a way to determine if the system is a kvm guest > in a default installation. Can you look at how rhn_register is doing it? They are parsing smbios somehow, because they are using these same fields to verify floating guest entitlements. If you need more information on this, I can put you in touch with the people who did the rhn implementation, just let me know.
Andrew: 1) Please see comments for Alan below. 2) Might it be enough to check for the divider kernel parameter? augeas is not part of the RHEL-5 distribution, therefore not usable. 3) See 1) and what is a "bare metal kernel"? 4) I just installed RHEL-5-Server-x86_64 as a KVM guest on my F-11 machine: cat /sys/devices/system/clocksource/clocksource0/available_clocksource jiffies 5) OK 6) A backup for every grub entry in grub.conf could bloat the file. For example: On my EL-5 server are currently 6 boot entries, because of installing updates.
Alan: I have had a look at rhn* packages in current RHEL-5. There are only checks for xen, but none for KVM/QEMU. Checking the lshal output in the RHEL-5 KVM guest shows these QEMU specific settings: system.firmware.version = 'QEMU' system.firmware.vendor = 'QEMU' system.bios.version = 'QEMU' system.bios.vendor = 'QEMU' Which one is ok to be used with all RHEL-5 versions and will not change? Are these the same for QEMU without KVM?
Thomas, System.vendor should appear in lshal too, correct? This should be set to "Red Hat" on a kvm guest. James, can you tell us where to find the RHN code in RHEL5.4 that checks to see if we're on a KVM guest?
(In reply to comment #7) > I believe the arch could only be x86_64 for a 64-bit rhel guest. Dor, could > you confirm, and have a look at the attached script and see if looks ok? The current additions are good. Since kvm supports only x86/i386, it is enough to test these architectures. There is a guest kernel command line missing - lpj=n. It is not trivial to put it, although a script can take it from the host: http://cleo.tlv.redhat.com/qumrawiki/KVM/TimeKeeping The lpj, is less critical for time keeping and more for safe boot.
(In reply to comment #16) How can a script running in the guest take the lpj=n parameter from the host?
Glauber, what's the best way to know we've got the new pvclock and it's being used - will it show up as the clocksource under system/devices?
Created attachment 357798 [details] New version of script to fix clock drift.
(In reply to comment #23) > Glauber, what's the best way to know we've got the new pvclock and it's being > used - will it show up as the clocksource under system/devices? For i386, yes. cat /sys/devices/system/clocksources/clocksource0/current_clocksource will show you In x86_64, this file exists, but it is bogus, probably just a placeholder from generic code. The way to be sure we are using it, is to check on dmesg for a message that looks like this: time.c: Using <yyy> MHz WALL KVM GTOD KVM timer
Created attachment 357808 [details] Script to set kernel parameters that fix clock drift. New check for KVM timer. Better check for clock drift fixed in the kernel command line. Moved checks below message and question.
Created attachment 357819 [details] Script to set kernel parameters that fix clock drift. Final version with text changes and a fix for vendor detection.
if dmesg | grep -q "time\.c:.*MHz WALL KVM GTOD KVM timer"; then echo "ERROR: KVM timer already in use, exiting." exit 1 fi Please note this will only work in RHEL5 x86_64. In i386, we have to proceed with another check.
Which check is needed for i686?
See comment #25: cat /sys/devices/system/clocksource/clocksource0/current_clocksource must return kvm-clock Unfortunately, at the moment we froze RHEL5 kernel, i386 already had the clocksource infrastructure in place, while x86_64 had not.
Do we need a fix for i686 then at all?
I don't see your script checking for it anywhere. So unless it is done somewhere else, I'd say yes.
Created attachment 357956 [details] Script to set kernel parameters that fix clock drift. Added i686 specific clocksource test for kvm-clock.
Is the checking for an existing command line correct? grep -q -E "^[[:space:]]+kernel.*(lpj=)|(divider=)|(notsc)" "${GRUB_CONF}"; then - Don't we also need to check for clocksource= - Also this looks for any entry not just the default/active one - not sure if that's correct. I've not tested this but looking at the script won't the sed line actually edit any kernel line including Xen entries and make them unusable? The text "ERROR: This is no kvm guest, exiting." should be changed to "ERROR: This is not a kvm guest, exiting." When we push out the new kernel with pv clock (bz 520685) how will we handle updating the configuration to support the pvclock. We'd at least need to remove the new entries - eg notsc .
Yes, the sed line will edit every kernel line. There is no grub.conf parser, which can be used by a simple bash script.
The script needs to be enhanced now since pvclock is supported only on stable_tsc hosts. You can detected it by reading the /proc/cpuinfo flags. More details on https://bugzilla.redhat.com/show_bug.cgi?id=520685#c26
Please explain in detail what needs to be changed in the fix_clock_drift.sh script. Some questions: - What does "all new host, > AMD rev-F" stand for? How to detect it? Is there stable_tsc or constant_tsc in /proc/cpuinfo? - You really want to modify /etc/rc.sysinit? - What has to be done and under which circumstances?
(In reply to comment #39) > Please explain in detail what needs to be changed in the fix_clock_drift.sh > script. It looks good. > > Some questions: > > - What does "all new host, > AMD rev-F" stand for? How to detect it? > Is there stable_tsc or constant_tsc in /proc/cpuinfo? only constant_tsc in /proc/cpuinfo. The problem is that even if the host supports constant_tsc, we do not expose it to the guest, so ktune won't see it. According to Glauber, it should work also there, it is just not well tested. > - You really want to modify /etc/rc.sysinit? No, it is minor, we can skip it > - What has to be done and under which circumstances? none.
Created attachment 378498 [details] Latest fix_clock_drift.sh version - Added clocksource= to kernel command line test as mentioned in comment #36 - Changed VENDOR=redhat test error string according to comment #36
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2010-0238.html
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days