Bug 807215
Summary: | after host S4 the guest can not work normally | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | langfang <flang> | ||||||
Component: | kernel | Assignee: | Frank Arnold <farnold> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 6.3 | CC: | acathrow, andreas.herrmann3, areis, bsarathy, chayang, djasa, dyasny, farnold, gleb, joerg.roedel, juzhang, michen, mjenner, mkenneth, mtosatti, peterm, qzhang, shuang, shu, sluo, tburke, virt-maint, wdai, xwei | ||||||
Target Milestone: | rc | ||||||||
Target Release: | --- | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | kernel-2.6.32-266.el6 | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2012-06-20 08:44:09 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
langfang
2012-03-27 10:10:35 UTC
(In reply to comment #0) > Steps to Reproduce: > 1.boot vm on host > 2.Suspend host to disk echo disk >/sys/power/state. > 3.before suspend finish,execute some operation in guest (for example ,input > characters in terminal) > 4.after suspend finished. > (qemu)info status >>>>>>> I don't get it - was the host resumed? How can you type in qemu when the host is suspended? > VM status: running > 5.guest can not work well >>> What's the issue the VM has? And please always post you command line. (In reply to comment #2) > (In reply to comment #0) > > Steps to Reproduce: > > 1.boot vm on host > > 2.Suspend host to disk echo disk >/sys/power/state. > > 3.before suspend finish,execute some operation in guest (for example ,input > > characters in terminal) > > 4.after suspend finished. > > (qemu)info status > > >>>>>>> I don't get it - was the host resumed? How can you type in qemu when the host is suspended? reply: yes ,when the host resume finished,can typed command(info status) in qemu. > > > VM status: running > > 5.guest can not work well > > >>> What's the issue the VM has? reply:the vm can not be used ,can not execute any opertions in vm .but can be use vm through qemu.(for example,shutdown vm) addtional info:According to the following steps vm can work well 1.boot vm on host 2.Suspend host to disk echo disk >/sys/power/state. 3.after host resume. (qemu)info status VM status: running 4.vm can work well my CLI: /usr/libexec/qemu-kvm -m 2G -smp 1 -cpu cpu64-rhel6,+x2apic -usbdevice tablet -drive file=/root/RHEL-Server-6.3-64-virtio.qcow2,format=qcow2,if=none,id=drive-ide0-0-0,werror=stop,rerror=stop,cache=none -device virtio-blk-pci,scsi=off,drive=drive-ide0-0-0,id=ide0-0-0 -netdev tap,id=hostnet0,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,mac=00:10:20:2d:31:21,bus=pci.0,addr=0x4,id=net0 -boot order=cdn,once=n,menu=on -uuid 3290efd3-7c9e-44f9-b5f7-af0f3a1b3066 -rtc base=utc,clock=host,driftfix=slew -no-kvm-pit-reinjection -monitor stdio -name rhel6.1 -spice port=1000,disable-ticketing -vga qxl -device virtio-balloon-pci,bus=pci.0,id=balloon0 -drive file=/root/RHEL6.3-20120313.2-Server-x86_64-DVD1.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 addtional info : Steps 1.boot vm on host 2.Suspend host to disk echo disk >/sys/power/state. 3.before host suspend finish,execute some operation in guest (for example ,input characters in terminal) 4.after host suspend finished and resume. (qemu)info status VM status: running 5.guest can not work well after step 5,run dmesg on host ,part of the results: .... usb 7-1.1: reset low speed USB device number 3 using uhci_hcd sd 0:0:0:0: [sda] Starting disk Restarting tasks ... done. ------------[ cut here ]------------ WARNING: at arch/x86/kvm/x86.c:1840 kvm_arch_vcpu_load+0x103/0x150 [kvm]() (Tainted: G W --------------- ) Hardware name: ThinkCentre M8000T Modules linked in: sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf bridge stp llc ipv6 vhost_net macvtap macvlan tun kvm_intel kvm microcode sg serio_raw i2c_i801 iTCO_wdt iTCO_vendor_support snd_hda_codec_hdmi snd_hda_codec_analog snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc e1000e ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif ahci video output pata_acpi ata_generic radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan] Pid: 2313, comm: qemu-kvm Tainted: G W --------------- 2.6.32-252.el6.x86_64 #1 Call Trace: [<ffffffff8106a017>] ? warn_slowpath_common+0x87/0xc0 [<ffffffff8106a06a>] ? warn_slowpath_null+0x1a/0x20 [<ffffffffa036e6d3>] ? kvm_arch_vcpu_load+0x103/0x150 [kvm] [<ffffffffa0364db5>] ? vcpu_load+0x55/0x80 [kvm] [<ffffffffa0377cb4>] ? kvm_arch_vcpu_ioctl_run+0x24/0x1000 [kvm] [<ffffffff81216cf1>] ? avc_has_perm+0x71/0x90 [<ffffffffa0361322>] ? kvm_vcpu_ioctl+0x522/0x670 [kvm] [<ffffffff810a4ba0>] ? do_futex+0x100/0xb00 [<ffffffff81042ed4>] ? __do_page_fault+0x1e4/0x480 [<ffffffff8118bf92>] ? vfs_ioctl+0x22/0xa0 [<ffffffff8118c45a>] ? do_vfs_ioctl+0x3aa/0x580 [<ffffffff8118c6b1>] ? sys_ioctl+0x81/0xa0 [<ffffffff8100b0f2>] ? system_call_fastpath+0x16/0x1b ---[ end trace 4549a2482b57c439 ]--- Try to add -kvmclock to -cpu flag (like that: -cpu cpu64-rhel6,+x2apic,-kvmclock) and reproduce. (In reply to comment #7) > Try to add -kvmclock to -cpu flag (like that: -cpu > cpu64-rhel6,+x2apic,-kvmclock) > and reproduce. >>>>>i tried with kvmclock(-cpu cpu64-rhel6,+x2apic,-kvmclock).the vm also have the problem,thanks~~ (In reply to comment #8) > (In reply to comment #7) > > Try to add -kvmclock to -cpu flag (like that: -cpu > > cpu64-rhel6,+x2apic,-kvmclock) > > and reproduce. > > >>>>>i tried with kvmclock(-cpu cpu64-rhel6,+x2apic,-kvmclock).the vm also have the problem,thanks~~ When you run it with -kvmclock what do you see when you do "cat cat /sys/devices/system/clocksource/clocksource0/current_clocksource" in the guest? (In reply to comment #9) > (In reply to comment #8) > > (In reply to comment #7) > > > Try to add -kvmclock to -cpu flag (like that: -cpu > > > cpu64-rhel6,+x2apic,-kvmclock) > > > and reproduce. > > > > >>>>>i tried with kvmclock(-cpu cpu64-rhel6,+x2apic,-kvmclock).the vm also have the problem,thanks~~ > > When you run it with -kvmclock what do you see when you do "cat cat > /sys/devices/system/clocksource/clocksource0/current_clocksource" in the guest? #cat /sys/devices/system/clocksource/clocksource0/current_clocksource tsc (In reply to comment #10) > (In reply to comment #9) > > (In reply to comment #8) > > > (In reply to comment #7) > > > > Try to add -kvmclock to -cpu flag (like that: -cpu > > > > cpu64-rhel6,+x2apic,-kvmclock) > > > > and reproduce. > > > > > > >>>>>i tried with kvmclock(-cpu cpu64-rhel6,+x2apic,-kvmclock).the vm also have the problem,thanks~~ > > > > When you run it with -kvmclock what do you see when you do "cat cat > > /sys/devices/system/clocksource/clocksource0/current_clocksource" in the guest? > > #cat /sys/devices/system/clocksource/clocksource0/current_clocksource > tsc This is even worse than kvmclock. Do "echo acpi_pm > /sys/devices/system/clocksource/clocksource0/current_clocksource" or boot kernel with clocksource=acpi_pm and make sure that /sys/devices/system/clocksource/clocksource0/current_clocksource has it after boot and reproduce again please. The problem is that that the adjustment of guests TSC offsets is not working properly, caused by commit 9588b955928949563bdc48662d735ebda7867a58 Author: Frank Arnold <farnold> Date: Tue Jan 17 16:32:39 2012 -0500 [virt] x86: Make tsc_delta calculation a function of guest tsc which changes the adjustments during host suspend/resume to be calculated from guests TSC instead of hosts TSC (which also opens a window for guests to cause the host to consider its TSC unstable). + u64 delta_cyc; list_for_each_entry(kvm, &vm_list, vm_list) + max_tsc = delta_cyc = 0; + kvm_for_each_vcpu(i, vcpu, kvm) { + if (max_tsc > vcpu->arch.last_host_tsc) { But max_tsc was zeroed 2 lines above, so the condition is never true and it never adjusts the offset. (In reply to comment #11) > (In reply to comment #10) > > (In reply to comment #9) > > > (In reply to comment #8) > > > > (In reply to comment #7) > > > > > Try to add -kvmclock to -cpu flag (like that: -cpu > > > > > cpu64-rhel6,+x2apic,-kvmclock) > > > > > and reproduce. > > > > > > > > >>>>>i tried with kvmclock(-cpu cpu64-rhel6,+x2apic,-kvmclock).the vm also have the problem,thanks~~ > > > > > > When you run it with -kvmclock what do you see when you do "cat cat > > > /sys/devices/system/clocksource/clocksource0/current_clocksource" in the guest? > > > > #cat /sys/devices/system/clocksource/clocksource0/current_clocksource > > tsc > > This is even worse than kvmclock. Do "echo acpi_pm > > /sys/devices/system/clocksource/clocksource0/current_clocksource" or boot > kernel with clocksource=acpi_pm and make sure that > /sys/devices/system/clocksource/clocksource0/current_clocksource has it after > boot and reproduce again please. >>>>>>>>>test this for two senario host: #uname -r 2.6.32-252.el6.x86_64 # rpm -qa |grep qemu-kvm qemu-kvm-0.12.1.2-2.265.el6.x86_64 guest: #uname -r 2.6.32-251.el6.x86_64 senario 1: 1)boot guests with -kvmclock 2)#cat /sys/devices/system/clocksource/clocksource0/current_clocksource tsc 3)echo acpi_pm > /sys/devices/system/clocksource/clocksource0/current_clocksource" 4)#cat /sys/devices/system/clocksource/clocksource0/current_clocksource acpi_pm 5)reboot guest 6)#cat /sys/devices/system/clocksource/clocksource0/current_clocksource tsc senario 2: 1)boot guests with -kvmclock 2)#cat /sys/devices/system/clocksource/clocksource0/current_clocksource tsc 3)modify guest kernel with clocksource=acpi_pm 4)reboot guest 5)#cat /sys/devices/system/clocksource/clocksource0/current_clocksource acpi_pm my qestion:why do two different way to set clocksource=acpi_pm ,the results is different,do you think this is problem~~ The following problems are present with the commit 9588b955928949563bdc48662d735ebda7867a58: 1) it breaks restoration of tsc offsets without tsc scaling 2) it allows guest to trigger host kernel's WARN() There is no need to adjust the offsets using the guest tsc values. The host tsc values can be used and converted to guest units before placement in TSC_OFFSET register. The patch should be reverted in RHEL6, and fix should aim at upstream (which also does not handle tsc scaling when resuming), and later backported. Please try this kernel RPM on your host, once the build is finished: https://brewweb.devel.redhat.com/taskinfo?taskID=4241060 Thanks (In reply to comment #11) > (In reply to comment #10) > > (In reply to comment #9) > > > (In reply to comment #8) > > > > (In reply to comment #7) > > > > > Try to add -kvmclock to -cpu flag (like that: -cpu > > > > > cpu64-rhel6,+x2apic,-kvmclock) > > > > > and reproduce. > > > > > > > > >>>>>i tried with kvmclock(-cpu cpu64-rhel6,+x2apic,-kvmclock).the vm also have the problem,thanks~~ > > > > > > When you run it with -kvmclock what do you see when you do "cat cat > > > /sys/devices/system/clocksource/clocksource0/current_clocksource" in the guest? > > > > #cat /sys/devices/system/clocksource/clocksource0/current_clocksource > > tsc > > This is even worse than kvmclock. Do "echo acpi_pm > > /sys/devices/system/clocksource/clocksource0/current_clocksource" or boot > kernel with clocksource=acpi_pm and make sure that > /sys/devices/system/clocksource/clocksource0/current_clocksource has it after > boot and reproduce again please. hi!Gleb Natapov,please ingore comment13. i test with clocksource=apci_pm ,also have problem,thanks~~ steps: 1) boot guest with -kvmclock 2)echo acpi_pm > /sys/devices/system/clocksource/clocksource0/current_clocksource" 3)#cat /sys/devices/system/clocksource/clocksource0/current_clocksource acpi_pm 4)Suspend host to disk echo disk >/sys/power/state. 5).before resume,execute some operation in guest (for example ,input characters in terminal) 4.after host resume (qemu)info status VM status: running 5.the vm can not be used ,can not execute any opertions in vm (In reply to comment #16) > hi!Gleb Natapov,please ingore comment13. > i test with clocksource=apci_pm ,also have problem,thanks~~ I did verified it myself and apci_pm in a guest definitely solves the problem. > steps: > 1) boot guest with -kvmclock > 2)echo acpi_pm > > /sys/devices/system/clocksource/clocksource0/current_clocksource" > 3)#cat /sys/devices/system/clocksource/clocksource0/current_clocksource > acpi_pm I hope you did that in a _guest_. > 4)Suspend host to disk echo disk >/sys/power/state. > 5).before resume,execute some operation in guest (for example ,input > characters in terminal) > 4.after host resume > (qemu)info status > VM status: running > 5.the vm can not be used ,can not execute any opertions in vm (In reply to comment #17) > (In reply to comment #16) > > hi!Gleb Natapov,please ingore comment13. > > i test with clocksource=apci_pm ,also have problem,thanks~~ > I did verified it myself and apci_pm in a guest definitely solves the problem. >i test again.also have problem ,you can try twice to test(twice host S4) > > steps: > > 1) boot guest with -kvmclock > > 2)echo acpi_pm > > > /sys/devices/system/clocksource/clocksource0/current_clocksource" > > 3)#cat /sys/devices/system/clocksource/clocksource0/current_clocksource > > acpi_pm > I hope you did that in a _guest_. > yes ,i did that in a guest. > > 4)Suspend host to disk echo disk >/sys/power/state. > > 5).before resume,execute some operation in guest (for example ,input > > characters in terminal) > > 4.after host resume > > (qemu)info status > > VM status: running > > 5.the vm can not be used ,can not execute any opertions in vm by the way QE will be on holiday from April 2nd to April 4th . and i will back April sixth.thanks~~ please ignore commend19 (In reply to comment #17) > (In reply to comment #16) > > hi!Gleb Natapov,please ingore comment13. > > i test with clocksource=apci_pm ,also have problem,thanks~~ > I did verified it myself and apci_pm in a guest definitely solves the problem. > i test again.also have problem ,you can try twice to test(twice host S4) > > steps: > > 1) boot guest with -kvmclock > > 2)echo acpi_pm > > > /sys/devices/system/clocksource/clocksource0/current_clocksource" > > 3)#cat /sys/devices/system/clocksource/clocksource0/current_clocksource > > acpi_pm > I hope you did that in a _guest_. > yes ,i did that in a guest. > > 4)Suspend host to disk echo disk >/sys/power/state. > > 5).before resume,execute some operation in guest (for example ,input > > characters in terminal) > > 4.after host resume > > (qemu)info status > > VM status: running > > 5.the vm can not be used ,can not execute any opertions in vm by the way QE will be on holiday from April 2nd to April 4th . and i will back April sixth.thanks~~ (In reply to comment #19) > (In reply to comment #17) > > (In reply to comment #16) > > > hi!Gleb Natapov,please ingore comment13. > > > i test with clocksource=apci_pm ,also have problem,thanks~~ > > I did verified it myself and apci_pm in a guest definitely solves the problem. > i test again.also have problem ,you can try twice to test(twice host S4) I did much more than twice. What host cpu do you have? (In reply to comment #21) > (In reply to comment #19) > > (In reply to comment #17) > > > (In reply to comment #16) > > > > hi!Gleb Natapov,please ingore comment13. > > > > i test with clocksource=apci_pm ,also have problem,thanks~~ > > > I did verified it myself and apci_pm in a guest definitely solves the problem. > > i test again.also have problem ,you can try twice to test(twice host S4) > I did much more than twice. What host cpu do you have? my cpu:Intel(R) Core(TM)2 Quad CPU Q9500 @ 2.83GHz (In reply to comment #22) > (In reply to comment #21) > > (In reply to comment #19) > > > (In reply to comment #17) > > > > (In reply to comment #16) > > > > > hi!Gleb Natapov,please ingore comment13. > > > > > i test with clocksource=apci_pm ,also have problem,thanks~~ > > > > I did verified it myself and apci_pm in a guest definitely solves the problem. > > > i test again.also have problem ,you can try twice to test(twice host S4) > > I did much more than twice. What host cpu do you have? > > my cpu:Intel(R) Core(TM)2 Quad CPU Q9500 @ 2.83GHz (In reply to comment #22) > (In reply to comment #21) > > (In reply to comment #19) > > > (In reply to comment #17) > > > > (In reply to comment #16) > > > > > hi!Gleb Natapov,please ingore comment13. > > > > > i test with clocksource=apci_pm ,also have problem,thanks~~ > > > > I did verified it myself and apci_pm in a guest definitely solves the problem. > > > i test again.also have problem ,you can try twice to test(twice host S4) > > I did much more than twice. What host cpu do you have? > > my cpu:Intel(R) Core(TM)2 Quad CPU Q9500 @ 2.83GHz Hi, Can you please test the kernel in comment #14? (In reply to comment #23) > (In reply to comment #22) > > (In reply to comment #21) > > > (In reply to comment #19) > > > > (In reply to comment #17) > > > > > (In reply to comment #16) > > > > > > hi!Gleb Natapov,please ingore comment13. > > > > > > i test with clocksource=apci_pm ,also have problem,thanks~~ > > > > > I did verified it myself and apci_pm in a guest definitely solves the problem. > > > > i test again.also have problem ,you can try twice to test(twice host S4) > > > I did much more than twice. What host cpu do you have? > > > > my cpu:Intel(R) Core(TM)2 Quad CPU Q9500 @ 2.83GHz > > (In reply to comment #22) > > (In reply to comment #21) > > > (In reply to comment #19) > > > > (In reply to comment #17) > > > > > (In reply to comment #16) > > > > > > hi!Gleb Natapov,please ingore comment13. > > > > > > i test with clocksource=apci_pm ,also have problem,thanks~~ > > > > > I did verified it myself and apci_pm in a guest definitely solves the problem. > > > > i test again.also have problem ,you can try twice to test(twice host S4) > > > I did much more than twice. What host cpu do you have? > > > > my cpu:Intel(R) Core(TM)2 Quad CPU Q9500 @ 2.83GHz > > Hi, > > Can you please test the kernel in comment #14? Hi~~ Marcelo Tosatti,i can not download the kernel package ,can you help me ,thanks~~ (In reply to comment #24) > (In reply to comment #23) > > (In reply to comment #22) > > > (In reply to comment #21) > > > > (In reply to comment #19) > > > > > (In reply to comment #17) > > > > > > (In reply to comment #16) > > > > > > > hi!Gleb Natapov,please ingore comment13. > > > > > > > i test with clocksource=apci_pm ,also have problem,thanks~~ > > > > > > I did verified it myself and apci_pm in a guest definitely solves the problem. > > > > > i test again.also have problem ,you can try twice to test(twice host S4) > > > > I did much more than twice. What host cpu do you have? > > > > > > my cpu:Intel(R) Core(TM)2 Quad CPU Q9500 @ 2.83GHz > > > > (In reply to comment #22) > > > (In reply to comment #21) > > > > (In reply to comment #19) > > > > > (In reply to comment #17) > > > > > > (In reply to comment #16) > > > > > > > hi!Gleb Natapov,please ingore comment13. > > > > > > > i test with clocksource=apci_pm ,also have problem,thanks~~ > > > > > > I did verified it myself and apci_pm in a guest definitely solves the problem. > > > > > i test again.also have problem ,you can try twice to test(twice host S4) > > > > I did much more than twice. What host cpu do you have? > > > > > > my cpu:Intel(R) Core(TM)2 Quad CPU Q9500 @ 2.83GHz > > > > Hi, > > > > Can you please test the kernel in comment #14? > > Hi~~ > Marcelo Tosatti,i can not download the kernel package ,can you help me > ,thanks~~ Please try https://brewweb.devel.redhat.com/taskinfo?taskID=4271960 (In reply to comment #25) > (In reply to comment #24) > > (In reply to comment #23) > > > (In reply to comment #22) > > > > (In reply to comment #21) > > > > > (In reply to comment #19) > > > > > > (In reply to comment #17) > > > > > > > (In reply to comment #16) > > > > > > > > hi!Gleb Natapov,please ingore comment13. > > > > > > > > i test with clocksource=apci_pm ,also have problem,thanks~~ > > > > > > > I did verified it myself and apci_pm in a guest definitely solves the problem. > > > > > > i test again.also have problem ,you can try twice to test(twice host S4) > > > > > I did much more than twice. What host cpu do you have? > > > > > > > > my cpu:Intel(R) Core(TM)2 Quad CPU Q9500 @ 2.83GHz > > > > > > (In reply to comment #22) > > > > (In reply to comment #21) > > > > > (In reply to comment #19) > > > > > > (In reply to comment #17) > > > > > > > (In reply to comment #16) > > > > > > > > hi!Gleb Natapov,please ingore comment13. > > > > > > > > i test with clocksource=apci_pm ,also have problem,thanks~~ > > > > > > > I did verified it myself and apci_pm in a guest definitely solves the problem. > > > > > > i test again.also have problem ,you can try twice to test(twice host S4) > > > > > I did much more than twice. What host cpu do you have? > > > > > > > > my cpu:Intel(R) Core(TM)2 Quad CPU Q9500 @ 2.83GHz > > > > > > Hi, > > > > > > Can you please test the kernel in comment #14? > > > > Hi~~ > > Marcelo Tosatti,i can not download the kernel package ,can you help me > > ,thanks~~ > > Please try > > https://brewweb.devel.redhat.com/taskinfo?taskID=4271960 hi~~,i think there also have problem,the steps as following: 1.change the guest and host kernel with https://brewweb.devel.redhat.com/taskinfo?taskID=4271960 2.run in guest #echo acpi_pm >/sys/devices/system/clocksource/clocksource0/current_clocksource 3)in guest #cat /sys/devices/system/clocksource/clocksource0/current_clocksource acpi_pm 4)S4 for host 5)do some operation in guest(input some characters in terminal) 6)after host resume results: (qemu)info status running but the guest can not work,have no response.have the same problem of above said. (In reply to comment #26) > (In reply to comment #25) > > (In reply to comment #24) > > > (In reply to comment #23) > > > > (In reply to comment #22) > > > > > (In reply to comment #21) > > > > > > (In reply to comment #19) > > > > > > > (In reply to comment #17) > > > > > > > > (In reply to comment #16) > > > > > > > > > hi!Gleb Natapov,please ingore comment13. > > > > > > > > > i test with clocksource=apci_pm ,also have problem,thanks~~ > > > > > > > > I did verified it myself and apci_pm in a guest definitely solves the problem. > > > > > > > i test again.also have problem ,you can try twice to test(twice host S4) > > > > > > I did much more than twice. What host cpu do you have? > > > > > > > > > > my cpu:Intel(R) Core(TM)2 Quad CPU Q9500 @ 2.83GHz > > > > > > > > (In reply to comment #22) > > > > > (In reply to comment #21) > > > > > > (In reply to comment #19) > > > > > > > (In reply to comment #17) > > > > > > > > (In reply to comment #16) > > > > > > > > > hi!Gleb Natapov,please ingore comment13. > > > > > > > > > i test with clocksource=apci_pm ,also have problem,thanks~~ > > > > > > > > I did verified it myself and apci_pm in a guest definitely solves the problem. > > > > > > > i test again.also have problem ,you can try twice to test(twice host S4) > > > > > > I did much more than twice. What host cpu do you have? > > > > > > > > > > my cpu:Intel(R) Core(TM)2 Quad CPU Q9500 @ 2.83GHz > > > > > > > > Hi, > > > > > > > > Can you please test the kernel in comment #14? > > > > > > Hi~~ > > > Marcelo Tosatti,i can not download the kernel package ,can you help me > > > ,thanks~~ > > > > Please try > > > > https://brewweb.devel.redhat.com/taskinfo?taskID=4271960 > hi~~,i think there also have problem,the steps as following: > > 1.change the guest and host kernel with > https://brewweb.devel.redhat.com/taskinfo?taskID=4271960 > 2.run in guest > #echo acpi_pm > >/sys/devices/system/clocksource/clocksource0/current_clocksource > 3)in guest > #cat /sys/devices/system/clocksource/clocksource0/current_clocksource > acpi_pm > 4)S4 for host > 5)do some operation in guest(input some characters in terminal) > 6)after host resume > results: > > (qemu)info status > running > > but the guest can not work,have no response.have the same problem of above > said. Do you still see the warning message such as WARNING: at arch/x86/kvm/x86.c:1840 kvm_arch_vcpu_load+0x103/0x150 [kvm]() ? If so, please paste full logs. Also, is the guest consuming CPU (CPU field of "top" utility on host)? Can you ping it? What "info registers" on qemu command line shows? (In reply to comment #27) > (In reply to comment #26) > > (In reply to comment #25) > > > (In reply to comment #24) > > > > (In reply to comment #23) > > > > > (In reply to comment #22) > > > > > > (In reply to comment #21) > > > > > > > (In reply to comment #19) > > > > > > > > (In reply to comment #17) > > > > > > > > > (In reply to comment #16) > > > > > > > > > > hi!Gleb Natapov,please ingore comment13. > > > > > > > > > > i test with clocksource=apci_pm ,also have problem,thanks~~ > > > > > > > > > I did verified it myself and apci_pm in a guest definitely solves the problem. > > > > > > > > i test again.also have problem ,you can try twice to test(twice host S4) > > > > > > > I did much more than twice. What host cpu do you have? > > > > > > > > > > > > my cpu:Intel(R) Core(TM)2 Quad CPU Q9500 @ 2.83GHz > > > > > > > > > > (In reply to comment #22) > > > > > > (In reply to comment #21) > > > > > > > (In reply to comment #19) > > > > > > > > (In reply to comment #17) > > > > > > > > > (In reply to comment #16) > > > > > > > > > > hi!Gleb Natapov,please ingore comment13. > > > > > > > > > > i test with clocksource=apci_pm ,also have problem,thanks~~ > > > > > > > > > I did verified it myself and apci_pm in a guest definitely solves the problem. > > > > > > > > i test again.also have problem ,you can try twice to test(twice host S4) > > > > > > > I did much more than twice. What host cpu do you have? > > > > > > > > > > > > my cpu:Intel(R) Core(TM)2 Quad CPU Q9500 @ 2.83GHz > > > > > > > > > > Hi, > > > > > > > > > > Can you please test the kernel in comment #14? > > > > > > > > Hi~~ > > > > Marcelo Tosatti,i can not download the kernel package ,can you help me > > > > ,thanks~~ > > > > > > Please try > > > > > > https://brewweb.devel.redhat.com/taskinfo?taskID=4271960 > > hi~~,i think there also have problem,the steps as following: > > > > 1.change the guest and host kernel with > > https://brewweb.devel.redhat.com/taskinfo?taskID=4271960 > > 2.run in guest > > #echo acpi_pm > > >/sys/devices/system/clocksource/clocksource0/current_clocksource > > 3)in guest > > #cat /sys/devices/system/clocksource/clocksource0/current_clocksource > > acpi_pm > > 4)S4 for host > > 5)do some operation in guest(input some characters in terminal) > > 6)after host resume > > results: > > > > (qemu)info status > > running > > > > but the guest can not work,have no response.have the same problem of above > > said. > > Do you still see the warning message such as > > WARNING: at arch/x86/kvm/x86.c:1840 kvm_arch_vcpu_load+0x103/0x150 [kvm]() > > ? > > If so, please paste full logs. > > Also, is the guest consuming CPU (CPU field of "top" utility on host)? > Can you ping it? > What "info registers" on qemu command line shows? after step6: in host: #dmesg device tap0 entered promiscuous mode switch: port 2(tap0) entering forwarding state tap0: no IPv6 routers present kvm: emulating exchange as write switch: port 2(tap0) entering disabled state switch: port 1(eth0) entering disabled state lo: Disabled Privacy Extensions switch: port 2(tap0) entering forwarding state switch: port 1(eth0) entering forwarding state switch: no IPv6 routers present PM: Syncing filesystems ... done. Freezing user space processes ... (elapsed 0.00 seconds) done. Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done. PM: Preallocating image memory... done (allocated 1769750 pages) PM: Allocated 7079000 kbytes in 14.32 seconds (494.34 MB/s) Suspending console(s) (use no_console_suspend to debug) sd 0:0:0:0: [sda] Synchronizing SCSI cache serial 00:08: disabled snd_hda_intel 0000:01:00.1: PCI INT B disabled ACPI handle has no context! snd_hda_intel 0000:00:1b.0: PCI INT A disabled ACPI handle has no context! switch: port 1(eth0) entering disabled state e1000e 0000:00:19.0: PCI INT A disabled e1000e 0000:00:19.0: PME# enabled e1000e 0000:00:19.0: wake-up capability enabled by ACPI ACPI: Preparing to enter system sleep state S4 PM: Saving platform NVS memory Disabling non-boot CPUs ... Broke affinity for irq 25 kvm: disabling virtualization on CPU1 CPU 1 is now offline Broke affinity for irq 26 kvm: disabling virtualization on CPU2 CPU 2 is now offline Broke affinity for irq 27 Broke affinity for irq 29 Broke affinity for irq 30 kvm: disabling virtualization on CPU3 CPU 3 is now offline SMP alternatives: switching to UP code Extended CMOS year: 2000 PM: Creating hibernation image: PM: Need to copy 166026 pages PM: Restoring platform NVS memory microcode: CPU0 updated to revision 0xa0b, date = 2010-09-28 CPU0: Thermal monitoring handled by SMI Extended CMOS year: 2000 Enabling non-boot CPUs ... SMP alternatives: switching to SMP code Booting Node 0 Processor 1 APIC 0x1 CPU1: Thermal monitoring handled by SMI kvm: enabling virtualization on CPU1 microcode: CPU1 updated to revision 0xa0b, date = 2010-09-28 hpet: hpet3 irq 25 for MSI CPU1 is up Booting Node 0 Processor 2 APIC 0x2 CPU2: Thermal monitoring handled by SMI kvm: enabling virtualization on CPU2 microcode: CPU2 updated to revision 0xa0b, date = 2010-09-28 hpet: hpet4 irq 26 for MSI CPU2 is up Booting Node 0 Processor 3 APIC 0x3 CPU3: Thermal monitoring handled by SMI kvm: enabling virtualization on CPU3 microcode: CPU3 updated to revision 0xa0b, date = 2010-09-28 hpet: hpet5 irq 27 for MSI CPU3 is up ACPI: Waking up from system sleep state S4 snd_hda_intel 0000:00:1b.0: restoring config space at offset 0x1 (was 0x100106, writing 0x100102) ahci 0000:00:1f.2: restoring config space at offset 0x1 (was 0x2b00403, writing 0x2b00407) i801_smbus 0000:00:1f.3: restoring config space at offset 0x1 (was 0x2800001, writing 0x2800003) snd_hda_intel 0000:01:00.1: restoring config space at offset 0x1 (was 0x40100107, writing 0x40100103) e1000e 0000:00:19.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16 e1000e 0000:00:19.0: setting latency timer to 64 e1000e 0000:00:19.0: wake-up capability disabled by ACPI e1000e 0000:00:19.0: PME# disabled e1000e 0000:00:19.0: irq 31 for MSI/MSI-X e1000e 0000:00:19.0: eth0: MAC Wakeup cause - Unicast Packet uhci_hcd 0000:00:1a.0: setting latency timer to 64 usb usb3: root hub lost power or was reset uhci_hcd 0000:00:1a.1: setting latency timer to 64 usb usb4: root hub lost power or was reset uhci_hcd 0000:00:1a.2: setting latency timer to 64 usb usb5: root hub lost power or was reset ehci_hcd 0000:00:1a.7: setting latency timer to 64 usb usb1: root hub lost power or was reset ehci_hcd 0000:00:1a.7: cache line size of 64 is not supported snd_hda_intel 0000:00:1b.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16 snd_hda_intel 0000:00:1b.0: setting latency timer to 64 snd_hda_intel 0000:00:1b.0: irq 32 for MSI/MSI-X uhci_hcd 0000:00:1d.0: setting latency timer to 64 usb usb6: root hub lost power or was reset uhci_hcd 0000:00:1d.1: setting latency timer to 64 usb usb7: root hub lost power or was reset uhci_hcd 0000:00:1d.2: setting latency timer to 64 usb usb8: root hub lost power or was reset ehci_hcd 0000:00:1d.7: setting latency timer to 64 usb usb2: root hub lost power or was reset ehci_hcd 0000:00:1d.7: cache line size of 64 is not supported pci 0000:00:1e.0: setting latency timer to 64 ahci 0000:00:1f.2: setting latency timer to 64 radeon 0000:01:00.0: setting latency timer to 64 [drm] PCIE GART of 512M enabled (table at 0x0000000000040000). radeon 0000:01:00.0: WB enabled [drm] fence driver on ring 0 use gpu addr 0x20000c00 and cpu addr 0xffff8801f34a3c00 [drm] ring test on 0 succeeded in 1 usecs [drm] ib test on ring 0 succeeded in 0 usecs ata4: SATA link down (SStatus 0 SControl 300) ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata1.00: configured for UDMA/100 ata3.00: configured for UDMA/33 snd_hda_intel 0000:01:00.1: PCI INT B -> GSI 17 (level, low) -> IRQ 17 snd_hda_intel 0000:01:00.1: setting latency timer to 64 snd_hda_intel 0000:01:00.1: irq 33 for MSI/MSI-X serial 00:08: activated e1000e: eth0 NIC Link is Up 100 Mbps Full Duplex, Flow Control: None e1000e 0000:00:19.0: eth0: 10/100 speed: disabling TSO ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready switch: port 1(eth0) entering forwarding state usb 7-1: reset full speed USB device number 2 using uhci_hcd usb 7-1.1: reset low speed USB device number 3 using uhci_hcd sd 0:0:0:0: [sda] Starting disk Restarting tasks ... done. eth0: no IPv6 routers present > #top PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 12466 root 20 0 2809m 257m 3256 S 81.1 3.5 4:37.79 qemu-kvm ... > (qemu) info registers RAX=00000000000698ee RBX=000000002bf742ec RCX=0000000000000000 RDX=000000000000b008 RSI=ffff880002211960 RDI=ffffffff81b0c4c0 RBP=ffff880002203dc8 RSP=ffff880002203dc8 R8 =0000000000000000 R9 =0000000000000000 R10=0000000000000033 R11=00007f35f492df80 R12=000000000000029a R13=0000000000127be2 R14=ffffffff81b0c4c0 R15=ffff88007c028aa0 RIP=ffffffff81409c90 RFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =0000 0000000000000000 ffffffff 00000000 CS =0010 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA] SS =0000 0000000000000000 ffffffff 00000000 DS =0000 0000000000000000 ffffffff 00000000 FS =0000 00007f35f7830940 ffffffff 00000000 GS =0000 ffff880002200000 ffffffff 00000000 LDT=0000 0000000000000000 ffffffff 00000000 TR =0040 ffff880002214200 00002087 00008b00 DPL=0 TSS64-busy GDT= ffff880002204000 0000007f IDT= ffffffff81dd7000 00000fff CR0=80050033 CR2=00007f35e5498008 CR3=000000007b25a000 CR4=000006f0 DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 DR6=00000000ffff0ff0 DR7=0000000000000400 FCW=037f FSW=0120 [ST=0] FTW=00 MXCSR=00001fa2 FPR0=000000000000000f ffff FPR1=0000000000000031 ffff FPR2=0014000000000000 ffff FPR3=000000000000000a ffff FPR4=0000000000000000 ffff FPR5=fa37696850000000 400c FPR6=fa37696850000000 400c FPR7=a8c0000000000000 400f XMM00=00000000000000000000000000000000 XMM01=ffffffffffff0000ffff000000000000 XMM02=00000000000000000000000000000000 XMM03=00000000000000000000000000000000 XMM04=0000000029373231342e34352c393232 XMM05=00000000000000000000000000000000 XMM06=00000000000000003ff0000000000000 XMM07=00000000000000000000000000000000 XMM08=00000000000000003ff0000000000000 XMM09=00000000000000000000000000000000 XMM10=00000000000000000000000000000000 XMM11=00000000000000000000000000000000 XMM12=00000000000000003fe0000000000000 XMM13=000000000000000039dfdc0a740850d4 XMM14=00000000000000003febb67ae8584caa XMM15=00000000000000003f252382d7366000 (qemu) > #ping guest ping 10.66.65.76 PING 10.66.65.76 (10.66.65.76) 56(84) bytes of data. 64 bytes from 10.66.65.76: icmp_seq=1 ttl=64 time=28.4 ms 64 bytes from 10.66.65.76: icmp_seq=2 ttl=64 time=0.135 ms 64 bytes from 10.66.65.76: icmp_seq=3 ttl=64 time=0.234 ms 64 bytes from 10.66.65.76: icmp_seq=4 ttl=64 time=0.228 ms ..... Created attachment 576766 [details]
Patch to fix the bug
Looks like the relation sign is just wrong in the TSC scaling patch. Here is a patch that fixes the sign so that max_tsc is calculated correctly.
(In reply to comment #29) > Created attachment 576766 [details] > Patch to fix the bug > > Looks like the relation sign is just wrong in the TSC scaling patch. Here is a > patch that fixes the sign so that max_tsc is calculated correctly. Joerg, It might be ok to adjust the offset using guest tsc units, but it is not entirely clear. This can be discussed upstream. Can you please fix it there? Note there is no problem to remove this patch because tsc trapping is never enabled in practice in RHEL6.
> > Also, is the guest consuming CPU (CPU field of "top" utility on host)?
> > Can you ping it?
> > What "info registers" on qemu command line shows?
>
> after step6:
> in host:
> #dmesg
> device tap0 entered promiscuous mode
> switch: port 2(tap0) entering forwarding state
> tap0: no IPv6 routers present
> kvm: emulating exchange as write
> switch: port 2(tap0) entering disabled state
> switch: port 1(eth0) entering disabled state
> lo: Disabled Privacy Extensions
> switch: port 2(tap0) entering forwarding state
> switch: port 1(eth0) entering forwarding state
> switch: no IPv6 routers present
> PM: Syncing filesystems ... done.
> Freezing user space processes ... (elapsed 0.00 seconds) done.
> Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done.
> PM: Preallocating image memory... done (allocated 1769750 pages)
> PM: Allocated 7079000 kbytes in 14.32 seconds (494.34 MB/s)
> Suspending console(s) (use no_console_suspend to debug)
> sd 0:0:0:0: [sda] Synchronizing SCSI cache
> serial 00:08: disabled
> snd_hda_intel 0000:01:00.1: PCI INT B disabled
> ACPI handle has no context!
> snd_hda_intel 0000:00:1b.0: PCI INT A disabled
> ACPI handle has no context!
> switch: port 1(eth0) entering disabled state
> e1000e 0000:00:19.0: PCI INT A disabled
> e1000e 0000:00:19.0: PME# enabled
> e1000e 0000:00:19.0: wake-up capability enabled by ACPI
> ACPI: Preparing to enter system sleep state S4
> PM: Saving platform NVS memory
> Disabling non-boot CPUs ...
> Broke affinity for irq 25
> kvm: disabling virtualization on CPU1
> CPU 1 is now offline
> Broke affinity for irq 26
> kvm: disabling virtualization on CPU2
> CPU 2 is now offline
> Broke affinity for irq 27
> Broke affinity for irq 29
> Broke affinity for irq 30
> kvm: disabling virtualization on CPU3
> CPU 3 is now offline
> SMP alternatives: switching to UP code
> Extended CMOS year: 2000
> PM: Creating hibernation image:
> PM: Need to copy 166026 pages
> PM: Restoring platform NVS memory
> microcode: CPU0 updated to revision 0xa0b, date = 2010-09-28
> CPU0: Thermal monitoring handled by SMI
> Extended CMOS year: 2000
> Enabling non-boot CPUs ...
> SMP alternatives: switching to SMP code
> Booting Node 0 Processor 1 APIC 0x1
> CPU1: Thermal monitoring handled by SMI
> kvm: enabling virtualization on CPU1
> microcode: CPU1 updated to revision 0xa0b, date = 2010-09-28
> hpet: hpet3 irq 25 for MSI
> CPU1 is up
> Booting Node 0 Processor 2 APIC 0x2
> CPU2: Thermal monitoring handled by SMI
> kvm: enabling virtualization on CPU2
> microcode: CPU2 updated to revision 0xa0b, date = 2010-09-28
> hpet: hpet4 irq 26 for MSI
> CPU2 is up
> Booting Node 0 Processor 3 APIC 0x3
> CPU3: Thermal monitoring handled by SMI
> kvm: enabling virtualization on CPU3
> microcode: CPU3 updated to revision 0xa0b, date = 2010-09-28
> hpet: hpet5 irq 27 for MSI
> CPU3 is up
> ACPI: Waking up from system sleep state S4
> snd_hda_intel 0000:00:1b.0: restoring config space at offset 0x1 (was 0x100106,
> writing 0x100102)
> ahci 0000:00:1f.2: restoring config space at offset 0x1 (was 0x2b00403, writing
> 0x2b00407)
> i801_smbus 0000:00:1f.3: restoring config space at offset 0x1 (was 0x2800001,
> writing 0x2800003)
> snd_hda_intel 0000:01:00.1: restoring config space at offset 0x1 (was
> 0x40100107, writing 0x40100103)
> e1000e 0000:00:19.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
> e1000e 0000:00:19.0: setting latency timer to 64
> e1000e 0000:00:19.0: wake-up capability disabled by ACPI
> e1000e 0000:00:19.0: PME# disabled
> e1000e 0000:00:19.0: irq 31 for MSI/MSI-X
> e1000e 0000:00:19.0: eth0: MAC Wakeup cause - Unicast Packet
> uhci_hcd 0000:00:1a.0: setting latency timer to 64
> usb usb3: root hub lost power or was reset
> uhci_hcd 0000:00:1a.1: setting latency timer to 64
> usb usb4: root hub lost power or was reset
> uhci_hcd 0000:00:1a.2: setting latency timer to 64
> usb usb5: root hub lost power or was reset
> ehci_hcd 0000:00:1a.7: setting latency timer to 64
> usb usb1: root hub lost power or was reset
> ehci_hcd 0000:00:1a.7: cache line size of 64 is not supported
> snd_hda_intel 0000:00:1b.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
> snd_hda_intel 0000:00:1b.0: setting latency timer to 64
> snd_hda_intel 0000:00:1b.0: irq 32 for MSI/MSI-X
> uhci_hcd 0000:00:1d.0: setting latency timer to 64
> usb usb6: root hub lost power or was reset
> uhci_hcd 0000:00:1d.1: setting latency timer to 64
> usb usb7: root hub lost power or was reset
> uhci_hcd 0000:00:1d.2: setting latency timer to 64
> usb usb8: root hub lost power or was reset
> ehci_hcd 0000:00:1d.7: setting latency timer to 64
> usb usb2: root hub lost power or was reset
> ehci_hcd 0000:00:1d.7: cache line size of 64 is not supported
> pci 0000:00:1e.0: setting latency timer to 64
> ahci 0000:00:1f.2: setting latency timer to 64
> radeon 0000:01:00.0: setting latency timer to 64
> [drm] PCIE GART of 512M enabled (table at 0x0000000000040000).
> radeon 0000:01:00.0: WB enabled
> [drm] fence driver on ring 0 use gpu addr 0x20000c00 and cpu addr
> 0xffff8801f34a3c00
> [drm] ring test on 0 succeeded in 1 usecs
> [drm] ib test on ring 0 succeeded in 0 usecs
> ata4: SATA link down (SStatus 0 SControl 300)
> ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> ata1.00: configured for UDMA/100
> ata3.00: configured for UDMA/33
> snd_hda_intel 0000:01:00.1: PCI INT B -> GSI 17 (level, low) -> IRQ 17
> snd_hda_intel 0000:01:00.1: setting latency timer to 64
> snd_hda_intel 0000:01:00.1: irq 33 for MSI/MSI-X
> serial 00:08: activated
> e1000e: eth0 NIC Link is Up 100 Mbps Full Duplex, Flow Control: None
> e1000e 0000:00:19.0: eth0: 10/100 speed: disabling TSO
> ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
> switch: port 1(eth0) entering forwarding state
> usb 7-1: reset full speed USB device number 2 using uhci_hcd
> usb 7-1.1: reset low speed USB device number 3 using uhci_hcd
> sd 0:0:0:0: [sda] Starting disk
> Restarting tasks ... done.
> eth0: no IPv6 routers present
>
> >
> #top
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 12466 root 20 0 2809m 257m 3256 S 81.1 3.5 4:37.79 qemu-kvm
> ...
> >
> (qemu) info registers
> RAX=00000000000698ee RBX=000000002bf742ec RCX=0000000000000000
> RDX=000000000000b008
> RSI=ffff880002211960 RDI=ffffffff81b0c4c0 RBP=ffff880002203dc8
> RSP=ffff880002203dc8
> R8 =0000000000000000 R9 =0000000000000000 R10=0000000000000033
> R11=00007f35f492df80
> R12=000000000000029a R13=0000000000127be2 R14=ffffffff81b0c4c0
> R15=ffff88007c028aa0
> RIP=ffffffff81409c90 RFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
> ES =0000 0000000000000000 ffffffff 00000000
> CS =0010 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA]
> SS =0000 0000000000000000 ffffffff 00000000
> DS =0000 0000000000000000 ffffffff 00000000
> FS =0000 00007f35f7830940 ffffffff 00000000
> GS =0000 ffff880002200000 ffffffff 00000000
> LDT=0000 0000000000000000 ffffffff 00000000
> TR =0040 ffff880002214200 00002087 00008b00 DPL=0 TSS64-busy
> GDT= ffff880002204000 0000007f
> IDT= ffffffff81dd7000 00000fff
> CR0=80050033 CR2=00007f35e5498008 CR3=000000007b25a000 CR4=000006f0
> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000
> DR3=0000000000000000
> DR6=00000000ffff0ff0 DR7=0000000000000400
> FCW=037f FSW=0120 [ST=0] FTW=00 MXCSR=00001fa2
> FPR0=000000000000000f ffff FPR1=0000000000000031 ffff
> FPR2=0014000000000000 ffff FPR3=000000000000000a ffff
> FPR4=0000000000000000 ffff FPR5=fa37696850000000 400c
> FPR6=fa37696850000000 400c FPR7=a8c0000000000000 400f
> XMM00=00000000000000000000000000000000 XMM01=ffffffffffff0000ffff000000000000
> XMM02=00000000000000000000000000000000 XMM03=00000000000000000000000000000000
> XMM04=0000000029373231342e34352c393232 XMM05=00000000000000000000000000000000
> XMM06=00000000000000003ff0000000000000 XMM07=00000000000000000000000000000000
> XMM08=00000000000000003ff0000000000000 XMM09=00000000000000000000000000000000
> XMM10=00000000000000000000000000000000 XMM11=00000000000000000000000000000000
> XMM12=00000000000000003fe0000000000000 XMM13=000000000000000039dfdc0a740850d4
> XMM14=00000000000000003febb67ae8584caa XMM15=00000000000000003f252382d7366000
> (qemu)
>
> >
> #ping guest
> ping 10.66.65.76
> PING 10.66.65.76 (10.66.65.76) 56(84) bytes of data.
> 64 bytes from 10.66.65.76: icmp_seq=1 ttl=64 time=28.4 ms
> 64 bytes from 10.66.65.76: icmp_seq=2 ttl=64 time=0.135 ms
> 64 bytes from 10.66.65.76: icmp_seq=3 ttl=64 time=0.234 ms
> 64 bytes from 10.66.65.76: icmp_seq=4 ttl=64 time=0.228 ms
> .....
Ok, from the data you provide the guest appears operational. Can you ssh into it?
(In reply to comment #30) > (In reply to comment #29) > > Created attachment 576766 [details] > > Patch to fix the bug > > > > Looks like the relation sign is just wrong in the TSC scaling patch. Here is a > > patch that fixes the sign so that max_tsc is calculated correctly. > > Joerg, > > It might be ok to adjust the offset using guest tsc units, but it is not > entirely clear. This can be discussed upstream. Can you please fix it there? > > Note there is no problem to remove this patch because tsc trapping is never > enabled in practice in RHEL6. Upstream we have a version of adjust_tsc_offset which works in host or guest-tsc units. So upstream the delta can be in host-tsc-units. But this does not work on the RHEL6 kernel where adjust_tsc_offset always uses guest-tsc units, no? (In reply to comment #32) > (In reply to comment #30) > > (In reply to comment #29) > > > Created attachment 576766 [details] > > > Patch to fix the bug > > > > > > Looks like the relation sign is just wrong in the TSC scaling patch. Here is a > > > patch that fixes the sign so that max_tsc is calculated correctly. > > > > Joerg, > > > > It might be ok to adjust the offset using guest tsc units, but it is not > > entirely clear. This can be discussed upstream. Can you please fix it there? > > > > Note there is no problem to remove this patch because tsc trapping is never > > enabled in practice in RHEL6. > > Upstream we have a version of adjust_tsc_offset which works in host or > guest-tsc units. So upstream the delta can be in host-tsc-units. But this does > not work on the RHEL6 kernel where adjust_tsc_offset always uses guest-tsc > units, no? I looked at upstream code, it uses Host TSC offset which is fine, because there is a adjust_tsc_offset_host function which is used. So how about dropping this patch and backport the adjust_tsc_offset_host patch instead? (In reply to comment #33) > (In reply to comment #32) > > (In reply to comment #30) > > > (In reply to comment #29) > > > > Created attachment 576766 [details] > > > > Patch to fix the bug > > > > > > > > Looks like the relation sign is just wrong in the TSC scaling patch. Here is a > > > > patch that fixes the sign so that max_tsc is calculated correctly. > > > > > > Joerg, > > > > > > It might be ok to adjust the offset using guest tsc units, but it is not > > > entirely clear. This can be discussed upstream. Can you please fix it there? > > > > > > Note there is no problem to remove this patch because tsc trapping is never > > > enabled in practice in RHEL6. > > > > Upstream we have a version of adjust_tsc_offset which works in host or > > guest-tsc units. So upstream the delta can be in host-tsc-units. But this does > > not work on the RHEL6 kernel where adjust_tsc_offset always uses guest-tsc > > units, no? > > I looked at upstream code, it uses Host TSC offset which is fine, because there > is a adjust_tsc_offset_host function which is used. So how about dropping this > patch and backport the adjust_tsc_offset_host patch instead? Sure, can you do it or should i? hi! Marcelo Tosatti,after host resume,i can ping and ssh the guest~~.thanks~~~ Created attachment 577072 [details]
Patches to revert the buggy code and backport upstream adjust_tsc_offset changes
Okay, I reverted Patch 3 from the TSC-Scaling patch-set in RHEL6 and backported upstream commit f1e2b26003c41e581243c09ceed7567677449468 to have separate adjust_tsc_offset_host() and adjust_tsc_offset_guest() functions. Please have a look if the backport is OK.
(In reply to comment #36) > Created attachment 577072 [details] > Patches to revert the buggy code and backport upstream adjust_tsc_offset > changes > > Okay, I reverted Patch 3 from the TSC-Scaling patch-set in RHEL6 and backported > upstream commit f1e2b26003c41e581243c09ceed7567677449468 to have separate > adjust_tsc_offset_host() and adjust_tsc_offset_guest() functions. Please have a > look if the backport is OK. Patches are OK. I suppose Frank Arnold should submit them. Thanks. This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. I'm getting the same traces in different situation - when I suspend my corporate laptop to S3 with a VM running. Upon resume, I get three traces like this in dmesg (see below). Do you think it is the same bug? (also note that while first trace says the kernel is not tainted, second one and any subsequent says that kernel got tainted - funny.) cut from dmesg: Restarting tasks ... ------------[ cut here ]------------ WARNING: at arch/x86/kvm/x86.c:1838 kvm_arch_vcpu_load+0x103/0x150 [kvm]() (Not tainted) Hardware name: 4384AT6 Modules linked in: hidp fuse ebtable_nat ebtables rfcomm sco bnep l2cap autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf bridge stp llc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6 table_filter ip6_tables ipv6 ipt_REJECT xt_state iptable_filter ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables fpu aesni_intel cryptd aes_x86_64 aes_gener ic xts gf128mul dm_crypt vhost_net macvtap macvlan tun kvm_intel kvm uinput btusb bluetooth thinkpad_acpi arc4 iwlwifi mac80211 cfg80211 rfkill sg uvcvideo videodev v4l2_compat_ioctl32 microcode in tel_ips i2c_i801 iTCO_wdt iTCO_vendor_support shpchp snd_hda_codec_hdmi snd_hda_codec_conexant snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_ alloc e1000e ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif firewire_ohci firewire_core crc_itu_t sdhci_pci sdhci mmc_core ahci wmi i915 drm_kms_helper drm i2c_algo_bit i2c_core video output dm_m irror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan] Pid: 3271, comm: qemu-kvm Not tainted 2.6.32-262.el6.x86_64 #1 Call Trace: [<ffffffff8106b607>] ? warn_slowpath_common+0x87/0xc0 [<ffffffff8106b65a>] ? warn_slowpath_null+0x1a/0x20 [<ffffffffa0454703>] ? kvm_arch_vcpu_load+0x103/0x150 [kvm] [<ffffffffa044adb5>] ? vcpu_load+0x55/0x80 [kvm] [<ffffffffa045dcb4>] ? kvm_arch_vcpu_ioctl_run+0x24/0x1000 [kvm] [<ffffffff81219511>] ? avc_has_perm+0x71/0x90 [<ffffffffa0447322>] ? kvm_vcpu_ioctl+0x522/0x670 [kvm] [<ffffffff8118d742>] ? vfs_ioctl+0x22/0xa0 [<ffffffff8118dc0a>] ? do_vfs_ioctl+0x3aa/0x580 [<ffffffff8118de61>] ? sys_ioctl+0x81/0xa0 [<ffffffff8100b0f2>] ? system_call_fastpath+0x16/0x1b ---[ end trace 2e360406a0e72392 ]--- done. video LNXVIDEO:00: Restoring backlight state ------------[ cut here ]------------ WARNING: at arch/x86/kvm/x86.c:1838 kvm_arch_vcpu_load+0x103/0x150 [kvm]() (Tainted: G W --------------- ) Patch(es) available on kernel-2.6.32-266.el6 (In reply to comment #39) > I'm getting the same traces in different situation - when I suspend my > corporate laptop to S3 with a VM running. Upon resume, I get three traces like > this in dmesg (see below). Do you think it is the same bug? Yes, it is the same bug. And I'm pretty sure your VM has three VCPUs assigned to it. Thus you are getting three traces, one for each VCPU. > (also note that while first trace says the kernel is not tainted, second one > and any subsequent says that kernel got tainted - funny.) That's ok. The first kernel warning taints the kernel. All subsequent traces will report the kernel as tainted (taint flags: G => No proprietary modules loaded, W => tainted due to a prior WARNING). HTH, Frank (In reply to comment #42) > (In reply to comment #39) > > I'm getting the same traces in different situation - when I suspend my > > corporate laptop to S3 with a VM running. Upon resume, I get three traces like > > this in dmesg (see below). Do you think it is the same bug? > > Yes, it is the same bug. Confirming, in -268, it's gone. > And I'm pretty sure your VM has three VCPUs assigned > to it. Thus you are getting three traces, one for each VCPU. > Yes, that's the case. > > (also note that while first trace says the kernel is not tainted, second one > > and any subsequent says that kernel got tainted - funny.) > > That's ok. The first kernel warning taints the kernel. All subsequent traces > will report the kernel as tainted (taint flags: G => No proprietary modules > loaded, W => tainted due to a prior WARNING). > > HTH, > Frank Thanks for explanation. verify this issue with following two senarios: version host: #uname -r 2.6.32-269.el6.x86_64 guest: #uname -r 2.6.32-269.el6.x86_64 1)boot guest /usr/libexec/qemu-kvm -cpu Penryn,+x2apic -rtc base=localtime,clock=host,driftfix=slew -no-kvm-pit-reinjection -M rhel6.3.0 -enable-kvm -name rhel6.3 -smp 4,cores=2,threads=1,sockets=2 -m 4G -uuid a3d13230-f1c1-4dc9-95de-bb92b2017674 -boot menu=on -drive file=/home/tracing-run-rhel6.3-copy1.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none,aio=native,media=disk,werror=stop,rerror=stop -device virtio-scsi-pci,id=bus1 -device scsi-hd,bus=bus1.0,drive=drive-virtio-disk0,id=virtio-scsi-pci0 -netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=net0,mac=44:37:E6:97:58:89 -spice port=9000,disable-ticketing -vga qxl -global qxl-vga.vram_size=67108864 -monitor stdio -usb -device usb-tablet,id=input1 -drive file=/home/RHEL6.3-20120426.2-Server-x86_64-DVD1.iso,if=none,id=drive-virtio-disk1,format=raw,cache=none,aio=native,media=cdrom -device virtio-scsi-pci,id=bus2 -device scsi-cd,bus=bus2.0,drive=drive-virtio-disk1,id=virtio-scsi-pci1,bootindex=1 2) senario1) #cat /sys/devices/system/clocksource/clocksource0/current_clocksource acpi_pm senario2) #cat /sys/devices/system/clocksource/clocksource0/current_clocksource kvm-clock results: the guest work well.so this issue has been fixed. verify this issue with following two senarios: version host: #uname -r 2.6.32-269.el6.x86_64 guest: #uname -r 2.6.32-269.el6.x86_64 1)boot guest /usr/libexec/qemu-kvm -cpu Penryn,+x2apic -rtc base=localtime,clock=host,driftfix=slew -no-kvm-pit-reinjection -M rhel6.3.0 -enable-kvm -name rhel6.3 -smp 4,cores=2,threads=1,sockets=2 -m 4G -uuid a3d13230-f1c1-4dc9-95de-bb92b2017674 -boot menu=on -drive file=/home/tracing-run-rhel6.3-copy1.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none,aio=native,media=disk,werror=stop,rerror=stop -device virtio-scsi-pci,id=bus1 -device scsi-hd,bus=bus1.0,drive=drive-virtio-disk0,id=virtio-scsi-pci0 -netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=net0,mac=44:37:E6:97:58:89 -spice port=9000,disable-ticketing -vga qxl -global qxl-vga.vram_size=67108864 -monitor stdio -usb -device usb-tablet,id=input1 -drive file=/home/RHEL6.3-20120426.2-Server-x86_64-DVD1.iso,if=none,id=drive-virtio-disk1,format=raw,cache=none,aio=native,media=cdrom -device virtio-scsi-pci,id=bus2 -device scsi-cd,bus=bus2.0,drive=drive-virtio-disk1,id=virtio-scsi-pci1,bootindex=1 2) senario1) #cat /sys/devices/system/clocksource/clocksource0/current_clocksource acpi_pm senario2) #cat /sys/devices/system/clocksource/clocksource0/current_clocksource kvm-clock 3) on host echo disk >/sys/power/state after resume S4 for host ,the results: guest work well .so this issue has been fixed. According to comment47,set this issue as verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2012-0862.html |