Bug 1054706

Summary: boot seabios more than 1 minute when boot qemu-kvm with cpus more than 30
Product: Red Hat Enterprise Linux 7 Reporter: Jun Li <juli>
Component: qemu-kvmAssignee: Gerd Hoffmann <kraxel>
Status: CLOSED DUPLICATE QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: high    
Version: 7.0CC: acathrow, drjones, flang, hhuang, juli, juzhang, kraxel, michen, pbonzini, rhod, scrandall, virt-maint, xfu
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-03-17 11:58:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
perf kvm report none

Description Jun Li 2014-01-17 10:19:17 UTC
Description of problem:
when boot qemu-kvm with cpus more than 30, it will take more than 1 minute to entry the seabios GUI.

command line just like:
# /usr/libexec/qemu-kvm -smp 30 -vnc :1 -monitor stdio
---

Version-Release number of selected component (if applicable):
3.10.0-65.el7.x86_64
qemu-kvm-1.5.3-37.el7.x86_64
Host cpu:
AMD Athlon(tm) Dual Core Processor 5400B

How reproducible:
100%

Steps to Reproduce:
1.
2.
3.

Actual results:
when set "-smp" with more than 30 cpus, entry to seabios GUI too slow.

Expected results:
entry to seabios GUI as normal.

Additional info:
BTW, most of other machines not hit this issue.
I also find another machine has this issue.
host cpu: 
Intel(R) Xeon(R) CPU E7- 8870  @ 2.40GHz

Comment 2 Gerd Hoffmann 2014-01-17 15:36:37 UTC
Doesn't reproduce locally (not surprising).

Please try this:

/usr/libexec/qemu-kvm -enable-kvm -smp 128 -vnc :1 \
  -chardev stdio,id=log -device isa-debugcon,iobase=0x402,chardev=log

Does it hang before or after seabios boot messages start showing up?
In case of the latter:  Where in the message log does it hang?

Comment 3 Jun Li 2014-01-20 04:32:53 UTC
(In reply to Gerd Hoffmann from comment #2)
> Doesn't reproduce locally (not surprising).
> 
> Please try this:
> 
> /usr/libexec/qemu-kvm -enable-kvm -smp 128 -vnc :1 \
>   -chardev stdio,id=log -device isa-debugcon,iobase=0x402,chardev=log
> 
> Does it hang before or after seabios boot messages start showing up?
> In case of the latter:  Where in the message log does it hang?

# /usr/libexec/qemu-kvm -enable-kvm -smp 128 -vnc :1 \
-chardev stdio,id=log -device isa-debugcon,iobase=0x402,chardev=log

Start bios (version seabios-1.7.2.2-10.el7)

------
Try as above, it hangs at "Start bios (version seabios-1.7.2.2-10.el7)".

Comment 11 Gerd Hoffmann 2014-02-03 13:36:18 UTC
 39,99%       qemu-kvm  [kernel.kallsyms]        [k] _raw_spin_lock                                ◆
  4,42%       qemu-kvm  [kernel.kallsyms]        [k] native_read_msr_safe                          ▒
  2,50%       qemu-kvm  [kernel.kallsyms]        [k] dequeue_task_fair                             ▒
  2,46%       qemu-kvm  [kernel.kallsyms]        [k] select_task_rq_fair                           ▒
  2,41%       qemu-kvm  [kernel.kallsyms]        [k] update_curr                                   ▒
  2,23%       qemu-kvm  [kernel.kallsyms]        [k] _raw_spin_lock_irqsave                        ▒
  2,16%       qemu-kvm  [kernel.kallsyms]        [k] native_write_msr_safe                         ▒
  2,12%       qemu-kvm  [kernel.kallsyms]        [k] __schedule                                    ▒
  1,81%       qemu-kvm  [kvm]                    [k] kvm_arch_vcpu_load                            ▒
  1,78%       qemu-kvm  [kernel.kallsyms]        [k] enqueue_entity                                ▒
  1,73%       qemu-kvm  [kernel.kallsyms]        [k] perf_event_task_sched_out                     ▒
  1,72%       qemu-kvm  [kernel.kallsyms]        [k] enqueue_task_fair                             ▒
  1,53%       qemu-kvm  [kvm_amd]                [k] svm_interrupt_allowed                         ▒
  1,44%       qemu-kvm  [kvm]                    [k] kvm_sched_in                                  ▒
  1,43%       qemu-kvm  [kernel.kallsyms]        [k] dequeue_entity                                ▒
  1,34%       qemu-kvm  [kernel.kallsyms]        [k] try_to_wake_up                                ▒
  1,17%       qemu-kvm  [kernel.kallsyms]        [k] update_min_vruntime                           ▒
  1,15%       qemu-kvm  [kernel.kallsyms]        [k] sched_clock_local                             ▒
  1,14%       qemu-kvm  [kernel.kallsyms]        [k] task_waking_fair                              ▒
  1,07%       qemu-kvm  [kernel.kallsyms]        [k] schedule                                      ▒

Comment 12 Gerd Hoffmann 2014-02-03 13:40:09 UTC
So, it looks like lock contention on the host.
Can you repeat the test with call graphs (-g) enabled please, i.e. this:

perf kvm --host --guest  record -g -p `pidof qemu-kvm`

Comment 13 langfang 2014-02-07 03:06:33 UTC
(In reply to Gerd Hoffmann from comment #12)
> So, it looks like lock contention on the host.
> Can you repeat the test with call graphs (-g) enabled please, i.e. this:
> 
> perf kvm --host --guest  record -g -p `pidof qemu-kvm`

Hi,Gerd
   I can't download the build "http://brewweb.devel.redhat.com/brew/taskinfo?taskID=6895910 ", now the status is closed. could you help me ?

thanks

fang lang

Comment 14 Gerd Hoffmann 2014-02-17 08:16:00 UTC
kicked new scratch build:
http://brewweb.devel.redhat.com/brew/taskinfo?taskID=7063650

Comment 15 Jun Li 2014-02-18 06:57:47 UTC
(In reply to Gerd Hoffmann from comment #14)
> kicked new scratch build:
> http://brewweb.devel.redhat.com/brew/taskinfo?taskID=7063650

I am waiting for reserving that machine. thx.

Comment 16 Jun Li 2014-02-26 10:21:05 UTC
(In reply to Gerd Hoffmann from comment #14)
> kicked new scratch build:
> http://brewweb.devel.redhat.com/brew/taskinfo?taskID=7063650

Hi Gerd,
    Can not download the package from "http://brewweb.devel.redhat.com/brew/taskinfo?taskID=7063650".
    Could you help to check this url. thx.

Best Regards,
juli

Comment 17 Gerd Hoffmann 2014-02-26 10:56:21 UTC
kicked new scratch build:
https://brewweb.devel.redhat.com/taskinfo?taskID=7109560

Comment 18 Jun Li 2014-02-27 02:28:07 UTC
(In reply to Gerd Hoffmann from comment #17)
> kicked new scratch build:
> https://brewweb.devel.redhat.com/taskinfo?taskID=7109560

Retest with url:
https://brewweb.devel.redhat.com/taskinfo?taskID=7109560
-----
# /usr/libexec/qemu-kvm -enable-kvm -smp 128 -vnc :1 -chardev stdio,id=log -device isa-debugcon,iobase=0x402,chardev=log
Start bios (version seabios-1.7.2.2-10.el7.bz1054706.1)
handle_post:399
handle_post:403
handle_post:407
handle_post:411
...
------
# perf kvm --host --guest  record -g -p `pidof qemu-kvm`
-----------
The result file is:
perf.data.kvm-juli-20140227

You could download from url:
wget 10.66.7.190:8000/perf.data.kvm-juli-20140227

Comment 19 Gerd Hoffmann 2014-02-28 10:07:13 UTC
Created attachment 868940 [details]
perf kvm report

Hmm.  Alot of scheduling activity.  No obvious reason for that.

Quite some PMU activity, but I suspect this simply shows up here due to watching the qemu-kvm process with perf, not due to PMU emulation/passthrough.

Comment 20 Gerd Hoffmann 2014-02-28 10:27:03 UTC
Paolo, any clue what this might be?

Note this depends on host hardware somehow, shows on a few machines only,
but from both amd and intel camp ...

Comment 21 Andrew Jones 2014-02-28 10:55:21 UTC
(In reply to Gerd Hoffmann from comment #20)
> Paolo, any clue what this might be?
> 
> Note this depends on host hardware somehow, shows on a few machines only,
> but from both amd and intel camp ...

Does the host's dmesg say that the tsc has been marked as unstable?

Comment 22 Gerd Hoffmann 2014-02-28 12:38:04 UTC
Forwarding question to QE.  Jun, see comment 21, can you check the machines showing / not showing the issue?

Comment 23 Jun Li 2014-03-03 05:10:10 UTC
(In reply to Gerd Hoffmann from comment #22)
> Forwarding question to QE.  Jun, see comment 21, can you check the machines
> showing / not showing the issue?

Yes, dmesg say that the tsc has been marked as unstable.
# dmesg |grep -i tsc
[    0.000000] tsc: Fast TSC calibration using PIT
[    0.000000] tsc: Detected 2805.936 MHz processor
[    0.000000] tsc: Marking TSC unstable due to TSCs unsynchronized
-------------
# dmesg |grep -i tsc -n8
175-[    0.000000] NR_IRQS:327936 nr_irqs:712 16
176-[    0.000000] spurious 8259A interrupt: IRQ7.
177-[    0.000000] Console: colour VGA+ 80x25
178-[    0.000000] console [tty0] enabled
179-[    0.000000] console [ttyS0] enabled
180-[    0.000000] allocated 16777216 bytes of page_cgroup
181-[    0.000000] please try 'cgroup_disable=memory' option if you don't want memory cgroups
182-[    0.000000] hpet clockevent registered
183:[    0.000000] tsc: Fast TSC calibration using PIT
184:[    0.000000] tsc: Detected 2805.936 MHz processor
185:[    0.000000] tsc: Marking TSC unstable due to TSCs unsynchronized
186-[    0.001005] Calibrating delay loop (skipped), value calculated using timer frequency.. 5611.87 BogoMIPS (lpj=2805936)
187-[    0.001009] pid_max: default: 32768 minimum: 301
188-[    0.001040] Security Framework initialized
189-[    0.001052] SELinux:  Initializing.
190-[    0.001062] SELinux:  Starting in permissive mode
191-[    0.002057] Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes)
192-[    0.004323] Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes)
193-[    0.005447] Mount-cache hash table entries: 256

Comment 24 Andrew Jones 2014-03-03 10:12:07 UTC
This is likely a dup of bug 1036457. I'm brewing a kernel that can be used to test a fix for that bug here

https://brewweb.devel.redhat.com/taskinfo?taskID=7131223

(assuming the build completes) then we can try it on these hosts to prove that this bug is a dup.

Comment 25 juzhang 2014-03-17 10:04:49 UTC
Also tested by using 160 vCPUs with fixed kernel.
'-smp 160,sockets=8,cores=10,threads=2'

Result:
Booting guest took about 30 sec.
# ./boot.sh 
start boot at:	03/17/14 05:46:00
finish boot at:	03/17/14 05:46:33

Viewed debuginfo from seabios console, shows below:
# nc -U /tmp/seabios 
Start bios (version seabios-1.7.2.2-12.el7)
Ram Size=0xc0000000 (0x0000000040000000 high)
Relocating low data from 0x000e3ec0 to 0x000ef780 (size 2161)
Relocating init from 0x000e4731 to 0xbffe2ce0 (size 53759)
CPU Mhz=1998
=== PCI bus & bridge init ===
PCI: pci_bios_init_bus_rec bus = 0x0
=== PCI device probing ===
Found 7 PCI devices (max PCI bus is 00)
=== PCI new allocation pass #1 ===
PCI: check devices
=== PCI new allocation pass #2 ===
PCI: IO: c000 - c08f
PCI: 32: 00000000c0000000 - 00000000fec00000
PCI: map device bdf=00:03.0  bar 0, addr 0000c000, size 00000040 [io]
PCI: map device bdf=00:02.0  bar 3, addr 0000c040, size 00000020 [io]
PCI: map device bdf=00:04.0  bar 0, addr 0000c060, size 00000020 [io]
PCI: map device bdf=00:01.1  bar 4, addr 0000c080, size 00000010 [io]
PCI: map device bdf=00:02.0  bar 0, addr f4000000, size 04000000 [mem]
PCI: map device bdf=00:02.0  bar 1, addr f8000000, size 04000000 [mem]
PCI: map device bdf=00:02.0  bar 6, addr fc000000, size 00010000 [mem]
PCI: map device bdf=00:02.0  bar 2, addr fc010000, size 00002000 [mem]
PCI: map device bdf=00:03.0  bar 1, addr fc012000, size 00001000 [mem]
PCI: map device bdf=00:04.0  bar 1, addr fc013000, size 00001000 [mem]
PCI: init bdf=00:00.0 id=8086:1237
PCI: init bdf=00:01.0 id=8086:7000
PIIX3/PIIX4 init: elcr=00 0c
PCI: init bdf=00:01.1 id=8086:7010
PCI: init bdf=00:01.3 id=8086:7113
Using pmtimer, ioport 0xb008, freq 3579 kHz
PCI: init bdf=00:02.0 id=1b36:0100
PCI: init bdf=00:03.0 id=1af4:1001
PCI: init bdf=00:04.0 id=1af4:1003
Found 160 cpu(s) max supported 244 cpu(s)
MP table addr=0x000fdec0 MPC table addr=0x000fded0 size=372
SMBIOS ptr=0x000fdea0 table=0xbfffd3d0 size=11297
Scan for VGA option rom
Running option rom at c000:0003
Start SeaVGABIOS (version seabios-1.7.2.2-12.el7)
enter vga_post:
   a=00000010  b=0000ffff  c=00000000  d=0000ffff ds=0000 es=f000 ss=0000
  si=00000000 di=0000d150 bp=00000000 sp=00006de2 cs=f000 ip=c4de  f=0000
VBE DISPI: bdf 00:02.0, bar 0
VBE DISPI: lfb_addr=f4000000, size 64 MB
Turning on vga text mode console
set VGA mode 3
SeaBIOS (version seabios-1.7.2.2-12.el7)
Machine UUID cca1433d-5bac-490f-a097-c5c80c1a083f
Found 0 lpt ports
Found 0 serial ports
ATA controller 1 at 1f0/3f4/0 (irq 14 dev 9)
ATA controller 2 at 170/374/0 (irq 15 dev 9)
found virtio-blk at 0:3
Searching bootorder for: /pci@i0cf8/*@3
PS2 keyboard initialized
All threads complete.
Scan for option roms
Searching bootorder for: /rom@genroms/kvmvapic.bin
Searching bootorder for: HALT
drive 0x000fde20: PCHS=16383/16/63 translation=lba LCHS=1024/255/63 s=83886080
Running option rom at c900:0003
Space available for UMB: 000cb800-000ed800
Returned 20480 bytes of ZoneHigh
e820 map has 8 items:
  0: 0000000000000000 - 000000000009fc00 = 1 RAM
  1: 000000000009fc00 - 00000000000a0000 = 2 RESERVED
  2: 00000000000f0000 - 0000000000100000 = 2 RESERVED
  3: 0000000000100000 - 00000000bfff5000 = 1 RAM
  4: 00000000bfff5000 - 00000000c0000000 = 2 RESERVED
  5: 00000000feffc000 - 00000000ff000000 = 2 RESERVED
  6: 00000000fffc0000 - 0000000100000000 = 2 RESERVED
  7: 0000000100000000 - 0000000140000000 = 1 RAM
enter handle_19:
  NULL
Booting from Hard Disk...
Booting from 0000:7c00
VBE mode info request: 100
VBE mode info request: 101
VBE mode info request: 102
VBE mode info request: 103
VBE mode info request: 104
VBE mode info request: 105
VBE mode info request: 106
VBE mode info request: 107
VBE mode info request: 10d
VBE mode info request: 10e
VBE mode info request: 10f
VBE mode info request: 110
VBE mode info request: 111
VBE mode info request: 112
VBE mode info request: 113
VBE mode info request: 114
VBE mode info request: 115
VBE mode info request: 116
VBE mode info request: 117
VBE mode info request: 118
VBE mode info request: 119
VBE mode info request: 11a
VBE mode info request: 11b
VBE mode info request: 11c
VBE mode info request: 11d
VBE mode info request: 11e
VBE mode info request: 11f
VBE mode info request: 140
VBE mode info request: 141
VBE mode info request: 142
VBE mode info request: 143
VBE mode info request: 144
VBE mode info request: 145
VBE mode info request: 146
VBE mode info request: 147
VBE mode info request: 148
VBE mode info request: 149
VBE mode info request: 14a
VBE mode info request: 14b
VBE mode info request: 14c
VBE mode info request: 178
VBE mode info request: 179
VBE mode info request: 17a
VBE mode info request: 17b
VBE mode info request: 17c
VBE mode info request: 17d
VBE mode info request: 17e
VBE mode info request: 17f
VBE mode info request: 180
VBE mode info request: 181
VBE mode info request: 182
VBE mode info request: 183
VBE mode info request: 184
VBE mode info request: 185
VBE mode info request: 186
VBE mode info request: 187
VBE mode info request: 188
VBE mode info request: 189
VBE mode info request: 18a
VBE mode info request: 18b
VBE mode info request: 18c
VBE mode info request: 0
VBE mode info request: 1
VBE mode info request: 2
VBE mode info request: 3
VBE mode info request: 4
VBE mode info request: 5
VBE mode info request: 6
VBE mode info request: 7
VBE mode info request: d
VBE mode info request: e
VBE mode info request: f
VBE mode info request: 10
VBE mode info request: 11
VBE mode info request: 12
VBE mode info request: 13
VBE mode info request: 6a
set VGA mode 3

Comment 26 Andrew Jones 2014-03-17 11:58:58 UTC
re: bug 1036457 comment 39, closing as dup of 1036457

*** This bug has been marked as a duplicate of bug 1036457 ***