RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1035099 - "KVM internal error. Suberror: 3" when boot rhel6.5 guest with more than 42(7 AHCI controller) AHCI disks
Summary: "KVM internal error. Suberror: 3" when boot rhel6.5 guest with more than 42(...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: seabios
Version: 7.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: rc
: ---
Assignee: Gerd Hoffmann
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On: 1101500
Blocks: 1113520
TreeView+ depends on / blocked
 
Reported: 2013-11-27 05:24 UTC by Sibiao Luo
Modified: 2015-03-05 08:14 UTC (History)
14 users (show)

Fixed In Version: seabios-1.7.5-1.el7
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-03-05 08:14:58 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
ahci-multi-disks-cli.sh (8.12 KB, text/plain)
2013-11-27 05:26 UTC, Sibiao Luo
no flags Details
Screenshot for AHCI guest with Probing EDD (edd=off to disable)... (12.56 KB, image/png)
2013-11-27 05:28 UTC, Sibiao Luo
no flags Details
command line(hang issue) (7.47 KB, application/x-shellscript)
2014-01-08 03:32 UTC, Xu Han
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2015:0345 0 normal SHIPPED_LIVE seabios bug fix and enhancement update 2015-03-05 12:27:47 UTC

Description Sibiao Luo 2013-11-27 05:24:56 UTC
Description of problem:
"KVM internal error. Suberror: 3" and fail to boot rhel6.5 guest with 48(8 AHCI controller) AHCI disks. But it can boot guest less than 42 (7 ahci controller) ahci disks successfully and guest works well.
BTW, also tried the rhel7.0 guest that did *not* reproduce this issue.

Version-Release number of selected component (if applicable):
host info:
3.10.0-55.el7.x86_64
qemu-kvm-1.5.3-19.el7.x86_64
seabios-bin-1.7.2.2-4.el7.noarch
guest info:
rhel6.5-64bit
2.6.32-425.el6.x86_64

How reproducible:
100%

Steps to Reproduce:
1.I attached qemu-kvm command line into attachement.
2.
3.

Actual results:
after step 1, "KVM internal error. Suberror: 3" and fail to boot rhel6.5 guest with 48(8 AHCI controller) AHCI disks. I attached guest screenshot to the attachement.
# sh ahci-multi-disks-cli.sh 
QEMU 1.5.3 monitor - type 'help' for more information
(qemu) KVM internal error. Suberror: 3
extra data[0]: 80000306
extra data[1]: 31
EAX=00000500 EBX=000fe000 ECX=000062d6 EDX=00000000
ESI=0000fff0 EDI=00009000 EBP=0000feff ESP=00000001
EIP=0000004c EFL=00010046 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =e000 000e0000 ffffffff 00809300
CS =0000 00000000 ffffffff 00809b00
SS =9000 00090000 ffffffff 00809300
DS =9000 00090000 ffffffff 00809300
FS =9900 00099000 ffffffff 00809300
GS =9000 00090000 ffffffff 00809300
LDT=0000 00000000 0000ffff 00008200
TR =0000 00000000 0000ffff 00008b00
GDT=     00009180 00000027
IDT=     00000000 000003ff
CR0=00000010 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
Code=00 00 00 00 00 00 00 00 06 0a 00 c9 4d f8 00 f0 41 f8 00 f0 <fe> e3 00 f0 9f 07 00 c9 59 f8 00 f0 f7 07 00 c9 d2 ef 00 f0 7b c7 00 f0 f2 e6 00 f0 6e fe

(qemu) 
(qemu) info status 
VM status: paused (internal-error)
(qemu) cont
Resetting the Virtual Machine is required

Expected results:
boot guest with all disks are loaded successfully.

Additional info:

Comment 1 Sibiao Luo 2013-11-27 05:26:04 UTC
Created attachment 829572 [details]
ahci-multi-disks-cli.sh

Comment 2 Sibiao Luo 2013-11-27 05:28:19 UTC
Created attachment 829573 [details]
Screenshot for AHCI guest with Probing EDD (edd=off to disable)...

Comment 3 Sibiao Luo 2013-11-27 05:36:03 UTC
My host cpu info:

processor	: 7
vendor_id	: GenuineIntel
cpu family	: 6
model		: 42
model name	: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
stepping	: 7
microcode	: 0x29
cpu MHz		: 1598.000
cache size	: 8192 KB
physical id	: 0
siblings	: 8
core id		: 3
cpu cores	: 4
apicid		: 7
initial apicid	: 7
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid
bogomips	: 6782.70
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

Comment 4 Paolo Bonzini 2013-11-27 11:04:31 UTC
The CS:IP point to the address of the vector for INT 13h (the BIOS disk I/O services).  Note how the hex dump includes address in the ROM area:

0000:0040  (INT 10h)    06 0a 00 c9        c900:0a06 (in sgabios)
0000:0044  (INT 11h)    4d f8 00 f0        f000:f84d (in SeaBIOS)
0000:0048  (INT 12h)    41 f8 00 f0        f000:f841 (in SeaBIOS)
0000:004C  (INT 13h)   <fe> e3 00 f0       f000:e3fe (in SeaBIOS)
0000:0050  (INT 14h)    9f 07 00 c9        c900:079f (in sgabios)
0000:0054  (INT 15h)    59 f8 00 f0        f000:f859 (in SeaBIOS)
0000:0058  (INT 16h)    f7 07 00 c9        c900:07f7 (in sgabios)
0000:005C  (INT 17h)    d2 ef 00 f0        f000:efd2 (in SeaBIOS)
0000:0060  (INT 18h)    7b c7 00 f0        f000:c77b (in SeaBIOS)
0000:0064  (INT 19h)    f2 e6 00 f0        f000:e6f2 (in SeaBIOS)

So KVM is really executing data, and the internal error is justified.  Changing component to seabios.

Comment 5 Xu Han 2014-01-08 03:32:52 UTC
Created attachment 846919 [details]
command line(hang issue)

Tested this issue with these component below:
qemu-kvm-1.5.3-31.el7.x86_64
kernel-debug-3.10.0-65.el7.x86_64
seabios-1.7.2.2-7.el7.x86_64

Guests:
RHEL7
Win2012R2

Steps:
1. boot guest following cmdline
attached qemu-kvm cmdline in this comment.

Results:
While guest booting, have not seen "KVM internal error. Suberror: 3" in comment 0. However, guest hanged during kernel loading. Tested with RHEL7 and Win2012R2 guest all hit this issue.

If remove the last AHCI controller and disk, then guest would boot successfully.

Comment 6 Gerd Hoffmann 2014-01-15 15:49:44 UTC
Any change when booting the guest kernel with "edd=off" ?

Comment 7 Sibiao Luo 2014-01-16 05:58:11 UTC
(In reply to Gerd Hoffmann from comment #6)
> Any change when booting the guest kernel with "edd=off" ?
No matter appending "edd=off" to rhel6.5 guest kernel line or not which can both hit this issue with the same qemu-kvm command line(attachment 829572 [details]).

host info:
# uname -r && rpm -q qemu-kvm && rpm -qa | grep seabios
3.10.0-66.el7.x86_64.debug
qemu-kvm-1.5.3-31.el7.x86_64
seabios-1.7.2.2-7.el7.x86_64
seabios-bin-1.7.2.2-7.el7.x86_64
guest info:
rhel6.5_64bit
kernel-2.6.32-424.el6.x86_64

# sh ahci-multi-disks-cli.sh 
QEMU 1.5.3 monitor - type 'help' for more information
(qemu) 
(qemu) c
(qemu) KVM internal error. Suberror: 3
extra data[0]: 80000306
extra data[1]: 31
EAX=00000500 EBX=000fe000 ECX=000062d6 EDX=00000000
ESI=0000fff0 EDI=00009000 EBP=0000feff ESP=00000001
EIP=0000004c EFL=00010046 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =e000 000e0000 ffffffff 00809300
CS =0000 00000000 ffffffff 00809b00
SS =9000 00090000 ffffffff 00809300
DS =9000 00090000 ffffffff 00809300
FS =9900 00099000 ffffffff 00809300
GS =9000 00090000 ffffffff 00809300
LDT=0000 00000000 0000ffff 00008200
TR =0000 00000000 0000ffff 00008b00
GDT=     00009180 00000027
IDT=     00000000 000003ff
CR0=00000010 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
Code=00 00 00 00 00 00 00 00 06 0a 00 c9 4d f8 00 f0 41 f8 00 f0 <fe> e3 00 f0 9f 07 00 c9 59 f8 00 f0 f7 07 00 c9 d2 ef 00 f0 7b c7 00 f0 f2 e6 00 f0 6e fe

(qemu) info status 
VM status: paused (internal-error)
(qemu) c
Resetting the Virtual Machine is required
(qemu)

Comment 8 Gerd Hoffmann 2014-01-16 10:50:52 UTC
Do you get more guest kernel messages when booting without "quiet"?

Comment 9 Sibiao Luo 2014-01-17 02:45:52 UTC
(In reply to Gerd Hoffmann from comment #8)
> Do you get more guest kernel messages when booting without "quiet"?
No any guest kernel message display, it did not go to read the seabios before QEMU quit.

Comment 10 Gerd Hoffmann 2014-01-17 12:51:38 UTC
Something touches seabios data structures (ahci driver structures to be exact).
They get filled with zeros, which can have -- depending on the exact memory layout -- all sorts of funky effects.  Jumping to address zero (as seen in this report) by following a cleared function pointer certainly is in the cards. On my machine seabios just hangs.  A cleared memory pointer makes seabios and ahci disagree where the cmd block is, therefore ahci never sees the command seabios intended to submit ...

Comment 11 Gerd Hoffmann 2014-01-17 13:18:42 UTC
Tried to boot rhel7 kernel on the rhel6.5 guest.  Hangs too.
Given that a rhel7 guest boots fine (see initial report) this
points to the rhel6 grub as most likely culprit for the memory
corruption.

Comment 13 juzhang 2014-01-20 03:41:30 UTC
Hi Gerd,

According to https://bugzilla.redhat.com/show_bug.cgi?id=1035099#c5, rhel7.0 and Win2012R2 hit this issue as well. You mean it's a different bz? If yes, QE will open new one. Free to add your suggestions?

Best Regards,
Junyi

Comment 14 Gerd Hoffmann 2014-01-20 08:38:55 UTC
(In reply to juzhang from comment #13)
> Hi Gerd,
> 
> According to https://bugzilla.redhat.com/show_bug.cgi?id=1035099#c5, rhel7.0
> and Win2012R2 hit this issue as well. You mean it's a different bz? If yes,
> QE will open new one. Free to add your suggestions?

Oops, havn't read comment #5 careful enough.  So, the initial comment and #5 disagree whenever rhel7 works or not.  Hard to say whenever that is a different issue.  Certainly could be the same root cause, but maybe not.  Windows being affected too pretty much rules out bootloader / kernel though.

Comment 18 Gerd Hoffmann 2014-05-27 14:40:50 UTC
Can you retest with the 1.7.5 rebase builds please?
http://people.redhat.com/ghoffman/bz1101500/

Comment 19 Gerd Hoffmann 2014-07-03 08:40:27 UTC
(In reply to Gerd Hoffmann from comment #18)
> Can you retest with the 1.7.5 rebase builds please?
> http://people.redhat.com/ghoffman/bz1101500/

Ping

Comment 20 Sibiao Luo 2014-07-03 09:16:46 UTC
(In reply to Gerd Hoffmann from comment #19)
> (In reply to Gerd Hoffmann from comment #18)
> > Can you retest with the 1.7.5 rebase builds please?
> > http://people.redhat.com/ghoffman/bz1101500/
> 
Retried it with this private build which did not hit such issue any more.

host info:
3.10.0-128.el7.x86_64
qemu-kvm-rhev-1.5.3-60.el7ev.x86_64
seabios-1.7.5-1.el7_0.bz1101500.3.x86_64
guest info:
2.6.32-452.el6.x86_64

Steps:
the same to comment #0.

Results:
QEMU and KVM guest work well without any quit, all the disks can be detected in guest correctly, no any error in guest dmesg.
# ls /dev/sd* | wc -l
43

Best Regards,
sluo

Comment 21 Gerd Hoffmann 2014-07-03 09:51:39 UTC
(In reply to Sibiao Luo from comment #20)
> (In reply to Gerd Hoffmann from comment #19)
> > (In reply to Gerd Hoffmann from comment #18)
> > > Can you retest with the 1.7.5 rebase builds please?
> > > http://people.redhat.com/ghoffman/bz1101500/
> > 
> Retried it with this private build which did not hit such issue any more.

Cool.

Comment 22 FuXiangChun 2014-08-04 06:21:31 UTC
Reproduce this bug with seabios-1.7.2.2-10.el7.x86_64 & qemu-kvm-rhev-2.1.0-3.el7ev.preview.x86_64 & RHEL6.5 guest.

(qemu) KVM internal error. Suberror: 1
emulation failure
EAX=00000500 EBX=000fe000 ECX=000062d6 EDX=00000000
ESI=0000fff0 EDI=00009000 EBP=0000feff ESP=00000001
EIP=0000004c EFL=00000046 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =e000 000e0000 ffffffff 00809300
CS =0000 00000000 ffffffff 00809b00
SS =9000 00090000 ffffffff 00809300
DS =9000 00090000 ffffffff 00809300
FS =9900 00099000 ffffffff 00809300
GS =9000 00090000 ffffffff 00809300
LDT=0000 00000000 0000ffff 00008200
TR =0000 00000000 0000ffff 00008b00
GDT=     00009180 00000027
IDT=     00000000 000003ff
CR0=00000010 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
Code=00 00 00 00 00 00 00 00 06 0a 00 c9 4d f8 00 f0 41 f8 00 f0 <fe> e3 00 f0 9f 07 00 c9 59 f8 00 f0 f7 07 00 c9 d2 ef 00 f0 7b c7 00 f0 f2 e6 00 f0 6e fe


Verify this bug with seabios-1.7.5-1.el7 & qemu-kvm-rhev-2.1.0-3.el7ev.preview.x86_64.

For RHEL6.5 & RHEL7.0 & win2012r2-64 guest. 48 ahci disks are detected inside per guest.  and guest and all disks work well.

Comment 26 errata-xmlrpc 2015-03-05 08:14:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-0345.html


Note You need to log in before you can comment on or make changes to this bug.