Bug 1219197
Summary: | Xen BUG at page_alloc.c:1738 | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Major Hayden 🤠<mhayden> | ||||
Component: | gcc | Assignee: | Jakub Jelinek <jakub> | ||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 22 | CC: | antony, davejohansen, gansalmon, itamar, jakub, jforbes, jonathan, jwakely, kernel-maint, ketuzsezr, kraxel, lantw44, law, madhu.chinakonda, m.a.young, mchehab, mhayden, mpolacek, pbrobinson, steven, virt-maint | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | gcc-5.1.1-3.fc22 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2015-06-18 14:19:00 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Major Hayden ðŸ¤
2015-05-06 19:48:07 UTC
FWIW, the error is identical with kernel-4.0.0-0.rc5.git4.1.fc22.x86_64. The output is from Xen, so we'll start there. The same error appears when using these kernels as well: * kernel-3.19.5-200.fc21.x86_64 * kernel-3.18.8-201.fc21.x86_64 * kernel-3.17.8-300.fc21.x86_64 The crash occurs at the line BUG_ON((pg[i].u.inuse.type_info & PGT_count_mask) != 0); in xen/common/page_alloc.c. Jan suggested on xen-devel that gcc 5.0.1 might be to blame[1]. Is Xen 4.5 working for anyone else on Fedora 22's latest package/kernel set? [1] http://lists.xen.org/archives/html/xen-devel/2015-05/msg02604.html Yes, it looks like gcc (or something else in the build chain). My newly updated F22 system won't boot in xen (4.5.0-8 or 4.5.1-rc1) but will boot with the 4.5.1-rc1 xen.gz file built on F21. From the thread http://marc.info/?l=xen-devel&m=143292326301633&w=2 on the xen-devel list GCC 5 is indeed miscompiling the code. Comparing the fc21 vs fc22 builds: The C snippet from mmio_ro_do_page_fault(): struct page_info *page = mfn_to_page(mfn); struct domain *owner = page_get_owner_and_reference(page); if ( owner ) put_page(page); In fc21 is: movabs $0xffff82e000000000,%rbp shr %cl,%rax or %rdx,%rax shl $0x5,%rax add %rax,%rbp mov %rbp,%rdi callq ffff82d080186900 <page_get_owner_and_reference> test %rax,%rax mov %rax,%r12 je ffff82d080189c4e <mmio_ro_do_page_fault+0x11e> mov %rbp,%rdi callq ffff82d080188ec0 <put_page> and in fc22 is: movabs $0xffff82e000000000,%r8 shr %cl,%rax or %rdx,%rax shl $0x5,%rax lea (%r8,%rax,1),%rdi callq ffff82d0801874f0 <page_get_owner_and_reference> test %rax,%rax mov %rax,%rbp je ffff82d08018ca14 <mmio_ro_do_page_fault+0x114> mov %r8,%rdi callq ffff82d080189a90 <put_page> "lea (%r8,%rax,1),%rdi" in FC22 is slightly shorter than "add %rax,%rbp; mov %rbp,%rdi" in FC21. In both cases %rdi is now 'page' from the C snippet. In FC21, the result is stored in %rbp, then reloaded from %rbp into %rdi for call to put_page(). However, in FC22, the result of the calculation is only held in %rdi, and clobbered by the call to page_get_owner_and_reference(). When it comes to call put_page(), %r8 is reloaded, which is still a pointer to the base of the frametable, not the page we actually took a reference on. FC22 is miscompiling the C to: struct page_info *page = mfn_to_page(mfn); struct domain *owner = page_get_owner_and_reference(page); if ( owner ) put_page(mfn_to_page(0)); which is wrong, and why free_domheap_pages() does legitimately complain about the wonky refcount. Further testing links this to the -fcaller-saves option as if the file is built with -fno-caller-saves on F22 then the code snippet goes back to the F21 version. Possibly the mov %r8,%rdi line is incorrect. Please attach preprocessed source in which this happens and provide full gcc command line used to compile this file. Created attachment 1035629 [details]
preprocessed source
The full compile line (with some duplications removed) is
gcc -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -fomit-frame-pointer -fno-strict-aliasing -std=gnu99 -Wstrict-prototypes -Wdeclaration-after-statement -Wno-unused-but-set-variable -Wno-unused-local-typedefs -DNDEBUG -I/home/michael/rpmbuild/BUILD/xen-4.5.0/xen/include -I/home/michael/rpmbuild/BUILD/xen-4.5.0/xen/include/asm-x86/mach-generic -I/home/michael/rpmbuild/BUILD/xen-4.5.0/xen/include/asm-x86/mach-default -msoft-float -fno-stack-protector -fno-exceptions -Wnested-externs -DHAVE_GAS_VMX -DHAVE_GAS_EPT -DHAVE_GAS_FSGSBASE -mno-red-zone -mno-sse -fpic -fno-asynchronous-unwind-tables -DGCC_HAS_VISIBILITY_ATTRIBUTE -fno-builtin -fno-common -Werror -Wredundant-decls -Wno-pointer-arith -pipe -D__XEN__ -include /home/michael/rpmbuild/BUILD/xen-4.5.0/xen/include/xen/config.h -nostdinc -DXSM_ENABLE -DFLASK_ENABLE -DHAS_ACPI -DHAS_GDBSX -DHAS_PASSTHROUGH -DHAS_MEM_ACCESS -DHAS_MEM_PAGING -DHAS_MEM_SHARING -DHAS_PCI -DHAS_IOPORTS -DHAS_PDX -MMD -MF .xen.d -MF .built_in.o.d -MF .mm.o.d -c mm.c -o mm.o
Thanks, filed upstream: PR66444. It looks like the patch made it into upstream GCC if I am reading this ticket correctly: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66444#c12 Then it is already in the gcc-5.1.1-3.fc22 errata. |