Bug 1312738

Summary: crash: zero-size memory allocation (aarch64)
Product: Red Hat Enterprise Linux 7 Reporter: XiaoNi <xni>
Component: crashAssignee: Dave Anderson <anderson>
Status: CLOSED ERRATA QA Contact: Emma Wu <xiawu>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.3CC: anderson, dwysocha, jbastian, qzhao, ruyang, xiawu
Target Milestone: rc   
Target Release: ---   
Hardware: aarch64   
OS: Linux   
Whiteboard:
Fixed In Version: crash-7.1.5-1.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-04 03:46:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Comment 2 Dave Anderson 2016-02-29 14:02:28 UTC
> crash: zero-size memory allocation
...
> The password for hp-moonshot-02-c11.khw.lab.eng.bos.redhat.com is redhat

Are you sure?

#  uname -rn
hp-moonshot-02-c11.khw.lab.eng.bos.redhat.com 4.5.0-0.rc5.29.el7.aarch64
# crash

crash 7.1.2-2.el7
Copyright (C) 2002-2014  Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010  IBM Corporation
Copyright (C) 1999-2006  Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012  Fujitsu Limited
Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011  NEC Corporation
Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions.  Enter "help copying" to see the conditions.
This program has absolutely no warranty.  Enter "help warranty" for details.
 
GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "aarch64-unknown-linux-gnu"...


crash: invalid structure member offset: module_num_symtab
       FILE: kernel.c  LINE: 3242  FUNCTION: module_init()

[/usr/bin/crash] error trace: 64da8c => 462508 => 4cfea4 => 50bc3c

  50bc3c: OFFSET_verify+132
  4cfea4: module_init+1496
  462508: main_loop+208
  64da8c: current_interp_command_loop+24

#

Anyway, there are fixes upstream that address the zero-size allocation
issue and the one above, both of which are caused by Linux 4.5 kernel
modifications.

Comment 3 Dave Anderson 2016-02-29 14:07:15 UTC
> Are you sure?

Ah, I see that in your example you used a vmcore file, so that is presumably
the reason for the type of failure seen.  I don't know where the vmcore file
is located on hp-moonshot-02-c11.khw.lab.eng.bos.redhat.com.

Comment 4 XiaoNi 2016-03-07 05:46:22 UTC
(In reply to Dave Anderson from comment #3)
> > Are you sure?
> 
> Ah, I see that in your example you used a vmcore file, so that is presumably
> the reason for the type of failure seen.  I don't know where the vmcore file
> is located on hp-moonshot-02-c11.khw.lab.eng.bos.redhat.com.

Hi Dave

Sorry for the late response. I didn't see the emails. The vmcore is under 
/var/crash/hp-moonshot-02-c11.khw.lab.eng.bos.redhat.com-yizhan-1239074/10.16.184.139-2016-02-27-14:41:34

Thanks
Xiao

Comment 5 Dave Anderson 2016-03-08 15:52:07 UTC
hp-moonshot-02-c11.khw.lab.eng.bos.redhat.com did not respond to a ping,
so I tried accessing it via the console server.  It looks like it's 
doing a complete system reinstallation.

Comment 7 Dave Anderson 2016-03-09 14:11:02 UTC
OK thanks.  For a sanity check, tested here is the upstream git repo version
of the crash utility:

# ./crash /usr/lib/debug/lib/modules/4.5.0-0.rc5.29.el7.aarch64/vmlinux /var/crash/amd-seattle-06.khw.lab.eng.bos.redhat.com-yizhan-1255595/10.16.184.109-2016-03-09-02:14:43/vmcore

crash 7.1.5rc17
Copyright (C) 2002-2016  Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010  IBM Corporation
Copyright (C) 1999-2006  Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012  Fujitsu Limited
Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011  NEC Corporation
Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions.  Enter "help copying" to see the conditions.
This program has absolutely no warranty.  Enter "help warranty" for details.
 
GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "aarch64-unknown-linux-gnu"...

      KERNEL: /usr/lib/debug/lib/modules/4.5.0-0.rc5.29.el7.aarch64/vmlinux
    DUMPFILE: /var/crash/amd-seattle-06.khw.lab.eng.bos.redhat.com-yizhan-1255595/10.16.184.109-2016-03-09-02:14:43/vmcore  [PARTIAL DUMP]
        CPUS: 8 [OFFLINE: 7]
        DATE: Tue Mar  8 21:14:03 2016
      UPTIME: 00:20:46
LOAD AVERAGE: 5.93, 5.12, 4.06
       TASKS: 249
    NODENAME: amd-seattle-06.khw.lab.eng.bos.redhat.com
     RELEASE: 4.5.0-0.rc5.29.el7.aarch64
     VERSION: #1 SMP Mon Feb 22 13:41:33 EST 2016
     MACHINE: aarch64  (unknown Mhz)
      MEMORY: 16 GB
       PANIC: "Unable to handle kernel NULL pointer dereference at virtual address 000002a8"
         PID: 8213
     COMMAND: "loop0"
        TASK: fffffe0000b0cb00  [THREAD_INFO: fffffe0000010000]
         CPU: 1
       STATE: TASK_RUNNING (PANIC)

crash>

Comment 8 XiaoNi 2016-03-10 03:05:02 UTC
Hi Dave

Thanks ;)
I can use it now. 

Another question
[ 1248.843392] PC is at super_written+0x34/0x94
[ 1248.847655] LR is at bio_endio+0x90/0xc4

What's meaning of the PC and LR. Is there a doc that I can learn about how to analyze the vmcore.

Thanks
Xiao

Comment 9 Dave Anderson 2016-03-10 14:01:39 UTC
The PC is the "program counter" register, which contains the address of the
instruction that was running which generated the panic.  So in the example
above, while executing within the kernel's super_written() function, something
occurred at super_written+0x34 that caused the system to panic.   The LR is 
the "link register", which contains the address of the function that called super_written().

If you google for arm64 or aarch64, there are some documents that might be
helpful.  If you try amazon.com, though, there's pretty much nothing.  It's
frustrating because, for example, the i386, x86_64, ia64, and ppc64 arches
all have full sets of paperback manuals.

Comment 10 XiaoNi 2016-03-11 08:29:04 UTC
Thanks for the explanation.

Comment 15 Dave Anderson 2016-04-09 16:18:54 UTC
*** Bug 1325374 has been marked as a duplicate of this bug. ***

Comment 20 errata-xmlrpc 2016-11-04 03:46:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2325.html