Bug 1219445

Summary: 'numastat qemu-kvm' aborted (core dumped)
Product: Red Hat Enterprise Linux 7 Reporter: Gu Nini <ngu>
Component: numactlAssignee: Petr Oros <poros>
Status: CLOSED ERRATA QA Contact: Daniel Rusek <drusek>
Severity: high Docs Contact:
Priority: unspecified    
Version: 7.2CC: bgray, bugproxy, dhoward, drusek, dzheng, gsun, hannsj_uhl, mdeng, michen, skozina, xuhan, ypu
Target Milestone: rc   
Target Release: 7.5   
Hardware: ppc64le   
OS: Linux   
Whiteboard:
Fixed In Version: numactl-2.0.9-7.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-10 10:06:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1399177, 1438583, 1472889, 1476742, 1502782, 1522983    
Attachments:
Description Flags
Debug info of core file none

Description Gu Nini 2015-05-07 11:42:31 UTC
Created attachment 1023083 [details]
Debug info of core file

Description of problem:
When execute cmd 'numastat qemu-kvm', it 'Aborted (core dumped).

Version-Release number of selected component (if applicable):
Host kernel: 3.10.0-244.ael7b.ppc64le
qemu-kvm-rhev: qemu-kvm-rhev-2.3.0-1.ael7b.ppc64le
Numa:
numactl-2.0.9-4.ael7b.ppc64le
numad-0.5-14.20140620git.ael7b.ppc64le


How reproducible:
100% on the same ppc host, while not on the other one

Steps to Reproduce:
1. Check the host numa status:
[root@ibm-p8-kvm-01-qe tests]# numactl -H
available: 4 nodes (0-1,16-17)
node 0 cpus: 8 16
node 0 size: 65536 MB
node 0 free: 3631 MB
node 1 cpus: 48 56 64 72
node 1 size: 0 MB
node 1 free: 0 MB
node 16 cpus: 80 88 96 104 112
node 16 size: 65536 MB
node 16 free: 6495 MB
node 17 cpus: 120 128 136 144 152
node 17 size: 0 MB
node 17 free: 0 MB
node distances:
node   0   1  16  17 
  0:  10  20  40  40 
  1:  20  10  40  40 
 16:  40  40  10  20 
 17:  40  40  20  10 
[root@ibm-p8-kvm-01-qe tests]# numastat
                           node0           node1          node16          node17
numa_hit              4393471118               0     13383898505               0
numa_miss              122155280               0        95469613               0
numa_foreign            95469613               0       122155280               0
interleave_hit              2967               0            3052               0
local_node            1496527084               0      8429767811               0
other_node            3019099314               0      5049600307               0

2. Start a guest in qemu:
/usr/libexec/qemu-kvm -name spaprqcow-0507 -S -machine pseries-rhel7.1.0,accel=kvm,usb=off -m 129708 -realtime mlock=off -smp 4,sockets=1,cores=4,threads=1 -numa node,nodeid=0,cpus=0,mem=32427 -numa node,nodeid=1,cpus=1,mem=32427 -numa node,nodeid=2,cpus=2,mem=32427 -numa node,nodeid=3,cpus=3,mem=32427 ...

3. Check the qemu-kvm related numa node status with cmd 'numastat qemu-kvm'/'numastat -c qemu-kvm':
[root@ibm-p8-kvm-02-qe ngu]# numastat -c qemu-kvm

Per-node process memory usage (in MBs) for PID 161374 (qemu-kvm)
Can't read /proc/161374/numa_maps: Bad address
         Node 0 Node 1 Node 16 Node 17 Total
         ------ ------ ------- ------- -----
Huge          0      0       0       0     0
Heap          0     96       0       0   193
Stack         0      0       0       0     0
Private       4     15       0       0    64
-------  ------ ------ ------- ------- -----
Total         4    111       0       0   235
*** Error in `numastat': double free or corruption (!prev): 0x0000010006a501e0 ***
======= Backtrace: =========
/lib64/power8/libc.so.6(+0x8fe94)[0x3fff94adfe94]
numastat[0x100025d0]
numastat[0x1000446c]
numastat[0x100014c4]
/lib64/power8/libc.so.6(+0x24580)[0x3fff94a74580]
/lib64/power8/libc.so.6(__libc_start_main+0xc4)[0x3fff94a74774]
======= Memory map: ========
10000000-10010000 r-xp 00000000 fd:00 136539775                          /usr/bin/numastat
10010000-10020000 r--p 00000000 fd:00 136539775                          /usr/bin/numastat
10020000-10030000 rw-p 00010000 fd:00 136539775                          /usr/bin/numastat
10006a50000-10006a80000 rw-p 00000000 00:00 0                            [heap]
3fff94a30000-3fff94a50000 rw-p 00000000 00:00 0 
3fff94a50000-3fff94c10000 r-xp 00000000 fd:00 67133008                   /usr/lib64/power8/libc-2.17.so
3fff94c10000-3fff94c20000 r--p 001b0000 fd:00 67133008                   /usr/lib64/power8/libc-2.17.so
3fff94c20000-3fff94c30000 rw-p 001c0000 fd:00 67133008                   /usr/lib64/power8/libc-2.17.so
3fff94c30000-3fff94c40000 rw-p 00000000 00:00 0 
3fff94c40000-3fff94c60000 r-xp 00000000 00:00 0                          [vdso]
3fff94c60000-3fff94c90000 r-xp 00000000 fd:00 201328841                  /usr/lib64/ld-2.17.so
3fff94c90000-3fff94ca0000 r--p 00020000 fd:00 201328841                  /usr/lib64/ld-2.17.so
3fff94ca0000-3fff94cb0000 rw-p 00030000 fd:00 201328841                  /usr/lib64/ld-2.17.so
3fffe3420000-3fffe3450000 rw-p 00000000 00:00 0                          [stack]
Aborted (core dumped)

Actual results:
Aborted (core dumped) as above.

Expected results:
The cmd could be executed without any problem.

Additional info:
Ever met the problem when the same host was installed with BE OS and following sw versions previously:
Host kernel: 3.10.0-234.el7.ppc64
Qemu-kvm-rhev: qemu-kvm-rhev-2.2.0-8.el7.ppc64
Numa:
numad-0.5-14.20140620git.el7.ppc64
numactl-libs-2.0.9-4.el7.ppc64
numactl-2.0.9-4.el7.ppc64

Comment 3 Min Deng 2017-04-19 07:04:54 UTC
It still could be reproduced on the following builds
numactl-2.0.9-6.el7_2.ppc64le
kernel-3.10.0-654.el7.ppc64le
qemu-kvm-rhev-2.9.0-0.el7.patchwork201703291116.ppc64le

[root@ibm-p8-09 home]# numastat -p 99060

Per-node process memory usage (in MBs) for PID 99060 (qemu-kvm)
Can't read /proc/99060/numa_maps: Bad address
                           Node 0          Node 1         Node 16
                  --------------- --------------- ---------------
Huge                         0.00            0.00            0.00
Heap                         0.00            0.00            0.00
Stack                        0.00            0.00            0.00
Private                      3.06            3.25            0.00
----------------  --------------- --------------- ---------------
Total                        3.06            3.25            0.00

                          Node 17           Total
                  --------------- ---------------
Huge                         0.00            0.00
Heap                         0.00          336.12
Stack                        0.00            0.00
Private                      0.00           15.56
----------------  --------------- ---------------
Total                        0.00          351.69
*** Error in `numastat': double free or corruption (!prev): 0x000001001c0101e0 ***
======= Backtrace: =========
/lib64/libc.so.6(cfree+0x4ac)[0x3fff8257693c]
numastat[0x100025b0]
numastat[0x1000444c]
numastat[0x100014c4]
/lib64/libc.so.6(+0x24980)[0x3fff82504980]
/lib64/libc.so.6(__libc_start_main+0xc4)[0x3fff82504b74]
======= Memory map: ========
10000000-10010000 r-xp 00000000 fd:00 1197213                            /usr/bin/numastat
10010000-10020000 r--p 00000000 fd:00 1197213                            /usr/bin/numastat
10020000-10030000 rw-p 00010000 fd:00 1197213                            /usr/bin/numastat
1001c010000-1001c040000 rw-p 00000000 00:00 0                            [heap]
3fff824c0000-3fff824e0000 rw-p 00000000 00:00 0 
3fff824e0000-3fff826b0000 r-xp 00000000 fd:00 33599141                   /usr/lib64/libc-2.17.so
3fff826b0000-3fff826c0000 r--p 001c0000 fd:00 33599141                   /usr/lib64/libc-2.17.so
3fff826c0000-3fff826d0000 rw-p 001d0000 fd:00 33599141                   /usr/lib64/libc-2.17.so
3fff826d0000-3fff826e0000 rw-p 00000000 00:00 0 
3fff826e0000-3fff82700000 r-xp 00000000 00:00 0                          [vdso]
3fff82700000-3fff82730000 r-xp 00000000 fd:00 33599134                   /usr/lib64/ld-2.17.so
3fff82730000-3fff82740000 r--p 00020000 fd:00 33599134                   /usr/lib64/ld-2.17.so
3fff82740000-3fff82750000 rw-p 00030000 fd:00 33599134                   /usr/lib64/ld-2.17.so
3fffde140000-3fffde170000 rw-p 00000000 00:00 0                          [stack]
Aborted (core dumped)

Comment 4 Petr Oros 2017-05-25 11:46:02 UTC
*** Bug 1220761 has been marked as a duplicate of this bug. ***

Comment 6 Petr Oros 2017-08-08 09:22:12 UTC
*** Bug 1463618 has been marked as a duplicate of this bug. ***

Comment 9 Hanns-Joachim Uhl 2017-10-17 11:40:24 UTC
*** Bug 1502782 has been marked as a duplicate of this bug. ***

Comment 11 Petr Oros 2017-12-04 08:17:22 UTC
*** Bug 1520238 has been marked as a duplicate of this bug. ***

Comment 12 IBM Bug Proxy 2017-12-13 13:20:38 UTC
------- Comment From nasastry.com 2017-12-13 08:12 EDT-------
Not seeing the reported bug.

# numastat 9018

Per-node process memory usage (in MBs) for PID 9018 (qemu-kvm)
Node 0           Total
Heap                        32.00           32.00
Stack                        0.06            0.06
Private                   1009.44         1009.44
Total                     1041.50         1041.50

numactl-2.0.9-7.el7

This bugzilla can be closed.

Comment 13 Hanns-Joachim Uhl 2017-12-13 13:46:19 UTC
oops ... the previous comment has to read:
"
Not seeing the reported bug.

# numastat 9018

Per-node process memory usage (in MBs) for PID 9018 (qemu-kvm)
                           Node 0           Total
                  --------------- ---------------
Huge                         0.00            0.00
Heap                        32.00           32.00
Stack                        0.06            0.06
Private                   1009.44         1009.44
----------------  --------------- ---------------
Total                     1041.50         1041.50


numactl-2.0.9-7.el7

This bugzilla can be closed.
"
...

Comment 16 errata-xmlrpc 2018-04-10 10:06:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:0691