RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 668340 - NUMA is not recognized for nec-em25.rhts.eng.bos.redhat.com
Summary: NUMA is not recognized for nec-em25.rhts.eng.bos.redhat.com
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel
Version: 6.1
Hardware: x86_64
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Cong Wang
QA Contact: Zhang Kexin
URL:
Whiteboard:
Depends On:
Blocks: 668681
TreeView+ depends on / blocked
 
Reported: 2011-01-10 06:14 UTC by Zhang Kexin
Modified: 2018-11-14 14:44 UTC (History)
4 users (show)

Fixed In Version: kernel-2.6.32-112.el6
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-05-19 12:54:31 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
kernel message for rhel6 (143.34 KB, application/octet-stream)
2011-01-10 06:14 UTC, Zhang Kexin
no flags Details
kernel message for rhel5.6 (195.85 KB, application/x-troff-man)
2011-01-10 06:14 UTC, Zhang Kexin
no flags Details
Untested patch (2.86 KB, patch)
2011-01-10 11:04 UTC, Cong Wang
no flags Details | Diff
Upstream patch (5.26 KB, patch)
2011-01-10 14:08 UTC, Cong Wang
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2011:0542 0 normal SHIPPED_LIVE Important: Red Hat Enterprise Linux 6.1 kernel security, bug fix and enhancement update 2011-05-19 11:58:07 UTC

Description Zhang Kexin 2011-01-10 06:14:00 UTC
Created attachment 472516 [details]
kernel message for rhel6

Description of problem:
NUMA is not recognized for nec-em25.rhts.eng.bos.redhat.com when rhel6 and 6.1 x86_64 is installed on it. while rhel5 can recognize NUMA on it.

Version-Release number of selected component (if applicable):
rhel6(2.6.32-71) and rhel6.1(2.6.32-94) x86_64. 

How reproducible:
always.

Steps to Reproduce:
1. install rhel6 x86_64 on nec-em25.rhts.eng.bos.redhat.com.
2. ls /sys/devices/system/node/
3.
  
Actual results:
only node1 is listed.

Expected results:
should have 8 nodes.

Additional info:

on rhel6:
[root@nec-em25 ~]# numactl  --hardware
available: 1 nodes (0)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 

on rhel5.6
[root@nec-em25 ~]#  numactl  --hardware
available: 8 nodes (0-7)
node 0 size: 16045 MB
node 0 free: 15346 MB
node 1 size: 16160 MB
node 1 free: 16114 MB
node 2 size: 16160 MB
node 2 free: 16127 MB
node 3 size: 16160 MB
node 3 free: 16134 MB
node 4 size: 16160 MB
node 4 free: 15973 MB
node 5 size: 16160 MB
node 5 free: 16134 MB
node 6 size: 16160 MB
node 6 free: 16133 MB
node 7 size: 16096 MB
node 7 free: 16058 MB
node distances:
node   0   1   2   3   4   5   6   7 
  0:  10  15  15  20  20  15  20  20 
  1:  15  10  20  15  20  20  20  20 
  2:  15  20  10  15  20  20  20  15 
  3:  20  15  15  10  15  20  20  20 
  4:  20  20  20  15  10  15  15  20 
  5:  15  20  20  20  15  10  20  15 
  6:  20  20  20  20  15  20  10  15 
  7:  20  20  15  20  20  15  15  10 

/var/log/messages for rhel5.6 and rhel6 will be attached.

Comment 2 Zhang Kexin 2011-01-10 06:14:37 UTC
Created attachment 472517 [details]
kernel message for rhel5.6

Comment 3 Zhang Kexin 2011-01-10 06:17:52 UTC
sorry the whole numactl output for rhel6 is:

[root@nec-em25 ~]# numactl  --hardware
available: 1 nodes (0)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127
node 0 size: 130927 MB
node 0 free: 126026 MB
node distances:
node   0 
0:  10

Comment 5 Cong Wang 2011-01-10 06:25:15 UTC
This sounds like a regression. On RHEL5 nodes are detected correctly with different distances which should be the distances physically, however, on RHEL6 all memory are put into one node which could hurt the performance since there are different distances to access different memory.

Comment 6 Cong Wang 2011-01-10 09:12:53 UTC
REHL5 and RHEL6 detect same SRAT, but on RHEL6 we got:

SRAT: PXMs only cover 98159MB of your 130927MB e820 RAM. Not used.

It seems something wrong with __absent_pages_in_range(), 32768M is missed by its calculation!

Comment 7 Cong Wang 2011-01-10 09:52:04 UTC
Boot with "loglevel=8 mminit_loglevel=4":

Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(0, 0x10, 0x9b) 0 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(0, 0x100, 0x7702b) 1 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(0, 0x7724c, 0x7724e) 2 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(0, 0x773e7, 0x7b206) 3 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(0, 0x7b409, 0x7b600) 4 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(0, 0x100000, 0x280000) 5 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(0, 0x280000, 0x380000) 6 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(0, 0x380000, 0x480000) 6 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(2, 0x880000, 0x980000) 6 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(2, 0x980000, 0xa80000) 7 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(2, 0xa80000, 0xb80000) 7 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(2, 0xb80000, 0xc80000) 7 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(1, 0x480000, 0x580000) 7 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(1, 0x580000, 0x680000) 8 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(1, 0x680000, 0x780000) 8 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(1, 0x780000, 0x880000) 8 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(3, 0xc80000, 0xd80000) 8 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(3, 0xd80000, 0xe80000) 9 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(3, 0xe80000, 0xf80000) 9 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(3, 0xf80000, 0x1080000) 9 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(4, 0x1080000, 0x1180000) 9 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(4, 0x1180000, 0x1280000) 10 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(4, 0x1280000, 0x1380000) 10 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(4, 0x1380000, 0x1480000) 10 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(6, 0x1880000, 0x1980000) 10 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(6, 0x1980000, 0x1a80000) 11 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(6, 0x1a80000, 0x1b80000) 11 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(6, 0x1b80000, 0x1c80000) 11 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(5, 0x1480000, 0x1580000) 11 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(5, 0x1580000, 0x1680000) 12 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(5, 0x1680000, 0x1780000) 12 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(5, 0x1780000, 0x1880000) 12 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(7, 0x1c80000, 0x1d80000) 12 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(7, 0x1d80000, 0x1e80000) 13 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(7, 0x1e80000, 0x1f80000) 13 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(7, 0x1f80000, 0x207c000) 13 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(0, 0x10, 0x9b) 0 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(0, 0x100, 0x7702b) 1 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(0, 0x7724c, 0x7724e) 2 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(0, 0x773e7, 0x7b206) 3 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(0, 0x7b409, 0x7b600) 4 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::memory_register Entering add_active_range(0, 0x100000, 0x207c000) 5 entries of 25600 used
Jan 10 04:36:10 nec-em25 kernel: mminit::pageflags_layout_widths Section 0 Node 9 Zone 2 Flags 26
Jan 10 04:36:10 nec-em25 kernel: mminit::pageflags_layout_shifts Section 19 Node 9 Zone 2
Jan 10 04:36:10 nec-em25 kernel: mminit::pageflags_layout_offsets Section 0 Node 55 Zone 53
Jan 10 04:36:10 nec-em25 kernel: mminit::pageflags_layout_zoneid Zone ID: 53 -> 64
Jan 10 04:36:10 nec-em25 kernel: mminit::pageflags_layout_usage location: 64 -> 53 unused 53 -> 26 flags 26 -> 0
Jan 10 04:36:10 nec-em25 kernel: mminit::memmap_init Initialising map node 0 zone 0 pfns 16 -> 4096
Jan 10 04:36:10 nec-em25 kernel: mminit::memmap_init Initialising map node 0 zone 1 pfns 4096 -> 1048576
Jan 10 04:36:10 nec-em25 kernel: mminit::memmap_init Initialising map node 0 zone 2 pfns 1048576 -> 34062336

Comment 8 Cong Wang 2011-01-10 10:20:58 UTC
Jan 10 04:36:10 nec-em25 kernel: BIOS-provided physical RAM map:
Jan 10 04:36:10 nec-em25 kernel: BIOS-e820: 0000000000000000 - 000000000009b400 (usable)
Jan 10 04:36:10 nec-em25 kernel: BIOS-e820: 000000000009b400 - 00000000000a0000 (reserved)
Jan 10 04:36:10 nec-em25 kernel: BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
Jan 10 04:36:10 nec-em25 kernel: BIOS-e820: 0000000000100000 - 000000007702b000 (usable)
Jan 10 04:36:10 nec-em25 kernel: BIOS-e820: 000000007702b000 - 00000000770ec000 (ACPI NVS)
Jan 10 04:36:10 nec-em25 kernel: BIOS-e820: 00000000770ec000 - 000000007714e000 (ACPI data)
Jan 10 04:36:10 nec-em25 kernel: BIOS-e820: 000000007714e000 - 0000000077167000 (reserved)
Jan 10 04:36:10 nec-em25 kernel: BIOS-e820: 0000000077167000 - 0000000077168000 (ACPI NVS)
Jan 10 04:36:10 nec-em25 kernel: BIOS-e820: 0000000077168000 - 0000000077179000 (reserved)
Jan 10 04:36:10 nec-em25 kernel: BIOS-e820: 0000000077179000 - 000000007717c000 (ACPI NVS)
Jan 10 04:36:10 nec-em25 kernel: BIOS-e820: 000000007717c000 - 0000000077230000 (reserved)
Jan 10 04:36:10 nec-em25 kernel: BIOS-e820: 0000000077230000 - 0000000077239000 (ACPI NVS)
Jan 10 04:36:10 nec-em25 kernel: BIOS-e820: 0000000077239000 - 000000007724c000 (reserved)
Jan 10 04:36:10 nec-em25 kernel: BIOS-e820: 000000007724c000 - 000000007724e000 (usable)
Jan 10 04:36:10 nec-em25 kernel: BIOS-e820: 000000007724e000 - 0000000077250000 (ACPI NVS)
Jan 10 04:36:10 nec-em25 kernel: BIOS-e820: 0000000077250000 - 0000000077285000 (reserved)
Jan 10 04:36:10 nec-em25 kernel: BIOS-e820: 0000000077285000 - 0000000077399000 (ACPI NVS)
Jan 10 04:36:10 nec-em25 kernel: BIOS-e820: 0000000077399000 - 00000000773e7000 (reserved)
Jan 10 04:36:10 nec-em25 kernel: BIOS-e820: 00000000773e7000 - 000000007b206000 (usable)
Jan 10 04:36:10 nec-em25 kernel: BIOS-e820: 000000007b206000 - 000000007b409000 (ACPI NVS)
Jan 10 04:36:10 nec-em25 kernel: BIOS-e820: 000000007b409000 - 000000007b600000 (usable)
Jan 10 04:36:10 nec-em25 kernel: BIOS-e820: 000000007b600000 - 000000007b800000 (reserved)
Jan 10 04:36:10 nec-em25 kernel: BIOS-e820: 0000000080000000 - 0000000091000000 (reserved)
Jan 10 04:36:10 nec-em25 kernel: BIOS-e820: 00000000fed1c000 - 00000000fed20000 (reserved)
Jan 10 04:36:10 nec-em25 kernel: BIOS-e820: 00000000ff000000 - 0000000100000000 (reserved)
Jan 10 04:36:10 nec-em25 kernel: BIOS-e820: 0000000100000000 - 000000207c000000 (usable)

Comment 9 Cong Wang 2011-01-10 10:22:35 UTC
__absent_pages_in_range() doesn't count e820 reserved ranges, while e820_hole_size() counts that, thus causes this bug.

Comment 10 Cong Wang 2011-01-10 11:04:42 UTC
Created attachment 472564 [details]
Untested patch

Will test it now...

Comment 11 Cong Wang 2011-01-10 14:08:33 UTC
Created attachment 472596 [details]
Upstream patch

Upstream has fixed this, this should be the right patch...

Comment 12 Cong Wang 2011-01-11 09:04:02 UTC
With that patch applied, NUMA gets back:

[root@nec-em25 ~]# numactl --hardware
available: 6 nodes (1-3,5-7)
node 1 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 64 65 66 67 68 69 70 71 72
73 74 75 76 77 78 79
node 1 size: 16303 MB
node 1 free: 15616 MB
node 2 cpus: 16 17 18 19 20 21 22 23 80 81 82 83 84 85 86 87
node 2 size: 16384 MB
node 2 free: 15821 MB
node 3 cpus: 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 88 89 90 91 92 93
94 95 96 97 98 99 100 101 102 103
node 3 size: 16384 MB
node 3 free: 15619 MB
node 5 cpus: 40 41 42 43 44 45 46 47 104 105 106 107 108 109 110 111
node 5 size: 16384 MB
node 5 free: 15784 MB
node 6 cpus: 48 49 50 51 52 53 54 55 112 113 114 115 116 117 118 119
node 6 size: 16384 MB
node 6 free: 15668 MB
node 7 cpus: 56 57 58 59 60 61 62 63 120 121 122 123 124 125 126 127
node 7 size: 16320 MB
node 7 free: 15755 MB
No distance information available.

But there is still some problem, I reported it as Bug 668681.

Comment 13 RHEL Program Management 2011-01-11 09:10:51 UTC
This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux maintenance release. Product Management has 
requested further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed 
products. This request is not yet committed for inclusion in an Update release.

Comment 14 Aristeu Rozanski 2011-02-03 15:39:17 UTC
Patch(es) available on kernel-2.6.32-112.el6

Comment 20 errata-xmlrpc 2011-05-19 12:54:31 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0542.html


Note You need to log in before you can comment on or make changes to this bug.