1253842 – [aarch64] lscpu shows 4 sockets on a single socket SoC

Bug 1253842 - [aarch64] lscpu shows 4 sockets on a single socket SoC

Summary: [aarch64] lscpu shows 4 sockets on a single socket SoC

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	kernel
Sub Component:
Version:	rawhide
Hardware:	aarch64
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Jon Masters
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	ARMTracker
TreeView+	depends on / blocked

Reported:	2015-08-14 21:22 UTC by Richard W.M. Jones
Modified:	2019-03-05 14:01 UTC (History)
CC List:	11 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2019-03-05 14:01:52 UTC
Type:	Bug
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
dmesg (33.76 KB, text/plain) 2015-08-14 21:27 UTC, Richard W.M. Jones	no flags	Details
proc.tar.gz (938.41 KB, application/x-gzip) 2015-09-01 05:48 UTC, Richard W.M. Jones	no flags	Details
sys.tar.gz (1.57 MB, application/x-gzip) 2015-09-01 05:48 UTC, Richard W.M. Jones	no flags	Details
mustang.tar.gz (3.43 KB, application/x-gzip) 2015-09-01 05:50 UTC, Richard W.M. Jones	no flags	Details
View All

Description Richard W.M. Jones 2015-08-14 21:22:25 UTC

Description of problem:

This is on Fedora Rawhide aarch64.  The hardware is the APM Mustang.

# lscpu
Architecture:          aarch64
Byte Order:            Little Endian
CPU(s):                8
On-line CPU(s) list:   0-7
Thread(s) per core:    1
Core(s) per socket:    2
Socket(s):             4

This system is definitely not 4 sockets :-)  It has a single
system on chip (SoC), ie. 1 socket, with 8 real cores (no threads).

Version-Release number of selected component (if applicable):

kernel 4.0.7-300.fc22.aarch64
util-linux-2.26.2-3.fc24.aarch64

Comment 1 Richard W.M. Jones 2015-08-14 21:27:28 UTC

With kernel 4.2.0-0.rc2.git0.1.fc23.aarch64 the output is the same:

# lscpu
Architecture:          aarch64
Byte Order:            Little Endian
CPU(s):                8
On-line CPU(s) list:   0-7
Thread(s) per core:    1
Core(s) per socket:    2
Socket(s):             4

I'm attaching the dmesg output in case that is useful.

Comment 2 Richard W.M. Jones 2015-08-14 21:27:55 UTC

Created attachment 1063171 [details]
dmesg

Comment 3 Don Dutile (Red Hat) 2015-08-14 23:30:13 UTC

Currently, the report by lspu, which is derived from the cpu topology under sysfs, is comprised from two possible sources:
-- DT entries, which are often not filled out;
-- but more of it is getting filled out for NUMA-based machines/SoCs,
b/c correct cpu tolopolgy is critical for the kernel numa code to
DTRT.
-- note: ACPI-based NUMA specification, SRAT, solves most of the issue
except for the next field & its correlation to SRAT-cpu-values.
-- MPIDR Aff<x> fields.

On ARMv8, unlike ARMv7, MPIDR contents are *implementation specific* -- an architected register w/un-architected contents. duh!
ARMv7 had a simple spec that lower 8-bits(Aff0) were cores, next 8 bits(Aff1) were 'clusters' that shared L2 caches; I think the Aff3 field was for shared L3. But since 99.999999% of ARMv7's are single socket, relative small core counts, it was easy to make the MPIDR register go from unspecified to 'we all agree to do it this way' ... b/c they happen to do it that way.

On ARMv8, the upstream topology code in interpreting Aff1 as the socket count,
when in fact, it is a 'cluster' index, indicating shared caches.

If you run the same command on a RHELSA kernel, you will find proper values, because RHELSA is carrying a patch that works for Xgene & Seattle. IIRC, that patch will fail for ThunderX, and will need further modification.

jcm has a to-do on this issue w/ARM & to come up with an ACPI-based solution that will yield the equivalent of x86's cpu-id -based, cpu topology information. [read the arch/x86/kernel/topology.c code to learn the details, and search the Intel tech pages for a paper on their topology architecture wrt cpu-id probing. The DT-based solution is TBD, and the DT-huggers can resolve it with yet-another-specification-to-cpu-DT.

Until ARM & it's partners determine/specify/require an architected cpu topology specification, this problem becomes a per-SoC, 'quirk' to the arch/arm64/kernel/topology.c support.

Comment 4 Karel Zak 2015-08-17 10:23:27 UTC

So if I read correctly Don's comment #3 then the problem is mostly in stuff exported by kernel in /sys and for patched kernels (e.g. RHELSA) lscpu generates valid output. Right?  ... then it's time to reassign to kernel :-)

Richard, it would be nice to have /sys and /proc dump from the system, in upstream tree we have a script

https://raw.githubusercontent.com/karelzak/util-linux/master/tests/ts/lscpu/mk-input.sh

to create a tarball. Please, send me (or add to BZ) the tarball.

Comment 5 Richard W.M. Jones 2015-09-01 05:48:00 UTC

Created attachment 1068857 [details]
proc.tar.gz

Comment 6 Richard W.M. Jones 2015-09-01 05:48:30 UTC

Created attachment 1068858 [details]
sys.tar.gz

Comment 7 Richard W.M. Jones 2015-09-01 05:50:59 UTC

Created attachment 1068859 [details]
mustang.tar.gz

I should have read the whole comment before posting those
previous attachments.  Attached is the output from the command

sudo ./mk-input.sh mustang

Comment 8 Jeremy Linton 2019-03-04 22:04:58 UTC

This is an old one, and while not fixed on the mustang, should be fixed on any aarch64/ACPI machine with a correct PPTT.

It maybe worth closing this.

Comment 9 Laura Abbott 2019-03-05 14:01:52 UTC

Thanks. I'm going to close this then. I think the best approach would be to reopen for specific machines that don't have a correct PPTT is problems still show up.

Note You need to log in before you can comment on or make changes to this bug.