Bug 1545539 - jemalloc(aarch64): munmap - Invalid Argument - build-time pagesize 4k != Fedora kernel (64k)
Summary: jemalloc(aarch64): munmap - Invalid Argument - build-time pagesize 4k != Fedo...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: jemalloc
Version: 28
Hardware: aarch64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Ingvar Hagelund
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: 1548039
TreeView+ depends on / blocked
 
Reported: 2018-02-15 09:16 UTC by Daniel Black
Modified: 2018-03-30 12:47 UTC (History)
4 users (show)

Fixed In Version: jemalloc-4.5.0-5.fc26 jemalloc-4.5.0-5.fc27 jemalloc-5.0.1-5.fc28
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-03-20 17:31:36 UTC
Type: Bug


Attachments (Terms of Use)

Description Daniel Black 2018-02-15 09:16:28 UTC
Description of problem:

As per https://github.com/jemalloc/jemalloc/issues/467, jemalloc must have a page size of the maximium kernel size, i.e. compiled with --with-lg-page=16.

Without this it will use a the kernel page size (4k as per Fedora builders) which results in "<jemalloc>: Error in munmap(): Invalid argument" during use. Because you can't unmap 4k when the kernel size is 64k.

Version-Release number of selected component (if applicable):

 jemalloc-5.0.1-3.fc28

How reproducible:

Run mariadb-10.1 test suite:

like:
https://koji.fedoraproject.org/koji/getfile?taskID=24980424&volume=DEFAULT&name=build.log


Actual results:

munmap() - Invalid argument - unhappyness

Expected results:

munmap() - valid argument - happyness


Additional info:

The build:
https://kojipkgs.fedoraproject.org/work/tasks/7158/25047158/build.log

+ echo 'What is the pagesize?'
+ getconf PAGESIZE
4096

Without a --with-lg-page=16 configure option this PAGESIZE was used in the build.


https://bugzilla.redhat.com/show_bug.cgi?id=1405047
https://jira.mariadb.org/browse/MDEV-15303

Comment 1 Daniel Black 2018-02-15 09:22:10 UTC
note: please also fix the --with-lg-page=16 for ppc64, ppc64le as its kernel page size can be 4k or 64k too. It just happens that Fedora's ppc64 builders are on 64k currently.

Comment 2 Daniel Black 2018-02-15 21:14:02 UTC
Probably a good idea to make build infrastructure use distro kernels too.

Comment 3 Michal Schorm 2018-02-18 15:55:09 UTC
Blocks MariaDB 10.1.31 update on F26.

For testing please use:
https://github.com/devexp-db/mariadb/tree/10.1.31

For questions about mariadb contact me whenever :)

Comment 4 Fedora End Of Life 2018-02-20 15:31:28 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 28 development cycle.
Changing version to '28'.

Comment 5 Ingvar Hagelund 2018-03-06 20:59:31 UTC
Looks good to me. Seems to solve https://github.com/varnishcache/varnish-cache/issues/2521 as well :-)

I'll push builds for fc28 tonight.

Ingvar

Comment 6 Michal Schorm 2018-03-06 21:03:13 UTC
That's great! :)

Please, take a look at F27 & F26 too.

Comment 7 Ingvar Hagelund 2018-03-06 21:39:11 UTC
fedora 26,27 does not have jemalloc-5.0. They are still on 4.4.0 and 4.5.0. To be able to support the newer MariaDB on these fedora releases, that means I'll have to upgrade jemalloc to a new major release after a Fedora stable release. I'm not sure if I should do that.

At least, I would have to make all maintainers check that their packages work as expected with jemalloc-5.0

I can fix jemalloc itself and varnish. You may check mariadb. For the rest, that is blender, bro, neovim, and redis, I'm unsure on how to proceed.

f27$ dnf repoquery --whatrequires jemalloc
blender-1:2.79-1.fc27.x86_64
blenderplayer-1:2.79-1.fc27.x86_64
bro-0:2.4.1-3.fc25.x86_64
jemalloc-devel-0:4.5.0-4.fc27.i686
jemalloc-devel-0:4.5.0-4.fc27.x86_64
mariadb-server-3:10.2.9-3.fc27.x86_64
neovim-0:0.2.0-1.fc27.x86_64
neovim-0:0.2.2-1.fc27.x86_64
redis-0:3.2.11-1.fc27.x86_64
redis-0:4.0.8-1.fc27.x86_64
varnish-0:5.1.3-2.fc27.x86_64
varnish-0:5.1.3-4.fc27.x86_64


Ingvar

Comment 8 Michal Schorm 2018-03-06 21:58:58 UTC
(In reply to Ingvar Hagelund from comment #7)
> fedora 26,27 does not have jemalloc-5.0. They are still on 4.4.0 and 4.5.0.
> To be able to support the newer MariaDB on these fedora releases, that means
> I'll have to upgrade jemalloc to a new major release after a Fedora stable
> release. I'm not sure if I should do that.

No, please don't!
MariaDB (well, TokuDB storage engine to be exact, originating from the Percona upstream) does not support Jemmaloc 5.

That's another issue I'm solving on mentioned upstreams.

--

For now, I want 2 things:

1) Fix F26 Jemalloc
- there's mariadb 10.1, which has problems if the jemalloc is not compiled with "-with-lg-page=16"

2) Fix the Jemalloc on all other Fedoras
- because the issue is IMHO general and beacuse the Fedora use 64k kernel page size, you should compile Jemalloc with "-with-lg-page=16" to work correctly *with any software* on all Fedora versions

Comment 9 Ingvar Hagelund 2018-03-06 22:07:31 UTC
I'm not sure that would work across different pagesizes, which is the main culprit here.

https://github.com/jemalloc/jemalloc/pull/769/commits/e4827d5c16f1411729c6d426a4d5792a8d8e88aa

If I read that correctly, supporting configuring a pagesize larger than the actual running system's pagesize, is a feature that was added to jemalloc-5.0.

Ingvar

Comment 10 Ingvar Hagelund 2018-03-06 22:09:19 UTC
... and I don't think Fedora uses 64k kernel pagesize on all arches. For instance, ppc/64/64le have had special pagesizes for years.

Comment 11 Daniel Black 2018-03-07 03:30:25 UTC
> If I read that correctly, supporting configuring a pagesize larger than the 
> actual running system's pagesize, is a feature that was added to jemalloc-5.0.

Correct.

> ... and I don't think Fedora uses 64k kernel pagesize on all arches. For instance, ppc/64/64le have had special pagesizes for years.

Correct.

I think setting --with-lg-page=16 on arches with 64k page sizes default page sizes is the right solution which is a compile time hardening against a 4k page size kernel running. Having build infrastructure running with a default kernel is also recommended to ensure the tests of packages run when distributed (which is why ppc64{,le} doesn't currently hit this bug).

While setting it to 4k explicitly on dual pages size arches like ppc64/ppc64le/aarch64 will actually work whether the kernel page size is 4k or 64k, this does incur an added performance overhead in memory management which is what jemalloc intends to solve.

There is rarely a reason to lower the kernel page size on these kernels. I've seen it once as a temporary solution for code, like jemalloc, that by assumption or design, failed to retrieve the pagesize at runtime (sysconf(_SC_PAGESIZE)).

I don't know if a pre-install check to see if the running size matches the configure size is something you wish to consider.

Comment 12 Ingvar Hagelund 2018-03-07 07:02:16 UTC
As the default compile time options seem to work on the ppc arches (and s390x too), if I should change this, it should be on the arches that have this problem.

Also, if --with-lg-page is not set, jemalloc should check that page size matches  at build time, and this have not been an actual problem for a long time.

Michal, I therefore wonder if the problem you observe is bound to an external library which handles this incorrectly, like my problem with pcre with jit in varnish.

If that is the case, hardcoding the pagesize for jemalloc-4.x to 4k on x86_64, and to 64k on ppc64 et al, probably won't help, though you might be lucky

Just for curiousness, please try to build jemalloc-4.5.0 with --with-lg-page=12 and see if that helps on fedora 26 / x86_64.

Ingvar

Comment 13 Ingvar Hagelund 2018-03-07 11:15:42 UTC
(In reply to Daniel Black from comment #11)
> While setting it to 4k explicitly on dual pages size arches like
> ppc64/ppc64le/aarch64 will actually work whether the kernel page size is 4k
> or 64k, this does incur an added performance overhead in memory management
> which is what jemalloc intends to solve.

Daniel,

As just setting 64k everywhere, and let jemalloc handle the rest itself, seem to work, as in the mentioned added support in jemalloc-5.0, I'll do that in fedora 28+. Does that sound allright?

Then I need a fix for fedora 26 and 27: I could build jemalloc with pagesizes according to a fixed scheme, instead of relying on the builder's pagesize, using simple %ifarch.

Then what are the correct pagesizes? Is this table correct?

i686:     4k
x86_64:   4k
armv7hl:  4k
s390x:    4k
aarch64: 64k
ppc64:   64k
ppc64le: 64k


Ingvar

Comment 14 Ingvar Hagelund 2018-03-07 18:59:21 UTC
jemalloc won't build on RedHat's aarch64 builders, as they run 4k pagesize kernels, so for now:

%ifarch %ix86 %arm x86_64 s390x aarch64
%define lg_page --with-lg-page=12
%endif

%ifarch ppc64 ppc64le
%define lg_page --with-lg-page=16
%endif

Full f26 specfile: https://ingvar.fedorapeople.org/jemalloc/jemalloc.spec

Builds: https://koji.fedoraproject.org/koji/taskinfo?taskID=25543005 

Please try these builds for fedora 26, and report back.

Ingvar

Comment 15 Daniel Black 2018-03-07 23:05:24 UTC
Seems I brain farted with the below comment.

> While setting it to 4k explicitly on dual pages size arches like ppc64/ppc64le/aarch64 will actually work whether the kernel page size is 4k or 64k,

The larger jemalloc page size 64k, will work on kernel page size 4k and 64k.

As such aarch64 should be with the ppc64 and ppc64le in the spec above at --with-lg-page=16.

I think the same arch dependent configure should apply to jemalloc-5.0.

Michal's build log (no missing) was mariadb using jemalloc directly. The varnish build/test log didn't show munmap syscalls failing so it looks like a different problem.

I'm slight confused by the thp patch that seems to be applied in some cases but THP tests still run successfully. I'm hoping the build kernel/OS configuration there matches the release kernels in this respect.

Comment 16 Ingvar Hagelund 2018-03-08 09:47:28 UTC
(In reply to Daniel Black from comment #15)
> Seems I brain farted with the below comment.
> 
> > While setting it to 4k explicitly on dual pages size arches like ppc64/ppc64le/aarch64 will actually work whether the kernel page size is 4k or 64k,
> 
> The larger jemalloc page size 64k, will work on kernel page size 4k and 64k.
>
> As such aarch64 should be with the ppc64 and ppc64le in the spec above at
> --with-lg-page=16.

With jemalloc-5.0, yes. With jemalloc-4.x, no. For all I know, it way work at runtime, but it won't build.

jemalloc-4.x does some sanity checks at build time, and won't play along with 64k pagesize on a system running with 4k pagesize. The fedora aarch64 builders have 4k pagesize. So for now, I must go with 4k pagesize for jemalloc-4.x on aarch64.

Btw, This is exactly what https://github.com/jemalloc/jemalloc/pull/769/commits/e4827d5c16f1411729c6d426a4d5792a8d8e88aa is addressing, so this problem is fixed in jemalloc-5.0.

> I think the same arch dependent configure should apply to jemalloc-5.0.

Okay, I'll go with that on jemalloc-5.0, even with 64k on aarch64.
 
Ingvar

Comment 17 Fedora Update System 2018-03-08 14:42:22 UTC
jemalloc-4.5.0-5.fc27 has been submitted as an update to Fedora 27. https://bodhi.fedoraproject.org/updates/FEDORA-2018-c0dcbdd82f

Comment 18 Fedora Update System 2018-03-08 14:42:31 UTC
jemalloc-5.0.1-5.fc28 has been submitted as an update to Fedora 28. https://bodhi.fedoraproject.org/updates/FEDORA-2018-0e404615ab

Comment 19 Fedora Update System 2018-03-08 14:42:38 UTC
jemalloc-4.5.0-5.fc26 has been submitted as an update to Fedora 26. https://bodhi.fedoraproject.org/updates/FEDORA-2018-62ddf35772

Comment 20 Fedora Update System 2018-03-08 16:24:56 UTC
jemalloc-4.5.0-5.fc27 has been pushed to the Fedora 27 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-c0dcbdd82f

Comment 21 Fedora Update System 2018-03-08 16:28:42 UTC
jemalloc-4.5.0-5.fc26 has been pushed to the Fedora 26 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-62ddf35772

Comment 22 Fedora Update System 2018-03-09 23:25:50 UTC
jemalloc-5.0.1-5.fc28 has been pushed to the Fedora 28 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-0e404615ab

Comment 23 Fedora Update System 2018-03-20 17:31:36 UTC
jemalloc-4.5.0-5.fc26 has been pushed to the Fedora 26 stable repository. If problems still persist, please make note of it in this bug report.

Comment 24 Fedora Update System 2018-03-20 18:19:06 UTC
jemalloc-4.5.0-5.fc27 has been pushed to the Fedora 27 stable repository. If problems still persist, please make note of it in this bug report.

Comment 25 Fedora Update System 2018-03-30 12:47:15 UTC
jemalloc-5.0.1-5.fc28 has been pushed to the Fedora 28 stable repository. If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.