Bug 1463241

Summary: rlimit_stack problems after update to 3.10.0-514.21.2.el7, and JVM Crash after updating to kernel-3.10.0-514.21.2.el7.x86_64
Product: Red Hat Enterprise Linux 7 Reporter: Markus Frosch <markus.frosch>
Component: kernelAssignee: Larry Woodman <lwoodman>
kernel sub component: Memory Management QA Contact: Li Wang <liwan>
Status: CLOSED ERRATA Docs Contact:
Severity: urgent    
Priority: urgent CC: ajb, akkornel, Andreas.Lehr, anrussel, aogburn, avettath, azone, bhaubeck, bloch, brian.hoppus, carnil, ccheney, chorn, christian.dengler, chrlee, cisley, cye, deryni, dhoward, diego_lozano_a, egolov, fhirtz, fweimer, gangelop, gholms, hagberg, herrold, hmadhava, hmatsumo, hpham, jaroslaw.polok, jdatta, jos100, jscalf, jualvare, kbost, klaas, knweiss, kolshanov, kperrier, loberman, lwoodman, mdshaikh, michael.friedrich, mirco.santori, mkolbas, mmilgram, onatalen, onestero, pasik, pasteur, pbokoc, pchavan, phil, p.malishev, pmatouse, pragshar, qguo, rbeyel, rbost, rcyriac, rhbugs, rickatnight11, ripleymj, rratkiewicz, sfalzara, shuwang, smeyer, sreber, stanislav.moravec, taylor.gresser, thomas.oulevey, tlavigne, toracat, trond, vagrawal, vcojot, volodymyrgl, wmealing, Yannick.Charton
Version: 7.3Keywords: Regression, ZStream
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: kernel-3.10.0-690.el7 Doc Type: Bug Fix
Doc Text:
Prior to this update, a bug in the kernel prevented executables from starting if the maximum process stack size (rlimit_stack) was set to a value below approximately 4 MB. This update fixes the search for unmapped address ranges (suitable gap) in unmapped_area() and unmapped_area_topdown() by ensuring that the gap_end is always larger than gap_start. As a result, executables can be started with a limited process stack size as expected.
Story Points: ---
Clone Of:
: 1466138 1466235 (view as bug list) Environment:
Last Closed: 2017-08-02 07:43:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1463491    
Bug Blocks: 1461333, 1463688, 1464290, 1466138, 1466235, 1466921, 1466923, 1466925, 1466927, 1504288    

Description Markus Frosch 2017-06-20 12:20:53 UTC
After updating to 3.10.0-514.21.2.el7, probably related to the stackguard patches, you can't set a low rlimit_stack for processes.

When setting a Limit to a value lower than ~ 4.5 MB, the system can't start the executable anymore.

How reproducible:

$ ulimit -s 1024
$ /bin/true
bash: /bin/true: Argument list too long

$ ulimit -s 4096
$ /bin/true
bash: /bin/true: Argument list too long

What works:

$ ulimit -s 4608
$ /bin/true

Additional info:

This does not happen on latest patched versions of Debian jessie + stretch.

We haven't yet tested RHEL 6, or other distributions.

Similar problems happen in systemd, then systemd even has internal problems trying to start your daemon...


[Unit]
Description=Test problems with LimitRSTACK

[Service]
Type=oneshot
ExecStart=/bin/true
#LimitSTACK=256K
LimitSTACK=4M

Comment 2 Markus Frosch 2017-06-20 13:45:55 UTC
RHEL 6 seems to be fine:

[root@rhel6-test ~]# uname -a
Linux rhel6-test.localdomain 2.6.32-696.3.2.el6.x86_64 #1 SMP Wed Jun 7 11:51:39 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
[root@rhel6-test ~]# bash -c "ulimit -s 256; /bin/true; echo 'Works.'"
Works.

Comment 4 Michael Friedrich 2017-06-21 07:59:34 UTC
Hi,

I wasn't able to access this ticket yesterday, so I've opened a CentOS ticket (https://bugs.centos.org/view.php?id=13453). I'll add my findings over here too.

We have a workaround in place to Icinga, an advisory for our users is here: https://www.icinga.com/2017/06/20/advisory-for-latest-security-updates-on-rhel-7/

Still the kernel update just kills icinga2 too and requires a manual workaround patch.


### Tests 

4112 kbytes works.

# uname -a ; ulimit -s 4112 && /bin/true && echo "works"
Linux icinga2 3.10.0-514.21.2.el7.x86_64 #1 SMP Tue Jun 20 12:24:47 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
works

4111 kbytes does not (use a new shell)

# uname -a ; ulimit -s 4111 && /bin/true && echo "works"
Linux icinga2 3.10.0-514.21.2.el7.x86_64 #1 SMP Tue Jun 20 12:24:47 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
-bash: /bin/true: Argument list too long

### Code

I'm no kernel dev and don't know any specifics about the code, just know some C/C++. If my findings are wrong, just ignore them.

CentOS pushed their updates and kernel sources yesterday too, so I used that as reference. Markus looked into the RHEL sources yesterday afternoon, but we don't really know if differences matter or how the code works exactly.

```
wget http://vault.centos.org/7.3.1611/updates/Source/SPackages/kernel-3.10.0-514.21.1.el7.src.rpm
wget http://vault.centos.org/7.3.1611/updates/Source/SPackages/kernel-3.10.0-514.21.2.el7.src.rpm

mkdir 1 2

cd 1 && rpm2cpio ../kernel-3.10.0-514.21.1.el7.src.rpm | cpio -ivd && tar xf linux-3.10.0-514.21.1.el7.tar.xz && cd ..
cd 2 && rpm2cpio ../kernel-3.10.0-514.21.2.el7.src.rpm | cpio -ivd && tar xf linux-3.10.0-514.21.2.el7.tar.xz && cd ..

diff -ur 1/linux-3.10.0-514.21.1.el7/ 2/linux-3.10.0-514.21.2.el7/ > diff

```

Debian patches are located here: https://anonscm.debian.org/cgit/kernel/linux.git/log/?h=jessie-security

1)

The patch in RHEL/CentOS in stack_guard_area() returns with less than.

```
mm/mmap.c

+int stack_guard_area(struct vm_area_struct *vma, unsigned long address)
...
+ return vma->vm_end - address < stack_guard_gap;
```

The patch released in Debian Jessie-Security https://anonscm.debian.org/cgit/kernel/linux.git/commit/?h=jessie-security&id=af5f37d1b8feebe4cf4976770a6c37f64de817c7

does a less than EQUAL comparison. This may or may not return different booleans.

```
++	return vma->vm_end - address <= stack_guard_gap;
```



2) task_mmu.c differs too.


CentOS

```
diff -ur 1/linux-3.10.0-514.21.1.el7/fs/proc/task_mmu.c 2/linux-3.10.0-514.21.2.el7/fs/proc/task_mmu.c
--- 1/linux-3.10.0-514.21.1.el7/fs/proc/task_mmu.c 2017-04-22 06:17:16.000000000 +0000
+++ 2/linux-3.10.0-514.21.2.el7/fs/proc/task_mmu.c 2017-05-28 20:42:06.000000000 +0000
@@ -293,11 +293,13 @@

        /* We don't show the stack guard page in /proc/maps */
        start = vma->vm_start;
- if (stack_guard_page_start(vma, start))
- start += PAGE_SIZE;
        end = vma->vm_end;
- if (stack_guard_page_end(vma, end))
- end -= PAGE_SIZE;
+ if (stack_guard_area(vma, start)) {
+ if (vma->vm_flags & VM_GROWSDOWN)
+ start += stack_guard_gap;
+ else
+ end -= stack_guard_gap;
+ }
```

There's no explicit check for VM_GROWSUP.


Debian

```
+--- a/fs/proc/task_mmu.c
++++ b/fs/proc/task_mmu.c
+@@ -276,11 +276,14 @@ show_map_vma(struct seq_file *m, struct
+ 
+ /* We don't show the stack guard page in /proc/maps */
+ start = vma->vm_start;
+-	if (stack_guard_page_start(vma, start))
+-	start += PAGE_SIZE;
+ end = vma->vm_end;
+-	if (stack_guard_page_end(vma, end))
+-	end -= PAGE_SIZE;
++	if (vma->vm_flags & VM_GROWSDOWN) {
++	if (stack_guard_area(vma, start))
++	start += stack_guard_gap;
++	} else if (vma->vm_flags & VM_GROWSUP) {
++	if (stack_guard_area(vma, end))
++	end -= stack_guard_gap;
++	}
```


3) 


CentOS

```
@@ -2750,7 +2716,8 @@
                return VM_FAULT_SIGBUS;

        /* Check if we need to add a guard page to the stack */
- if (check_stack_guard_page(vma, address) < 0)
+ if ((vma->vm_flags & (VM_GROWSDOWN|VM_GROWSUP)) &&
+ expand_stack(vma, address) < 0)
                return VM_FAULT_SIGBUS;
```

Debian

```
+@@ -2642,8 +2610,10 @@ static int do_anonymous_page(struct mm_s
+ return VM_FAULT_SIGBUS;
+ 
+ /* Check if we need to add a guard page to the stack */
+-	if (check_stack_guard_page(vma, address) < 0)
+-	return VM_FAULT_SIGSEGV;
++	if (stack_guard_area(vma, address)) {
++	if (expand_stack(vma, address) < 0)
++	return VM_FAULT_SIGSEGV;
++	}
```


Debian returns a SIGSEGV instead of SIGBUS, but I think it could also be related to the difference with just checking vm_flags and not calling stack_guard_area() like Debian does.

The main difference also is that stack_guard_area() checks against VM_GROWSUP only. Not sure why the CentOS patch differs so much (probably the Kernel base version behaves differently?).


Kind regards,
Michael

Comment 6 Kurt Seifried 2017-06-21 16:07:22 UTC
Making this public as per discussion with fweimer@

Comment 9 Larry Woodman 2017-06-22 00:08:25 UTC
Sorry Anthony, I think it did get purged.  I just kicked off a new
build that should be done in a couple hours.  I'll post it here once
its complete.

Larry

Comment 12 Larry Woodman 2017-06-22 01:43:43 UTC
Yes, everything is complete now:

https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=13501436

Comment 16 Larry Woodman 2017-06-22 10:00:54 UTC
Yes, please share this with anyone as necessary.

Larry

Comment 17 Markus Frosch 2017-06-22 10:37:56 UTC
I tested the build in a test VM, works good so far.

Nothing out of the ordinary so far.

Are we allowed to share this build with a mutual customer, or do they have to contact support for a build?

Comment 18 Volodymyr G. Lukiianyk 2017-06-22 11:28:04 UTC
Looking at the differences in src.rpm's for the latest releases, it seems that for el6 (696.3.2 vs 696.3.1) it has an additional change in get_arg_page() function that is absent in el7 (514.21.2 vs 514.21.1):

> @@ -206,6 +206,12 @@ struct page *get_arg_page(struct linux_b
>                 unsigned long size = bprm->vma->vm_end - bprm->vma->vm_start;
>                 struct rlimit *rlim;
>  
> +               /*
> +                * GROWSUP doesn't really have any gap at this stage because we grow
> +                * the stack down now. See the expand_downwards above.
> +                */
> +               if (!IS_ENABLED(CONFIG_STACK_GROWSUP))
> +                       size -= stack_guard_gap;
>                 acct_arg_size(bprm, size / PAGE_SIZE);
>  
>                 /*

This may explain why the problem with executing binaries with smaller limit on stack size is not present on the latest el6 kernel. There is a check lower in the function where this "size" variable is compared to the one fourth of the corresponding RLIMIT_STACK:

>                 /*
>                  * Limit to 1/4-th the stack size for the argv+env strings.
>                  * This ensures that:
>                  *  - the remaining binfmt code will not run out of stack space,
>                  *  - the program will have a reasonable amount of stack left
>                  *    to work from.
>                  */
>                 rlim = current->signal->rlim;
>                 if (size > ACCESS_ONCE(rlim[RLIMIT_STACK].rlim_cur) / 4) {
>                         put_page(page);
>                         return NULL;
>                 }

This may explain why the problem appears with stack limit of ~4 MiB and smaller: 4 MiB / 4 is compared to uncompensated "size" which should be a bit larger than "stack_guard_gap" variable (1 MiB).

Comment 19 Michael Friedrich 2017-06-22 13:56:59 UTC
I've tested the same test build as Markus, and the issue seems gone. Thanks a lot for your efforts and openness to our questions! :)

I'm hoping that an upstream release for all affected users (including the CentOS team) finds its way to the official channels soon :)

### Icinga 2 specific test

#### Problem

[root@icinga2 ~]# uname -a
Linux icinga2 3.10.0-514.21.2.el7.x86_64 #1 SMP Tue Jun 20 12:24:47 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
[root@icinga2 ~]# icinga2 daemon -C
execvp: Argument list too long
[root@icinga2 ~]# echo $?
1

#### Fixed

[root@icinga2 ~]# uname -a
Linux icinga2 3.10.0-514.el7.CVE7.3.z.x86_64 #1 SMP Wed Jun 21 20:13:13 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux

[root@icinga2 ~]# icinga2 daemon -C
information/cli: Icinga application loader (version: v2.6.3-378-g55a057c)
information/cli: Loading configuration file(s).
information/ConfigItem: Committing config item(s).
information/ApiListener: My API identity: icinga2
warning/ApplyRule: Apply rule 'satellite-host' (in /etc/icinga2/conf.d/satellite.conf: 29:1-29:41) for type 'Dependency' does not match anywhere!
information/ConfigItem: Instantiated 4 ApiUsers.
information/ConfigItem: Instantiated 1 ApiListener.
information/ConfigItem: Instantiated 3 Zones.
information/ConfigItem: Instantiated 1 FileLogger.
information/ConfigItem: Instantiated 1 Endpoint.
information/ConfigItem: Instantiated 1 UserGroup.
information/ConfigItem: Instantiated 28 Notifications.
information/ConfigItem: Instantiated 2 NotificationCommands.
information/ConfigItem: Instantiated 177 CheckCommands.
information/ConfigItem: Instantiated 1 Downtime.
information/ConfigItem: Instantiated 4 HostGroups.
information/ConfigItem: Instantiated 1 IcingaApplication.
information/ConfigItem: Instantiated 157 Hosts.
information/ConfigItem: Instantiated 318 Comments.
information/ConfigItem: Instantiated 1 User.
information/ConfigItem: Instantiated 3 TimePeriods.
information/ConfigItem: Instantiated 161 Services.
information/ConfigItem: Instantiated 3 ServiceGroups.
information/ConfigItem: Instantiated 1 ScheduledDowntime.
information/ConfigItem: Instantiated 1 IdoMysqlConnection.
information/ConfigItem: Instantiated 1 NotificationComponent.
information/ConfigItem: Instantiated 1 GraphiteWriter.
information/ConfigItem: Instantiated 1 CheckerComponent.
information/ScriptGlobal: Dumping variables to file '/var/cache/icinga2/icinga2.vars'
information/cli: Finished validating the configuration file(s).
[root@icinga2 ~]# echo $?
0

Comment 20 Michael Friedrich 2017-06-22 15:32:42 UTC
Kurt pointed me to an ongoing discussion on oss-sec, thanks.

Possible problems: http://seclists.org/oss-sec/2017/q2/562

SuSE seems affected too: http://seclists.org/oss-sec/2017/q2/563 & http://seclists.org/oss-sec/2017/q2/567

Could be related: http://seclists.org/oss-sec/2017/q2/566 -> https://patchwork.kernel.org/patch/9802797/

On Tuesday I read a German article about the issue here: https://www.heise.de/security/meldung/Stack-Clash-Schwachstelle-fuehrt-zu-Rechteausweitung-auf-Linux-und-BSD-Systemen-3748070.html
which leads to a currently offline host: https://lkml.org/lkml/2017/6/19/1515 IIRC it was a discussion between Linus and Hugh about the cleanup and possible issues with the current patch.

Reference to this issue: http://seclists.org/oss-sec/2017/q2/542

Comment 29 Vishal Agrawal 2017-06-26 18:42:30 UTC
*** Bug 1465111 has been marked as a duplicate of this bug. ***

Comment 30 Linda Wang 2017-06-27 12:25:21 UTC
Note: SRT, QE, zstream maintainers:

Because Larry has already posted the revert of the v6 of his patchset 
from bug 1452733 CVE-2017-100364, and backported the upstream fix
for the security issue; we plan to use this BZ for this revert, and the
new upstream patch into 7.4 RC build.

Move the BZ to POST.

Thanks!

Comment 33 Rafael Aquini 2017-06-29 03:05:34 UTC
Patch(es) committed on kernel repository and an interim kernel build is undergoing testing

Comment 38 Rafael Aquini 2017-06-29 12:34:23 UTC
Patch(es) available on kernel-3.10.0-690.el7

Comment 46 Diego Aguilar 2017-07-03 16:54:34 UTC
(In reply to Rafael Aquini from comment #33)
> Patch(es) committed on kernel repository and an interim kernel build is
> undergoing testing

Hi Rafael, would you please indicate me where can I download this kernel version? I looked for it in RHEL downloads section and in several other repositories but without success. Thanks in advance.

Comment 57 errata-xmlrpc 2017-08-02 07:43:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:1842