Bug 1443872 - glibc: Terminate process on invalid netlink response from kernel
Summary: glibc: Terminate process on invalid netlink response from kernel
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: glibc
Version: 7.3
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: rc
: 7.5
Assignee: Florian Weimer
QA Contact: qe-baseos-tools
URL:
Whiteboard:
Depends On:
Blocks: 1655768
TreeView+ depends on / blocked
 
Reported: 2017-04-20 07:53 UTC by Masaki MAENO
Modified: 2019-08-06 12:49 UTC (History)
7 users (show)

Fixed In Version: glibc-2.17-276.el7
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-08-06 12:48:58 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2019:2118 None None None 2019-08-06 12:49:27 UTC
Sourceware 12926 None None None 2019-06-17 12:43:54 UTC

Description Masaki MAENO 2017-04-20 07:53:39 UTC
Description of problem:

The recvmsg system calls for netlink sockets have been particularly 
prone to picking up unrelated data after a file descriptor race 
(where the descriptor is closed and reopened concurrently in a 
multi-threaded process, as the result of a file descriptor management 
issue elsewhere).

https://sourceware.org/bugzilla/show_bug.cgi?id=12926

Version-Release number of selected component (if applicable):

RHEL6.0 - RHEL6.10 Errata Latest (glibc-2.12-1.209.el6) 

How reproducible:

Steps to Reproduce:
1. Multi-threaded JavaAP create many SNMP4J instance.
2. A thread is spinning forever.
(ex)
#0  0xf7717430 in __kernel_vsyscall ()
#1  0xf757e448 in recvmsg () at ../sysdeps/unix/sysv/linux/i386/socket.S:97
#2  0xf76797a6 in make_request (fd=1590, pid=<value optimized out>, seen_ipv4=0x7dceea7b, seen_ipv6=0x7dceea7a, in6ai=0x7dceea70, in6ailen=0x7dceea6c) at ../sysdeps/unix/sysv/linux/check_pf.c:123
#3  0xf7679bd4 in __check_pf (seen_ipv4=0x7dceea7b, seen_ipv6=0x7dceea7a, in6ai=0x7dceea70, in6ailen=0x7dceea6c) at ../sysdeps/unix/sysv/linux/check_pf.c:275
#4  0xf761ff4c in getaddrinfo (name=0x988f330 "<a-valid-host-name>", service=0x0, hints=0x7dceeaf8, pai=0x7dceeb18) at ../sysdeps/posix/getaddrinfo.c:2109

Actual results:

A thread is spinning forever.

Expected results:

All threads are not spining forever.

Community Patch:
https://sourceware.org/git/gitweb.cgi?p=glibc.git;a=history;f=sysdeps/unix/sysv/linux/check_pf.c;hb=2eecc8afd02d8c65cf098cbae4de87f332dc21bd
(2016-02-19 glibc-2.23 release)

Comment 2 Florian Weimer 2017-04-20 08:28:58 UTC
I'm afraid, but this kind of change is not appropriate during Production Phase 2 of Red Hat Enterprise Linux 6.  We will consider it for Red Hat Enterprise Linux 7.

Comment 4 Masaki MAENO 2017-04-20 23:54:21 UTC
(In reply to Florian Weimer from comment #2)
> I'm afraid, but this kind of change is not appropriate during Production
> Phase 2 of Red Hat Enterprise Linux 6.  We will consider it for Red Hat
> Enterprise Linux 7.

I hope the Errata package will be released as soon as possible.

I understood that the glibc problem will be fixed by at least RHEL7.5,
and the circumstances (no fix for Production Phase 2) of RHEL 6.

Comment 5 Masaki MAENO 2017-04-20 23:55:18 UTC
But, I hope the Errata package for RHEL6.x

Comment 6 Florian Weimer 2017-04-21 07:08:20 UTC
(In reply to Masaki MAENO from comment #5)
> But, I hope the Errata package for RHEL6.x

But other users of Red Hat Enterprise Linux 6 expect that their buggy applications continue to run, even if that means burning CPU cycles in an infinite loop.

The situation is different for Red Hat Enterprise Linux 7 because the product is still relatively early in its life-cycle.

Comment 7 Masaki MAENO 2017-04-24 00:22:35 UTC
(In reply to Florian Weimer from comment #6)
I understood that there is no choice but to avoid the problem by serializing getaddrinfo() in the multi-threaded application on RHEL6.

Thank you.

Comment 8 Florian Weimer 2017-04-24 08:57:51 UTC
(In reply to Masaki MAENO from comment #7)
> I understood that there is no choice but to avoid the problem by serializing
> getaddrinfo() in the multi-threaded application on RHEL6.

The libresolv bug which could trigger incorrect file descriptor reuse in getaddrinfo was addressed in this erratum:

https://rhn.redhat.com/errata/RHSA-2015-0863.html

If the application has a different descriptor reuse issue, serializing calls to getaddrinfo will probably be insufficient to avoid the bug.

Comment 9 Masaki MAENO 2017-04-25 00:56:46 UTC
(In reply to Florian Weimer from comment #8)
Thank you for the errata infomation.
But, we use glibc-2.12-1.209.el6 that is newer than glibc-2.12-1.149.el6_6.7, 
so I think there is the problem of getaddrinfo() stuck.

Comment 10 Florian Weimer 2017-04-25 06:56:42 UTC
(In reply to Masaki MAENO from comment #9)
> Thank you for the errata infomation.
> But, we use glibc-2.12-1.209.el6 that is newer than
> glibc-2.12-1.149.el6_6.7, 
> so I think there is the problem of getaddrinfo() stuck.

In this case, please open a support case:

  https://access.redhat.com/support/cases/#/case/new

Note that this could also be an issue with your application.  File descriptor races are somewhat common, and our erratum only fixed an issue within glibc itself, which is why it does not help to deal with application bugs.

Comment 11 Masaki MAENO 2017-10-02 08:16:10 UTC
I think this bug will fix it at RHEL 7.5.
Is my recognition recognized properly?

Comment 12 Florian Weimer 2017-10-02 09:08:59 UTC
(In reply to Masaki MAENO from comment #11)
> I think this bug will fix it at RHEL 7.5.
> Is my recognition recognized properly?

Have you opened a support case (see comment 10)?  Please keep in mind that the changed discussed in this bug is merely a diagnostic aid for broken applications.  If an application triggers the diagnostic, it will still have to be fixed.

Comment 13 Masaki MAENO 2017-10-02 09:29:38 UTC
I see.
I asked the customer to register the support case.

Comment 17 Florian Weimer 2019-03-01 12:32:23 UTC
Core fix is in glibc-2.17-276.el7.  glibc-2.17-279.el7 brings a minor improvement in error message formatting.

Comment 21 Sergey Kolosov 2019-06-19 11:06:08 UTC
Verified, SanityOnly, the patch was successfully applied

Comment 23 errata-xmlrpc 2019-08-06 12:48:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:2118


Note You need to log in before you can comment on or make changes to this bug.