1209433 – Deadlock in __check_pf()

Bug 1209433 - Deadlock in __check_pf()

Summary: Deadlock in __check_pf()

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	kernel
Sub Component:
Version:	22
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Kernel Maintainer List
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2015-04-07 11:03 UTC by David Woodhouse
Modified:	2017-10-24 11:39 UTC (History)
CC List:	13 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2015-07-03 11:36:03 UTC
Type:	Bug
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
full backtraces (94.62 KB, text/plain) 2015-04-07 11:03 UTC, David Woodhouse	no flags	Details
check_pf.stp (2.57 KB, text/plain) 2017-10-24 11:37 UTC, Paulo Andrade	no flags	Details
check_pf.c (4.39 KB, text/x-csrc) 2017-10-24 11:39 UTC, Paulo Andrade	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Linux Kernel	99671	0	None	None	None	Never

Description David Woodhouse 2015-04-07 11:03:54 UTC

Created attachment 1011691 [details]
full backtraces

Thread 12 (Thread 0x7f5ef8b88700 (LWP 14353)):
#0  0x0000003841302edd in recvmsg () at ../sysdeps/unix/syscall-template.S:81
#1  0x000000384131f8ee in __check_pf (pid=-75530, fd=52) at ../sysdeps/unix/sysv/linux/check_pf.c:166
#2  0x000000384131f8ee in __check_pf (seen_ipv4=seen_ipv4@entry=0x7f5ef8b87a32, seen_ipv6=seen_ipv6@entry=0x7f5ef8b87a33, in6ai=in6ai@entry=0x7f5ef8b87a40, in6ailen=in6ailen@entry=0x7f5ef8b87a48) at ../sysdeps/unix/sysv/linux/check_pf.c:324
#3  0x00000038412eada1 in __GI_getaddrinfo (name=<optimized out>, service=0x0, hints=0x33b4d7cae0, pai=0x7f5ef8b87b70) at ../sysdeps/posix/getaddrinfo.c:2366
#4  0x00000033b4a88280 in do_lookup_by_name () at /lib64/libgio-2.0.so.0
#5  0x00000033b4a85855 in g_task_thread_pool_thread () at /lib64/libgio-2.0.so.0
#6  0x00000033b1471cce in g_thread_pool_thread_proxy () at /lib64/libglib-2.0.so.0
#7  0x00000033b1471335 in g_thread_proxy () at /lib64/libglib-2.0.so.0
#8  0x0000003841a07555 in start_thread (arg=0x7f5ef8b88700) at pthread_create.c:333
#9  0x0000003841301f3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

...


Thread 9 (Thread 0x7f5e92ffd700 (LWP 14441)):
#0  0x000000384130f99c in __lll_lock_wait_private () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
#1  0x000000384131f5a6 in __check_pf (seen_ipv4=0x38415be910 <lock>, seen_ipv6=0x7f5e92ffca33, in6ai=0x7f5e92ffca40, in6ailen=0x384130f99c <__lll_lock_wait_private+28>) at ../sysdeps/unix/sysv/linux/check_pf.c:302
#2  0x0000000000000000 in  ()

Thread 8 (Thread 0x7f5ef0ff9700 (LWP 14465)):
#0  0x000000384130f99c in __lll_lock_wait_private () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
#1  0x000000384131f5a6 in __check_pf (seen_ipv4=0x38415be910 <lock>, seen_ipv6=0x7f5ef0ff8a33, in6ai=0x7f5ef0ff8a40, in6ailen=0x384130f99c <__lll_lock_wait_private+28>) at ../sysdeps/unix/sysv/linux/check_pf.c:302
#2  0x0000000000000000 in  ()

Thread 7 (Thread 0x7f5eb37fe700 (LWP 14466)):
#0  0x000000384130f99c in __lll_lock_wait_private () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
#1  0x000000384131f5a6 in __check_pf (seen_ipv4=0x38415be910 <lock>, seen_ipv6=0x7f5eb37fda33, in6ai=0x7f5eb37fda40, in6ailen=0x384130f99c <__lll_lock_wait_private+28>) at ../sysdeps/unix/sysv/linux/check_pf.c:302
#2  0x0000000000000000 in  ()

Thread 6 (Thread 0x7f5ef2ffd700 (LWP 14522)):
#0  0x000000384130f99c in __lll_lock_wait_private () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
#1  0x000000384131f5a6 in __check_pf (seen_ipv4=0x38415be910 <lock>, seen_ipv6=0x7f5ef2ffca33, in6ai=0x7f5ef2ffca40, in6ailen=0x384130f99c <__lll_lock_wait_private+28>) at ../sysdeps/unix/sysv/linux/check_pf.c:302
#2  0x0000000000000000 in  ()

Thread 5 (Thread 0x7f5e93fff700 (LWP 14530)):
#0  0x000000384130f99c in __lll_lock_wait_private () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
#1  0x000000384131f5a6 in __check_pf (seen_ipv4=0x38415be910 <lock>, seen_ipv6=0x7f5e93ffea33, in6ai=0x7f5e93ffea40, in6ailen=0x384130f99c <__lll_lock_wait_private+28>) at ../sysdeps/unix/sysv/linux/check_pf.c:302
#2  0x0000000000000000 in  ()

No idea why that recvmsg() never returned; is netlink supposed to be reliable?

Comment 1 David Woodhouse 2015-04-08 11:05:37 UTC

Netlink isn't supposed to be reliable, but AFAICT we *are* supposed to get an ENOBUFS error when we call recvmsg(). Reassigning to kernel.

Comment 2 David Woodhouse 2015-04-23 08:28:52 UTC

I went back to the 3.19 kernel for a while and this went away. Updating to 4.0.0-1.fc22 made it start happening again.

Comment 3 Josh Boyer 2015-04-23 12:09:52 UTC

If you haven't already, would you please report this to the upstream maintainers?  It will be a bit before we can get to this.

Comment 4 David Woodhouse 2015-04-23 21:00:01 UTC

https://www.marc.info/?l=linux-netdev&m=142849461211078&w=3
https://www.marc.info/?l=linux-netdev&m=142954024310299&w=3

Comment 5 David Woodhouse 2015-06-08 21:14:31 UTC

Also seen here: https://github.com/nahi/httpclient/issues/232

Comment 6 David Woodhouse 2015-06-09 07:37:46 UTC

Further investigation shows that after sending the RT_GETADDR request, we don't get *any* responses back.

If a *response* message is dropped, we get ENOBUFS. But what about when a *request* is dropped? Could it be silently dropped leaving us waiting for a reply that never comes?

Comment 7 David Woodhouse 2015-06-10 22:13:20 UTC

(In reply to David Woodhouse from comment #6)
>  But what about when a *request* is dropped? Could it be silently dropped
> leaving us waiting for a reply that never comes?

Apparently so, due to a locking problem. It's been suggested that this might fix it: http://patchwork.ozlabs.org/patch/473041/

Test kernel build at http://koji.fedoraproject.org/koji/taskinfo?taskID=10007738

Comment 8 Josh Boyer 2015-06-15 14:21:43 UTC

Did your test kernel work for you?  Afaik, you're the only person to report this issue so I'm not sure how widely it will be tested otherwise.

Comment 9 David Woodhouse 2015-06-15 14:31:03 UTC

No, it didn't work. And I'm not the only person reporting the issue. 
http://comments.gmane.org/gmane.linux.network/363085

Comment 10 Josh Boyer 2015-06-15 17:16:46 UTC

(In reply to David Woodhouse from comment #9)
> No, it didn't work. And I'm not the only person reporting the issue. 
> http://comments.gmane.org/gmane.linux.network/363085

I meant in the context of Fedora.

Comment 11 David Woodhouse 2015-07-03 11:36:03 UTC

Fixed in 4.0.6.

Comment 12 Paulo Andrade 2017-10-04 12:14:58 UTC

(In reply to David Woodhouse from comment #11)
> Fixed in 4.0.6.

  Any hint on what commit fixed this issue?

Comment 13 David Woodhouse 2017-10-04 12:37:19 UTC

I have absolutely no idea. Someone with my name made this comment though...
https://bugzilla.kernel.org/show_bug.cgi?id=99671#c4

Comment 14 Paulo Andrade 2017-10-04 13:05:43 UTC

Thanks. I see this backtrace pattern on rhel7.
Still asking user for a core image and/or see it live.

One thread apparently blocking, but still do not know if spinning in glibc
make request():

0  0x00007fa57c7346ad in recvmsg () from /lib64/libc.so.6
#1  0x00007fa57c756f7d in make_request () from /lib64/libc.so.6
#2  0x00007fa57c757474 in __check_pf () from /lib64/libc.so.6
#3  0x00007fa57c71cfc1 in getaddrinfo () from /lib64/libc.so.6
...

and 50+ threads blocked:

#0  0x00007fa57c7410fc in __lll_lock_wait_private () from /lib64/libc.so.6
#1  0x00007fa57c75756d in _L_lock_746 () from /lib64/libc.so.6
#2  0x00007fa57c757325 in __check_pf () from /lib64/libc.so.6
#3  0x00007fa57c71cfc1 in getaddrinfo () from /lib64/libc.so.6
...

the issue appears to happen when an user reaches max open files limit.

Comment 15 Carlos O'Donell 2017-10-05 05:07:52 UTC

(In reply to Paulo Andrade from comment #14)
> Thanks. I see this backtrace pattern on rhel7.
> Still asking user for a core image and/or see it live.
> 
> One thread apparently blocking, but still do not know if spinning in glibc
> make request():
> 
> 0  0x00007fa57c7346ad in recvmsg () from /lib64/libc.so.6
> #1  0x00007fa57c756f7d in make_request () from /lib64/libc.so.6
> #2  0x00007fa57c757474 in __check_pf () from /lib64/libc.so.6
> #3  0x00007fa57c71cfc1 in getaddrinfo () from /lib64/libc.so.6
> ...
> 
> and 50+ threads blocked:
> 
> #0  0x00007fa57c7410fc in __lll_lock_wait_private () from /lib64/libc.so.6
> #1  0x00007fa57c75756d in _L_lock_746 () from /lib64/libc.so.6
> #2  0x00007fa57c757325 in __check_pf () from /lib64/libc.so.6
> #3  0x00007fa57c71cfc1 in getaddrinfo () from /lib64/libc.so.6
> ...
> 
> the issue appears to happen when an user reaches max open files limit.

Most of the time this is a user application bug where a file descriptor is reused by accident and that breaks the netlink handling.

We did two things to harden this:

commit fda389c8f0311dd5786be91a7b54b9f935fcafa1
Author: Siddhesh Poyarekar <siddhesh>
Date:   Tue Oct 14 21:05:33 2014 +0530

    Fix infinite loop in check_pf (BZ #12926)
    
    The recvmsg could return 0 under some conditions and cause the
    make_request function to be stuck in an infinite loop.
    
    Thank you Jim King <jim.king> for posting Paul's patch
    on the list.

commit 2eecc8afd02d8c65cf098cbae4de87f332dc21bd
Author: Florian Weimer <fweimer>
Date:   Mon Nov 9 12:48:41 2015 +0100

    Terminate process on invalid netlink response from kernel [BZ #12926]
    
    The recvmsg system calls for netlink sockets have been particularly
    prone to picking up unrelated data after a file descriptor race
    (where the descriptor is closed and reopened concurrently in a
    multi-threaded process, as the result of a file descriptor
    management issue elsewhere).  This commit adds additional error
    checking and aborts the process if a datagram of unexpected length
    (without the netlink header) is received, or an error code which
    cannot happen due to the way the netlink socket is used.
    
            [BZ #12926]
            Terminate process on invalid netlink response.
            * sysdeps/unix/sysv/linux/netlinkaccess.h
            (__netlink_assert_response): Declare.
            * sysdeps/unix/sysv/linux/netlink_assert_response.c: New file.
            * sysdeps/unix/sysv/linux/Makefile [$(subdir) == inet]
            (sysdep_routines): Add netlink_assert_response.
            * sysdeps/unix/sysv/linux/check_native.c (__check_native): Call
            __netlink_assert_response.
            * sysdeps/unix/sysv/linux/check_pf.c (make_request): Likewise.
            * sysdeps/unix/sysv/linux/ifaddrs.c (__netlink_request): Likewise.
            * sysdeps/unix/sysv/linux/Versions (GLIBC_PRIVATE): Add
            __netlink_assert_response.

I don't believe we have any of these in rhel-7 right now.

Comment 16 Paulo Andrade 2017-10-24 11:37:48 UTC

Created attachment 1342696 [details]
check_pf.stp

This systemtap script was used to help in tracking a file descriptor race condition.

Comment 17 Paulo Andrade 2017-10-24 11:39:40 UTC

Created attachment 1342697 [details]
check_pf.c

LD_PRELOAD'able version of the systemtap script.
Either script might need some tailoring, but both should be good tools to debug an incorrect usage of the file descriptor used by glibc, in the getaddrinfo call.

Note You need to log in before you can comment on or make changes to this bug.