Bug 1456405 - Brick Multiplexing:dmesg shows request_sock_TCP: Possible SYN flooding on port 49152 and memory related backtraces
Summary: Brick Multiplexing:dmesg shows request_sock_TCP: Possible SYN flooding on por...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: protocol
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard: brick-multiplexing
Depends On: 1453145
Blocks: 1463517
TreeView+ depends on / blocked
 
Reported: 2017-05-29 10:39 UTC by Mohammed Rafi KC
Modified: 2017-09-05 17:32 UTC (History)
7 users (show)

Fixed In Version: glusterfs-3.12.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1453145
: 1463517 (view as bug list)
Environment:
Last Closed: 2017-09-05 17:32:34 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Comment 1 Mohammed Rafi KC 2017-05-29 10:41:07 UTC
Description of problem:
========================
In my brick mux setup, which hosts about 40+ 1x3 volumes oversame brick pids,
I ran IOs over a day for about 10 volumes from 10 dedicated clients.

However I noticed one of the server dmesg which had below backtrace

[Mon May 22 10:32:47 2017] Kprobes globally unoptimized
[Mon May 22 10:32:47 2017] stap_f2381505060ae6310863f02aed9a39ef_22947: loading out-of-tree module taints kernel.
[Mon May 22 10:32:47 2017] stap_f2381505060ae6310863f02aed9a39ef_22947: module verification failed: signature and/or required key missing - tainting kernel
[Mon May 22 10:32:48 2017] stap_f2381505060ae6310863f02aed9a39ef_22947: systemtap: 3.0/0.166, base: ffffffffa0608000, memory: 144data/64text/83ctx/2058net/50alloc kb, probes: 1
[Mon May 22 10:34:15 2017] stap_daf3ce6b146bb54d562d096209111fa0_23295: systemtap: 3.0/0.166, base: ffffffffa063e000, memory: 145data/68text/83ctx/2058net/50alloc kb, probes: 1
[Mon May 22 10:35:06 2017] stap_283ca8e5669d13a9ef14e3bb2b601f89_23574: systemtap: 3.0/0.166, base: ffffffffa0675000, memory: 145data/68text/86ctx/2058net/50alloc kb, probes: 1
[Mon May 22 10:40:56 2017] staprun: page allocation failure: order:4, mode:0x10c0d0
[Mon May 22 10:40:56 2017] CPU: 5 PID: 24031 Comm: staprun Tainted: G           OE  ------------   3.10.0-514.10.2.el7.x86_64 #1
[Mon May 22 10:40:56 2017] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[Wed May 17 18:39:41 2017] intel_powerclamp: No package C-state available
[Mon May 22 10:40:56 2017]  000000000010c0d0
[Mon May 22 10:40:56 2017]  00000000eb8721f0 ffff88012cc37988 ffffffff816863ef
[Mon May 22 10:40:56 2017]  ffff88012cc37a18 ffffffff81186ba0 0000000000000000 00000000ffffffff
[Mon May 22 10:40:56 2017]  fffffffffffffff0 0010c0d000000000 ffff88012cc379e8 00000000eb8721f0
[Mon May 22 10:40:56 2017] Call Trace:
[Mon May 22 10:40:56 2017]  [<ffffffff816863ef>] dump_stack+0x19/0x1b
[Mon May 22 10:40:56 2017]  [<ffffffff81186ba0>] warn_alloc_failed+0x110/0x180
[Mon May 22 10:40:56 2017]  [<ffffffff81681f83>] __alloc_pages_slowpath+0x6b7/0x725
[Mon May 22 10:40:56 2017]  [<ffffffff8118b155>] __alloc_pages_nodemask+0x405/0x420
[Mon May 22 10:40:56 2017]  [<ffffffff811cf30a>] alloc_pages_current+0xaa/0x170
[Mon May 22 10:40:56 2017]  [<ffffffff81185a7e>] __get_free_pages+0xe/0x50
[Mon May 22 10:40:56 2017]  [<ffffffff811dabae>] kmalloc_order_trace+0x2e/0xa0
[Mon May 22 10:40:56 2017]  [<ffffffff8113c723>] relay_open+0x63/0x2c0
[Mon May 22 10:40:56 2017]  [<ffffffffa0612840>] ? _stp_linenumber_lookup+0x20/0x20 [stap_e6774b0dd49a63e06112bfeaa9167888_24031]
[Mon May 22 10:40:56 2017]  [<ffffffffa0609b57>] _stp_transport_data_fs_init+0xc7/0x180 [stap_e6774b0dd49a63e06112bfeaa9167888_24031]
[Mon May 22 10:40:56 2017]  [<ffffffff812156d0>] ? dput+0x20/0xd0
[Mon May 22 10:40:56 2017]  [<ffffffff8121d48a>] ? __get_fs_type+0x3a/0x60
[Mon May 22 10:40:56 2017]  [<ffffffffa0612969>] init_module+0x129/0x950 [stap_e6774b0dd49a63e06112bfeaa9167888_24031]
[Mon May 22 10:40:56 2017]  [<ffffffffa0612840>] ? _stp_linenumber_lookup+0x20/0x20 [stap_e6774b0dd49a63e06112bfeaa9167888_24031]
[Mon May 22 10:40:56 2017]  [<ffffffff810020e8>] do_one_initcall+0xb8/0x230
[Mon May 22 10:40:56 2017]  [<ffffffff81100488>] load_module+0x22c8/0x2930
[Mon May 22 10:40:56 2017]  [<ffffffff8133d6e0>] ? ddebug_proc_write+0xf0/0xf0
[Mon May 22 10:40:56 2017]  [<ffffffff81100bb5>] SyS_init_module+0xc5/0x110
[Mon May 22 10:40:56 2017]  [<ffffffff81696a09>] system_call_fastpath+0x16/0x1b
[Mon May 22 10:40:56 2017] Mem-Info:
[Mon May 22 10:40:56 2017] active_anon:537547 inactive_anon:6033 isolated_anon:0
 active_file:329546 inactive_file:332986 isolated_file:0
 unevictable:6524 dirty:355 writeback:0 unstable:0
 slab_reclaimable:333111 slab_unreclaimable:119891
 mapped:9475 shmem:11538 pagetables:4152 bounce:0
 free:67311 free_pcp:0 free_cma:0
[Mon May 22 10:40:56 2017] Node 0 DMA free:15908kB min:132kB low:164kB high:196kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[Mon May 22 10:40:56 2017] lowmem_reserve[]: 0 2816 7813 7813
[Mon May 22 10:40:56 2017] Node 0 DMA32 free:89160kB min:24308kB low:30384kB high:36460kB active_anon:1059552kB inactive_anon:3796kB active_file:413284kB inactive_file:409652kB unevictable:8908kB isolated(anon):0kB isolated(file):0kB present:3129336kB managed:2884596kB mlocked:8908kB dirty:312kB writeback:0kB mapped:9644kB shmem:7988kB slab_reclaimable:639436kB slab_unreclaimable:103088kB kernel_stack:10096kB pagetables:4284kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[Mon May 22 10:40:56 2017] lowmem_reserve[]: 0 0 4996 4996
[Mon May 22 10:40:56 2017] Node 0 Normal free:164176kB min:43136kB low:53920kB high:64704kB active_anon:1090636kB inactive_anon:20336kB active_file:904900kB inactive_file:922292kB unevictable:17188kB isolated(anon):0kB isolated(file):0kB present:5251072kB managed:5115920kB mlocked:17188kB dirty:1108kB writeback:0kB mapped:28256kB shmem:38164kB slab_reclaimable:693008kB slab_unreclaimable:376476kB kernel_stack:19152kB pagetables:12324kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[Mon May 22 10:40:56 2017] lowmem_reserve[]: 0 0 0 0
[Mon May 22 10:40:56 2017] Node 0 DMA: 1*4kB (U) 0*8kB 0*16kB 1*32kB (U) 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15908kB
[Mon May 22 10:40:56 2017] Node 0 DMA32: 9232*4kB (UEM) 3811*8kB (UEM) 1341*16kB (UEM) 20*32kB (UE) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 89512kB
[Mon May 22 10:40:56 2017] Node 0 Normal: 24334*4kB (UEM) 3413*8kB (UEM) 1857*16kB (UEM) 292*32kB (UEM) 16*64kB (EM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 164720kB
[Mon May 22 10:40:56 2017] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[Mon May 22 10:40:56 2017] 676567 total pagecache pages
[Mon May 22 10:40:56 2017] 0 pages in swap cache
[Mon May 22 10:40:56 2017] Swap cache stats: add 0, delete 0, find 0/0
[Mon May 22 10:40:56 2017] Free swap  = 0kB
[Mon May 22 10:40:56 2017] Total swap = 0kB
[Mon May 22 10:40:56 2017] 2099100 pages RAM
[Mon May 22 10:40:56 2017] 0 pages HighMem/MovableOnly
[Mon May 22 10:40:56 2017] 94994 pages reserved
[Mon May 22 10:41:18 2017] stap_ca16476c9707735de6b0448a9e43e59_24277: systemtap: 3.0/0.166, base: ffffffffa0641000, memory: 139data/48text/37ctx/2058net/50alloc kb, probes: 1
[Mon May 22 10:44:43 2017] stap_283ca8e5669d13a9ef14e3bb2b601f89_24407: systemtap: 3.0/0.166, base: ffffffffa0608000, memory: 145data/68text/86ctx/2058net/50alloc kb, probes: 1
[Mon May 22 10:45:28 2017] stap_ca16476c9707735de6b0448a9e43e59_24448: systemtap: 3.0/0.166, base: ffffffffa063f000, memory: 139data/48text/37ctx/2058net/50alloc kb, probes: 1
[Mon May 22 11:02:00 2017] TCP: request_sock_TCP: Possible SYN flooding on port 49152. Sending cookies.  Check SNMP counters.





Version-Release number of selected component (if applicable):
===========
3.8.4-25

Steps to Reproduce:
1.40+ volumes on brick mux setup all 1x3 vols
2.mounted 10 volumes on 10 different clients
3.performed IO over a day

Comment 2 Worker Ant 2017-05-29 10:41:36 UTC
REVIEW: https://review.gluster.org/17411 (protocol/server: make listen backlog value as configurable) posted (#1) for review on master by mohammed rafi  kc (rkavunga)

Comment 3 Worker Ant 2017-05-29 10:51:59 UTC
REVIEW: https://review.gluster.org/17412 (socket/reconfigure: reconfigure should be done on new dict) posted (#1) for review on master by mohammed rafi  kc (rkavunga)

Comment 4 Worker Ant 2017-05-29 16:51:54 UTC
COMMIT: https://review.gluster.org/17412 committed in master by Raghavendra G (rgowdapp) 
------
commit 2168729011a0e835a8bdbde9cae7aaddd78a414d
Author: Mohammed Rafi KC <rkavunga>
Date:   Mon May 29 16:19:40 2017 +0530

    socket/reconfigure: reconfigure should be done on new dict
    
    In socket reconfigure, reconfigurations are doing with old
    dict values. It should be with new reconfigured dict values
    
    Change-Id: Iac5ad4382fe630806af14c99bb7950a288756a87
    BUG: 1456405
    Signed-off-by: Mohammed Rafi KC <rkavunga>
    Reviewed-on: https://review.gluster.org/17412
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Raghavendra G <rgowdapp>

Comment 5 Worker Ant 2017-05-30 12:11:06 UTC
REVIEW: https://review.gluster.org/17411 (protocol/server: make listen backlog value as configurable) posted (#2) for review on master by mohammed rafi  kc (rkavunga)

Comment 6 Worker Ant 2017-06-05 14:27:22 UTC
REVIEW: https://review.gluster.org/17411 (protocol/server: make listen backlog value as configurable) posted (#3) for review on master by mohammed rafi  kc (rkavunga)

Comment 7 Worker Ant 2017-06-06 07:14:05 UTC
REVIEW: https://review.gluster.org/17411 (protocol/server: make listen backlog value as configurable) posted (#4) for review on master by mohammed rafi  kc (rkavunga)

Comment 8 Worker Ant 2017-06-08 15:32:39 UTC
COMMIT: https://review.gluster.org/17411 committed in master by Jeff Darcy (jeff.us) 
------
commit 0a20e56d07de3e467e09da885a6b71cdc165de17
Author: Mohammed Rafi KC <rkavunga>
Date:   Mon May 29 16:00:24 2017 +0530

    protocol/server: make listen backlog value as configurable
    
    problem:
    
    When we call listen from protocol/server, we are giving a
    hard coded valie of 10 if it is not manually given.
    With multiplexing, especially when glusterd restarts all
    clients may try to connect to the server at a time.
    Which will result in overflowing the queue, and kernel
    will complain about the errors.
    
    Solution:
    
    This patch will introduce a volume set command to make backlog
    value as a configurable. This patch also changes the default
    values for backlog from 10 to 128. This changes is only applicable
    for sockets listening from protocol.
    
    Example:
    
    gluster volume set <volname> transport.listen-backlog 1024
    
    Note: 1 Brick has to be restarted to get this value in effect
          2 This changes won't be reflected in glusterd, or other
            xlators which calls listen. If you need, you have to
            add this option to the volfile.
    
    Change-Id: I0c5a2bbf28b5db612f9979e7560e05dd82b41477
    BUG: 1456405
    Signed-off-by: Mohammed Rafi KC <rkavunga>
    Reviewed-on: https://review.gluster.org/17411
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Raghavendra G <rgowdapp>
    Reviewed-by: Raghavendra Talur <rtalur>
    Reviewed-by: Atin Mukherjee <amukherj>
    Reviewed-by: Niels de Vos <ndevos>
    Reviewed-by: Jeff Darcy <jeff.us>

Comment 9 Shyamsundar 2017-09-05 17:32:34 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.12.0, please open a new bug report.

glusterfs-3.12.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/announce/2017-September/000082.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.