Bug 1846809 - kernel 5.6.16-200 nfq fails with suricata-4.1.6-1.fc31 and with snort-2.9.16-1.fc31
Summary: kernel 5.6.16-200 nfq fails with suricata-4.1.6-1.fc31 and with snort-2.9.16-...
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 31
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-06-14 16:17 UTC by rce-dev
Modified: 2020-11-24 17:23 UTC (History)
19 users (show)

Fixed In Version:
Doc Type: ---
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-11-24 17:23:18 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
dmesg showing RIP: 0010:nf_conntrack_update (87.20 KB, text/plain)
2020-06-16 11:45 UTC, rce-dev
no flags Details
dmesg 5.6.18 kernel (87.66 KB, text/plain)
2020-06-18 12:21 UTC, rce-dev
no flags Details
messages 5.6.18 kernel (27.78 KB, text/plain)
2020-06-18 12:23 UTC, rce-dev
no flags Details
kernel 5.6.19 reporter-print (1) output (99.93 KB, text/plain)
2020-06-24 14:58 UTC, rce-dev
no flags Details

Description rce-dev 2020-06-14 16:17:35 UTC
User-Agent:       Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:77.0) Gecko/20100101 Firefox/77.0
Build Identifier: 

Starting suricata fails with the log entry: 
[ERRCODE: SC_ERR_NFQ_CREATE_QUEUE(72)] - nfq_create_queue failed
14/6/2020 -- 09:06:14 - <Error> - [ERRCODE: SC_ERR_NFQ_THREAD_INIT(78)] - nfq thread failed to initialize

snort fails with:
FATAL ERROR:  Can't initialize DAQ nfq (-1) - nfq_daq_initialize: nf queue creation failed


Reproducible: Always

Steps to Reproduce:
1.systemctl start suricata or systemctl start snortd
2.
3.



NFQ functionality worked under 5.6.15-200.fc31.x86_64 but fails under kernel-5.6.16-200.fc31.x86_64

I run suricata on one system and snortd on two others.

suricata-4.1.6-1.fc31.x86_64 uses nftables-0.9.1-3.fc31.x86_64
example:
   chain input {
                type filter hook input priority filter; policy drop;
                iifname "lo" counter packets 22486 bytes 4101987 queue num 1-3 fanout
   .
   .
   .
   }

snort-2.9.16-1.fc31.x86_64 uses iptables-1.8.3-7.fc31.x86_64
example:
   iptables -A OUTPUT -s 127.0.0.1/32 -j NFQUEUE --queue-num 1


I give this a Medium Severity since I can use kernel 5.6.15-200 until this is fixed.

Comment 1 rce-dev 2020-06-16 11:45:16 UTC
Created attachment 1697602 [details]
dmesg showing RIP: 0010:nf_conntrack_update

nf_conntrack issues begin showing up at [  602.248777]

Suricata is running in inline IPS mode.  From suricata.log:

14/6/2020 -- 09:05:54 - <Info> - CPUs/cores online: 4
14/6/2020 -- 09:05:54 - <Info> - NFQ running in standard ACCEPT/DROP mode
14/6/2020 -- 09:05:54 - <Info> - fast output device (regular) initialized: fast.log
14/6/2020 -- 09:05:54 - <Info> - stats output device (regular) initialized: stats.log
14/6/2020 -- 09:05:54 - <Info> - drop output device (regular) initialized: drop.log
14/6/2020 -- 09:05:54 - <Info> - Running in live mode, activating unix socket
14/6/2020 -- 09:05:58 - <Info> - 1 rule files processed. 20254 rules successfully loaded, 
0 rules failed
14/6/2020 -- 09:05:59 - <Info> - Threshold config parsed: 0 rule(s) found
14/6/2020 -- 09:05:59 - <Info> - 20257 signatures processed. 1085 are IP-only rules, 3970 
are inspecting packet payload, 16139 inspect application layer, 103 are decoder event only
14/6/2020 -- 09:06:14 - <Info> - binding this thread 0 to queue '1'
14/6/2020 -- 09:06:14 - <Error> - [ERRCODE: SC_ERR_NFQ_CREATE_QUEUE(72)] - nfq_create_queu
e failed
14/6/2020 -- 09:06:14 - <Error> - [ERRCODE: SC_ERR_NFQ_THREAD_INIT(78)] - nfq thread faile
d to initialize

Comment 2 rce-dev 2020-06-18 12:21:07 UTC
Created attachment 1697955 [details]
dmesg 5.6.18 kernel

dmesg showing failure under kernel 5.6.18-200.fc31.x86_64

Comment 3 rce-dev 2020-06-18 12:23:20 UTC
Created attachment 1697956 [details]
messages 5.6.18 kernel

partial messages file from kernel-5.6.18-200.fc31.x86_64 showing failure

Comment 4 rce-dev 2020-06-18 12:26:31 UTC
Issue still exists in kernel-5.6.18-200.fc31.x86_64 showing failure.

Failure occurs ~1-2minutes after boot complete.

Messages show attempt to restart suricata after failure failed.

Suricata command line:
/sbin/suricata -c /etc/suricata/suricata.yaml --pidfile /var/run/suricata.pid -v -D -q 1 -q 2 -q 3

Let me know if I can provide additional information.

Comment 5 rce-dev 2020-06-24 14:58:23 UTC
Created attachment 1698618 [details]
kernel 5.6.19 reporter-print (1) output

Bug is present in 5.6.19;
IPS is run with:
/sbin/suricata -c /etc/suricata/suricata.yaml --pidfile /var/run/suricata.pid -v -D -q 1 -q 2 -q 3

IPS is able to pass small packets (ie echo, echo-reply) but kernel oops occurs under increased network activity such as opening a web page.

It appears that an oops occurs with attempt of IPS to use each of the NFQUEUEs 1-3. Once an oops occurs, IPS traffic is blocked - IPS useless.

Restarting IPS results in failure to open previously used queues:
<Error> - [ERRCODE: SC_ERR_NFQ_CREATE_QUEUE(72)] - nfq_create_queue failed

An IPS process can open previously unused queues (ie q4) but with the same ultimate result.


The most recently attached file is the 3rd of 3 oops events corresponding with an attempt to open a web page. These events resulted in blocking all subsequent traffic from the IPS process.

Note that each oops references a very short-lived tainted process which I've been unable to identify with `ps -e` run at `sleep 1e-03` interval.
first oops:
CPU: 1 PID: 14850 Comm: TX#01 Not tainted 5.6.19-200.fc31.x86_64 #1
[  109.483740] CPU: 1 PID: 14850 Comm: TX#01 Not tainted 5.6.19-200.fc31.x86_64 #1
[  110.064602] CPU: 3 PID: 14851 Comm: TX#02 Tainted: G      D           5.6.19-200.fc31.x86_64 #1
2nd oops:
kernel_tainted_long: D - Kernel has oopsed before
 3 PID: 14851 Comm: TX#02 Tainted: G      D           5.6.19-200.fc31.x86_64 #1
[  109.483740] CPU: 1 PID: 14850 Comm: TX#01 Not tainted 5.6.19-200.fc31.x86_64 #1
[  110.064602] CPU: 3 PID: 14851 Comm: TX#02 Tainted: G      D            5.6.19-200.fc31.x86_64 #1
3rd oops
kernel_tainted_long: D - Kernel has oopsed before
/var/tmp/ProblemReport-C-5.6.19-200.fc31.txt::CPU: 3 PID: 14849 Comm: TX#00 Tainted: G      D           5.6.19-200.fc31.x86_64 #1
[  109.483740] CPU: 1 PID: 14850 Comm: TX#01 Not tainted 5.6.19-200.fc31.x86_64 #1
[  110.064602] CPU: 3 PID: 14851 Comm: TX#02 Tainted: G      D           5.6.19-200.fc31.x86_64 #1
[  124.498896] CPU: 3 PID: 14849 Comm: TX#00 Tainted: G      D           5.6.19-200.fc31.x86_64 #1

Comment 6 rce-dev 2020-07-06 14:51:34 UTC
Fixed in 5.7.7-100.fc31.x86_64.

Thanks to Pablo Neira Ayuso
https://bugzilla.netfilter.org/show_bug.cgi?id=1436#c2

Thanks, All

Comment 7 Ben Cotton 2020-11-03 17:10:26 UTC
This message is a reminder that Fedora 31 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 31 on 2020-11-24.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '31'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 31 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 8 Ben Cotton 2020-11-24 17:23:18 UTC
Fedora 31 changed to end-of-life (EOL) status on 2020-11-24. Fedora 31 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.