Description of problem: using nftrace tracing one can mark less or more specific traffic to be traced and reported via nft monitor command. the output prints out *all* packet traces without without a way to filter which particular trace rule enabled the tracing. Each event in a single trace is printed on a separate line, thus making reading of the trace very difficult especially when they overlap and if there are more than one rule enabling/disabling the tracing. Scenario: there's traffic to two services coming from two hosts, I can set trace for all incoming traffic, for traffic coming from one of the hosts or towards one or both of the services. An instance of nft monitor is used to get the some of the traffic while other instance for other parts of the traffic. This could help a lot with aggregating the data for immediate or later use from the same time window. The RFE is for ability to filter traffic by a key or some part of the context the trace contain like: ip header, nft objects, tcp payload, packet marks or specific nftrace mark key. e.g. nft monitor trace ip saddr ::1 nft monitor trace ip6 filter INPUT nft monitor trace tcp dport 25 nft monitor trace mark mouse nft monitor trace trace_key_str # see note bellow * this would move specification of the filter to the rule itself rather than to the monitor but probably also require a netfilter update Version-Release number of selected component (if applicable): nftables-0.9.3-4.el8.x86_64 How reproducible: always Steps to Reproduce: nft -f - <<EOF table inet trace_t { chain ch { type filter hook input priority filter; policy accept; tcp dport 25 meta nftrace set 1 } } EOF nft monitor trace & nc localhost 25 Actual results: a verbose output of trace events linked by an id in 3rd column. without a parser or a low traffic this is rather difficult to use >example output: trace id bb59767d inet trace_t inch packet: iif "lo" @ll,0,112 34525 ip6 saddr ::1 ip6 daddr ::1 ip6 dscp cs0 ip6 ecn not-ect ip6 hoplimit 64 ip6 flowlabel 43505 ip6 nexthdr tcp ip6 length 40 tcp sport 46356 tcp dport 25 tcp flags == syn tcp window 43690 trace id bb59767d inet trace_t inch rule tcp dport 25 meta nftrace set 1 (verdict continue) trace id bb59767d inet trace_t inch verdict continue trace id bb59767d inet trace_t inch policy accept trace id bb59767d ip6 filter INPUT verdict continue trace id bb59767d ip6 filter INPUT policy accept trace id bb59767d inet firewalld filter_INPUT packet: iif "lo" @ll,0,112 34525 ip6 saddr ::1 ip6 daddr ::1 ip6 dscp cs0 ip6 ecn not-ect ip6 hoplimit 64 ip6 flowlabel 43505 ip6 nexthdr tcp ip6 length 40 tcp sport 46356 tcp dport 25 tcp flags == syn tcp window 43690 trace id bb59767d inet firewalld filter_INPUT rule iifname "lo" accept (verdict accept) trace id bb59767d ip6 security INPUT verdict continue trace id bb59767d ip6 security INPUT policy accept trace id 820a27e1 inet trace_t inch packet: iif "lo" @ll,0,112 2048 ip saddr 127.0.0.1 ip daddr 127.0.0.1 ip dscp cs0 ip ecn not-ect ip ttl 64 ip id 46216 ip protocol tcp ip length 60 tcp sport 60948 tcp dport 25 tcp flags == syn tcp window 43690 trace id 820a27e1 inet trace_t inch rule tcp dport 25 meta nftrace set 1 (verdict continue) trace id 820a27e1 inet trace_t inch verdict continue trace id 820a27e1 inet trace_t inch policy accept trace id 820a27e1 ip filter INPUT verdict continue trace id 820a27e1 ip filter INPUT policy accept trace id 820a27e1 inet firewalld filter_INPUT packet: iif "lo" @ll,0,112 2048 ip saddr 127.0.0.1 ip daddr 127.0.0.1 ip dscp cs0 ip ecn not-ect ip ttl 64 ip id 46216 ip protocol tcp ip length 60 tcp sport 60948 tcp dport 25 tcp flags == syn tcp window 43690 trace id 820a27e1 inet firewalld filter_INPUT rule iifname "lo" accept (verdict accept) trace id 820a27e1 ip security INPUT verdict continue trace id 820a27e1 ip security INPUT policy accept Expected results: ability to filter traffic by a key or some part of the context the trace contain like: ip header, nft objects, tcp payload, packet marks or other Additional info:
Hi Tomas, I like the idea, but I guess filtering in nft client is hard to do as we would basically start duplicating kernel nftables VM in userspace. Instead, I would like to limit efforts to implementing support for a user-defined trace ID like so: | meta nftrace set 1 meta nftrace_id set 0xdeadbeef The idea is to have trace output later print 'trace id deadbeef ...'. The downside is obviously that you can't distinguish between two different packets anymore. Maybe an additional "user ID" would be better, which generates output like 'trace uid deadbeef id 12345abc ...'. With this in place, you would just specify your filter rules in kernel ruleset, so instead of: | nft monitor trace ip saddr ::1 you add a rule: | ip saddr ::1 meta nftrace_id set 0x123 The reason why I prefer an ID over a "key string" is the stronger definition: A 32bit ID is fixed length and clear in values, strings are somewhat undefined. Also, for sarters (or while being at it) description of 'nft monitor' command in man page could use a review - 'nft monitor trace' e.g. isn't described at all. :( Cheers, Phil
Hello Phil, (In reply to Phil Sutter from comment #1) > I like the idea, but I guess filtering in nft client is hard to do as we > would > basically start duplicating kernel nftables VM in userspace. Instead, I would > like to limit efforts to implementing support for a user-defined trace ID > like that's pretty acceptable from usability point of view given the absence of event filtering. > so: > > | meta nftrace set 1 meta nftrace_id set 0xdeadbeef > > The idea is to have trace output later print 'trace id deadbeef ...'. The > downside is obviously that you can't distinguish between two different > packets anymore. Yes, that's why some way to only filter output, not destroy inner references, should be used. The point is to ease the processing of the output (automated or manual). > Maybe an additional "user ID" would be better, which generates output > like 'trace uid deadbeef id 12345abc ...'. > > With this in place, you would just specify your filter rules in kernel > ruleset, > so instead of: > > | nft monitor trace ip saddr ::1 > > you add a rule: > > | ip saddr ::1 meta nftrace_id set 0x123 Yes, as suggested in comment 0: >> * this would move specification of the filter to the rule itself rather than to the monitor (...) actually it would be pretty same to what one can do with a log statement. also, the context for the packet is known from the given base chain. > The reason why I prefer an ID over a "key string" is the stronger > definition: A > 32bit ID is fixed length and clear in values, strings are somewhat undefined. Ack, that still does fulfil the RFE and makes sense. > Also, for starters (or while being at it) description of 'nft monitor' > command in man page could use a review - 'nft monitor trace' e.g. isn't > described at all. :( I will create a rhbz for that separately. There would be more to enhance than just 'monitor', e.g. 'describe' command which does not really tell what kind of expression can be described. Thanks, Tomas
Sadly, implementing this user-specified trace ID is non-trivial: For standard trace functionality, kernel skbuff has a field 'nf_trace' which is merely a single bit to indicate trace enabled/disabled state. Trace ID is generated by hashing stable packet fields. So in order to introduce a user-defined ID, a new field has to be added to skbuff which is always a problematic task and probably not feasible for "mere packet tracing". Maybe I could introduce an skb_ext for it, these seem to suit well for that task.
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release. Therefore, it is being closed. If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.