Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1035748

Summary:	seccomp_rule_add: rules with action equal to default action are ignored
Product:	Red Hat Enterprise Linux 7	Reporter:	Jiri Jaburek <jjaburek>
Component:	libseccomp	Assignee:	Paul Moore <pmoore>
Status:	CLOSED NOTABUG	QA Contact:	Virtualization Bugs <virt-bugs>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	7.0	CC:	juzhang, michen, xuhan, yunzheng
Target Milestone:	rc
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2013-12-02 15:21:56 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Jiri Jaburek 2013-11-28 11:29:38 UTC

Description of problem:

I'm not sure if libseccomp guarantees deterministic rule ordering, if it doesn't then this bug may not be valid.

Libseccomp, unlike ie. xtables, seems to arrange rules in an inverted order, for example

    seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(write), 1,
                     SCMP_A0(SCMP_CMP_EQ, 1));
    seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(write), 1,
                     SCMP_A0(SCMP_CMP_EQ, 2));

seems to produce

  if ($syscall == 1)
    if ($a0.hi32 == 0)
      if ($a0.lo32 == 2)
        action ALLOW;
      if ($a0.lo32 == 1)
        action ALLOW;

- this itself is not a problem, just a design decision (I presume).

Taking the inverted ordering in mind, I could theoretically create an "allowed range" of file descriptors for the write syscall, without having to add them one by one - let's allow (again) fds 1-2 and use TRAP instead of KILL (which seems to be the same if I don't specify a custom SIGSYS handler):

    ctx = seccomp_init(SCMP_ACT_KILL);
    seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(write), 1,
                     SCMP_A0(SCMP_CMP_GT, 0));
    seccomp_rule_add(ctx, SCMP_ACT_TRAP, SCMP_SYS(write), 1,
                     SCMP_A0(SCMP_CMP_GT, 2));

which produces

  # filter for syscall "write" (1) [priority: 65532]
  if ($syscall == 1)
    if ($a0.hi32 >= 0)
      if ($a0.lo32 > 2)
        action TRAP;
      if ($a0.lo32 > 0)
        action ALLOW;
  # default action
  action KILL;

- I'm quite unsure how libseccomp uses bpf registers, ($a0.hi32 >= 0) where ($a0.hi32 == 0) was earlier doesn't seem quite right, but is understandable (eg. the parser could have looked at just one rule (fd>0) and adjusted the hi32 condition based on that) ... I wonder if someone could exploit that on architectures with 64bit integers. Otherwise, the ruleset looks correct and correctly blocks fds <= 0 and >=3.

However, let's see what happens when we change TRAP to KILL and leave default action on KILL:

  # filter for syscall "write" (1) [priority: 65533]
  if ($syscall == 1)
    if ($a0.hi32 >= 0)
      if ($a0.lo32 > 0)
        action ALLOW;
  # default action
  action KILL;

- the second rule (lo32 > 0) is missing and write(fd, ...) now succeeds where fd >= 3. If we change the default action to TRAP, the ruleset becomes

  # filter for syscall "write" (1) [priority: 65532]
  if ($syscall == 1)
    if ($a0.hi32 >= 0)
      if ($a0.lo32 > 2)
        action KILL;
      if ($a0.lo32 > 0)
        action ALLOW;
  # default action
  action TRAP;

and the range works correctly again.

This led me to believe that there's something going on with rules which have action == the default action.


Version-Release number of selected component (if applicable):
libseccomp-2.1.1-0.el7.x86_64
kernel-3.10.0-57.el7.x86_64
gcc-4.8.2-3.el7.x86_64

How reproducible:
always

Actual results:
rules with action equal to the default action seems to be ignored

Expected results:
rules with action equal to the default action are added to the ruleset

Additional info:

Comment 2 Paul Moore 2013-12-02 15:21:56 UTC

(In reply to Jiri Jaburek from comment #0)
> Description of problem:
> 
> I'm not sure if libseccomp guarantees deterministic rule ordering, if it
> doesn't then this bug may not be valid.

No, libseccomp does not guarantee rule ordering; it focuses on generating the smallest possible filter with a preference for the smallest/quickest rules at the front of the filter.  If ordering is important, you can provide priority hints via the seccomp_syscall_priority() API which the library will use when generating the filter.

> I'm quite unsure how libseccomp uses bpf registers, ($a0.hi32 >= 0) where
> ($a0.hi32 == 0) was earlier doesn't seem quite right, but is understandable
> (eg. the parser could have looked at just one rule (fd>0) and adjusted the
> hi32 condition based on that) ... I wonder if someone could exploit that on
> architectures with 64bit integers. Otherwise, the ruleset looks correct and
> correctly blocks fds <= 0 and >=3.

You are looking at the PFC output and not the actual disassembled BPF output.  The goal of the PFC output is to be a middle ground between the rule based API and the often obfuscated BPF output with a definite focus on human readability.

If you are interested, you can grab the libseccomp sources, and in the tools/ directory you'll find a simple BPF disassembler which you can use to peek at the generated BPF. 

> ... This led me to believe that there's something going on with rules which 
> have action == the default action.

If you step back for a minute, think about the problem again for a minute: what is the point of a rule where the desired action is the same as the default?  Nothing, it serves no purpose.  If a rule is added where the action is the same as the default action we actually fail and return an error to the caller (I'm guessing you weren't checking return codes).

Comment 3 Jiri Jaburek 2013-12-02 17:10:15 UTC

(In reply to Paul Moore from comment #2)
> (In reply to Jiri Jaburek from comment #0)
> > ... This led me to believe that there's something going on with rules which 
> > have action == the default action.
> 
> If you step back for a minute, think about the problem again for a minute:
> what is the point of a rule where the desired action is the same as the
> default?  Nothing, it serves no purpose.  If a rule is added where the
> action is the same as the default action we actually fail and return an
> error to the caller (I'm guessing you weren't checking return codes).

In a system with orderless rules, such rule would indeed serve no purpose. However since BPF respects rule order and supports conditional jumps, one can create a scenario where a rule with an action equal to the default action can matter - like I did.

An analogy - iptables:

-P INPUT DROP
-A INPUT -p tcp -m tcp --dport 22 -j DROP
-A INPUT -m conntrack --ctstate NEW -j ACCEPT
...

-- you can't ignore the --dport 22 rule just because its target is equal to the chain policy. Because if you ignore it, the second rule gets matched and allows the packet in.

It wouldn't matter as much here if libseccomp didn't care about rule ordering, but - from my observations - it does provide deterministic rule ordering via the priority value.

(In reply to Paul Moore from comment #2)
> No, libseccomp does not guarantee rule ordering; it focuses on generating
> the smallest possible filter with a preference for the smallest/quickest
> rules at the front of the filter.  If ordering is important, you can provide
> priority hints via the seccomp_syscall_priority() API which the library will
> use when generating the filter.

Though as long as priority values are thought of only as "hints", the logic isn't broken and this bug should probably be an RFE.

Comment 4 Paul Moore 2013-12-02 19:37:32 UTC

(In reply to Jiri Jaburek from comment #3)
> (In reply to Paul Moore from comment #2)
> > (In reply to Jiri Jaburek from comment #0)
> > > ... This led me to believe that there's something going on with rules 
> > > which have action == the default action.
> > 
> > If you step back for a minute, think about the problem again for a minute:
> > what is the point of a rule where the desired action is the same as the
> > default?  Nothing, it serves no purpose.  If a rule is added where the
> > action is the same as the default action we actually fail and return an
> > error to the caller (I'm guessing you weren't checking return codes).
> 
> In a system with orderless rules, such rule would indeed serve no purpose.
> However since BPF respects rule order and supports conditional jumps ...

Yes, BPF is like any other state machine, but libseccomp is not.  The libseccomp API does not support rule ordering or make any claims about rule ordering.  I do not see this changing at any point in the near future.

> (In reply to Paul Moore from comment #2)
> > No, libseccomp does not guarantee rule ordering; it focuses on generating
> > the smallest possible filter with a preference for the smallest/quickest
> > rules at the front of the filter.  If ordering is important, you can provide
> > priority hints via the seccomp_syscall_priority() API which the library will
> > use when generating the filter.
> 
> Though as long as priority values are thought of only as "hints", the logic
> isn't broken and this bug should probably be an RFE.

They are simply hints, not a guarantee of a rule ordering.