Bug 2074414

Summary: avc denied for comm="teamd" scontext=system_u:system_r:NetworkManager_t:s0
Product: Red Hat Enterprise Linux 9 Reporter: Hangbin Liu <haliu>
Component: selinux-policyAssignee: Zdenek Pytela <zpytela>
Status: CLOSED ERRATA QA Contact: Milos Malik <mmalik>
Severity: high Docs Contact:
Priority: high    
Version: 9.1CC: bgalvani, lvrabec, mmalik, nknazeko, ssekidde
Target Milestone: rcKeywords: Triaged
Target Release: 9.1   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: selinux-policy-34.1.40-1.el9 Doc Type: Bug Fix
Doc Text:
Cause: Missing policy rule that allows NetworkManager_t to signal unconfined_t. During kernel selftests, teamd - running in NetworkManager_t domain, checks if teamd instance is started externally. When teamd started from command line, it runs in the caller domain - unconfined_t. Consequence: AVC denials for teamd command when run kernel selftests. Fix: Allow NetworkManager_t to send generic signals to the unconfined domain. Result: No AVC denials.
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-11-15 11:13:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Hangbin Liu 2022-04-12 08:04:25 UTC
Description of problem:
Got avc denied for teamd command when run kernel selftests. Here is the log from beaker job https://beaker.engineering.redhat.com/recipes/11757144#task142445370

SELinux status:                 enabled
SELinuxfs mount:                /sys/fs/selinux
SELinux root directory:         /etc/selinux
Loaded policy name:             targeted
Current mode:                   enforcing
Mode from config file:          enforcing
Policy MLS status:              enabled
Policy deny_unknown status:     allowed
Memory protection checking:     actual (secure)
Max kernel policy version:      33
selinux-policy-34.1.29-1.el9_0.noarch
----
time->Fri Apr  8 11:01:16 2022
type=PROCTITLE msg=audit(1649430076.436:6807): proctitle=2F7573722F62696E2F7465616D64002D6B002D74006C6167
type=SYSCALL msg=audit(1649430076.436:6807): arch=c000003e syscall=62 success=no exit=-13 a0=1ad7f5 a1=f a2=0 a3=7fbf0f5d6ac0 items=0 ppid=13470 pid=1759420 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="teamd" exe="/usr/bin/teamd" subj=system_u:system_r:NetworkManager_t:s0 key=(null)
type=AVC msg=audit(1649430076.436:6807): avc:  denied  { signal } for  pid=1759420 comm="teamd" scontext=system_u:system_r:NetworkManager_t:s0 tcontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=process permissive=0

Comment 1 Zdenek Pytela 2022-04-12 17:32:53 UTC
Hi,

Do you happen to know which command triggers this denial?
Did it start to happen with some package (kernel, teamd) update?

Comment 2 Hangbin Liu 2022-04-13 03:41:53 UTC
(In reply to Zdenek Pytela from comment #1)
> Hi,
> 
> Do you happen to know which command triggers this denial?

It was triggered by a kselftest. I have optimize the reproducer like
```
#!/bin/bash

for i in `seq 10`;do
        > /var/log/audit/audit.log
        systemctl restart NetworkManager
        ip link add type veth
        ip link set veth1 up

        teamd -t team0 -d -c '{"runner": {"name": "loadbalance"}}'
        ip link set veth0 master team0
        ip link set team0 up
        sleep 2
        teamd -t team0 -k
        modprobe -r veth
        aureport -a | grep teamd && break
done

```

> Did it start to happen with some package (kernel, teamd) update?

Not sure, I didn't enable AVC checking before. Need to try this reproducer on old releases first.

Comment 3 Zdenek Pytela 2022-04-13 07:52:54 UTC
Thank you, I still don't understand this part though:

(In reply to Hangbin Liu from comment #2)
> (In reply to Zdenek Pytela from comment #1)
> > Hi,
> > 
> > Do you happen to know which command triggers this denial?
> 
> It was triggered by a kselftest. I have optimize the reproducer like
> ```
> #!/bin/bash
> 
> for i in `seq 10`;do
>         > /var/log/audit/audit.log
>         systemctl restart NetworkManager
>         ip link add type veth
>         ip link set veth1 up
> 
>         teamd -t team0 -d -c '{"runner": {"name": "loadbalance"}}'
teamd instance is started, it runs in the caller domain, unconfined_t in this case:

# ps -eo pid,ppid,command,context | grep -e CONTEXT -e teamd
    PID    PPID COMMAND                     CONTEXT
  94401       1 teamd -t team0 -d -c {"runn unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023

>         ip link set veth0 master team0
>         ip link set team0 up
>         sleep 2
>         teamd -t team0 -k
teamd is executed to kill the running instance; the kill() syscall is sent from a process in NetworkManager_t domain though - so which process actually sends the signal? The signal seems to be SIGTERM.


>         modprobe -r veth
>         aureport -a | grep teamd && break
> done
> 
> ```

Unfortunately, I was unable to reproduce the problem with this scenario, even when the loop was run 100 times. We would need a reliable one for the purpose of testing.

Comment 4 Hangbin Liu 2022-04-13 08:15:11 UTC
(In reply to Zdenek Pytela from comment #3)
> 
> Unfortunately, I was unable to reproduce the problem with this scenario,
> even when the loop was run 100 times. We would need a reliable one for the
> purpose of testing.

Maybe it's easy to reproduce on VM? I use `1minutetip rhel8`[1] to get an instance on PSI and
could reproduce this issue easily.



[1] Homepage: http://wiki.test.redhat.com/BaseOs/Projects/1minuteTIP

Comment 5 Zdenek Pytela 2022-04-13 08:24:52 UTC
(In reply to Hangbin Liu from comment #4)
> (In reply to Zdenek Pytela from comment #3)
> > 
> > Unfortunately, I was unable to reproduce the problem with this scenario,
> > even when the loop was run 100 times. We would need a reliable one for the
> > purpose of testing.
> 
> Maybe it's easy to reproduce on VM? I use `1minutetip rhel8`[1] to get an
> instance on PSI and
> could reproduce this issue easily.

I used a rhel9 1mt vm with this script. Now executed about 300 times, no AVC.
On a rhel8 system it reproduced immediately.
Is there any substantial difference, or why it did not trigger an AVC?

Still, I'd like to understand the interprocess communication.

Comment 6 Hangbin Liu 2022-04-13 09:12:07 UTC
(In reply to Zdenek Pytela from comment #5)
> 
> I used a rhel9 1mt vm with this script. Now executed about 300 times, no AVC.
> On a rhel8 system it reproduced immediately.
> Is there any substantial difference, or why it did not trigger an AVC?

I also could not reproduce it on RHEL9 by running the reproducer manually.
But my test jobs could trigger this issue every time.

https://beaker.engineering.redhat.com/recipes/11757144#task142445370
https://beaker.engineering.redhat.com/recipes/11718213#task142085471

> 
> Still, I'd like to understand the interprocess communication.

I don't know either. teamd only calls libdaemon to kill the process.

```teamd/teamd.c:main()
        case DAEMON_CMD_KILL:
                if (daemon_pid_file_is_running() > 0) {
                        err = daemon_pid_file_kill_wait(SIGTERM, 30);
                        if (err)
                                teamd_log_warn("Failed to kill daemon: %s",
                                               strerror(errno));
                        else
                                ret = TEAMD_EXIT_SUCCESS;
                } else {
                        teamd_log_warn("Daemon not running");
                }
                break;

```

Hi Beniamino, do you have any idea why NM will kill the process teamd trying to kill?

Comment 7 Beniamino Galvani 2022-04-13 09:53:48 UTC
Using your script I got:

  [1649842400.6525] device[5509e18e5a0c14b1] (team0): sys-iface-state: external -> removed
  [1649842400.6526] device[5509e18e5a0c14b1] (team0): unrealize (ifindex 493)
  [1649842400.6526] device[5509e18e5a0c14b1] (team0): ifindex: set ifindex 0 (old-l3cfg: 3a09b5bf450d7785)
  [1649842400.6529] device (team0): state change: activated -> unmanaged (reason 'unmanaged', sys-iface-state: 'removed')
  [1649842400.6530] device[5509e18e5a0c14b1] (team0): deactivating device (reason 'unmanaged') [3]
  [1649842400.6530] device[5509e18e5a0c14b1] (team0): running: /usr/bin/teamd -k -t team0

NM is tracking the interface as "external", which means that NM tries
to never touch it. In nm-device-team.c it does:

  static void
  deactivate(NMDevice *device)
  {
      NMDeviceTeam        *self = NM_DEVICE_TEAM(device);
      NMDeviceTeamPrivate *priv = NM_DEVICE_TEAM_GET_PRIVATE(self);

      if (nm_device_sys_iface_state_is_external(device))
          return;

     [...]

     teamd_kill(self, NULL, NULL);   // <---- this calls "teamd -k"
  }

However, in the log we see that the sys-iface-state already
transitioned from "external" to "removed" and so the check on
*_is_external() doesn't work as expected. This (the fact that
NM tries to kill a teamd started externally, even if there is
no NM connection for it) seems a bug in NM.

--

But note that in general, if NM wants to activate a connection on team0
and already finds a teamd instance started externally for team0, it
will try to kill it via "teamd -k". I think this is expected, and probably
it will give a similar AVC denial. Something like:

  teamd -t team0 -d -c '{"runner": {"name": "loadbalance"}}'
  nmcli connection add type team ifname team0 ip4 172.25.1.1/24 con-name team0+
  nmcli connection up team0+

Comment 8 Zdenek Pytela 2022-04-13 16:27:18 UTC
(In reply to Beniamino Galvani from comment #7)
>   teamd -t team0 -d -c '{"runner": {"name": "loadbalance"}}'
>   nmcli connection add type team ifname team0 ip4 172.25.1.1/24 con-name
> team0+
>   nmcli connection up team0+
This one reproduced on RHEL 9, too. There are 5 attempts audited:

----
type=PROCTITLE msg=audit(04/13/22 12:25:33.714:364) : proctitle=/usr/bin/teamd -k -t team0
type=SYSCALL msg=audit(04/13/22 12:25:33.714:364) : arch=x86_64 syscall=kill success=no exit=EACCES(Permission denied) a0=0x17bd8 a1=SIGTERM a2=0x7ffee23dc640 a3=0x7fb433e44480 items=0 ppid=85272 pid=97273 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=(none) ses=unset comm=teamd exe=/usr/bin/teamd subj=system_u:system_r:NetworkManager_t:s0 key=(null)
type=AVC msg=audit(04/13/22 12:25:33.714:364) : avc:  denied  { signal } for  pid=97273 comm=teamd scontext=system_u:system_r:NetworkManager_t:s0 tcontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=process permissive=0

Thank you for the explanation.

Comment 21 errata-xmlrpc 2022-11-15 11:13:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (selinux-policy bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:8283