Bug 2074402 - --device=/dev/kvm denies device access without --privileged
Summary: --device=/dev/kvm denies device access without --privileged
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: podman
Version: 36
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Matthew Heon
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-04-12 07:42 UTC by Martin Pitt
Modified: 2022-04-12 15:08 UTC (History)
12 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2022-04-12 15:08:08 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Martin Pitt 2022-04-12 07:42:32 UTC
Description of problem: A recent Fedora CoreOS update broke podman's --device forwarding option.


Version-Release number of selected component (if applicable):

Fedora CoreOS 36.20220411.10.0 x86_64

podman-4.0.2-1.fc36.x86_64
kernel-5.17.2-300.fc36.x86_64

How reproducible: Always


Steps to Reproduce:

sudo podman run -it --rm --device=/dev/kvm quay.io/libpod/busybox head /dev/kvm

Actual results:

Fails with "head: /dev/kvm: Operation not permitted"

Expected results: opening the device should succeed, and it should fail with "head: /dev/kvm: Input/output error". Of course `head` is not an appropriate tool to really drive KVM, but it is the simplest and fastest one to check access.

This used to work for a long time (our CI depends on it), but regressed recently.

It does work with --privileged. It does *not* work with --security-opt=seccomp=unconfined . Is there anything else "in between" that I could test?


Additional info:

$ podman info
host:
  arch: amd64
  buildahVersion: 1.24.1
  cgroupControllers:
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.0-2.fc36.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.0, commit: '
  cpus: 96
  distribution:
    distribution: fedora
    variant: coreos
    version: "36"
  eventLogger: journald
  hostname: cockpit-aws-tasks
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 5.17.2-300.fc36.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 186968645632
  memTotal: 202416189440
  networkBackend: netavark
  ociRuntime:
    name: crun
    package: crun-1.4.4-1.fc36.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.4.4
      commit: 6521fcc5806f20f6187eb933f9f45130c86da230
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.0-0.2.beta.0.fc36.x86_64
    version: |-
      slirp4netns version 1.2.0-beta.0
      commit: 477db14a24ff1a3de3a705e51ca2c4c1fe3dda64
      libslirp: 4.6.1
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.3
  swapFree: 0
  swapTotal: 0
  uptime: 43m 31.09s
plugins:
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
  - quay.io
store:
  configFile: /var/home/core/.config/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /var/home/core/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 0
  runRoot: /run/user/1000/containers
  volumePath: /var/home/core/.local/share/containers/storage/volumes
version:
  APIVersion: 4.0.2
  Built: 1646319369
  BuiltTime: Thu Mar  3 14:56:09 2022
  GitCommit: ""
  GoVersion: go1.18beta2
  OsArch: linux/amd64
  Version: 4.0.2

Comment 1 Martin Pitt 2022-04-12 07:52:18 UTC
Note: This does not depend on the busybox image (it's just the fastest to download). I also tried --cap-add=sys_admin,sys_rawio,sys_resource but that also does not help.

Comment 2 Daniel Walsh 2022-04-12 10:06:28 UTC
I am not seeing this failure.

$ sudo podman run -it --rm --device=/dev/kvm quay.io/libpod/busybox head /dev/kvm
Place your right index finger on the fingerprint reader
Trying to pull quay.io/libpod/busybox:latest...
Getting image source signatures
Copying blob 9758c28807f2 done  
Copying config f0b02e9d09 done  
Writing manifest to image destination
Storing signatures
head: /dev/kvm: Input/output error

Comment 3 Daniel Walsh 2022-04-12 10:07:55 UTC
I am on an older kernel.

5.17.1-300.fc36.x86_64

Comment 4 Martin Pitt 2022-04-12 11:34:07 UTC
I reported this from an EC2 instance, but I can reproduce this locally in a Fedora CoreOS VM. I have to upgrade to

     rpm-ostree rebase "fedora/x86_64/coreos/next"

though to get Fedora 36 (stable is still F35, where this bug does not happen).

That image has an even older kernel, but the bug happens there.

    podman-4.0.2-1.fc36.x86_64
    kernel-5.17.0-300.fc36.x86_64

So this is not just some weird EC2 quirk.

Comment 5 Martin Pitt 2022-04-12 11:35:40 UTC
I forgot:

   # rpm-ostree status
    State: idle
    Deployments:
    ● fedora:fedora/x86_64/coreos/next
                       Version: 36.20220325.1.0 (2022-03-25T19:27:17Z)

So even the "next" image is actually quite old -- apparently EC2 has daily images. But either way, this regression isn't *that* new then.

Comment 6 Matthew Heon 2022-04-12 13:27:19 UTC
Any chance you can try with v4.0.3? I see the following in the release notes for 4.0.3:

"Fixed a bug where devices added to containers by the --device option to podman run and podman create would not be accessible within the container."

That sounds suspiciously similar to your issue.

Comment 7 Martin Pitt 2022-04-12 15:08:08 UTC
I ran `rpm-ostree update`. That updated to image 36.20220410.1.1 which contains (among other things)

    kernel-core 5.17.0-300.fc36 -> 5.17.1-300.fc36
    skopeo 1:1.6.0-1.fc36 -> 1:1.7.0-1.fc36


... but not yet podman 4.0.3 (https://koji.fedoraproject.org/koji/buildinfo?buildID=1941823)

But this does the trick:

    rpm-ostree override replace https://kojipkgs.fedoraproject.org//packages/podman/4.0.3/1.fc36/x86_64/podman-plugins-4.0.3-1.fc36.x86_64.rpm https://kojipkgs.fedoraproject.org//packages/podman/4.0.3/1.fc36/x86_64/podman-4.0.3-1.fc36.x86_64.rpm

Et voilà, now /dev/kvm access works again \o/

Thanks!


Note You need to log in before you can comment on or make changes to this bug.