Created attachment 2078836 [details] schedmd patch to fix ebpf log errors and correctly constrain devices Description of problem: Slurm fails to constrain devices under group v2 due to kernel ebpf change. Version-Release number of selected component (if applicable): slurm < 23.02.4 How reproducible: cgroup.conf ConstrainDevices=yes Configure gres resources in gres.conf, ours has 5 different Nvidia h200 mig instances and a full h200. Steps to Reproduce: 1. Configure Slurm for gres devices 2. srun --gres=gpu:h200:mig.1g.18g nvidia-smi 3. Receive non 1g.18gb mig profile Actual results: ebpf log error All Nvidia devices returned by nvidia-smi All Slurm jobs pile on to one gpu Expected results: Only selected gpu profile returned by nvidia-smi Additional info: Adding the patch from https://support.schedmd.com/show_bug.cgi?id=17210 to the spec file of the Slurm source rpm, ensuring the epel macro is installed and rpmbuild the spec file, back porting the ebpf patch and force upgrade to resulting new rpm files resolves the issue. This approach keeps the system consistent with rhel9 and group v2 compatibility. Workaround is likely to switch to cgroup v1 without patch. This would be a valuable patch to add to Slurm versions prior to the 23.02.4