Version-Release number of selected component: sssd-common-2.3.1-4.fc33 Additional info: reporter: libreport-2.14.0 backtrace_rating: 4 cgroup: 0::/system.slice/sssd.service cmdline: /usr/libexec/sssd/sssd_nss --uid 0 --gid 0 --logger=files crash_function: sss_mmap_cache_pw_store executable: /usr/libexec/sssd/sssd_nss journald_cursor: s=b2e7c0583c004ae8bd317a4440255076;i=2ee4a7;b=7c261537394f4d169971a59cd1761dfe;m=55626889;t=5b13c2149d0cf;x=8fed0b78d48ecd6c kernel: 5.8.14-300.fc33.x86_64 rootdir: / runlevel: N 5 type: CCpp uid: 0 Truncated backtrace: Thread no. 1 (10 frames) #0 sss_mmap_cache_pw_store at src/responder/nss/nsssrv_mmap_cache.c:769 #1 nss_protocol_fill_pwent at src/responder/nss/nss_protocol_pwent.c:308 #2 nss_protocol_reply at src/responder/nss/nss_protocol.c:91 #3 nss_getby_done at src/responder/nss/nss_cmd.c:626 #4 tevent_common_invoke_immediate_handler at ../../tevent_immediate.c:166 #5 tevent_common_loop_immediate at ../../tevent_immediate.c:203 #6 epoll_event_loop_once at ../../tevent_epoll.c:917 #7 std_event_loop_once at ../../tevent_standard.c:110 #8 _tevent_loop_once at ../../tevent.c:772 #9 tevent_common_loop_wait at ../../tevent.c:895
Created attachment 1720250 [details] File: backtrace
Created attachment 1720251 [details] File: core_backtrace
Created attachment 1720252 [details] File: cpuinfo
Created attachment 1720253 [details] File: dso_list
Created attachment 1720254 [details] File: environ
Created attachment 1720255 [details] File: exploitable
Created attachment 1720256 [details] File: limits
Created attachment 1720257 [details] File: maps
Created attachment 1720258 [details] File: mountinfo
Created attachment 1720259 [details] File: open_fds
Created attachment 1720260 [details] File: proc_pid_status
Created attachment 1720261 [details] File: var_log_messages
Hi Kamil, do you have a coredump?
Program terminated with signal SIGBUS, Bus error. #0 0x000055b728aceb15 in sss_mmap_cache_pw_store (_mcc=_mcc@entry=0x55b72a00dd28, name=0x55b72a030380, pw=pw@entry=0x7fff947313c0, uid=uid@entry=173, gid=gid@entry=173, gecos=gecos@entry=0x7fff947313d0, homedir=0x7fff947313e0, shell=0x7fff947313f0) at src/responder/nss/nsssrv_mmap_cache.c:769 This ^^ is `MC_RAISE_BARRIER(rec);`: ``` #define MC_RAISE_BARRIER(m) do { \ m->b2 = MC_NEXT_BARRIER(m->b1); \ __sync_synchronize(); \ } while (0) ``` In "open_fds" there is: ``` 17:/var/lib/sss/mc/passwd (deleted) pos: 0 flags: 02100000 mnt_id: 65 18:/var/lib/sss/mc/passwd pos: 0 flags: 0100002 mnt_id: 65 lock: 1: POSIX ADVISORY WRITE 710 00:20:429251 0 0 ``` -- seems SIGBUS was triggered by attempt to access memory mmap-ed to deleted file. Is there any chance this file ("/var/lib/sss/mc/passwd") was deleted by some outer process?
Created attachment 1720291 [details] coredump Yes, here it is. I have no idea how it happened and how to reproduce that.
> Is there any chance this file ("/var/lib/sss/mc/passwd") was deleted by some outer process? I don't know, this is a fairly fresh F33 VM, and I wasn't doing anything particularly interesting, just playing with abrt. The file is currently there: $ ll /var/lib/sss/mc/passwd -rw-rw-r--. 1 root root 9253600 Oct 9 16:12 /var/lib/sss/mc/passwd
*** Bug 1974235 has been marked as a duplicate of this bug. ***
This message is a reminder that Fedora 33 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora 33 on 2021-11-30. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '33'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 33 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
Hi, I run couple of tests and was able to reproduce similar crashes with SIGBUS only on files which were shortened with e.g. the truncate command. I'm not sure if it is worth to try to protect the memory mapped files against this. I'm also not sure how. Calling fstat() before every memory access will slow things down considerably. Maybe sssd_nss can set an inotify watch to detect such a change but there would still be a chance the the truncation happens while sssd_nss is working in the files before handling inotify. The most promising protection I found is F_SEAL_SHRINK, see man fcntl for details, but this requires an anonymous file in tmpfs, see man memfd_create for details. bye, Sumit
The only good way to handle this is to change how we open the memory mapped files. Instead of opening them directly from the client, we need to introduce a new command call over the pipe that will ask the parent to open them for us, and then pass the fd over the socket to the client. This method has a few advantages: - clients will not be allowed to directly open mmapped files which means the only thing to bind mount over (for stuff like container access) is the sockets. - the server can simply create new files when needed and just *mark* the old files as obsoleted before simply renaming them or even unlink() them on the spot. - the server can move the cache files at will. for Example it can decide to create them on tmpfs for speed on machines where populating the cache at every reboot is ok, while keeping them on long lived storage for machines (like laptops) that are frequently rebooting in disconnected mode and need to preserve the caches. Once this is done, the only case of encountering a truncated() file in the client is generally a server bug. The server should never change the size of the cache files, it should always mark and rename/unlink an old cache and create a new one instead.
Fedora 33 changed to end-of-life (EOL) status on 2021-11-30. Fedora 33 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed.