Description of problem: [root@localhost ~]# rpm-ostree install cockpit cockpit-podman cockpit-storaged cockpit-dashboard cockpit-ostree Checking out tree 5947c35... done Enabled rpm-md repositories: fedora updates fedora-cisco-openh264 rpm-md repo 'fedora' (cached); generated: 2020-10-19T23:26:56Z rpm-md repo 'updates' (cached); generated: 2020-11-22T00:51:17Z rpm-md repo 'fedora-cisco-openh264' (cached); generated: 2020-08-25T19:10:34Z Importing rpm-md... done Resolving dependencies... done Checking out packages... done Running pre scripts... done error: Bus owner changed, aborting. This likely means the daemon crashed; check logs with `journalctl -xe`. [root@localhost ~]# coredumpctl info PID: 3513 (rpm-ostree) UID: 0 (root) GID: 0 (root) Signal: 11 (SEGV) Timestamp: Sun 2020-11-22 17:07:13 UTC (43s ago) Command Line: /usr/bin/rpm-ostree start-daemon Executable: /usr/bin/rpm-ostree Control Group: /system.slice/rpm-ostreed.service Unit: rpm-ostreed.service Slice: system.slice Boot ID: c349b1eb7e1246b3b237981757aee70e Machine ID: 4d1fdddf42234ef6a3f89e72dfda9354 Hostname: localhost.localdomain Storage: /var/lib/systemd/coredump/core.rpm-ostree.0.c349b1eb7e1246b3b237981757aee70e.3513.1606064833000000.zst Message: Process 3513 (rpm-ostree) of user 0 dumped core. Stack trace of thread 4160: #0 0x00000000b62f3d30 strchrnul (libc.so.6 + 0x80d30) Version-Release number of selected component (if applicable): Fedora 33.20201119.0 (IoT Edition) coredumpctl available here : https://easyupload.io/3vfksw (available for 30 days)
What's the output of `rpm -q rpm-ostree` ? This is a likely dup of https://bugzilla.redhat.com/show_bug.cgi?id=1890577
[root@localhost ~]# rpm -q rpm-ostree warning: Found bdb Packages database while attempting sqlite backend: using bdb backend. rpm-ostree-2020.8-1.fc33.armv7hl
Thanks for the report. It looks like this may be a different issue from what has been fixed in 2020.8. Jérôme, can you please follow https://fedoraproject.org/wiki/StackTraces and report back the output of `thread apply all bt full` from a gdb session using the coredump above?
I have the same issue when attempting to install fail2ban on a raspberry pi 2b (armv7). Same output of rpm -q rpm-ostree as above, and same output of coredumpctl info. I check journalctl -xe as advised and systemd-coredump reports process xxx (rpm-ostree) of user 0 dumped core: Stack trace of thread 2382: #0 0x000000000b62b5d30 strchrnul (libc.so.6 + 0x80d30) I am unable to install gdb as the rpm-ostree fails with the same error/stack trace etc
Can you install gdb and the same rpm-ostree version + debuginfo in a privileged container with bind mounts so that you can transfer the core dump and try there? Otherwise, you can also use `rpm-ostree usroverlay` and then installing `gbd-minimal` directly by RPM. (I think there's one other dep you'd also need to fetch by hand.)
I'm experiencing exact same issue (0x00000000b62a0d60 strchrnul (libc.so.6 + 0x80d60)) I prepared gdb in container and there is backtrace generated: Core was generated by `/usr/bin/rpm-ostree start-daemon'. Program terminated with signal SIGSEGV, Segmentation fault. (gdb) bt #0 0xb6350d30 in __argz_create_sep (string=<optimized out>, delim=2, argz=0xb38fe008, len=0xb6421e18) at argz-ctsep.c:47 #1 0x00000000 in ?? () Backtrace stopped: previous frame identical to this frame (corrupt stack?) (gdb) frame #0 0xb6350d30 in __argz_create_sep (string=<optimized out>, delim=2, argz=0xb38fe008, len=0xb6421e18) at argz-ctsep.c:47 47 --nlen; (gdb) x/50x $sp 0xb38fdb00: 0xb38fdf42 0x00000001 0x00000001 0xb64ed754 0xb38fdb10: 0xb38fdf52 0x00000001 0xb38fdb3c 0xb6504cec 0xb38fdb20: 0x00000000 0x00000000 0x00000000 0x00000000 0xb38fdb30: 0x00000000 0x00000003 0x000005e8 0xb6422c78 0xb38fdb40: 0x00000000 0x00000000 0x00000020 0x00000000 0xb38fdb50: 0x00000000 0x00000000 0x00000000 0xb6421e18 0xb38fdb60: 0x00000000 0x00000002 0x0052e82d 0x00001034 0xb38fdb70: 0xb38ff580 0x00000000 0xffffffff 0x00000000 0xb38fdb80: 0xb640403c 0x00000000 0x00000000 0xb6327d90 0xb38fdb90: 0x00000011 0x0052e83e 0xb38fe39c 0xb6543e45 0xb38fdba0: 0xb642320c 0x00000001 0xb38fe1c4 0xb6328108 0xb38fdbb0: 0x00000000 0x00000000 0x00000000 0x00000000 0xb38fdbc0: 0x00000000 0x00000000 (gdb) info locals rp = 0xfbad8000 <error: Cannot access memory at address 0xfbad8000> wp = <optimized out> nlen = 3012551580 it looks like there is some buffer overflow corrupting stack. In my case it is caused by installing k3s selinux: https://rpm.rancher.io/k3s/stable/common/centos/7/noarch/k3s-selinux-0.2-1.el7_8.noarch.rpm
I just found in previous comment request for this: (gdb) thread apply all bt full Thread 4 (Thread 0xb4cfd040 (LWP 847)): #0 0xb63a54e4 in internal_fallocate64 (len=-5462859712275939329, offset=4294967295, fd=1) at ../sysdeps/posix/posix_fallocate64.c:36 st = {st_dev = 2, __pad1 = 0, __st_ino = 2147483647, st_mode = 4294967295, st_nlink = 3670904576, st_uid = 11806224, st_gid = 3059451596, st_rdev = 754823547118, __pad2 = 0, st_size = -5310270675074856684, st_blksize = -1261450496, st_blocks = -5310232845147732544, st_atim = {tv_sec = 11814408, tv_nsec = 11785064}, st_mtim = {tv_sec = -1090696978, tv_nsec = -1236178984}, st_ctim = {tv_sec = 0, tv_nsec = 0}, st_ino = 13028863522405089280} increment = <optimized out> #1 __GI___posix_fallocate64_l64 (fd=1, offset=<optimized out>, len=0) at ../sysdeps/unix/sysv/linux/posix_fallocate64.c:37 res = -1271921144 #2 0x00000000 in ?? () No symbol table info available. Backtrace stopped: previous frame identical to this frame (corrupt stack?) Thread 3 (Thread 0xb42ff040 (LWP 848)): #0 0xb63a54e4 in internal_fallocate64 (len=52713615387525119, offset=4294967295, fd=1) at ../sysdeps/posix/posix_fallocate64.c:36 st = {st_dev = 3, __pad1 = 11865296, __st_ino = 2147483647, st_mode = 4294967295, st_nlink = 3670904576, st_uid = 11865456, st_gid = 11865460, st_rdev = 50961783036135942, __pad2 = 0, st_size = -5310224031730021844, st_blksize = 11865296, st_blocks = 50743217547978744, st_atim = {tv_sec = 11865272, tv_nsec = -1090696698}, st_mtim = {tv_sec = 175, tv_nsec = -1233994756}, st_ctim = {tv_sec = 11814576, tv_nsec = 11872248}, st_ino = 13137395768631314950} increment = <optimized out> #1 __GI___posix_fallocate64_l64 (fd=1, offset=<optimized out>, len=0) at ../sysdeps/unix/sysv/linux/posix_fallocate64.c:37 res = 12273344 #2 0x00000000 in ?? () No symbol table info available. Backtrace stopped: previous frame identical to this frame (corrupt stack?) Thread 2 (Thread 0xb4cfe020 (LWP 846)): #0 0xb63a54e4 in internal_fallocate64 (len=-5507796649423929345, offset=4294967295, fd=1) at ../sysdeps/posix/posix_fallocate64.c:36 st = {st_dev = 23369724511387651, __pad1 = 0, __st_ino = 2147483647, st_mode = 4294967295, st_nlink = 3670904576, st_uid = 11818592, st_gid = 1, st_rdev = 50894194226495489, __pad2 = 32, st_size = -5310270678279127040, st_blksize = 1, st_blocks = 22543235376498408, st_atim = {tv_sec = 11785272, tv_nsec = 4994756}, st_mtim = {tv_sec = 0, tv_nsec = 0}, st_ctim = {tv_sec = 0, tv_nsec = -1236347052}, st_ino = 140643224740} increment = <optimized out> #1 __GI___posix_fallocate64_l64 (fd=1, offset=<optimized out>, len=-4684501980372918760) at ../sysdeps/unix/sysv/linux/posix_fallocate64.c:37 res = -1282383840 #2 0xb62eb1dc in __libc_start_main (main=0xbefd4e44, argc=-1237180904, argv=0xb62eb1dc <__libc_start_main+344>, init=<optimized out>, fini=0x5234c8 <__libc_csu_fini>, rtld_fini=0xb6fc909c <_dl_fini>, stack_end=0xbefd4e44) at libc-start.c:320 __p = <optimized out> ptr = <optimized out> result = <optimized out> unwind_buf = {cancel_jmp_buf = {{jmp_buf = {-1725690856, -1846021788, 5387368, 0, 4890756, 0, 0, 0, 5582608, 0 <repeats 17 times>, 119, 1, -1090695600, 35032, 520, 11802408, -1261443744, -1237178980, -1090695600, 0, 0, -1238065408, 0, -1237179604, 11780216, -1227823300, -1237180904, -1238349752, -1237097072, -1237174488, -1227839572, -1238349576, -1227823392, -1227823328, -1227823392, -1227823296, -1244734276, -1237174488, -1227839572, -1238349576, -1090695600, 0, -1225571436, -1226043696, -1225608956, -1226043696, 2, -1224962476}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0xb6febfb4, 0xbefd4e50}, data = {prev = 0x0, cleanup = 0x0, canceltype = -1224818764}}} not_first_call = -1282383840 #3 0x004aa0c8 in _start () No symbol table info available. Backtrace stopped: previous frame identical to this frame (corrupt stack?) Thread 1 (Thread 0xb38ff040 (LWP 2160)): #0 0xb6350d30 in __argz_create_sep (string=<optimized out>, delim=2, argz=0xb38fe008, len=0xb6421e18) at argz-ctsep.c:47 rp = 0xfbad8000 <error: Cannot access memory at address 0xfbad8000> wp = <optimized out> nlen = 3012551580 #1 0x00000000 in ?? () No symbol table info available.
Hi, I may have logged in with different credentials. I am james, who tried to install fail2ban. I installed gdb-headless using rpm-ostree ok, it is a package that is a dependency for gdb. I then used rpm-ostree usroverlay and installed gdb after wgeting the package. I ran rpm-ostree install fail2ban, and it crapped out as usual with the complaint that the bus owner had changed. I then ran gdb and used the command bt, but there was no backtrace. I su'd to root, and ran gdb with file set to rpm-ostree and ran the command run install fail2ban. The command ran and crapped out as usual, and then I ran the command for bt to obtain a backtrace, but was informed - no stack. I am inexperienced with gdb. I wonder if I perhaps need to install debug symbols? I have also opened a bug report for fedora-iot - https://pagure.io/fedora-iot/issue/38
Paul got a good backtrace in 1906184 for this strchrnul fault. Closing as dupe of that one. *** This bug has been marked as a duplicate of bug 1906184 ***