Description of problem: ls -l segfaults This machine has 2 AMD MP processors and those have some differences in instruction set from Intel's i686 machines and that could potentially be related to the problem. Version-Release number of selected component (if applicable): coreutils-8.23-6.fc22.i686 glibc-2.20.90-20.fc22.i686 How reproducible: Everytime I run ls -l Additional info: gdb /usr/bin/ls GNU gdb (GDB) Fedora 7.8.90.20150202-2.fc22 Copyright (C) 2015 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "i686-redhat-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /usr/bin/ls...Reading symbols from /usr/bin/ls...(no debugging symbols found)...done. (no debugging symbols found)...done. Missing separate debuginfos, use: debuginfo-install coreutils-8.23-6.fc22.i686 (gdb) run -l Starting program: /usr/bin/ls -l [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/libthread_db.so.1". total 524 Program received signal SIGSEGV, Segmentation fault. 0xb7dc795c in __stpcpy_ia32 () from /lib/libc.so.6 (gdb) where #0 0xb7dc795c in __stpcpy_ia32 () from /lib/libc.so.6 #1 0x0804e88a in align_nstrftime.constprop () #2 0x0804fa9a in print_long_format () #3 0x0805036c in print_current_files () #4 0x0804b76c in main ()
It's conceivable though unlikely than a tm_mon outside the range 0..11 is being returned on your platform. This defensive patch would protect against that: diff --git a/src/ls.c b/src/ls.c index cb9d3d6..4698520 100644 --- a/src/ls.c +++ b/src/ls.c @@ -3665,7 +3665,8 @@ align_nstrftime (char *buf, size_t size, char const *fmt, the replacement is not done. A malloc here slows ls down by 2% */ char rpl_fmt[sizeof (abmon[0]) + 100]; const char *pb; - if (required_mon_width && (pb = strstr (fmt, "%b"))) + if (required_mon_width && (pb = strstr (fmt, "%b")) + && 0 <= tm->tm_mon && tm->tm_mon <= 11) { if (strlen (fmt) < (sizeof (rpl_fmt) - sizeof (abmon[0]) + 2)) { If that doesn't help, it does suggest an issue with the stpcpy implementation in libc. In either case running through a debugger would be informative: wget ftp://ftp.gnu.org/pub/gnu/coreutils/coreutils-8.23.tar.xz tar -xf coreutils-8.23.tar.xz cd coreutils-8.23 patch p1 #paste the above ./configure --quiet && make CFLAGS=-ggdb -j3 gdb -tui -args src/ls -l (gdb) b align_nstrftime (gdb) p tm->tm_mon c
This doesn't seem to have gone as expected. Some source gets listed after starting, but commands don't seem to be working as expected. │1243 { │ │1244 int i; │ │1245 struct pending *thispend; │ │1246 int n_files; │ │1247 │ │1248 /* The signals that are trapped, and the number of such signals. */ │ │1249 static int const sig[] = │ │1250 { │ │1251 /* This one is handled specially. */ │ │1252 SIGTSTP, │ │1253 │ │1254 /* The usual suspects. */ │ │1255 SIGALRM, SIGHUP, SIGINT, SIGPIPE, SIGQUIT, SIGTERM, │ │1256 #ifdef SIGPOLL │ │1257 SIGPOLL, │ │1258 #endif │ │1259 #ifdef SIGPROF │ │1260 SIGPROF, │ │1261 #endif │ │1262 #ifdef SIGVTALRM │ │1263 SIGVTALRM, │ │1264 #endif │ │1265 #ifdef SIGXCPU │ │1266 SIGXCPU, │ │1267 #endif │ │1268 #ifdef SIGXFSZ │ │1269 SIGXFSZ, For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from src/ls...done. (gdb) b align_nstrftime Breakpoint 1 at 0x804f1df: file src/ls.c, line 3666. (gdb) p tm->tm_mon No symbol "tm" in current context. (gdb) c The program is not being run. (gdb)
src/ls -l works normally, while the system ls still segfaults.
I added a run step that might have been missing. The month value looked OK for a few iterations. (gdb) p tm->tm_mon $2 = 6 (gdb) c Continuing. Breakpoint 1, align_nstrftime (buf=0xbfffd6f5 "\001-. 1 ", size=1001, fmt=0x805c714 "%b %e %Y", tm=0xb7f1ebc0 <_tmbuf>, __utc=0, __ns=0) at src/ls.c:3666
oops sorry, yes I missed the run step. It's interesting/annoying that only the system version segfaults. Do you have ls aliased to add any time style options etc? Perhaps you have a TIME_STYLE env variable that's not passed through the debugger, in which case you'd need a --time-style to simulate for the command passed to the debugger.
Can you reproduce in the debugger for the system version? # yum install coreutils-debuginfo $ gdb -tui -args /usr/bin/ls -l (gdb) b align_nstrftime (gdb) r (gdb) p tm->tm_mon (gdb) c If not reproducible under the debugger, perhaps the TZ env var is different under the debugger and your shell?
We had the same issue in Arch Linux with glibc-2.21. This is our proposed fix for glibc: https://sourceware.org/ml/libc-alpha/2015-02/msg00191.html
This might be moot now given the above comment, but for completeness: Missing separate debuginfos, use: debuginfo-install libacl-2.2.52-7.fc22.i686 libattr-2.4.47-9.fc22.i686 libcap-2.24-7.fc22.i686 libselinux-2.3-6.fc22. i686 pcre-8.36-3.fc22.i686 xz-libs-5.2.0-2.fc22.i686 (gdb) p tm->tm_mon $1 = 9 (gdb) c Continuing. Program received signal SIGSEGV, Segmentation fault. 0xb7dc794c in __stpcpy_ia32 () from /lib/libc.so.6
Thanks guys! That confirms the problem is within stpcpy and is mostly likely the issue that Allan identified. Bruno, I am confused as to why your built version didn't also hit the issue. If it was using the gnulib replacement rather than the glibc version it would explain things, but I don't know why it would use the replacement. It's not using the replacement if you see HAVE_STPCPY in the output of `grep STPCPY lib/config.h`. It would be useful to confirm that from the directory where you built coreutils as instructed in comment #1.
In my testing 'ls -l' wouldn't segfault when ls.c was built with -O0, only when using -O1 or higher optimization levels. This would explain why Bruno's build worked fine.
Any of the ls options produce segfault on my Socket A systems, including the default aliases.
ls -l seems to be working again.
Thanks for the feedback Bruno. Yep, rawhide 2.21.90-2 uses glibc master branch at commit ebf27d1, which includes the pertinent commit (132a132). http://pkgs.fedoraproject.org/cgit/glibc.git/commit/?id=c6d44c99
(In reply to Bruno Wolff III from comment #12) > ls -l seems to be working again. Running what, a fresh installation, or an upgrade? Upgrading F22 on Socket A host kt880 didn't install a newer coreutils or stop the optioned ls segfaulting: coreutils-8.23-6.fc22.i686 glibc-2.21-1.fc22.i686 glibc-common-2.21-1.fc22.i686
Right, F22 is not updated yet. Note the fix is on the 2.21 branch upstream: https://sourceware.org/git/?p=glibc.git;a=shortlog;h=refs/heads/release/2.21/master
(In reply to Pádraig Brady from comment #15) > Right, F22 is not updated yet. Note the fix is on the 2.21 branch upstream: > https://sourceware.org/git/?p=glibc.git;a=shortlog;h=refs/heads/release/2.21/ > master F22 has the fix for this bug. The glibc-2.21-1 should not fail. Just to make sure I've bumped the NVR and kicked off a rebuild to redo the testing. Thus glibc-2.21-2.fc22 should absolutely have the fix for this segfault problem, otherwise it's another problem and we need someone to provide some SOS info about the failure e.g. backtrace. Upcoming build: http://koji.fedoraproject.org/koji/taskinfo?taskID=8967632
The patch wasn't applied to the f22 build due to bugs in rpm. These are tracked at bug #1193603 Carlos is working around the issue in a further build.
Fixed in glibc-2.21-3.fc22.
Final build in progress: http://koji.fedoraproject.org/koji/taskinfo?taskID=8975702
This was fixed in Fedora 22.