Description of problem: Excessive precision in a printf format string provokes a segfault. I.e., "%1.Ns" for large N. Version-Release number of selected component (if applicable): How reproducible: Consistently, on rawhide (updated a day or two ago) and RHEL 3 & 4. Steps to Reproduce: 1. LC_ALL=en_US.UTF-8 /usr/bin/printf %1.25000000s x; echo Actual results: Segmentation fault Expected results: x Additional info: You can reproduce the failure using a simple C program, too: $ cat kk.c #include <stdio.h> #include <locale.h> int main () { setlocale (LC_ALL, ""); printf ("%1.25000000s", "x"); return 0; } $ gcc -O kk.c && LC_ALL=fr_FR.utf8 ./a.out zsh: segmentation fault LC_ALL=fr_FR.utf8 ./a.out [Exit 139 (SEGV)] I looked at libc/stdio-common/vfprintf.c's process_string_arg macro, and spotted this: len = prec != -1 ? (size_t) prec : strlen (mbs); \ if (__libc_use_alloca (len * sizeof (wchar_t))) \ string = (CHAR_T *) alloca (len * sizeof (wchar_t)); \ I'm not sure it's related -- haven't used a debugger or rebuilt -- but in that test, "len * sizeof (wchar_t)" can overflow, which leads to allocating far less space than is eventually used -> buffer overrun. For the record, this started with a report filed against coreutils' printf command: <http://bugs.debian.org/421555>.
(Copy of my email sent yesterday at 3:00AM.) Hi, I use your personnal emails [of few libc developers] because the bug might be a security vulnerability (I don't know Linux kernel enough to guess). If it is not, I can open a bug report on Bugzilla if you would like to. I found a bug in dpkg program (from apt-get of Debian project): COLUMNS=10000000 dpkg -l => Crash with segfault (SIGSEGV) After long investigation (around one week :-)), I'm certain that the bug comes from GNU libc. The crash is not specific to this program, any program allowing to change format string of printf() may crash. Smallest C testcase: ------------------------------------------------------------- #include <stdlib.h> #include <stdio.h> #include <locale.h> int main() { setlocale (LC_CTYPE, ""); printf("%-1.30500200s\n", "Hello"); return 0; } ------------------------------------------------------------- If your locale is not UTF-8, specify another multibyte locale to setlocale(). The value "30500200" just have to be bigger than current stack size limit. You can also try with bash/core-utils printf: ------------------------------------------------------------- printf '%-1.25000000s' 'Hello' ------------------------------------------------------------- The bug is located in stdio-common/vfprintf.c, macro "process_string_arg", in this block: ------------------------------------------------------------- if (prec != -1) { /* Search for the end of the string, but don't search past the length (in bytes) specified by the precision. Also don't use incomplete characters. */ if (_NL_CURRENT_WORD (LC_CTYPE, _NL_CTYPE_MB_CUR_MAX) == 1) len = __strnlen (string, prec); else { /* In case we have a multibyte character set the situation is more compilcated. We must not copy bytes at the end which form an incomplete character. */ wchar_t ignore[prec]; const char *str2 = string; mbstate_t ps; memset (&ps, '\0', sizeof (ps)); if (__mbsnrtowcs (ignore, &str2, prec, prec, &ps) == (size_t) -1) { done = -1; goto all_done; } if (str2 == NULL) len = strlen (string); else len = str2 - string - (ps.__count & 7); } } else len = strlen (string); ------------------------------------------------------------- If 1 < prec and 1 < LC_CTYPE[_NL_CTYPE_MB_CUR_MAX], we go in "complicated" block :-) Now imagine that prec is equal to 30500200: 30 MB will be "allocated" on the stack (by "wchar_t ignore[prec]") whereas Linux use 8 MB (in default config) for stack limit. Stack *should* grow up/down, but on my compute (i386) gcc just use "sub $eax, $esp" instruction to allocated memory and Linux just raises the signal SIGSEGV. I don't know enough locale API (mbsnrtowcs() function) to fix the bug. Victor Stinner http://www.inl.fr/
Fixed upstream.
Fixed in glibc-2.5.90-22.