From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040116 Description of problem: With LANG=en_US, awk gets a fatal internal error. Version-Release number of selected component (if applicable): gawk-3.1.3-9 How reproducible: Always Steps to Reproduce: # rpm -q redhat-release redhat-release-3.94AS-1 # rpm -q gawk gawk-3.1.3-9 # LANG=en_US awk -F "[[:alnum:]]" ' { print } ' < /etc/fstab awk: fatal error: internal error Aborted # LANG=en_US.UTF-8 awk -F "[[:alnum:]]" ' { print } ' < /etc/fstab LABEL=/ / ext3 defaults 1 1 LABEL=/boot /boot ext3 defaults 1 2 none /dev/pts devpts gid=5,mode=620 0 0 none /dev/shm tmpfs defaults 0 0 none /proc proc defaults 0 0 none /sys sysfs defaults 0 0 /dev/sda2 swap swap defaults 0 0 Actual Results: with LANG=en_US, awk aborts with LANG=en_US.UTF-8, awk succeeds Expected Results: awk should succeed in both cases Additional info:
The problem is in the file regcomp.c in the function build_charclass(). There is defined macro BUILD_CHARCLASS_LOOP() that does chars translation by casetable[] (defined in eval.c). But this table can contains negative numbers too. If you use negative numbers for bitset_set() that is inside BUILD_CHARCLASS_LOOP() the "awk" crashs. The contributed patch resolve this problem, but I unsure if use unsigned char is best way, because casetable[] maybe expects negative chars. Maybe better way will rewrite bitset_set(sbcset...), but I think a lot of things depend on it. All tests pass with the patch, but tests works with LANG=C only...
Created attachment 106074 [details] fix?
Note: maybe instead strange casetable[] in eval.c build translation table (in re.c: make_regexp()) by same way as the "sed": for (i = 0; i < sizeof(translate) / sizeof(char); i++) translate[i] = tolower (i); new_regex->translate = translate; for me this looks better than static definition in the "awk"...
http://sources.redhat.com/ml/bug-glibc/2003-09/msg00069.html
Florian, thanks for link. The problem is definitely with unsigned/signed casetable (RE_TRANSLATE_TYPE). Fixed in the devel tree (FC4).
Fixed in the RHEL-4-HEAD too.