Bug 137832
| Summary: | awk aborts under certain locales | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 4 | Reporter: | Paul Clements <paul.clements> | ||||
| Component: | gawk | Assignee: | Karel Zak <kzak> | ||||
| Status: | CLOSED RAWHIDE | QA Contact: | Brock Organ <borgan> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | medium | ||||||
| Version: | 4.0 | CC: | kzak | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | All | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2004-11-17 17:06:20 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 135876 | ||||||
| Attachments: |
|
||||||
The problem is in the file regcomp.c in the function build_charclass(). There is defined macro BUILD_CHARCLASS_LOOP() that does chars translation by casetable[] (defined in eval.c). But this table can contains negative numbers too. If you use negative numbers for bitset_set() that is inside BUILD_CHARCLASS_LOOP() the "awk" crashs. The contributed patch resolve this problem, but I unsure if use unsigned char is best way, because casetable[] maybe expects negative chars. Maybe better way will rewrite bitset_set(sbcset...), but I think a lot of things depend on it. All tests pass with the patch, but tests works with LANG=C only... Created attachment 106074 [details]
fix?
Note: maybe instead strange casetable[] in eval.c build translation
table (in re.c: make_regexp()) by same way as the "sed":
for (i = 0; i < sizeof(translate) / sizeof(char); i++)
translate[i] = tolower (i);
new_regex->translate = translate;
for me this looks better than static definition in the "awk"...
Florian, thanks for link. The problem is definitely with unsigned/signed casetable (RE_TRANSLATE_TYPE). Fixed in the devel tree (FC4). Fixed in the RHEL-4-HEAD too. |
From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040116 Description of problem: With LANG=en_US, awk gets a fatal internal error. Version-Release number of selected component (if applicable): gawk-3.1.3-9 How reproducible: Always Steps to Reproduce: # rpm -q redhat-release redhat-release-3.94AS-1 # rpm -q gawk gawk-3.1.3-9 # LANG=en_US awk -F "[[:alnum:]]" ' { print } ' < /etc/fstab awk: fatal error: internal error Aborted # LANG=en_US.UTF-8 awk -F "[[:alnum:]]" ' { print } ' < /etc/fstab LABEL=/ / ext3 defaults 1 1 LABEL=/boot /boot ext3 defaults 1 2 none /dev/pts devpts gid=5,mode=620 0 0 none /dev/shm tmpfs defaults 0 0 none /proc proc defaults 0 0 none /sys sysfs defaults 0 0 /dev/sda2 swap swap defaults 0 0 Actual Results: with LANG=en_US, awk aborts with LANG=en_US.UTF-8, awk succeeds Expected Results: awk should succeed in both cases Additional info: