If $LANG contains "utf8", then [^\w] doesn't work right: setenv LANG en_US echo -n "foo.bar" | \ perl -e '$_ = <>; print join (" | ", split (/([^\w]+)/)) . "\n";' ===> "foo | . | bar" (right) setenv LANG en_US.utf8 echo -n "foo.bar" | \ perl -e '$_ = <>; print join (" | ", split (/([^\w]+)/)) . "\n";' ===> "foo.bar" (wrong!) It works fine in both cases if you do $_ = "foo.bar" instead of reading the text from stdin. This is perl, v5.8.0 built for i386-linux-thread-multi (with 1 registered patch, see perl -V for more detail) perl-5.8.0-88 Red Hat Linux release 9 (Shrike) Linux 2.4.20-8smp #1 SMP Thu Mar 13 16:43:01 EST 2003 i686 athlon i386 Maybe this is a dup of 102106, I can't tell.
Very sorry for the long delay in processing this bug report. This bug is no longer a problem with the perl in any current Red Hat OS release.