Is this some weird UTF-8 thing or am I being stupid? Stock FC3 grep: $ rpm -q grep grep-2.5.1-31.i386 $ cat out U CRYPTO_add_lock w __cxa_finalize U d2i_PrivateKey U d2i_PrivateKey_bio $ grep ' [A-TV-Z] ' out Actual behaviour: $ grep ' [A-TV-Z] ' out w __cxa_finalize $ Expected behaviour as per LANG=C $ LANG=C grep ' [A-TV-Z] ' out $
It's not UTF-8 but locale collation order. Try LANG=en_GB, for example. For the particular case you're after I think it's most portably described by ' [ABCDEFGHIJKLMNOPQRSTVWXYZ] ', believe it or not. See 'man grep', "Regular Expressions", paragraph 5.