78648 – en_GB.UTF-8 support missing from /usr/share/locale - breaks grep and others

Bug 78648 - en_GB.UTF-8 support missing from /usr/share/locale - breaks grep and others

Summary: en_GB.UTF-8 support missing from /usr/share/locale - breaks grep and others

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Red Hat Linux
Classification:	Retired
Component:	grep
Sub Component:
Version:	8.0
Hardware:	i686
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Tim Waugh
QA Contact:	Mike McLean
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2002-11-26 23:19 UTC by Richard Lloyd
Modified:	2007-04-18 16:48 UTC (History)
CC List:	1 user (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2002-11-26 23:19:37 UTC
Embargoed:

Attachments	(Terms of Use)

Description Richard Lloyd 2002-11-26 23:19:31 UTC

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1) Gecko/20020827

Description of problem:
If I install Red Hat Linux 8.0 and select British English as my
language in anaconda, the environmental variable LANG gets set
to en_GB.UTF-8 (in 7.2 it was set to just en_GB). However, the
directory /usr/share/locale only contains an en_GB directory and
no en_GB.UTF-8 directory, which breaks grep (and no doubt other
programs).

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Install RH 8.0, selecting British English as a language when
   in Anaconda.
2. Run this command (grep is undoubtedly not the only broken app,
   but it's the easiest to demo):
   echo "hello" | grep "[A-Z]"
	

Actual Results:  The command outputs the string "hello".

Expected Results:  It should NOT output anything (upper case A-Z are not present
in the
lower case string "hello"). I can imagine this would break a heck
of scripts on the system...

Additional info:

British English environments set LANG to en_GB.UTF-8 in RH 8.0 -
if you set it to en_GB instead, the grep example works correctly
(and this is the easiest workaround IMHO for this bug - although
you could soft-link en_GB.UTF-8 to en_GB in /usr/share/locale I
guess, but that feels less "correct").
Basically, Red Hat 8.0 omits to install an appropriate unicode
directory for British English, namely /usr/share/locale/en_GB.UTF-8.

One additional point here - Mandrake 9.0 *does* have that unicode
directory for British English, so it appears that RH 8.0 is indeed
faulty here.

BTW, I think some of Perl's XML modules in RH 8.0 also suffer
in a British English unicode environment, but I don't have the
full details to hand. Oh and apologies for putting in the grep component,
but I couldn't see an "internationalisation" component that was obvious
(this is a cross-app problem).

Comment 1 Tim Waugh 2002-11-27 09:52:04 UTC

You forgot to use 'LANG=C' if you want POSIX locale character range matching. 
The behaviour you see is correct for the en_GB locale.

Note You need to log in before you can comment on or make changes to this bug.