Bug 100970 - bash: grep: command not found
Summary: bash: grep: command not found
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Linux Beta
Classification: Retired
Component: bash
Version: beta1
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Tim Waugh
QA Contact: Ben Levenson
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2003-07-28 08:26 UTC by Nicolas Mailhot
Modified: 2007-04-18 16:56 UTC (History)
5 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2004-03-11 13:21:28 UTC
Embargoed:


Attachments (Terms of Use)

Description Nicolas Mailhot 2003-07-28 08:26:42 UTC
Description of problem:

I'm used to putting spaces around pipes to improve readability. This has been
broken in terms lately though. A command like :

[nim@ulysse nim]$ ps auwx | grep tomcat

will return

bash:  grep: command not found

The problem is linked to the key sequence I use to type pipe-space with my
keyboard layout : pipe is altgr+6 ant thus space is often altgr+space which is
defined as nobreakspace.

Anyway there is no reason for a shell to distinguish between space and
nobreakspace (unlike a text editor) so bash should be taught to treat all spaces
the same way.

How reproducible:

Allways

Steps to Reproduce:
1. set fr-latin9 as default keyboard layout
2. type a command involving pipe-space, keeping altgr pressed when space is pressed
3. watch bash complain

Additional info:

bash-2.05b-25, probably allways been broken

Comment 1 Tim Waugh 2003-07-28 08:32:26 UTC
See the bash man page, in particular the section about the IFS variable.

Comment 2 Nicolas Mailhot 2003-07-28 08:37:44 UTC
So what ?
Is there a good reason nobreakspace should not be added to the default IFS value ?

I really do not want to tweak IFS on dozens of systems unless there is a very
good reason not to add nobreakspace to IFS by default

Comment 3 Tim Waugh 2003-07-28 08:38:54 UTC
Yes: it can be a filename character.

Comment 4 Nicolas Mailhot 2003-07-28 08:49:38 UTC
I'd agree with the thought. Except I've yet to find any app that uses
nobreakspace in file names and users certainly do not.

I can't remember ever not having to escape/quote file names with spaces, because
everyone uses normal space in them.

So I really don't see it an IFS showstopper. When it's so easy to insert
nobreakspace after a pipe on some layouts, it's a shame it's not in IFS. At
least in text editors it makes sence, because one *sees* the nobreakspace.

Comment 5 Tim Waugh 2003-07-28 09:40:07 UTC
The initial value of IFS is mandated by POSIX.

Sounds like your problem is with the font used in whichever terminal you have
not displaying the non-breaking character correctly as a rotated '[' or whatever.

Comment 6 Nicolas Mailhot 2003-07-28 10:05:54 UTC
Seeing nobreakspaces in shell might help. But is it really possible ? It can't
be done changing fonts - people will use whatever font they like (vera,
corefonts...) and showing them is app dependant - do we want to see
nobreakspaces in html pages for example ?

I don't suppose whoever wrote the POSIX entry thought about nobreakspace. Is it
even available without UTF8/latin9 ? I'd hate to be stuck forever because no one
really thought of it and no one wants to discuss what was written long ago.

Comment 7 Than Ngo 2003-07-28 11:17:49 UTC
i'm not sure if it's a bug in urw-fonts. What is correct to display for
altgr+space? I have tried xterm with fixed font, it also has this problem!

Owen: any comment about this issue?


Comment 8 Owen Taylor 2003-07-28 15:01:17 UTC
No-break-space should show as a space; if it displayed anything else,
it wouldn't work for text files. I don't think there is anything
to do here except to change the AltGR+space keyboard mapping
not to produce a no-break-space. (This doesn't seem a like a very
useful thing to me as compared to obvious problem here.)

Reassing to XFree86 - Mike Harris will probably want a bug filed
in bugs.xfree86.org

Comment 9 Nicolas Mailhot 2003-07-28 15:14:19 UTC
nobreakspace is *very* useful in locales where typographic rules require spaces
before punctation marks like : (ie in french for example like here). Without it
one can not do any serious editing because most/all the text apps out there will
consider very smart to go to line before ending : or » (really, even MS word had
to add a nobreakspace access because despite all the automated workarounds in
its code people still have to handle it manually).

The problem here is bash follows posix and posix clearly didn't thought about
nobreakspace. Can't we add the working behaviour to bash, at least in its
non-strict posix mode ? Or ask whatever org handles Posix or LSB their thoughts
on this ?

All the various workarounds proposed so far involved killing one class or app or
another, when the core problem is really in bash defaults.

Comment 10 Tim Waugh 2003-07-28 16:04:03 UTC
Think about what you're asking:

That a non-breaking space character, i.e. one that should *not* be used to
delineate separate words, be used a a marker for splitting words from each other.

It's entirely clear to me that it would be the wrong thing to do, and that bash
is doing exactly the right thing in this case.

Comment 11 Nicolas Mailhot 2003-07-28 16:28:48 UTC
Well I'd agree with you *if* someone somewhere was taking advantage of this
*and* the shell provided some sort of visual hint a non-breaking space is used.

Right now both conditions are false : no app I know of uses it the way you
suggest, and when it's used otherwise (ie as a separator, wich does happen) the
shell will issue a very confusing error message (really what would *you* do if
your shell told you bash:  grep: command not found or you found it in some
server logs - cron jobs and such do use pipes)

People do not think of non-breaking spaces in shells. Hell even if an app
somewhere decided to get smart and replace spaces with non-breaking spaces in
filenames it'd be shot down by the number of disgrunted users that could no
longer figure how to type the file names in their shells (because a space and a
non-breaking space look the same in a term unlike a text processor).

Really from a usability POV, a shell should not differentiate between
space/non-breaking space. Or if it does provide some kind of bloody visual hint.

But the whole default as mandated by posix shows its writers didn't want to get
into makefile-style whitespace mess - tabs are treated exactly like spaces, as
non-breaking spaces should.

Comment 12 Owen Taylor 2003-07-28 16:37:37 UTC
Actually, looking though the POSIX shell grammer, it implies that the 
delimitation of tokens should be done based on isspace()/LC_CTYPE.

And in non-C locales, isspace(U+00A0) should be true; so perhaps
in theory this *is* a bash bug. In pratice though, I suspect the chance 
of things changing here is miniscule. It's certainly something where
we would be very hestitant to deviate from upstream.

(Note that the default IFS only contains U+0020, so U+00A0
won't be interchangeable with space in all circumstances, just when
parsiing input into tokens.)

So, if the nobreakspace key combination is desired, I suspect that the
correct resolution is to move it back to bash and WONTFIX. Though 
you could always try to get the change upstream.


Comment 13 Tim Waugh 2003-07-28 16:42:46 UTC
Single UNIX, and POSIX last time I read it, says IFS starts as: the space
character, the tab character, and the newline character.

No bash bug here.  Even if you thought that changing bash would make things
work: what about all the other shells?

Comment 14 Miloslav Trmac 2003-07-28 17:49:10 UTC
FWIW, shells are not used only by users. It is perfectly OK to generate a string
of random non-(0, slash, space, tab, newline) characters and use them in a shell
script for a file name. Such script would break by changing IFS if all uses of
the variable were not quoted ("$key").

The original problem can be worked around by
alias ' grep'=grep
(where the second space is non-breaking), if there are only a few commands
often used after | (which true in my case).

Comment 15 Owen Taylor 2003-07-28 18:09:18 UTC
*  Note that tokenization *does not* use IFS.

  Try IFS="@ " shell -c "ls@|@cat"

 It won't work. If you read the appropriate section of POSIX,
 it strongly implies that LC_CTYPE does effect tokenization.

 I'm not saying that bash's behavior should change - it might
 be a noticeable speed hit, it would involve changing lots of
 code, and it might even be security concern; but, I do think
 that bash's current behavior doesn't comply with the letter
 of POSIX.

* Generally, any time you use filenames in a shell script,
  you should generally quote so it works if you have (normal)
  spaces in them. If your shell script doesn't handle filenames
  with spaces, it doesn't really matter if it also doesn't handle
  filnames with non-breaking spaces in them.


Comment 16 Nicolas Mailhot 2003-07-28 19:03:08 UTC
Btw the problem with the aliasing is it breaks as soon as one tries to use a new
command or type two nobreakspaces. So it's a bit helpful, but not much.

I'll post the bug url to gnu.bash.bug so hopefully bash maintainers can chirp in.

Comment 17 Brian J. Fox 2003-07-28 20:05:38 UTC
Add non-breaking space to the .inputrc.

As has been pointed out, this is an interactive user-mistyping type of bug, 
not a bug in semantics or behavior.

It should easily be fixable by having non-breaking space produce standard 
space from within readline.


Comment 18 Miloslav Trmac 2003-07-28 20:13:20 UTC
Another data point: the libc locale definition for LC_CTYPE
does *not* include U+00A0 (non-breaking space).
(From my reading of ISO C, it is not really obvious
whether it belongs there or not).

I have filed a defect report against the POSIX standard regarding the
locale-dependent definition of token recognition.

Brian's solution looks nice, but I wouldn't want it on default
install. It's as if readline were silently rewriting 'a' to 'b'.

Comment 19 Nicolas Mailhot 2003-07-28 20:17:58 UTC
[ about using .inputrc ]

Except for the wonderful world of shell scripts and crontabs.
                                                                                
The original typing might be interactive, but the consequence may also manifest
itself long after on another computer in a non-interactive environment.
                                                                                
(though as workaround go this is certainly one of the best suggested yet)


Comment 20 Brian J. Fox 2003-07-28 20:31:18 UTC
Shell scripts and crontabs are not typed interactively, they are typed into an 
editor (and, in the specific case presented here, into an editor running under 
X-windows on a graphical system).  I don't think bash should ignore the 
differences between the two characters, and I don't think that editors should 
hide the values of the characters that users type.

I do think that the purpose of interactive features in the shell is to make 
life easier for the end-user -- those who wish to enter non-breaking spaces 
into the command line may do so by prefixing them with C-v.


Comment 21 Nicolas Mailhot 2003-07-28 20:43:41 UTC
Sure. My (admitely evil) user wish is to have the shell treat all non
quoted/escaped whitespace as a token separator which is what 99,99 % of users
will want I think.

Whitespace differentiation is ok and has lots of usages in editing (where one
can *see* the differing whitespace effects aither directly or indirectly) but it
has no place in the shell IMHO. As long as the shell relies on some sort of
filtering *not* to feed it special whitespace some poor user will find a way to
feed it these chars. Because unix is that flexible.

Comment 22 Miloslav Trmac 2003-07-28 20:51:47 UTC
Nicolas, by using other <blank> characters you are actually asking for trouble.
You type in a script, carefully test it and after you are confident that it works,
you distribute it/sell it/whatever. The poor recipient of your script will find
out that it does not, in fact, work, on his/her system. In fact, it won't work
on *your* system when run with LC_CTYPE=C. (e.g. editing /etc/rc.d/rc.sysinit).

Comment 23 Nicolas Mailhot 2003-07-28 21:30:11 UTC
Hey, I didn't say I actively *wanted* to do this. Don't get me wrong. Just that
this will happen in reality and the shell bloody well should not misbehave when
it encounters unusual whitespace.

[ and even if what you described happened to me why should I feel ashamed ? I
know at least one $wellknown $bigbucks product whose utilities crashed on any
non en_US locale. Of course this was not documented anywhere, one had to find a
fellow non-american to learn the workaround ]

Fact of life : what's "simple" for users is not always so for apps. Apps usually
have to adapt, because users won't ever (the user is evil and dumb - unless you
are the user:)

Here the simple user-understandable rule is whitespace acts as shell separator
unless quoted/escaped.

We all know unicode locale is the near-future, and non-break-space is generic
enough to affect most locales. Or to we want to sprinkle every other script line
with LANG=C like I've seen RedHat doing lately for example ?

Comment 24 Mike A. Harris 2003-08-01 09:11:10 UTC
This is not a bug in XFree86 (or anything else).  It's just simply a user
doing something wrong, and getting something unexpected.  The proper thing
to do is don't do that.

Closing bug as NOTABUG.

Comment 25 Nicolas Mailhot 2003-08-01 09:26:40 UTC
Did we have the Posix defect report answer ?
The general opinion seemed to be the specs are not crystal-clear on this point

Comment 26 Owen Taylor 2003-08-01 13:31:57 UTC
Not sure it should stay open (as mentioned above, it's not something
where we would want to deviate from upstream bash), but definitely not 
an XFree86 issue.


Comment 27 Miloslav Trmac 2003-08-01 19:18:44 UTC
XCU ERN 25 has been filed (see
http://www.opengroup.org/austin/aardvark/finaltext/xcubug.txt),
but expect that to last a few weeks.

I have no idea whether it will be accepted, the standard is quite clear
that LC_CTYPE should be honored. I hope we will get some sort of rationale
at least.

Comment 28 Miloslav Trmac 2003-08-08 12:04:46 UTC
Resolution of the POSIX defect report (XCU ERN 25, to separate tokens
only by <space> or <tab> instead of <blank>):

> The standard clearly states these requirements,and conforming implementations
> must conform to this.
>
> The group feels there is no defect here.

All I can say is I'm fine with current bash behavior, POSIX or not.

Comment 29 Nicolas Mailhot 2004-03-11 13:21:28 UTC
I won't say I agree with the POSIX guys, but it's their call -> closing


Note You need to log in before you can comment on or make changes to this bug.