Red Hat Bugzilla – Bug 159098
's' command doesn't work with '|' delimiter and '\|' in expression
Last modified: 2007-11-30 17:07:18 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.8) Gecko/20050516 Firefox/1.0.4
Description of problem:
I'm not sure whether this is a bug in sed or if it just isn't documented well; the info page for "The `s' Command" simply says: "The `/' characters may be uniformly replaced by any other single character within any given `s' command."
Version-Release number of selected component (if applicable):
Steps to Reproduce:
Run the following command on a machine with an smp or hugemem kernel loaded:
uname -r | sed 's|smp\|enterprise\|bigmem\|hugemem||g'
Actual Results: 2.4.21-27.5smp
Expected Results: 2.4.21-27.5
The correct result is output by changing the delimiter to another character, such as:
uname -r | sed 's_smp\|enterprise\|bigmem\|hugemem__g'
Sed also works as expected if the alternatives are removed from the expression, as in:
uname -r | sed 's|smp||g'
If '(' or ')' is used both in the expression and as a delimiter, the expression works, provided the character used as a delimiter is escaped in the expression, which means the metacharacter forms can't be used. (Not that I would ever do that in real code, as it looks very confusing.) If '.', '*', '[', or ']' is used both in the expression and as a delimiter, the expression works, provided the same metacharacter in the expression is escaped, which means they can't be used as literals (in which case you wouldn't want to use them as delimiters anyway).
The sed behaviour is right.
When you use | as the delimiter character, | in the BRE must be escaped.
Therefore, \| in BRE is literal | character and there is no way how to express
\| (BRE alternation). With sed -r (where EREs are used instead of BREs),
if | is used as delimiter character, \| in the ERE is ERE alternation and
there is no way to express literal character | in the ERE.
Perhaps man 1p sed
is more descriptive about this, it says:
Any backslash used to alter the default meaning of a subsequent character shall
be discarded from the BRE or the replacement before evaluating the BRE or using