Bug 326151 - Regular expression that matchs start of paragraph and removes content gets compared against remaining content again
Summary: Regular expression that matchs start of paragraph and removes content gets co...
Alias: None
Product: Fedora
Classification: Fedora
Component: openoffice.org
Version: 7
Hardware: All
OS: Linux
Target Milestone: ---
Assignee: Caolan McNamara
QA Contact: Fedora Extras Quality Assurance
Depends On:
TreeView+ depends on / blocked
Reported: 2007-10-10 11:59 UTC by Jan Pazdziora
Modified: 2007-11-30 22:12 UTC (History)
0 users

Clone Of:
Last Closed: 2007-10-16 08:37:29 UTC

Attachments (Terms of Use)

External Trackers
Tracker ID Priority Status Summary Last Updated
OpenOffice.org 82473 None None None Never

Description Jan Pazdziora 2007-10-10 11:59:59 UTC
Description of problem:

If you have string "ABC DEF GHI" and want to remove the first word, you will use
Ctrl+F, Search for "^[^ ]+", empty Replace with string, and in More Options
check Regular expressions.

The result of "Replace All" is " DEF GHI", with space at the beginning of the

If you would like to remove the first word _and_ the leading space, you'll use
regular expression "^[^ ]+ " -- note that here we've appended a space character
to the regular expression. However, the result of such search / replace is
"GHI". So appending a space to a regular expression somehow changed the meaning
of that "^[^ ] +" that preceded it.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Create new text document.
2. Write string "ABC DEF GHI".
3. Try to remove the first word including the space after it (ABC ) with regular
expression "^[^ ]+ ".
Actual results:

"ABC DEF " gets removed.

Expected results:

Only "ABC " should have got removed.

Additional info:

The problem happens both in the writter and in calc.

Comment 1 Caolan McNamara 2007-10-10 12:55:52 UTC
It seems that after the first replace the remaining string then matches the
pattern, and the replace all is getting re-run after the first replace where it
now matches the pattern again and so the second replace takes place.

i.e. for replace all, instead of 
echo ABCZDEFZGHI | sed -r -e 's/^[^Z]+Z//'
we have
echo ABCZDEFZGHI | sed -r -e 's/^[^Z]+Z//' | sed -r -e 's/^[^Z]+Z//'

Here's a more concise example, i.e. search string of 
and no replace string to remove the first 3 characters, with the above example
it keeps removing a block of 3 characters until only 2 are left with "replace all"

Comment 2 Caolan McNamara 2007-10-16 08:37:29 UTC
Moving upstream, affects us all: i.e.

Note You need to log in before you can comment on or make changes to this bug.