Bug 326151 - Regular expression that matchs start of paragraph and removes content gets compared against remaining content again
Summary: Regular expression that matchs start of paragraph and removes content gets co...
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: Fedora
Classification: Fedora
Component: openoffice.org
Version: 7
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Caolan McNamara
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2007-10-10 11:59 UTC by Jan Pazdziora
Modified: 2007-11-30 22:12 UTC (History)
0 users

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-10-16 08:37:29 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenOffice.org 82473 0 None None None Never

Description Jan Pazdziora 2007-10-10 11:59:59 UTC
Description of problem:

If you have string "ABC DEF GHI" and want to remove the first word, you will use
Ctrl+F, Search for "^[^ ]+", empty Replace with string, and in More Options
check Regular expressions.

The result of "Replace All" is " DEF GHI", with space at the beginning of the
string.

If you would like to remove the first word _and_ the leading space, you'll use
regular expression "^[^ ]+ " -- note that here we've appended a space character
to the regular expression. However, the result of such search / replace is
"GHI". So appending a space to a regular expression somehow changed the meaning
of that "^[^ ] +" that preceded it.

Version-Release number of selected component (if applicable):

openoffice.org-calc-2.2.1-18.2.fc7.x86_64
openoffice.org-writer-2.2.1-18.2.fc7.x86_64

How reproducible:

Deterministic.

Steps to Reproduce:
1. Create new text document.
2. Write string "ABC DEF GHI".
3. Try to remove the first word including the space after it (ABC ) with regular
expression "^[^ ]+ ".
  
Actual results:

"ABC DEF " gets removed.

Expected results:

Only "ABC " should have got removed.

Additional info:

The problem happens both in the writter and in calc.

Comment 1 Caolan McNamara 2007-10-10 12:55:52 UTC
It seems that after the first replace the remaining string then matches the
pattern, and the replace all is getting re-run after the first replace where it
now matches the pattern again and so the second replace takes place.

i.e. for replace all, instead of 
echo ABCZDEFZGHI | sed -r -e 's/^[^Z]+Z//'
we have
effectively 
echo ABCZDEFZGHI | sed -r -e 's/^[^Z]+Z//' | sed -r -e 's/^[^Z]+Z//'

Here's a more concise example, i.e. search string of 
^.{3}
and no replace string to remove the first 3 characters, with the above example
it keeps removing a block of 3 characters until only 2 are left with "replace all"

Comment 2 Caolan McNamara 2007-10-16 08:37:29 UTC
Moving upstream, affects us all: i.e.
http://www.openoffice.org/issues/show_bug.cgi?id=82473


Note You need to log in before you can comment on or make changes to this bug.