Bug 3701 - m//g functionality in perl 5.005 changed (broken?) --
Summary: m//g functionality in perl 5.005 changed (broken?) --
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: perl
Version: 6.0
Hardware: i386
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Chip Turner
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 1999-06-24 11:11 UTC by ksparger
Modified: 2008-05-01 15:37 UTC (History)
0 users

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2003-01-24 18:10:56 UTC
Embargoed:


Attachments (Terms of Use)

Description ksparger 1999-06-24 11:11:12 UTC
Closest documented change:

/RE/g may match at the same position (with non-zero length)
after a zero-length match (bug fix).

Assume the following scenario:

A file containing:

----------------------------------
;authoritative data for huzzah.dom
@               IN      SOA    backup.test.dom.
root.backup.test.dom
                                (
                                1999062405      ;Serial
Number: YYYYMMDDxx
                                10800   ;Refresh time
                                3600    ;Retry Time
                                604800  ;Time
                                86400   ;Minimum TTL
                                )
                IN      NS      backup.test.dom.
                IN      NS      backup2.test.dom.
                IN      MX      10      mail
                IN      A       10.0.0.254
www             IN      A       10.0.0.254
ftp             IN      A       10.0.0.254
mail            IN      A       10.0.0.254
test    IN      A       10.0.1.100
--------------------

Note that the first line was wrapped and the serial number
line was wrapped.

Assume the following code snippet:

open(SLURP, file);
@zone = <SLURP>
close(SLURP);

	$i = 0;
	$a = 0;

TOP: while ($zone[$i])
{
	if ($zone[$i] =~ /\(/)
	{
	        $zone[$i] =~ s/.*\(//;
        	MIDDLE: while ($zone[$i])
                {
	               while ($zone[$i] =~ /(\d+)/go)
                       {
				if ($` !~ /;/)
				{
		                        $timeargs[$a] = $1;
        	                        $a++;
				}
                       }
                       if ($zone[$i] =~ /\)/)
                       {
	                       last TOP;
                       }
                       $i++;
		}
	}
        $i++;
}

-------------

Granted, this isn't 100% the actual code, but hopefully, you
get the idea.  We're looking for the line with the (, we
drop everything preceding it, then we search through the
next ), for digits, excluding everything that comes after a
semi-colon on a line.

The problem comes in when I attempt the $zone[$i] = /(\d+)/g
-- according to the documentation, it will match a match,
then ignore that specific the next time around, allowing one
to execute a block of code on each match.

However, in this case, for some reason, it goes into an
infinate loop.  Note that this was not an issue with my old
perl version (perl-5.004m4-1, as distributed with an earlier
RedHat CD -- perhaps as an update, I don't really remember).

Also, if I drop in an:

"$zone[$i] = s/.*$1//;" after the "if ($` !~ /;/)" block,
removing the matched data from the variable, it works
properly.

Between the facts that it worked great with the older
version, and the fact that it works fine if I explicitly
remove the matched data from the variable, I believe the
following:

It is not an error in my code.
It must either be:
	An intentional change of functionality (a reference to
which I was unable to find in the perldelta documentation)
	Or, a bug.

I found the original documentation for this feature in
"Programming Perl, 2nd Ed" by Larry Wall, Tom Christiansen,
& Randal L. Schwartz, published by O'Reilly (aka "the camel
book) on page 71.

One can also find documentation in perlre(1).

While I don't believe this is an error on my part, I
apologize in advance for any wasted time if it turns out to
be so.

Comment 1 Jay Turner 1999-06-28 14:38:59 UTC
This issue has been forwarded to a developer for further action.

Comment 2 ksparger 1999-06-28 15:28:59 UTC
> However, in this case, for some reason, it goes into an
> infinate loop.  Note that this was not an issue with my old
> perl version (perl-5.004m4-1, as distributed with an earlier
> RedHat CD -- perhaps as an update, I don't really remember).

I forgot to mention that it goes into an infinate loop, continually
matching the same thing.  In the example, it will match "1999062405"
forever.

Comment 3 Cristian Gafton 1999-07-29 02:05:59 UTC
The code is incorrect. In the MIDDLE loop that looks like this:

                     while ($zone[$i] =~ /(\d+)/go)
                       {
                                if ($` !~ /;/)
                                {
                                        $timeargs[$a] = $1;
                                        $a++;
                                }
                       }

you are not incrementing $i, so the while will sit on the same thing
forever.

Not a real perl bug.

Comment 4 ksparger 1999-07-29 09:33:59 UTC
Gafton;  I think you are misinterpreting what the code is supposed to
be doing.

"In a scalar context, m//g interates through the string, returning
true each time it matches, and false when it eventually runs out of
matches.  (In other words, it remembers where it left off last time
and restarts the search at that point. ..." -- p 71 Programming Perl,
Larry Wall, et all.

It's not SUPPOSED to increment $i, because I want to search through
the variable, find the first match for \d+, assign it into a variable,
then find the second match for \d+, assign it to a variable, and
repeat ad naseum until it has no more matches.

Provided that the /g flag is working properly, it will eventually NOT
match \d+, and thus eventually return FALSE, and thus break out of the
loop, because the while loop will only run while m//g returns TRUE.

As I stated in the summary, it might be an intentional functionality
change, but if it's not, then it's either a bug, or the perl book is
incorrect.  I tend to believe that the perl book is accurate, because
it worked properly under 5.004 as distributed with Red Hat 5.1.

Comment 5 Cristian Gafton 1999-07-29 13:54:59 UTC
I can cite from books too:

"A failed match normally resets the search position to the beginning
of the string, but you can avoid that by adding the /c modifier (e.g.
m//gc). Modifying the target string also resets the search position."
(from the perlop man page)

Something tells me that your "if ($` !~ /;/)" test might do just that
- modify the target string.

Comment 6 ksparger 1999-10-26 21:59:59 UTC
I figured I'd try this once again, since I'm still able to trigger the
problem, and I've done some more research.  The problem is NOT because
the "if ($` =~ /;/)" test resets it -- the test program below
completely removes it.  Note that the problem still appears to be
occuring in perl-5.00503-6 -- it may be a library problem or
something, since I only updated the perl RPM, but I do have the
problem on vanilla 6.0 machines.  Here's the information I have right
now:

Trigger program:

---------------

#!/usr/bin/perl -w

open(ZONE, 'test.dom');
@zone = <ZONE>;

while ($zone[0] =~ /(\d+)/g)
{
  push(@timeargs,$1);
}

print "@timeargs\n";

----------------

Contents of test.dom:
----------------
12345
----------------

Some information:

1.  The problem did not occur in perl-5.004-4
2.  It first appeared (as far as I can tell) in perl-5.00503-2
3.  The problem ONLY occurs when the perl script is setuid, and being
run by a user other than the owner.  It does not occur when the script
is not setuid, or is being run by the owner.
4.  You absolutely MUST get the input from a file, and you absolutely
MUST reference $zone[0] (or $zone[$i], or something with a subscript).
	a.  If you explicitly declare @zone in the perl script instead of
grabbing it from a file, the while loop will not go into an endless
loop.
	b. If you do not use a subscript for referencing the line in $zone
(for example, if you assign $zone[0] to $line, and then reference
$line in the while test condition instead), the problem does not
occur.
5.  Technically, you don't need the push and print statements -
they're just there to show what the program is trying to accomplish.
The while loop will still lock up, even if you remove them.

Hopefully, you won't just look at this and just discard it again --
I've tested this on standard redhat 6.0 machines, and they all display
this behaviour.  I really don't think it's normal.  If you can't
reproduce it, let me know, and I'll just accept that perhaps my system
is totally bizzare.

Comment 7 Cristian Gafton 2000-08-09 02:45:54 UTC
assigned to nalin

Comment 8 Crutcher Dunnavant 2000-09-18 21:03:25 UTC
Hey, so I've got this now, then.

I'm looking at your code example (the last one) and I'm thinking, that's gonna
only grab one argument, as the regex /(\d+)/ has a + after the digit, meaning
'one or more'.

With the plus, it finds one argument, without it, it finds 5 of them. This is
how it is supposed to act.

Comment 9 ksparger 2000-09-19 01:41:01 UTC
Crutcher,

True, but the match is supposed to return false if it can find no more matches,
at which point it breaks out of the loop, and prints what it found.

Point 3 is especially important -- it only misbehaves if running setuid from
another user.  Observe:

$ id
uid=500(ksparger) gid=500(ksparger) groups=500(ksparger),520(devel)
$ cat > bugzilla.pl
#!/usr/bin/perl -w

open(ZONE, 'test.dom');
@zone = <ZONE>;

while ($zone[0] =~ /(\d+)/g)
{
  push(@timeargs,$1);
}

print "@timeargs\n";
^D
$ cat > test.dom
12345
^D
$ chmod 755 bugzilla.pl
$ ./bugzilla.pl
12345
$ su
Password:
# id
uid=0(root) gid=0(root)
groups=0(root),1(bin),2(daemon),3(sys),4(adm),6(disk),10(wheel)
# chown root.root bugzilla.pl
# chmod 4755 bugzilla.pl
# ls -al bugzilla.pl
-rwsr-xr-x   1 root     root          145 Sep 18 21:36 bugzilla.pl
# ./bugzilla.pl
12345
# exit
$ id
uid=500(ksparger) gid=500(ksparger) groups=500(ksparger),520(devel)
$ ./bugzilla.pl

.... and then it hangs into infinity.

Surely, in this case, one set of behaviour while running setuid and another set
when not running setuid is not the desired behaviour. :)

Comment 10 Crutcher Dunnavant 2000-10-17 19:16:44 UTC
I cannot reproduce your last results. I have no problems.

Comment 11 Stephen John Smoogen 2003-01-24 18:10:56 UTC
Bug 3701 is being closed because developer was unable to duplicate the problem,
and original reporter has not responded in 2.x years. The current version of
Perl is very very different than the 5.005 that was reported in this bug. 

Please submit a new bug under 8.0 if it is still a problem in that release.


Note You need to log in before you can comment on or make changes to this bug.