78919 – The split function has a loop error

Bug 78919 - The split function has a loop error

Summary: The split function has a loop error

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Linux
Classification:	Retired
Component:	perl
Sub Component:
Version:	8.0
Hardware:	i686
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Warren Togami
QA Contact:	David Lawrence
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2002-12-03 15:59 UTC by Need Real Name
Modified:	2007-04-18 16:48 UTC (History)
CC List:	1 user (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2005-09-11 11:48:17 UTC
Embargoed:

Attachments	(Terms of Use)
Perl test script (1.03 KB, text/plain) 2002-12-03 16:02 UTC, Need Real Name	no flags	Details
Output of your script on my machine (341 bytes, text/plain) 2002-12-03 20:26 UTC, Michael Lee Yohe	no flags	Details
Test script and output in a tar ball (10.00 KB, application/octet-stream) 2002-12-04 19:56 UTC, Need Real Name	no flags	Details
View All

Description Need Real Name 2002-12-03 15:59:42 UTC

From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0; LM-MS; T312461)

Description of problem:
The split function produces a loop error when attempting to split a string. 
This problem exists in the 5.8.0 version (RH 8.0) of Perl but not the 5.6.1 
version (RH 7.3).

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Execute the splitTest.pl script
2. 
3.
	

Actual Results:  The error occurs at line 34 of the script with Perl 5.8.0, 
prints:

Start problem split
Works with ver 5.6.1 not 5.8.0

Split loop.


Expected Results:  Start problem split
Works with ver 5.6.1 not 5.8.0

Number of elements = 4
         [ junker.pl ] 
         [ junk.pl ] 
         [ splitTest.pl ] 
         [ test.pl ] 
end of program


Additional info:

Comment 1 Need Real Name 2002-12-03 16:02:45 UTC

Created attachment 87230 [details]
Perl test script

Comment 2 Michael Lee Yohe 2002-12-03 16:17:59 UTC

If you backslash the space in the split function does it work?

Alternatively, you can use '\s' which should encompass all whitespace.

Comment 3 Need Real Name 2002-12-03 17:11:50 UTC

(@fieldList) = split /[ \t\n]+/, $fileList; # original produces loop error
#(@fieldList) = split /[' '\t\n]+/, $fileList; # loop error
#(@fieldList) = split (/[' '\t\n]+/, $fileList); # loop error
#(@fieldList) = split /[\ \t\n]+/, $fileList; # loop error
#(@fieldList) = split (/[\ \t\n]+/, $fileList); # loop error
#(@fieldList) = split /[\s\t\n]+/, $fileList; # loop error
#(@fieldList) = split (/[\s\t\n]+/, $fileList); # loop error
#(@fieldList) = split ' ', $fileList; # does work with 5.8

All produce the same loop error with Perl 5.8, all work with Perl 5.6.1.

Comment 4 Michael Lee Yohe 2002-12-03 20:26:59 UTC

Created attachment 87265 [details]
Output of your script on my machine

Comment 5 Michael Lee Yohe 2002-12-03 20:31:30 UTC

WORKSFORME.

I see no error when running your script.  I even went so far to uncomment the
lines which you said caused more loop errors.  My original suggestion to you
works without a problem:

@fieldList = split /\s+/, $fileList;
$count2 = @fieldList;

print "Number of elements = ", $count2 ,"\n";

for $count2 ( @fieldList ) {
    print "\t [ $count2 ] \n";
}

I received the following as output (as expected):
Number of elements = 3
	 [ junker.pl ] 
	 [ junk.pl ] 
	 [ test.pl ] 

Did you try running this under a Red Hat Linux 8.0 box, or under Cygwin session
(judging from the path of your attachment)?

Comment 6 Need Real Name 2002-12-03 20:41:02 UTC

I ran the script under Cygwin (no problem), Red Hat 7.3 (no problem), Red Hat 
8.0 (problem). I guess I will have to take the script home and run on my RH 8.0 
system, since I only have 3 Intel boxes to play with.

Comment 7 Michael Lee Yohe 2002-12-03 20:46:41 UTC

I'm just wondering because the following:

@fieldList = split /\s+/, $fileList;
@fieldList = split /[\ \t\n]+/, $fileList;

are both Perl clean.  I verified it on two other Red Hat Linux 8 based boxes
without having a problem.

$ rpm -q perl
perl-5.8.0-55

Comment 8 Need Real Name 2002-12-03 20:58:15 UTC

I don't have any problem with:

@fieldList = split /\s+/, $fileList;

either, works just like for my data set:

(@fieldList) = split ' ', $fileList;

Comment 9 Michael Lee Yohe 2002-12-03 21:49:05 UTC

' ' == simply a space.
'\s' == [\ \t\n\r] (or the plethora of whitespace that's out there).

Either way - this should not be a bug since I can get your script working fine
on a Red Hat Linux 8 box.

Comment 10 Need Real Name 2002-12-04 16:18:32 UTC

I have executed the script on another RH 8 installation and I have the same 
problem.

@fieldList = split /\s+/, $fileList; #works
@fieldList = split /[\ \t\n]+/, $fileList; #loop error

Comment 11 Michael Lee Yohe 2002-12-04 17:39:50 UTC

$ perl
$fileList = "test.pl\n                 Test.pl\       string.pl\n";
@fieldList = split /[\ \t\n]+/, $fileList;
foreach(@fieldList) { print "$_\n"; }
<< CTRL+D >>
test.pl
Test.pl
string.pl

Still WORKSFORME on two different Red Hat platforms.  Could you please provide
the output of what you are doing since I can't generate your error message?  It
might help (or may not).

Comment 12 Need Real Name 2002-12-04 19:56:59 UTC

Created attachment 87384 [details]
Test script and output in a tar ball

Comment 13 Need Real Name 2002-12-04 20:05:50 UTC

I have added an attachment with the current version of the test script and the 
output.

I sent the script out to another organization they were able to execute it 
correctly on all but one of there RH 8.0 installs. The, now 3, systems that 
have a problem with the script have one thing in common all three were total 
new installations. The other RH 8.0 systems that work were upgraded from 7.2 to 
8.0.  Do you know how your machines got to be 8.0? If all of your systems are 
upgrades then this problem may only apply to new 8.0 installations verses 
upgrades to 8.0. Thoughts?

All tests were run on Red Hat 8.0 Linux systems. My administrative system is a 
win2k system, I use Cygwin on the win2k system so I don't have to use DOS.

Comment 14 Michael Lee Yohe 2002-12-04 20:14:25 UTC

I see the "Split loop" error in your text file (but have yet to see it on my
system).  I have three Red Hat Linux 8.0 workstations - one fresh install, one
upgraded from Red Hat Linux 7.2->7.3.94 (null)->8.0, one upgraded from Red Hat
Linux 7.1->8.0.  All three have no problems.

Since you've been able to reproduce this problem - run your perl script as
follows (go into debugging mode):

perl -d splitTest.pl

Quick reference:
Enter "s" for each line until you come to the culprit and submit all the
relavent output information when it script fails.

Comment 15 Need Real Name 2002-12-04 20:45:50 UTC

The trace from line 36 until the script terminated.


main::(splitTest.pl:36):        print "Start problem split\nWorks with ver 
5.6.1 not 5.8.0\n";
  DB<1> s
Start problem split
Works with ver 5.6.1 not 5.8.0
main::(splitTest.pl:38):        @fieldList = split /\s+/, $fileList; # does 
work with 5.8
  DB<1> s
main::(splitTest.pl:39):        print "Split 2\n\n";
  DB<1> s
Split 2

main::(splitTest.pl:41):        (@fieldList) = split /[ \t\n]+/, $fileList; # 
original produces loop error
  DB<1> s
main::CODE(0x8209ef8)(splitTest.pl:41):
41:     (@fieldList) = split /[ \t\n]+/, $fileList; # original produces loop 
error
  DB<1> s
utf8::CODE(0x8209ef8)(/usr/lib/perl5/5.8.0/utf8.pm:3):
3:      $utf8::hint_bits = 0x00800000;
  DB<1> s
utf8::CODE(0x8209ef8)(/usr/lib/perl5/5.8.0/utf8.pm:5):
5:      our $VERSION = '1.00';
  DB<1> s
utf8::CODE(0x8209ef8)(/usr/lib/perl5/5.8.0/utf8.pm:22):
22:     1;
23:     __END__
  DB<1> s
main::CODE(0x8209ef8)(splitTest.pl:41):
41:     (@fieldList) = split /[ \t\n]+/, $fileList; # original produces loop 
error
  DB<1> s
main::CODE(0x8209ef8)(splitTest.pl:41):
41:     (@fieldList) = split /[ \t\n]+/, $fileList; # original produces loop 
error
  DB<1> s
utf8::SWASHNEW(/usr/lib/perl5/5.8.0/utf8_heavy.pl:18):
18:         my ($class, $type, $list, $minbits, $none) = @_;
  DB<1> s
utf8::SWASHNEW(/usr/lib/perl5/5.8.0/utf8_heavy.pl:19):
19:         local $^D = 0 if $^D;
  DB<1> s
utf8::SWASHNEW(/usr/lib/perl5/5.8.0/utf8_heavy.pl:21):
21:         print STDERR "SWASHNEW @_\n" if DEBUG;
  DB<1> s
utf8::SWASHNEW(/usr/lib/perl5/5.8.0/utf8_heavy.pl:35):
35:         my $file; ## file to load data from, and also part of the %Cache 
key.
  DB<1> s
utf8::SWASHNEW(/usr/lib/perl5/5.8.0/utf8_heavy.pl:36):
36:         my $ListSorted = 0;
  DB<1> s
utf8::SWASHNEW(/usr/lib/perl5/5.8.0/utf8_heavy.pl:38):
38:         if ($type)
39:         {
  DB<1> s
utf8::SWASHNEW(/usr/lib/perl5/5.8.0/utf8_heavy.pl:143):
143:        my $extras;
  DB<1> s
utf8::SWASHNEW(/usr/lib/perl5/5.8.0/utf8_heavy.pl:144):
144:        my $bits;
  DB<1> s
utf8::SWASHNEW(/usr/lib/perl5/5.8.0/utf8_heavy.pl:146):
146:        my $ORIG = $list;
  DB<1> s
utf8::SWASHNEW(/usr/lib/perl5/5.8.0/utf8_heavy.pl:147):
147:        if ($list) {
  DB<1> s
utf8::SWASHNEW(/usr/lib/perl5/5.8.0/utf8_heavy.pl:148):
148:            my @tmp = split(/^/m, $list);
  DB<1> s
utf8::SWASHNEW(/usr/lib/perl5/5.8.0/utf8_heavy.pl:149):
149:            my %seen;
  DB<1> s
utf8::SWASHNEW(/usr/lib/perl5/5.8.0/utf8_heavy.pl:151):
151:            $extras = join '', grep /^[^0-9a-fA-F]/, @tmp;
  DB<1> s
utf8::SWASHNEW(/usr/lib/perl5/5.8.0/utf8_heavy.pl:154):
154:                grep {/^([0-9a-fA-F]+)/ and not $seen{$1}++} @tmp; # XXX 
doesn't do ranges right
  DB<1> s
utf8::SWASHNEW(/usr/lib/perl5/5.8.0/utf8_heavy.pl:153):
153:                sort { hex $a <=> hex $b }
  DB<1> s
utf8::SWASHNEW(/usr/lib/perl5/5.8.0/utf8_heavy.pl:157):
157:        if ($none) {
  DB<1> s
utf8::SWASHNEW(/usr/lib/perl5/5.8.0/utf8_heavy.pl:162):
162:        if ($minbits < 32) {
  DB<1> s
utf8::SWASHNEW(/usr/lib/perl5/5.8.0/utf8_heavy.pl:163):
163:            my $top = 0;
  DB<1> s
utf8::SWASHNEW(/usr/lib/perl5/5.8.0/utf8_heavy.pl:164):
164:            while ($list =~ /^([0-9a-fA-F]+)(?:\t([0-9a-fA-F]+)?)(?:\t([0-
9a-fA-F]+))?/mg) {
  DB<1> s
utf8::SWASHNEW(/usr/lib/perl5/5.8.0/utf8_heavy.pl:172):
172:                $top > 0xffff ? 32 :
173:                $top > 0xff ? 16 :
174:                $top > 1 ? 8 : 1
  DB<1> s
utf8::SWASHNEW(/usr/lib/perl5/5.8.0/utf8_heavy.pl:176):
176:        $bits = $minbits if $bits < $minbits;
  DB<1> s
utf8::SWASHNEW(/usr/lib/perl5/5.8.0/utf8_heavy.pl:178):
178:        my @extras;
  DB<1> s
utf8::SWASHNEW(/usr/lib/perl5/5.8.0/utf8_heavy.pl:179):
179:        for my $x ($extras) {
  DB<1> s
utf8::SWASHNEW(/usr/lib/perl5/5.8.0/utf8_heavy.pl:180):
180:            pos $x = 0;
  DB<1> s
utf8::SWASHNEW(/usr/lib/perl5/5.8.0/utf8_heavy.pl:181):
181:            while ($x =~ /^([^0-9a-fA-F\n])(.*)/mg) {
  DB<1> s
utf8::SWASHNEW(/usr/lib/perl5/5.8.0/utf8_heavy.pl:182):
182:                my $char = $1;
  DB<1> s
utf8::SWASHNEW(/usr/lib/perl5/5.8.0/utf8_heavy.pl:183):
183:                my $name = $2;
  DB<1> s
utf8::SWASHNEW(/usr/lib/perl5/5.8.0/utf8_heavy.pl:184):
184:                print STDERR "$1 => $2\n" if DEBUG;
  DB<1> s
utf8::SWASHNEW(/usr/lib/perl5/5.8.0/utf8_heavy.pl:185):
185:                if ($char =~ /[-+!]/) {
  DB<1> s
utf8::SWASHNEW(/usr/lib/perl5/5.8.0/utf8_heavy.pl:181):
181:            while ($x =~ /^([^0-9a-fA-F\n])(.*)/mg) {
  DB<1> s
utf8::SWASHNEW(/usr/lib/perl5/5.8.0/utf8_heavy.pl:179):
179:        for my $x ($extras) {
  DB<1> s
utf8::SWASHNEW(/usr/lib/perl5/5.8.0/utf8_heavy.pl:201):
201:        print STDERR "CLASS = $class, TYPE => $type, BITS => $bits, NONE => 
$none\nEXTRAS =>\n$extras\nLIST =>\n$list\n" if DEBUG;
  DB<1> s
utf8::SWASHNEW(/usr/lib/perl5/5.8.0/utf8_heavy.pl:203):
203:        my $SWASH = bless {
204:            TYPE => $type,
205:            BITS => $bits,
206:            EXTRAS => $extras,
207:            LIST => $list,
208:            NONE => $none,
209:            @extras,
  DB<1> s
utf8::SWASHNEW(/usr/lib/perl5/5.8.0/utf8_heavy.pl:212):
212:        if ($file) {
  DB<1> s
utf8::SWASHNEW(/usr/lib/perl5/5.8.0/utf8_heavy.pl:216):
216:        return $SWASH;
  DB<1> s
utf8::SWASHGET(/usr/lib/perl5/5.8.0/utf8_heavy.pl:223):
223:        my ($self, $start, $len) = @_;
  DB<1> s
utf8::SWASHGET(/usr/lib/perl5/5.8.0/utf8_heavy.pl:224):
224:        local $^D = 0 if $^D;
  DB<1> s
utf8::SWASHGET(/usr/lib/perl5/5.8.0/utf8_heavy.pl:225):
225:        my $type = $self->{TYPE};
  DB<1> s
utf8::SWASHGET(/usr/lib/perl5/5.8.0/utf8_heavy.pl:226):
226:        my $bits = $self->{BITS};
  DB<1> s
utf8::SWASHGET(/usr/lib/perl5/5.8.0/utf8_heavy.pl:227):
227:        my $none = $self->{NONE};
  DB<1> s
utf8::SWASHGET(/usr/lib/perl5/5.8.0/utf8_heavy.pl:228):
228:        print STDERR "SWASHGET @_ [$type/$bits/$none]\n" if DEBUG;
  DB<1> s
utf8::SWASHGET(/usr/lib/perl5/5.8.0/utf8_heavy.pl:229):
229:        my $end = $start + $len;
  DB<1> s
utf8::SWASHGET(/usr/lib/perl5/5.8.0/utf8_heavy.pl:230):
230:        my $swatch = "";
  DB<1> s
utf8::SWASHGET(/usr/lib/perl5/5.8.0/utf8_heavy.pl:231):
231:        my $key;
  DB<1> s
utf8::SWASHGET(/usr/lib/perl5/5.8.0/utf8_heavy.pl:232):
232:        vec($swatch, $len - 1, $bits) = 0;  # Extend to correct length.
  DB<1> s
utf8::SWASHGET(/usr/lib/perl5/5.8.0/utf8_heavy.pl:233):
233:        if ($none) {
  DB<1> s
utf8::SWASHGET(/usr/lib/perl5/5.8.0/utf8_heavy.pl:237):
237:        for ($self->{LIST}) {
  DB<1> s
utf8::SWASHGET(/usr/lib/perl5/5.8.0/utf8_heavy.pl:238):
238:            pos $_ = 0;
  DB<1> s
utf8::SWASHGET(/usr/lib/perl5/5.8.0/utf8_heavy.pl:239):
239:            if ($bits > 1) {
240:              LINE:
  DB<1> s
utf8::SWASHGET(/usr/lib/perl5/5.8.0/utf8_heavy.pl:274):
274:                while (/^([0-9a-fA-F]+)(?:[ \t]+([0-9a-fA-F]+))?/mg) {
  DB<1> s
utf8::SWASHGET(/usr/lib/perl5/5.8.0/utf8_heavy.pl:237):
237:        for ($self->{LIST}) {
  DB<1> s
utf8::SWASHGET(/usr/lib/perl5/5.8.0/utf8_heavy.pl:289):
289:        for my $x ($self->{EXTRAS}) {
  DB<1> s
utf8::SWASHGET(/usr/lib/perl5/5.8.0/utf8_heavy.pl:290):
290:            pos $x = 0;
  DB<1> s
utf8::SWASHGET(/usr/lib/perl5/5.8.0/utf8_heavy.pl:291):
291:            while ($x =~ /^([-+!])(.*)/mg) {
  DB<1> s
utf8::SWASHGET(/usr/lib/perl5/5.8.0/utf8_heavy.pl:289):
289:        for my $x ($self->{EXTRAS}) {
  DB<1> s
utf8::SWASHGET(/usr/lib/perl5/5.8.0/utf8_heavy.pl:334):
334:        if (DEBUG) {
  DB<1> s
utf8::SWASHGET(/usr/lib/perl5/5.8.0/utf8_heavy.pl:341):
341:        $swatch;
  DB<1> s
Split loop.
Debugged program terminated.  Use q to quit or R to restart,
  use O inhibit_exit to avoid stopping after program termination,
  h q, h R or h O to get additional info.  
  DB<1>

Comment 16 Michael Lee Yohe 2002-12-04 21:35:57 UTC

Yee-haw - now we are getting somewhere.  I do not see the utf8_heavy.pl
references because my language is not set to "en_US.utf8".  It is instead set to
"en_US".  There is our difference.  And you are accurate in assessing that it is
a bug since Perl does not seem to be handling the split properly when addressing
"en_US.utf8" - I can now duplicate your problem with the "Split loop" bug (boy,
it would have been handy if I had recommended you to run the Perl debugger in
the first place!)

You can work around this problem temporarily by executing your script as:

LANG=en_US ./splitTest.pl

_OR_

Adding "export LANG=en_US" to your .bashrc script in your home directory.

Comment 17 Need Real Name 2002-12-04 22:55:05 UTC

Added "export LANG=en_US" to my .bashrc script in my home directory now all 
works well. 

from an environment dump before
LANG=en_US.UTF-8
after
LANG=en_US

Thanks for your help. I'll follow the resolution of the problem.

Comment 18 Mike A. Harris 2004-01-13 18:07:22 UTC

Chip, is this problem resolved now?  Just curious as I have bugs
reported about this same problem with procinfo's lsdev script.  I'm
unable to reproduce it at all on any OS version on any system, and
other bug reports seem to indicate this problem is fixed.

bug #82432 is the one I'm currently investigating

Note You need to log in before you can comment on or make changes to this bug.