62427 – unset leaves array in inconsistent state

Bug 62427 - unset leaves array in inconsistent state

Summary: unset leaves array in inconsistent state

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Linux
Classification:	Retired
Component:	bash
Sub Component:
Version:	7.2
Hardware:	i386
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Tim Waugh
QA Contact:	Ben Levenson
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2002-03-31 21:16 UTC by Craig Lawson
Modified:	2007-04-18 16:41 UTC (History)
CC List:	0 users
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2002-10-08 17:40:24 UTC
Embargoed:

Attachments	(Terms of Use)
Bash shell script: "bash_array_bug" (613 bytes, text/plain) 2002-03-31 21:18 UTC, Craig Lawson	no flags	Details
View All

Description Craig Lawson 2002-03-31 21:16:41 UTC

Description of Problem:
According to bash documentation, "`unset' NAME[SUBSCRIPT] destroys the array
element at index SUBSCRIPT". However, the documentation is not clear on whether
subsequent elements are shifted down. In fact, what actually happens is that the
array length is decremented and the array elements are NOT shifted. This is a
problem, as it makes the array length and contents inconsistent.


Version-Release number of selected component (if applicable): 
bash 2.05.8(1)-release

How Reproducible:
100%

Steps to Reproduce:
1. Assign several elements to an array.
2. unset an element from the beginning of the array.
3. Dump the array length and contents.
(See attached example program "bash_array_bug".)

Actual Results:
The array length is changed, but array elements have not shifted down to replace
the destroyed element.

Expected Results:
I expected either the array elements to be shifted down, or the array length to
be unchanged.

Comment 1 Craig Lawson 2002-03-31 21:18:32 UTC

Created attachment 51567 [details]
Bash shell script: "bash_array_bug"

Comment 2 Tim Waugh 2002-10-07 11:12:01 UTC

Since ${array[*]} counts the number of elements (i.e. members that are set 
versus unset), this seems to be consistent behaviour with the documentation to 
me. 
 
For example: 'array[5]=set; echo ${#array[*]}' gives '1' as expected. 
 
Replacing the line in dump_A() with: 
  echo "A[$I]: ${A[$I]-(unset)}" 
shows that the unset element is indeed unset.

Comment 3 Craig Lawson 2002-10-08 06:02:56 UTC

Hmm, you're right. ${array[*]} does count the number of elements which are set.
The source of my confusion is from the documentation, which says:

  `${#name['SUBSCRIPT`]}' expands to the length of `${name['SUBSCRIPT`]}'.
  If SUBSCRIPT is `@' or `*', the expansion is the number of elements in
  the array.

But not the same thing as "the number of elements which are set". If only the
non-null elements are counted, then the problem I have with the current
implementation is that there is no way to count the total number of elements,
either set or unset. However, if we agree that an unset element is a valid
element, then the length should not change when an element is unset; yet it does.

Here is the output from the script I submitted on 2002-03-31 (my apologies for
not including this output in my first submission:

  Array initialized to 4 elements
  A(length = 4): a b c d
  A[0]: a
  A[1]: b
  A[2]: c
  A[3]: d

  Unset A[1]
  A(length = 3): a c d
  A[0]: a
  A[1]: 
  A[2]: c

  Reassign A[1]
  A(length = 4): a second c d
  A[0]: a
  A[1]: second
  A[2]: c
  A[3]: d

In the second step, titled "Unset A[1]", the number of elements drops by 1 and
A[1] has been replaced with null. But where is A[3]? It is not displayed because
the array's length is now one less than before. In the final step, it is visible
again.

To write a script that always displays (or processes) all array elements, we
would have to count the number of null elements encountered, and incrementally
increase the iteration limit. This increases code complexity. Also, a null
elements at the end of the array can be determined only by explicitly testing
the value, and doing so will not indicate the true array length anyway as all
elements past the end of the array will always test as null. So there is no way
to tell how many elements an array has by merely examining it. Therefore, the
only reliable way to track the array length to use an independent variable, and
this technique is cumbersome and counter to the object-oriented implementation
which provides the length with the ${array[*]} construct.

I don't like that conclusion, or the added burden on my scripts, and I suggest
that something is amiss with either the design or implementation of bash arrays.
Either (a) a new syntax is required which provides the total number of set and
unset elements, or (b) ${array[*]} is interpreted as the total number of set and
unset elements, or (c) the array elements are shifted when an element is unset.
My suggestion is (b).

Comment 4 Tim Waugh 2002-10-08 08:43:35 UTC

From my reading of the documentation there is no such thing as an 'unset 
element'.  c.f. normal shell variables: if a shell variable is unset, it's 
just not there in the environment. 
 
If you want to remove an item from an array but still have consecutive 
indices, why not use this kind of thing?: 
 
array[0]=0 
array[1]=1 
array[2]=2 
array[3]=3 
unset array[2] 
array=("${array[@]}") 
 
I think it's clear that this is not a bug but behaviour as advertised.  Please 
re-open if you disagree.

Comment 5 Craig Lawson 2002-10-08 16:35:51 UTC

I agree with you about normal shell variables being unset: they just disappear.
However, unset array elements appear to be different. While it seems that unset
elements are ignored when the array is accessed with ${array[*]}, they most
definitely exist when accessing array elements by index. See my comment of 
2002-10-08, "Unset A[1]": After the "unset A[1]" command, A[1] has no value, yet
the index [1] is still valid. A[1] has not disappeared even though unset.
Definitely not the same behavior as normal shell variables.

I am reopening this bug because I think there is a flaw somewhere. I leave it up
to you folks to resolve whether this is a design, documentation, or
implementation issue.
My suggestion: ${array[*]} currently counts only elements with values. It should
be changed to return the array length, which would mean both elements with
values and null (unset) elements.

Comment 6 Tim Waugh 2002-10-08 16:46:45 UTC

I can see no design flaw here, or any documentation problem.  Everything works 
as designed, and as documented.  I've given you a method for making bash 
'renumber' array indices. 
 
There *is* no 'array length' other than the number of elements that are set.  
Otherwise every array would have length infinity. (Example: echo 
${a[1000000]}) 
 
You are mistaken in believing that an array can have an element (counting 
towards the ${[@]} total) that is unset.

Comment 7 Craig Lawson 2002-10-08 17:40:15 UTC

Sorry. Forgot to check the "reopen button" last time. It's reopened now.

Thank you for you code to renumber array indicies.

We seem to have a strong difference of opinion here about whether an array
element can have unset elements. The output of my test script clearly shows that
it can. Your comment of 2002-10-07 agrees with my position, and you also provide
code that detects unset elements. We can argue the semantic difference on
whether an unset element actually exists, or whether it is merely a hole; it
occupies one array index either way.
I understand that bash arrays are sparse, and array elements are created on
access. What is missing from the current design is a way to find all assigned
elements in a sparse array. Renumbering (compacting) is a useful tool, but not
always appropriate: sparse assignment has usefulness beyond what a contiguous
array provides.

I will propose an alternate change for bash arrays: add syntax to provide the
highest non-null index.

Comment 8 Tim Waugh 2002-10-08 17:46:31 UTC

> What is missing from the current design is a way to find all assigned   
> elements in a sparse array.   
   
No: ${[*]} and ${[@]} do that perfectly well.   
   
I think what you are after is a way of finding all indices for which elements   
are assigned in an array.  This is not really an appropriate forum for   
suggesting that functionality; it would be far better to take it up with the   
GNU bash maintainers yourself.  
  
Changing to 'enhancement', and closing 'WONTFIX'.

Note You need to log in before you can comment on or make changes to this bug.