From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20050922 Fedora/1.0.7-1.1.fc4 Firefox/1.0.7
Description of problem:
The attached program shows the problem. In this program code, we are calculating the SVD of a m x n-dimensional matrix where m=5 and n=2. The program allocates space according to the minimal requirements of LAPACK User's manual.
However, the problem goes away when m<=4 above. Most likely, it is an issue when m is sufficiently larger than n.
The attached code is svdd-demo.c. It uses the attached header file (array.h).The datafile is given in test.dat.
Upon installing LAPACK manually, this is not a problem. The RPMs are the issue: I have verified that every release from 3.0.25 through 3.0.30 presents this problem.
Version-Release number of selected component (if applicable):
lapack-3.0-31.fc4, lapack-devel-3.0-31.fc4
How reproducible:
Always
Steps to Reproduce:
1. Use the following to compile: (I use the tcsh shell)
%gcc -ansi -Wall -pedantic svdd-demo.c -lm -llapack
2.
./a.out
3.
Actual Results: The program always ends as:
reading in matrix A from test.dat:
2.117079 2.992091
2.106365 1.970807
2.131384 3.130390
2.344235 2.667219
1.838804 3.243629
** On entry to DGESDD parameter number 12 had an illegal value
STOP 0
for m>=5 in the program (n=2).
However, using the statistical package R as well as a compiled LAPACK provides the correct result.
Expected Results: reading in matrix A from test.dat:
2.117079 2.992091
2.106365 1.970807
2.131384 3.130390
2.344235 2.667219
1.838804 3.243629
svdd returned 0
singular values are:
7.875120 0.769370
The left multiplier matrix, u, is:
-0.465324 0.102357 -0.525449 -0.358025 -0.607221
-0.360284 -0.676273 0.043269 -0.489893 0.413500
-0.480519 0.194368 0.811234 -0.064990 -0.262676
-0.449331 -0.386186 -0.133477 0.791199 -0.071766
-0.469971 0.587587 -0.214743 0.040039 0.621410
The right multiplier matrix, v, is:
-0.595001 -0.803725
-0.803725 0.595001
Additional info:
My hunch is that this is a problem because of some funky optimization that has been used in the RPMS. Also, the statistical package R uses blas, but gives the correct result: hence my guess is that it is the lapack RPM that is the problem.
This is a pretty SEVERE problem which also happens for RHEL4.

Comment 4Tom "spot" Callaway
2005-09-28 17:26:54 UTC

Let me look into this. :/

Comment 5Tom "spot" Callaway
2005-09-28 22:02:24 UTC

Removing optimization from the rpm does not resolve the issue, thus, it doesn't
seem to be the problem.

Comment 6Tom "spot" Callaway
2005-09-28 22:10:32 UTC

OK, its this patch (lapack-20010525.patch). I inherited that patch with the rest
of lapack when I took over maintainership in Fedora Extras. Time to figure out
why it causes this failure.

Comment 7Tom "spot" Callaway
2005-09-28 22:53:26 UTC

Alright, figured this one out.
Once upon a time (Fri May 25 2001), Trond Eivind GlomsrÃ¸d (aka, the pickled fish
eater), decided that for reasons unknown (or at least undocumented), he would
apply all of the patches that upstream had pending for lapack/blas, but had not
yet rolled into a formal release. Given that lapack hadn't had a major release
since May 31, 2000, he was probably trying to patch bugs or just be better
prepared for a future release.
Upstream had made revised versions of specific files in the lapack/blas tree and
put them here:
http://www.netlib.org/lapack/release_notes.html
Trond took those files, and made a patch diff from the original lapack to these
new updated source files and named it lapack-20010525.patch.
One of these changes is described by upstream as:
10/5/2000 Corrected error with LQUERY and setting of WORK(1)
This touches (amongst other files): LAPACK/SRC/dgesdd.f
When we look at the diff of that specific file, we see the following:
diff -uNr LAPACK.orig/SRC/dgesdd.f LAPACK/SRC/dgesdd.f
--- LAPACK.orig/SRC/dgesdd.f Thu Nov 11 20:32:31 1999
+++ LAPACK/SRC/dgesdd.f Fri May 25 16:07:58 2001
@@ -4,7 +4,8 @@
* -- LAPACK driver routine (version 3.0) --
* Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd.,
* Courant Institute, Argonne National Lab, and Rice University
-* October 31, 1999
+* June 30, 1999
+* 8-15-00: Improve consistency of WS calculations (eca)
*
* .. Scalar Arguments ..
CHARACTER JOBZ
@@ -116,16 +117,20 @@
* LWORK (input) INTEGER
* The dimension of the array WORK. LWORK >= 1.
* If JOBZ = 'N',
-* LWORK >= 3*min(M,N) + max(max(M,N),6*min(M,N)).
+* LWORK >= max(14*min(M,N)+4, 10*min(M,N)+2+
+* SMLSIZ*(SMLSIZ+8)) + max(M,N)
+* where SMLSIZ is returned by ILAENV and is equal to the
+* maximum size of the subproblems at the bottom of the
+* computation tree (usually about 25).
* If JOBZ = 'O',
-* LWORK >= 3*min(M,N)*min(M,N) +
-* max(max(M,N),5*min(M,N)*min(M,N)+4*min(M,N)).
+* LWORK >= 5*min(M,N)*min(M,N) + max(M,N) + 9*min(M,N).
* If JOBZ = 'S' or 'A'
-* LWORK >= 3*min(M,N)*min(M,N) +
-* max(max(M,N),4*min(M,N)*min(M,N)+4*min(M,N)).
+* LWORK >= 4*min(M,N)*min(M,N) + max(M,N) + 9*min(M,N).
* For good performance, LWORK should generally be larger.
-* If LWORK < 0 but other input arguments are legal, WORK(1)
-* returns the optimal LWORK.
+*
+* If LWORK = -1, a workspace query is assumed. The optimal
+* size for the WORK array is calculated and stored in WORK(1),
+* and no other work except argument checking is performed.
*
* IWORK (workspace) INTEGER array, dimension (8*min(M,N))
*
It goes on, but the notable change is in those comments. It changes the expected
definition of LWORK, supposedly to improve consistency of WC calculations.
So, I went to your test case, and made the same change:
--- svdd-demo.c.orig 2005-09-28 17:30:11.000000000 -0500
+++ svdd-demo.c 2005-09-28 17:31:15.000000000 -0500
@@ -23,8 +23,7 @@ void dgesdd_( char *jobz,
int svdd(double **a, int m, int n, double *d, double **u, double **v)
{
double *A, *U, *VT;
- int lwork =3*MIN(m,n)*MIN(m,n)
- + MAX(MAX(m,n),4*MIN(m,n)*MIN(m,n)+4*MIN(m,n));
+ int lwork = 4*MIN(m,n)*MIN(m,n) + MAX(m,n) + 9*MIN(m,n);
int liwork = 8*MIN(m,n);
char jobz = 'A';
double *work;
With that change, your test case starts working again:
[spot@swoop ~]$ ./svdd-demo
reading in matrix A from test.dat:
2.117079 2.992091
2.106365 1.970807
2.131384 3.130390
2.344235 2.667219
1.838804 3.243629
svdd returned 0
singular values are:
7.875120 0.769370
The left multiplier matrix, u, is:
-0.465324 0.102357 -0.525449 -0.358025 -0.607221
-0.360284 -0.676273 0.043269 -0.489893 0.413500
-0.480519 0.194368 0.811234 -0.064990 -0.262676
-0.449331 -0.386186 -0.133477 0.791199 -0.071766
-0.469971 0.587587 -0.214743 0.040039 0.621410
The right multiplier matrix, v, is:
-0.595001 -0.803725
-0.803725 0.595001
So, this answers the question of why your test case fails, but it doesn't really
make it clear what we should do.
Unfortunately, whoever wrote that improvement upstream didn't sign their name,
so we can't ask him or her whether its worthwhile to keep it. It doesn't look
likely that lapack will ever have another major release (it hasn't had any touch
since 2001). This has been the Red Hat shipped behavior for lapack since 2001,
and no one has hit this before. It is possible that people are relying on this
patched value for LWORK, and to revert it would break their existing configuration.
So, we can keep this patch and correct the user manual, or we can revert it, and
hope that it doesn't break existing users or make the WC calculations less
consistent. Your feedback here would be very much encouraged. :)

Thanks very much! I have verified that it works. I have also verified that the
querying for optimal lwork which is the ideal way of doing things also works.
I think the patch should remain, since it corrects for real errors. If possible,
the manpages (user manual could be upated with all the corrections in this patch).
http://www.netlib.org/lapack/html/errata.lughttp://www.netlib.org/lapack/release_notes.html
After digging around, I came up with the above websites. Maybe, if possible, the
errata could also be included in the manpages in a good way so that we can use
lapack correctly?
Thanks a lot for looking into this, and so quickly!! I am so proud of Fedora
Core as a user!!

Comment 9Tom "spot" Callaway
2006-04-07 14:09:52 UTC

Closing out this bug.

Note

You need to
log in
before you can comment on or make changes to this bug.

`Created attachment 119323 [details] file svdd-demo.c`

`Created attachment 119324 [details] header file for svdd-demo.c`

`Created attachment 119325 [details] test.dat (data file for svdd-demo.c)`