Bug 673205 - Insert SheetFromFile with csv data fails
Summary: Insert SheetFromFile with csv data fails
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: openoffice.org
Version: 14
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Eike Rathke
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-01-27 17:52 UTC by David A. De Graaf
Modified: 2012-06-08 23:14 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-06-08 23:01:39 UTC
Type: ---


Attachments (Terms of Use)

Description David A. De Graaf 2011-01-27 17:52:23 UTC
Description of problem:
Insert Sheet From File results in a blank screen

Version-Release number of selected component (if applicable):
1:openoffice.org-core-3.3.0-19.2.fc14.x86_64

How reproducible:
Every time.

Steps to Reproduce:
1.  See below
2.
3.
  
Actual results:
Blank screen, devoid of cells.

Expected results:
Creation of new sheet with data, as previous versions did.

Additional info:

After a recent `yum update` which provided
    1:openoffice.org-core-3.3.0-19.2.fc14.x86_64
and all its cohort, calc is unusable.
Specifically, it has become impossible to Insert Sheet from File
using a comma-separated data file.

To demonstrate, create a data file /tmp/12-31.csv containing stock
symbols and prices, eg,

$ cat /tmp/12-31.csv
ACI, 35.06
AMT, 51.64
BK, 30.20
BRK-A, 120450.00
  ...

Open any spreadsheet, a new blank one will do.  While standing in
Sheet 1, undertake to add a second sheet using that data file.
Click Insert | SheetFromFile
Select the data file /tmp/12-31.csv
Check  Separated by  + Comma   (and nothing else)
  Observe two columns that look OK
Choose position:  + After current sheet

Result:  A clean white screen (new sheet)  with Column headings, but no
Rows or Row labels.  The Tab highlighted at the bottom is labeled Sheet
1_2.  The available Tabs are  Sheet 1, Sheet 1_2  Sheet 2  and Sheet 3.
Sheets 1, 2, and 3 look normal (empty cells). but Sheet 1_2 has no 
cells at all.  Why is the new sheet called 1_2 and not just 2?

If I click on column header  A,  the cell contents window shows  ACI
If I click on column header  B,  the cell contents window shows  '35.06

Evidently the data was read, but not correctly displayed.
Note also that the numerical value contains a leading ' which renders
it text.  This is a second, serious error.

Due to these, and other, serious errors, I am forced to revert to an
earlier, working version.  The yum manual offers the "downgrade" option.
However, no earlier packages are available on the usual repos.
I was sure that an archive exists, but Google fails to reveal it.
Can anyone seriously think that a new major release of something as
complex as openoffice can ever be bug-free, so that reversion will
never be needed?  Why does yum offer the downgrade option if no
predecessor packages are available?

The only source of earlier versions I could locate were at the
official openoffice.org site
  http://download.openoffice.org/other.html
which provides a vast array of alternatives including
  OOo_3.2.1_Linux_x86-64_install-rpm-wJRE_en-US.tar.gz

Unfortunately, these are not yum-compliant, so I had to forcibly
  yum remove "openoffice*"
and then use the non-yum installer for the 3.2.1 version.


In Summary, several glitches are apparent:

1 - Repair and restore  Insert | SheetFromFile | csv file
2 - Do NOT gratuitously prefix ' on numerical data
3 - Label the new sheet correctly, eg, Sheet 2
4 - Retain prior rpm packages for the life of Fedora 14, somewhere.

Comment 1 Caolan McNamara 2011-01-27 20:10:23 UTC
Can you try openoffice.org-3.3.0-20.2.fc14 from updates-testing. I think this csv problem (insanely created in the last should-have-been-micro-change-update) is reportedly re-fixed in that one.

e.g. yum --enablerepo=updates-testing update openoffice.org-calc

(I've no control over 4.)

Comment 2 David A. De Graaf 2011-01-27 22:43:34 UTC
Caolan McNamara, thanks for your quick response.

I thought I had done that - installed the 3.30-20.2 versions from
updates-testing - and neglected to mention that because it made no
difference.

But that is wrong.

I have reinstalled that updates-testing version and can now confirm
that

1 - Insert | SheetFromFile | csv file  -  does now work.

2 - All the imported numerical values have ' prepended, so they are
    text, not numbers, so they cannot be used.
    This did not happen in earlier versions of openoffice.
    Moreover, I haven't found any way to globally remove the ' character
    from all cells.  This ' character is only visible in the cell-contents
    window; not in the cells themselves.
    Manually editing each cell (about 100) is totally impractical.
    There ought to be some way to do this, globally.

3 - The added sheet is labelled weirdly - Sheet 1_2 instead of
    the expected Sheet 2.  Not a big deal; just odd.

4 - I grant that package retention policies are beyond the scope
    of this BZ.  I wish I knew where to complain about it.

So this problem is only partly fixed by the updates-testing version
and openoffice 3.3.0-20.2 remains unusable (for me).

Comment 3 Kohei Yoshida 2011-02-08 02:01:30 UTC
(In reply to comment #2)

> 2 - All the imported numerical values have ' prepended, so they are
>     text, not numbers, so they cannot be used.

Check your option settings "Detect special numbers" and "Quoted fields as text", try changing them to see if that makes any difference.  I've seen this reported several times in other bugzilla's, and so far changing these option settings have fixed the issue for those users.

>     This did not happen in earlier versions of openoffice.
>     Moreover, I haven't found any way to globally remove the ' character
>     from all cells.  This ' character is only visible in the cell-contents
>     window; not in the cells themselves.

This ' character is prepended when the cell is formatted as text even though the content looks like a number.  So it's an indication that the cell was imported as a text.

Kohei

Comment 4 David A. De Graaf 2011-02-14 22:38:03 UTC
I have discovered a workaround.      (in 3.3.0-20.2.fc14.x86_64)

It is now necessary to check all three boxes:  Comma and Space and
Merge_delimiters in the Text Import panel.  My csv data file includes
a comma and a space between the stock symbol and stock price.  This has
always been an accepted format for csv files, but not anymore.

Why is whitespace no longer ignored?
This is NOT an improvement, IMHO.

There remains the mystery of why a sheet added after, say, Sheet 1
is labelled Sheet 1_2 and not Sheet 2.
Does anyone have a clue?

Further experiments reveal other maddeningly inconsistent behaviours:

1 - If I enclose one of the numerical values in double quotes, eg
BK, "31.23"
and check the Quoted_field_as_text box,
when imported, the quotes are stripped away and replaced by a leading '
so that the field is irretrievably made Text.

If the Quoted_field_as_text box is not checked, the quotes are again
stripped away, but the imported value is a number.

2 - If I create a nonsensical data item with two numbers enclosed in
"", eg
CYT, "54.54 101.0"
and leave unchecked the Quoted_field_as_text box,
the data is imported as a "number", but one which cannot be used in
computation, because it contains a space!

3 - If I add multiple spaces after a comma, they are ignored, provided
I've checked the Merge_delimiters box is checked.  (That seems quite
reasonable.)

4 - If I insert a TAB before the number (and after the comma and
space), the TAB is displayed in the preview window, but is stripped
away when imported.  Instead, the number is left-justified (!), unlike
all the other numerical values.  This is NOT reasonable.

5 - With the vast number of options on the Text Import panel, it is nice
to know that these settings will be remembered.  That's a good feature
AS LONG AS I WANT TO DO THE SAME THING the next time.  Not so nice
otherwise.

6 - Although the Text Import options are remembered, the next panel,
Insert Sheet, wants to know the Position - Before or After current
sheet.  This is NOT remembered.  Where's the sense in that?


In a sane world, the time-honored conventions of *nix would be
observed.  Quotes would demarcate one field regardless of content,
such as white space.  There is no need for the Quoted_field_as_text
option.  If a field is not a number, then it is text.  Simple.
In a CSV data file, white space before or after the comma should be
ignored - including TAB, automatically and always, unless quoted.

There are many more illogical behaviours elicited by the failure to
ignore whitespace.  This new user interface for CSV import seems to be
ill-conceived and poorly thought out.   - IMHO

Comment 5 David Tardon 2011-02-15 06:04:22 UTC
(In reply to comment #4)
> I have discovered a workaround.      (in 3.3.0-20.2.fc14.x86_64)
> 
> It is now necessary to check all three boxes:  Comma and Space and
> Merge_delimiters in the Text Import panel.  My csv data file includes
> a comma and a space between the stock symbol and stock price.  This has
> always been an accepted format for csv files, but not anymore.

It is still accepted, as you have already discovered. It is just not the default.

> Why is whitespace no longer ignored?
> This is NOT an improvement, IMHO.

IMHO because it is explicitly disallowed in the only "standard" for CSV there is, RFC 4180: "Spaces are considered part of a field and should not be ignored."

> There remains the mystery of why a sheet added after, say, Sheet 1
> is labelled Sheet 1_2 and not Sheet 2.
> Does anyone have a clue?

Probably because it is inserted "from" Sheet 1. But I do not know how the name for the new sheet is derived, really...

> Further experiments reveal other maddeningly inconsistent behaviours:
> 
> 1 - If I enclose one of the numerical values in double quotes, eg
> BK, "31.23"
> and check the Quoted_field_as_text box,
> when imported, the quotes are stripped away and replaced by a leading '
> so that the field is irretrievably made Text.

And where is the surprise there?

> If the Quoted_field_as_text box is not checked, the quotes are again
> stripped away, but the imported value is a number.

Again from the RFC 4180: "Each field may or may not be enclosed in double quotes". So if you do not say explicitly the field should be a text, the app will try to import it as a number.

> 2 - If I create a nonsensical data item with two numbers enclosed in
> "", eg
> CYT, "54.54 101.0"
> and leave unchecked the Quoted_field_as_text box,
> the data is imported as a "number", but one which cannot be used in
> computation, because it contains a space!

This looks like a bug.

> 4 - If I insert a TAB before the number (and after the comma and
> space), the TAB is displayed in the preview window, but is stripped
> away when imported.  Instead, the number is left-justified (!), unlike
> all the other numerical values.  This is NOT reasonable.

This is a bug, unless I am missing something.

> 
> 5 - With the vast number of options on the Text Import panel, it is nice
> to know that these settings will be remembered.  That's a good feature
> AS LONG AS I WANT TO DO THE SAME THING the next time.  Not so nice
> otherwise.

Sorry, but a mind-reading interface for the dialog has not been finished yet.

> 
> 6 - Although the Text Import options are remembered, the next panel,
> Insert Sheet, wants to know the Position - Before or After current
> sheet.  This is NOT remembered.  Where's the sense in that?

There is no connection between the two dialogs--they are totally independent. I do not think there is so many items on the latter that it would warrant saving the last configuration. Maybe the default value of Position could be "After current sheet", but that is all...

> In a sane world, the time-honored conventions of *nix would be
> observed.  Quotes would demarcate one field regardless of content,
> such as white space.  There is no need for the Quoted_field_as_text
> option.  If a field is not a number, then it is text.  Simple.
> In a CSV data file, white space before or after the comma should be
> ignored - including TAB, automatically and always, unless quoted.

No, that is just your personal interpretation of what CSV should be like. And because there are many different interpretations of the same, we need this complicated dialog with a lot of tweakable options.

Btw, according to http://en.wikipedia.org/wiki/Comma-separated_values, CSV predates Unix by several years, at least.

Comment 6 David Tardon 2011-02-15 06:15:05 UTC
> Why does yum offer the downgrade option if no
> predecessor packages are available?

They are not available in Fedora repos, but that does not mean they cannot be available from somewhere else. E.g., you can install yum-plugin-local and any package you install since that point will be retained locally.

> 4 - Retain prior rpm packages for the life of Fedora 14, somewhere.

The packages are not retained in the repos by policy, to avoid unnecessary growth of the repos. And that is not just the size of the packages on the mirrors, but also the size of the metadata that are downloaded by yum / packagekit just about every time.

Comment 7 David Tardon 2011-02-15 06:44:04 UTC
(In reply to comment #6)
> > Why does yum offer the downgrade option if no
> > predecessor packages are available?
> 
> They are not available in Fedora repos, but that does not mean they cannot be
> available from somewhere else. E.g., you can install yum-plugin-local and any
> package you install since that point will be retained locally.

Actually, now that I have thought about it a bit more, there should be at least one version of old packages available, in the fedora repo. Are you sure you have not disabled it? Try if

yum downgrade --enablerepo=fedora* --disablerepo=updates* openoffice.org-*

works. If it does not, it is a bug in yum.

Comment 8 Caolan McNamara 2012-01-10 14:08:42 UTC
caolanm->erack: we fixed/backported the csv fails problem ages ago. Considering the other points remaining, is this "as good as it gets" now, or is there any outstanding fixes.

Comment 9 Eike Rathke 2012-06-08 23:01:39 UTC
This went under my radar ... the original problem isn't reproducible anymore with libreoffice-3.4.5.2-15.fc16

Comment 10 Eike Rathke 2012-06-08 23:14:19 UTC
For the records, what was regarded as bug

> 2 - If I create a nonsensical data item with two numbers enclosed in
> "", eg
> CYT, "54.54 101.0"
> and leave unchecked the Quoted_field_as_text box,
> the data is imported as a "number", but one which cannot be used in
> computation, because it contains a space!

is not a bug. The string 54.54 101.0 (including the blank) is not a number, is not imported as such, and consequently can't be calculated with.


For

> 4 - If I insert a TAB before the number (and after the comma and
> space), the TAB is displayed in the preview window, but is stripped
> away when imported.  Instead, the number is left-justified (!), unlike
> all the other numerical values.  This is NOT reasonable.

in LibO 3.5.4 the space and tab are preserved, the result is not a number, and hence the cell content is left-aligned as all textual content is.


Note You need to log in before you can comment on or make changes to this bug.