Bug 531355 - sort join combination produces not in sorted order messages unless LANG=C
Summary: sort join combination produces not in sorted order messages unless LANG=C
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: coreutils
Version: 11
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Ondrej Vasik
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-10-27 20:25 UTC by Mike Hanafey
Modified: 2009-10-30 13:25 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-10-30 13:25:36 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
2 example files. (10.00 KB, application/x-tar)
2009-10-27 20:25 UTC, Mike Hanafey
no flags Details

Description Mike Hanafey 2009-10-27 20:25:37 UTC
Created attachment 366342 [details]
2 example files.

Description of problem:


Version-Release number of selected component (if applicable):coreutils-7.2-4.fc11.x86_64


How reproducible: Always


Steps to Reproduce:
Using the small example files provided as an attachment --

for l in en_US.UTF-8 en_US.utf8 en_US.iso88591 en_US.ISO-8859-1 C; do
  echo "-------------------------------------------------------------------------------------------------"
  echo "LANG=$l"
  export LANG=$l
  sort -o all-primary.csv all-primary.csv
  sort -o db-primary.csv db-primary.csv
  join -v 1 all-primary.csv db-primary.csv
done

  
Actual results:
-------------------------------------------------------------------------------------------------
LANG=en_US.UTF-8
Industrial CHO and Lipids
join: file 1 is not in sorted order
join: file 2 is not in sorted order
null
Root
-------------------------------------------------------------------------------------------------
LANG=en_US.utf8
Industrial CHO and Lipids
join: file 1 is not in sorted order
join: file 2 is not in sorted order
null
Root
-------------------------------------------------------------------------------------------------
LANG=en_US.iso88591
Industrial CHO and Lipids
join: file 1 is not in sorted order
join: file 2 is not in sorted order
null
Root
-------------------------------------------------------------------------------------------------
LANG=en_US.ISO-8859-1
Industrial CHO and Lipids
join: file 1 is not in sorted order
join: file 2 is not in sorted order
null
Root
-------------------------------------------------------------------------------------------------
LANG=C
Industrial CHO and Lipids
Root
null



Expected results:
No sort order messages.

Additional info:

Comment 1 Pádraig Brady 2009-10-27 21:04:13 UTC
This is due I think to sort using the whole line by default, whereas join just uses the first field. Specifically in non C locals the ' ' chars are sorting differently relative to other chars. If you want `join` to use the whole line use -t'\0'

Comment 2 Mike Hanafey 2009-10-30 13:25:36 UTC
Sorry, my mistake. Thought I was doing this on whole lines. The -t option was left off.


Note You need to log in before you can comment on or make changes to this bug.