Created attachment 366342 [details] 2 example files. Description of problem: Version-Release number of selected component (if applicable):coreutils-7.2-4.fc11.x86_64 How reproducible: Always Steps to Reproduce: Using the small example files provided as an attachment -- for l in en_US.UTF-8 en_US.utf8 en_US.iso88591 en_US.ISO-8859-1 C; do echo "-------------------------------------------------------------------------------------------------" echo "LANG=$l" export LANG=$l sort -o all-primary.csv all-primary.csv sort -o db-primary.csv db-primary.csv join -v 1 all-primary.csv db-primary.csv done Actual results: ------------------------------------------------------------------------------------------------- LANG=en_US.UTF-8 Industrial CHO and Lipids join: file 1 is not in sorted order join: file 2 is not in sorted order null Root ------------------------------------------------------------------------------------------------- LANG=en_US.utf8 Industrial CHO and Lipids join: file 1 is not in sorted order join: file 2 is not in sorted order null Root ------------------------------------------------------------------------------------------------- LANG=en_US.iso88591 Industrial CHO and Lipids join: file 1 is not in sorted order join: file 2 is not in sorted order null Root ------------------------------------------------------------------------------------------------- LANG=en_US.ISO-8859-1 Industrial CHO and Lipids join: file 1 is not in sorted order join: file 2 is not in sorted order null Root ------------------------------------------------------------------------------------------------- LANG=C Industrial CHO and Lipids Root null Expected results: No sort order messages. Additional info:
This is due I think to sort using the whole line by default, whereas join just uses the first field. Specifically in non C locals the ' ' chars are sorting differently relative to other chars. If you want `join` to use the whole line use -t'\0'
Sorry, my mistake. Thought I was doing this on whole lines. The -t option was left off.