Bug 560127

Summary: 'sort -V' sorts alpha parts incorrectly
Product: [Fedora] Fedora Reporter: Bruce Jerrick <bmj001>
Component: coreutilsAssignee: Kamil Dudka <kdudka>
Status: CLOSED CANTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 12CC: kdudka, ovasik, twaugh
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-09-28 06:13:14 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:

Description Bruce Jerrick 2010-01-29 17:47:04 EST
Description of problem:
Given input with alpha parts with different lengths, a non-alphanum
separator, and a numeric part, 'sort -V' produces incorrect output.
It seems to be ignoring length differences of the alpha part even
when the separator and numeric parts are the same in all input.

Version-Release number of selected component (if applicable):
coreutils-7.6-8.fc12
coreutils-7.2-7.fc11

How reproducible:
100%

Steps to Reproduce:
sort -V
aaa.1
aaaa.1
^D

Actual results:
aaaa.1
aaa.1

Expected results:
aaa.1
aaaa.1

Additional info:

Without -V, the result is correct.
If the '.1'  is omitted, result is correct.
If just the '.' is omitted, result is correct.
Behavior is the same if '-' is used instead of '.' .
Comment 1 Kamil Dudka 2010-01-29 18:11:03 EST
Thank you for filling the bug!

> Actual results:
> aaaa.1
> aaa.1

Core of the version-compare predicate of 'sort -V' is taken from 'dpkg', which also works that way:

$ dpkg --compare-versions aaaa.1 \< aaa.1 && echo LT
LT

It's partially documented within the 'filevercmp' gnulib module:

/* slightly modified verrevcmp function from dpkg
   S1, S2 - compared string
   S1_LEN, S2_LEN - length of strings to be scanned

   This implements the algorithm for comparison of version strings
   specified by Debian and now widely adopted.  The detailed
   specification can be found in the Debian Policy Manual in the
   section on the `Version' control field.  This version of the code
   implements that from s5.6.12 of Debian Policy v3.8.0.1
   http://www.debian.org/doc/debian-policy/ch-controlfields.html#s-f-Version */

I propose to improve the documentation on the coreutils' part as the first step.

> Expected results:
> aaa.1
> aaaa.1

I don't think we are able to change the implementation to get the result above without breaking anything else.  The key problem is that there are different version schemes and either you choose, it will be a trade-off.

> If the '.1'  is omitted, result is correct.
> If just the '.' is omitted, result is correct.

same as 'dpkg --compare-versions'

> Behavior is the same if '-' is used instead of '.' .    

This seems to differ from 'dpkg --compare-versions', not yet investigated further.
Comment 2 Kamil Dudka 2010-03-03 05:43:14 EST
(In reply to comment #0)
> Actual results:
> aaaa.1
> aaa.1

It bails out here:

$ printf "aaa.1\naaaa.1\n" > in && gdb -q --args sort -V in
(gdb) b filevercmp
Breakpoint 1 at 0x409e52: file filevercmp.c, line 134.
(gdb) r
Breakpoint 1, filevercmp (s1=0x61b4d0 "aaa.1", s2=0x61b4d6 "aaaa.1") at filevercmp.c:134
134       int simple_cmp = strcmp (s1, s2);
(gdb) b 97
Breakpoint 2 at 0x409ca4: file filevercmp.c, line 97.
(gdb) c
Breakpoint 2, verrevcmp (s1=0x61b4d0 "aaa.1", s1_len=5, s2=0x61b4d6 "aaaa.1", s2_len=6) at filevercmp.c:97
97                  return s1_c - s2_c;
(gdb) print s1[s1_pos]
$1 = 46 '.'
(gdb) print s2[s2_pos]
$2 = 97 'a'
(gdb) print s1_c
$3 = 302
(gdb) print s2_c
$4 = 97
Comment 3 Kamil Dudka 2010-03-03 05:52:01 EST
(In reply to comment #1)
> > Behavior is the same if '-' is used instead of '.' .    
> 
> This seems to differ from 'dpkg --compare-versions', not yet investigated
> further.    

The difference is in the preprocessing in case of dpkg.  When '-' is used, the suffix is cut off before it gets into verrevcmp():

$ gdb -q --args src/dpkg --compare-versions aaa.1 \< aaaa.1
(gdb) b verrevcmp
Breakpoint 1 at 0x41e390: file vercmp.c, line 42.
(gdb) r
Breakpoint 1, verrevcmp (val=0x7466c0 "aaa.1", ref=0x7466d0 "aaaa.1") at vercmp.c:42

$ gdb -q --args src/dpkg --compare-versions aaa-1 \< aaaa-1
(gdb) b verrevcmp
Breakpoint 1 at 0x41e390: file vercmp.c, line 42.
(gdb) r
Breakpoint 1, verrevcmp (val=0x7466c0 "aaa", ref=0x7466d0 "aaaa") at vercmp.c:42
Comment 4 Kamil Dudka 2010-09-28 06:13:14 EDT
I have no idea how to fix it, without breaking anything else.  Thanks for understanding!  Feel free to reopen as soon as you have a solution.