Bug 1325964 - Error in `/usr/bin/restraint': free(): invalid next size (fast): 0x00000000018f21f0
Summary: Error in `/usr/bin/restraint': free(): invalid next size (fast): 0x0000000001...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Restraint
Classification: Retired
Component: general
Version: 0.1.23
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: 0.1.25
Assignee: Artem Savkov
QA Contact: tools-bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-04-11 13:49 UTC by Gowrishankar Rajaiyan
Modified: 2016-08-26 04:43 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-05-24 10:00:48 UTC


Attachments (Terms of Use)
restraint-client rpm (2.16 MB, application/x-rpm)
2016-04-12 18:17 UTC, Bill Peck
no flags Details
updated libssh version to 0.7.3 (2.16 MB, application/x-rpm)
2016-04-15 13:43 UTC, Bill Peck
no flags Details
debuginfo package (18.33 MB, application/x-rpm)
2016-04-18 13:10 UTC, Bill Peck
no flags Details

Comment 1 Bill Peck 2016-04-11 14:06:50 UTC
Can you upgrade to 0.1.24 version and also install the restraint-debuginfo, glib-debuginfo packages so that the backtrace is more useful?

Thanks!

Comment 2 Gowrishankar Rajaiyan 2016-04-11 14:18:04 UTC
I couldn't update using http://copr-be.cloud.fedoraproject.org/results/bpeck/restraint/fedora-23-x86_64/

Is there any other repo ?

Comment 3 Bill Peck 2016-04-11 14:27:43 UTC
Oops. my bad.  Too many things going on at once.

I'll get the latest version built there now.

Comment 4 Bill Peck 2016-04-11 14:58:18 UTC
OK - repo should be good to go now.  sorry about that.

Comment 6 Bill Peck 2016-04-12 15:34:26 UTC
Hi Shanks,

Did you install restraint-debuginfo and glibc-debuginfo as well?  It should be showing more than addresses in the traceback.

I'll grab a fedora 23 system and see if I can read the core file.

Thanks.

Comment 7 Gowrishankar Rajaiyan 2016-04-12 15:55:00 UTC
Yes, I have installed them. I will see if I find any more cores.

glibc-debuginfo-common-2.22-3.fc23.x86_64
restraint-debuginfo-0.1.24-1.fc23.x86_64
restraint-client-0.1.24-1.fc23.x86_64

Comment 8 Bill Peck 2016-04-12 18:17:17 UTC
Created attachment 1146588 [details]
restraint-client rpm

Comment 9 Bill Peck 2016-04-12 18:18:17 UTC
I've attached an updated restraint client rpm.  This basically just has fixes for signed vs unsigned.  

Can you try this?

Comment 10 Gowrishankar Rajaiyan 2016-04-13 18:34:21 UTC
Hi Bill,

Is there are repo or a wget URL from where I can download this ?

Comment 12 Bill Peck 2016-04-13 19:43:01 UTC
(In reply to Gowrishankar Rajaiyan from comment #10)
> Hi Bill,
> 
> Is there are repo or a wget URL from where I can download this ?

sorry, I don't have a repo made for it right now.  Can't you just download it from bugzilla?

Comment 14 Bill Peck 2016-04-14 12:35:57 UTC
Were you able to install the version attached to this bugzilla?

Comment 15 Gowrishankar Rajaiyan 2016-04-14 13:54:47 UTC
(In reply to Bill Peck from comment #14)
> Were you able to install the version attached to this bugzilla?

Oh yes! I should have mentioned that. comment #13 is with the attached build of restraint.

Comment 18 Bill Peck 2016-04-15 13:43:35 UTC
Created attachment 1147650 [details]
updated libssh version to 0.7.3

Comment 19 Bill Peck 2016-04-15 13:46:01 UTC
The core files seem to indicate a problem in the libssh code.  There has been numerous bug fixes and security fixes in libssh version from 0.7.0 -> 0.7.3.  So its a good idea to upgrade anyway.

I'm puzzled why this is suddenly happening.  Unless it has to do with which encryption is chosen and now we are hitting it because of upgraded servers?

In any event. Shanks, can you try upgrading to this version and see if it helps at all?

Thanks!

Comment 20 Gowrishankar Rajaiyan 2016-04-18 06:46:43 UTC
I have libssh2-1.6.0-4.fc23.x86_64 installed and that's the latest available in Fedora 23 repo.

Comment 21 Bill Peck 2016-04-18 11:25:17 UTC
I should have been more clear.  restraint is statically linked to libssh so the version I attached here uses the upgraded libssh.

Have you been able to upgrade to this newer restraint yet?

Version attached to this bz: 
restraint-client-0.1.24-1.git.20.b4dee5d.fc23.x86_64.rpm

Comment 23 Bill Peck 2016-04-18 13:08:09 UTC
here is the backtrace from the latest core file from Shanks.

(gdb) bt full
#0  0x00007f4648826a98 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:55
        resultvar = 0
        pid = 339
        selftid = 339
#1  0x00007f464882869a in __GI_abort () at abort.c:89
        save_stage = 2
        act = {__sigaction_handler = {sa_handler = 0x672f343662696c2f, sa_sigaction = 0x672f343662696c2f}, 
          sa_mask = {__val = {8026372414452428643, 7815263158107207278, 7307199665335595877, 7077745696801437450, 
              7365405400577892913, 3486684834451174964, 2337418197644357680, 3472328296227680304, 
              3467824696768081952, 4121979157631928864, 3975887029566386530, 7219659675480241254, 
              8083166874389458992, 3472328304817614880, 2321097884313198640, 3762817069363706162}}, 
          sa_flags = 538976288, sa_restorer = 0x5e}
        sigs = {__val = {32, 0 <repeats 15 times>}}
#2  0x00007f4648869daa in __libc_message (do_abort=do_abort@entry=2, 
    fmt=fmt@entry=0x7f464897c8a0 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/posix/libc_fatal.c:175
        ap = {{gp_offset = 40, fp_offset = 32764, overflow_arg_area = 0x7ffc4839af10, 
            reg_save_area = 0x7ffc4839aea0}}
        fd = 2
        on_2 = <optimized out>
        list = <optimized out>
        nlist = <optimized out>
        cp = <optimized out>
        written = <optimized out>
#3  0x00007f46488724fa in malloc_printerr (ar_ptr=<optimized out>, ptr=<optimized out>, 
    str=0x7f464897c9f8 "free(): invalid next size (normal)", action=3) at malloc.c:5007
        buf = "0000000001725140"
        cp = <optimized out>
        ar_ptr = <optimized out>
        str = 0x7f464897c9f8 "free(): invalid next size (normal)"
        action = 3
---Type <return> to continue, or q <return> to quit---
#4  _int_free (av=<optimized out>, p=<optimized out>, have_lock=<optimized out>) at malloc.c:3868
        size = <optimized out>
        fb = <optimized out>
        nextchunk = <optimized out>
        nextsize = <optimized out>
        nextinuse = <optimized out>
        prevsize = <optimized out>
        bck = <optimized out>
        fwd = <optimized out>
        errstr = <optimized out>
        locked = <optimized out>
#5  0x00007f4648875cac in __GI___libc_free (mem=<optimized out>) at malloc.c:2969
        ar_ptr = <optimized out>
        p = <optimized out>
        hook = <optimized out>
#6  0x0000000000698d1d in CRYPTO_free ()
No symbol table info available.
#7  0x0000000000000000 in ?? ()
No symbol table info available.

Comment 24 Bill Peck 2016-04-18 13:10:59 UTC
Created attachment 1148208 [details]
debuginfo package

Comment 25 Bill Peck 2016-04-18 13:12:29 UTC
Can you provide your dockerfile so we can try and reproduce this?

Thanks

Comment 29 Jan Stancek 2016-04-19 08:54:24 UTC
libssh is corrupting memory:

==21577== Invalid read of size 1
==21577==    at 0x65F30A: ssh_pki_import_pubkey_file (pki.c:1011)
==21577==    by 0x641E6C: ssh_userauth_publickey_auto (auth.c:944)
==21577==    by 0x40BE5B: ssh_establish_connection (ssh.c:56)
==21577==    by 0x40C82A: ssh_start (ssh.c:283)
==21577==    by 0x40AAA0: add_recipe_host (client.c:1461)
==21577==    by 0x40AFB9: main (client.c:1578)

==21577== Invalid write of size 1
==21577==    at 0x65F32D: ssh_pki_import_pubkey_file (pki.c:1012)
==21577==    by 0x641E6C: ssh_userauth_publickey_auto (auth.c:944)
==21577==    by 0x40BE5B: ssh_establish_connection (ssh.c:56)
==21577==    by 0x40C82A: ssh_start (ssh.c:283)
==21577==    by 0x40AAA0: add_recipe_host (client.c:1461)
==21577==    by 0x40AFB9: main (client.c:1578)

as result of .ssh/id_rsa.pub having unexpected format. Entry in id_rsa.pub is missing hostname and there's also no newline at the end, so lines 1011/1012 go over to next malloc chunk and clobber it:

 944 int ssh_pki_import_pubkey_file(const char *filename, ssh_key *pkey)
...
1001     q = p = key_buf;
1002     while (!isspace((int)*p)) p++;
1003     *p = '\0';
1004 
1005     type = ssh_key_type_from_name(q);
1006     if (type == SSH_KEYTYPE_UNKNOWN) {
1007         SAFE_FREE(key_buf);
1008         return SSH_ERROR;
1009     }
1010     q = ++p;
1011     while (!isspace((int)*p)) p++;
1012     *p = '\0';

Comment 30 Artem Savkov 2016-04-19 12:58:28 UTC
As mentioned above fixing the id_rsa.pub should resolve the issue. I've sent a libssh patch upstream to see if they have anything to add/object. After a review there we can add the patch to restraint until we an upstream version with this issue fixed is available.

Comment 31 Gowrishankar Rajaiyan 2016-04-19 14:50:20 UTC
Ok, I triggered few jobs after adding a new line to id_rsa.pub. Haven't seen this issue yet. Thanks for looking into this.

Comment 32 Artem Savkov 2016-05-16 09:27:03 UTC
(In reply to Artem Savkov from comment #30)
> As mentioned above fixing the id_rsa.pub should resolve the issue. I've sent
> a libssh patch upstream to see if they have anything to add/object. After a
> review there we can add the patch to restraint until we an upstream version
> with this issue fixed is available.

It takes them a while to respond so I went ahead and added the patch to restraint: https://gerrit.beaker-project.org/#/c/4897/

Comment 33 Jan Stancek 2016-05-16 12:04:08 UTC
(In reply to Artem Savkov from comment #30)
> I've sent a libssh patch upstream to see if they have anything to add/object.

Adding reference to libssh mail archive:
 http://www.libssh.org/archive/libssh/2016-04/0000008.html

Comment 34 Artem Savkov 2016-05-24 10:00:48 UTC
merged


Note You need to log in before you can comment on or make changes to this bug.