Bug 1601146 - ntfsclone fails with: ntfsclone: malloc.c:2385: sysmalloc: Assertion `(old_top == initial_top (av) && old_size == 0) || ((unsigned long) (old_size) >= MINSIZE && prev_inuse (old_top) && ((unsigned long) old_end & (pagesize - 1)) == 0)' failed.
Summary: ntfsclone fails with: ntfsclone: malloc.c:2385: sysmalloc: Assertion `(old_to...
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: ntfs-3g
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Tom "spot" Callaway
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: TRACKER-bugs-affecting-libguestfs
TreeView+ depends on / blocked
 
Reported: 2018-07-14 08:40 UTC by Richard W.M. Jones
Modified: 2018-07-24 13:35 UTC (History)
2 users (show)

Fixed In Version: ntfs-3g-2017.3.23-8.fc29
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-07-24 13:35:36 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
Avoid malloc() of zero bytes (844 bytes, patch)
2018-07-16 08:06 UTC, Jean-Pierre André
no flags Details | Diff
Always allocate full clusters (2.15 KB, patch)
2018-07-16 12:25 UTC, Jean-Pierre André
no flags Details | Diff

Description Richard W.M. Jones 2018-07-14 08:40:59 UTC
Description of problem:

When using ntfsclone we observe this error which seems to indicate
memory corruption in the binary:

ntfsclone -o - --save-image --metadata --ignore-fs-check /dev/sda2
ntfsclone v2017.3.23 (libntfs-3g)
NTFS volume version: 3.1
Cluster size       : 4096 bytes
Current volume size: 268402688 bytes (269 MB)
Current device size: 268403200 bytes (269 MB)
Scanning volume ...
  0.00 percent completed^M100.00 percent completed
Accounting clusters ...
Space in use       : 1 MB (0.2%)   
Scanning volume ...
  0.00 percent completed^Mntfsclone: malloc.c:2385: sysmalloc: Assertion `(old_top == initial_top (av) && old_size == 0) || ((unsigned long) (old_size) >= MINSIZE && prev_inuse (old_top) && ((unsigned long) old_end & (pagesize - 1)) == 0)' failed.

Version-Release number of selected component (if applicable):

ntfsprogs-2017.3.23-6.fc29.x86_64

I recently upgraded this machine.  This did NOT happen with the previous
version (2:2017.3.23-4.fc28).  However since I also upgraded a lot of
other stuff including glibc, it might have been caused by another component.

How reproducible:

100%

Steps to Reproduce:
1. Run the libguestfs test suite in the tests/ntfs directory.

Comment 1 Jean-Pierre André 2018-07-16 08:06:39 UTC
Created attachment 1459065 [details]
Avoid malloc() of zero bytes

Due to a recent change implying mallocation of buffers for clusters, there is a possibility that an malloc() of zero bytes is made when no backup bootsector is present in the image. This patch avoid this situation, though such an malloc() should be allowed (which means the patch might not address the real issue).

Comment 2 Richard W.M. Jones 2018-07-16 09:55:40 UTC
malloc(0) is valid in C, so although this may be some bug it's unlikely
to be this bug.

Here's a simple reproducer of the orginal bug that doesn't require any
special privileges or devices:

$ rm -f test.img
$ truncate -s 1G test.img
$ mkfs.ntfs -F test.img
test.img is not a block device.
mkntfs forced anyway.
[...]
$ ntfsclone -o clone --save-image --metadata --ignore-fs-check test.img 
ntfsclone v2017.3.23 (libntfs-3g)
NTFS volume version: 3.1
Cluster size       : 4096 bytes
Current volume size: 1073737728 bytes (1074 MB)
Current device size: 1073741824 bytes (1074 MB)
Scanning volume ...
100.00 percent completed
Accounting clusters ...
Space in use       : 1 MB (0.1%)   
Scanning volume ...
ntfsclone: malloc.c:2385: sysmalloc: Assertion `(old_top == initial_top (av) && old_size == 0) || ((unsigned long) (old_size) >= MINSIZE && prev_inuse (old_top) && ((unsigned long) old_end & (pagesize - 1)) == 0)' failed.
Aborted (core dumped)

I was also able to get a stack trace although it's unfortunately
missing vital debug information for some stack frames even though
I believe I have all the required debuginfo installed:

#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007f52db7f2835 in __GI_abort () at abort.c:79
#2  0x00007f52db851a0a in __malloc_assert (
    assertion=assertion@entry=0x7f52db959ea0 "(old_top == initial_top (av) && old_size == 0) || ((unsigned long) (old_size) >= MINSIZE && prev_inuse (old_top) && ((unsigned long) old_end & (pagesize - 1)) == 0)", 
    file=file@entry=0x7f52db9560d2 "malloc.c", line=line@entry=2385, 
    function=function@entry=0x7f52db95a4f0 <__PRETTY_FUNCTION__.12924> "sysmalloc") at malloc.c:298
#3  0x00007f52db853daf in sysmalloc (nb=nb@entry=4112, 
    av=av@entry=0x7f52dbb8dc60 <main_arena>) at malloc.c:2382
#4  0x00007f52db855146 in _int_malloc (
    av=av@entry=0x7f52dbb8dc60 <main_arena>, bytes=bytes@entry=4096)
    at malloc.c:4111
#5  0x00007f52db856387 in __GI___libc_malloc (bytes=bytes@entry=4096)
    at malloc.c:3041
#6  0x00007f52dbbc1922 in ntfs_malloc (size=4096) at misc.c:57
#7  0x00005634397bb707 in ?? ()
#8  0x00005634397bcc29 in ?? ()
#9  0x00005634397ba2d6 in ?? ()
#10 0x00007f52db7f43b3 in __libc_start_main (main=0x5634397b92b0, argc=7, 
    argv=0x7fff7baebe18, init=<optimized out>, fini=<optimized out>, 
    rtld_fini=<optimized out>, stack_end=0x7fff7baebe08)
    at ../csu/libc-start.c:308
#11 0x00005634397babaa in ?? ()

I am using ntfs-3g-2017.3.23-6.fc29.x86_64

Comment 3 Richard W.M. Jones 2018-07-16 10:02:01 UTC
A few valgrinding errors too, although again the symbols are
unfortunately unavailable for unknown reasons:

==8242== Syscall param read(buf) points to unaddressable byte(s)
==8242==    at 0x517B3D5: read (read.c:26)
==8242==    by 0x10C105: ??? (in /usr/sbin/ntfsclone)
==8242==    by 0x10CE11: ??? (in /usr/sbin/ntfsclone)
==8242==    by 0x10E81E: ??? (in /usr/sbin/ntfsclone)
==8242==    by 0x10B2D5: ??? (in /usr/sbin/ntfsclone)
==8242==    by 0x50B23B2: (below main) (libc-start.c:308)
==8242==  Address 0x54c83c0 is 0 bytes after a block of size 1,024 alloc'd
==8242==    at 0x4C2FB5B: malloc (vg_replace_malloc.c:299)
==8242==    by 0x4E6A921: ntfs_malloc (misc.c:57)
==8242==    by 0x10DF54: ??? (in /usr/sbin/ntfsclone)
==8242==    by 0x10B2D5: ??? (in /usr/sbin/ntfsclone)
==8242==    by 0x50B23B2: (below main) (libc-start.c:308)
==8242== 

valgrind: m_mallocfree.c:280 (mk_plain_bszB): Assertion 'bszB != 0' failed.
valgrind: This is probably caused by your program erroneously writing past the
end of a heap block and corrupting heap metadata.  If you fix any
invalid writes reported by Memcheck, this assertion failure will
probably go away.  Please try that before reporting this as a bug.

Comment 4 Jean-Pierre André 2018-07-16 12:25:26 UTC
Created attachment 1459151 [details]
Always allocate full clusters

Ok. got it. When using --ignore-fs-check the rescue procedure processes full clusters, so full clusters must be allocated even when they are not fully used.

Comment 5 Richard W.M. Jones 2018-07-16 13:12:59 UTC
Scratch build containing this patch:

https://koji.fedoraproject.org/koji/taskinfo?taskID=28335543

Comment 6 Richard W.M. Jones 2018-07-16 13:18:57 UTC
Can confirm that the patch works, I have pushed it to Fedora and
issued a build in Rawhide.

Please let us know when the final version of the patch goes upstream
in case the patch in Fedora needs adjustment.

Comment 7 Tom "spot" Callaway 2018-07-24 13:35:36 UTC
Fixed in rawhide, thanks Richard for handling this one while I was on vacation. :)


Note You need to log in before you can comment on or make changes to this bug.