Bug 2069264 - [abrt] notmuch: notmuch_tags_valid(): notmuch killed by SIGSEGV
Summary: [abrt] notmuch: notmuch_tags_valid(): notmuch killed by SIGSEGV
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: notmuch
Version: 35
Hardware: x86_64
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Michael J Gruber
QA Contact: Fedora Extras Quality Assurance
URL: https://retrace.fedoraproject.org/faf...
Whiteboard: abrt_hash:984ab0912b2464ac1bdc7a4ed03...
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-03-28 15:27 UTC by Jan Hutař
Modified: 2023-01-02 20:48 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-08-29 10:52:25 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
File: backtrace (11.46 KB, text/plain)
2022-03-28 15:27 UTC, Jan Hutař
no flags Details
File: core_backtrace (2.16 KB, text/plain)
2022-03-28 15:27 UTC, Jan Hutař
no flags Details
File: cpuinfo (2.52 KB, text/plain)
2022-03-28 15:27 UTC, Jan Hutař
no flags Details
File: dso_list (598 bytes, text/plain)
2022-03-28 15:27 UTC, Jan Hutař
no flags Details
File: environ (3.20 KB, text/plain)
2022-03-28 15:27 UTC, Jan Hutař
no flags Details
File: exploitable (82 bytes, text/plain)
2022-03-28 15:27 UTC, Jan Hutař
no flags Details
File: limits (1.29 KB, text/plain)
2022-03-28 15:27 UTC, Jan Hutař
no flags Details
File: maps (3.98 KB, text/plain)
2022-03-28 15:27 UTC, Jan Hutař
no flags Details
File: mountinfo (2.88 KB, text/plain)
2022-03-28 15:27 UTC, Jan Hutař
no flags Details
File: open_fds (579 bytes, text/plain)
2022-03-28 15:27 UTC, Jan Hutař
no flags Details
File: proc_pid_status (1.39 KB, text/plain)
2022-03-28 15:27 UTC, Jan Hutař
no flags Details

Description Jan Hutař 2022-03-28 15:27:02 UTC
Description of problem:
Some email have to be causing that :-/ neomutt is failing because of that as well when I have virtual-mailboxes configuerd. Command I have used in this case was:

notmuch search "tag:zsdidaktis OR from:@zsdidaktis.cz"

Version-Release number of selected component:
notmuch-0.35-2.fc35

Additional info:
reporter:       libreport-2.15.2
backtrace_rating: 4
cgroup:         0::/user.slice/user-1000.slice/session-2.scope
cmdline:        notmuch search $'tag:zsdidaktis OR from:@zsdidaktis.cz'
crash_function: notmuch_tags_valid
executable:     /usr/bin/notmuch
journald_cursor: s=2b2e17a38b1e4b81b5a9d1b5dc73c207;i=4a10;b=b9117c89319c46bfb4da24d1af361aae;m=36b1505a0a;t=5db48d2887322;x=3e305f1fd4ce2258
kernel:         5.16.15-201.fc35.x86_64
rootdir:        /
runlevel:       N 5
type:           CCpp
uid:            1000

Truncated backtrace:
Thread no. 1 (6 frames)
 #0 notmuch_tags_valid at lib/tags.c:51
 #1 _thread_add_message at lib/thread.cc:254
 #2 _notmuch_thread_create at lib/thread.cc:631
 #3 notmuch_threads_get at lib/query.cc:671
 #4 do_search_threads at /usr/src/debug/notmuch-0.35-2.fc35.x86_64/notmuch-search.c:150
 #5 notmuch_search_command at /usr/src/debug/notmuch-0.35-2.fc35.x86_64/notmuch-search.c:845

Comment 1 Jan Hutař 2022-03-28 15:27:06 UTC
Created attachment 1868753 [details]
File: backtrace

Comment 2 Jan Hutař 2022-03-28 15:27:07 UTC
Created attachment 1868754 [details]
File: core_backtrace

Comment 3 Jan Hutař 2022-03-28 15:27:09 UTC
Created attachment 1868755 [details]
File: cpuinfo

Comment 4 Jan Hutař 2022-03-28 15:27:10 UTC
Created attachment 1868756 [details]
File: dso_list

Comment 5 Jan Hutař 2022-03-28 15:27:12 UTC
Created attachment 1868757 [details]
File: environ

Comment 6 Jan Hutař 2022-03-28 15:27:13 UTC
Created attachment 1868758 [details]
File: exploitable

Comment 7 Jan Hutař 2022-03-28 15:27:15 UTC
Created attachment 1868759 [details]
File: limits

Comment 8 Jan Hutař 2022-03-28 15:27:17 UTC
Created attachment 1868760 [details]
File: maps

Comment 9 Jan Hutař 2022-03-28 15:27:18 UTC
Created attachment 1868761 [details]
File: mountinfo

Comment 10 Jan Hutař 2022-03-28 15:27:20 UTC
Created attachment 1868762 [details]
File: open_fds

Comment 11 Jan Hutař 2022-03-28 15:27:21 UTC
Created attachment 1868763 [details]
File: proc_pid_status

Comment 12 Michael J Gruber 2022-03-28 16:26:08 UTC
Thanks for the report.

I suspect this is an internal problem in old code, but can you check whether downgrading to notmuch 0.34 helps? Or are you able to isolate and share the problematic e-mails (possibly by direct e-mail)?

Alternatively, this one might catch the SIGSEV:

https://koji.fedoraproject.org/koji/taskinfo?taskID=84840776

Still it would be good to have a reproducer.

Comment 13 Jan Hutař 2022-03-28 19:22:57 UTC
Thank you a lot for a quick response!

I have downgraded to https://kojipkgs.fedoraproject.org//packages/notmuch/0.34/1.fc35/x86_64/notmuch-0.34-1.fc35.x86_64.rpm but it is still failing.

Comment 15 Jan Hutař 2022-03-28 20:13:01 UTC
I have installed what you suggested notmuch-0.35-3.fc35.x86_64 and rerun the reproducer and it is still failing:

Stack trace of thread 467515:
#0  0x00007fb9ae00899d __strlen_avx2 (libc.so.6 + 0x17f99d)
#1  0x00007fb9adf28fe3 __strdup (libc.so.6 + 0x9ffe3)
#2  0x00007fb9ae2fcbc3 notmuch_threads_get (libnotmuch.so.5 + 0x25bc3)
#3  0x000055d861784464 notmuch_search_command (notmuch + 0x15464)
#4  0x000055d86177a143 main (notmuch + 0xb143)
#5  0x00007fb9adeb6560 __libc_start_call_main (libc.so.6 + 0x2d560)
#6  0x00007fb9adeb660c __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x2d60c)
#7  0x000055d86177a485 _start (notmuch + 0xb485)

Comment 16 Michael J Gruber 2022-03-28 20:38:49 UTC
Thanks for the quick check!

So, as suspected, it's not a regression.

The stack trace with notmuch-0.35-3 is different since I patched a few library funtions to do additional checks on a struct pointer (which apparantly happens to be NULL in your case). I can't quite make sense of the new trace, though (i.e. where a messy string is strduped).

What happens if you search the terms without explicit OR:

notmuch search tag:zsdidaktis from:@zsdidaktis.cz

or individually:

notmuch search tag:zsdidaktis
notmuch search from:@zsdidaktis.cz

Somehow, one of the matched messages seems to lead to a NULL tags pointer which is weird - I can't reproduce simply by a message without tags, for example. Maybe adding `--output=tags` to these searches shows something fishy? Maybe an automatically generated tag with a weird encoding (fishing in the dark)?

In any case, notmuch should not crash, of course, but I'm trying to find the cause.

Comment 17 Jan Hutař 2022-03-30 19:13:17 UTC
Hello!

These are the tags for the message:

    $ notmuch search --offset 405 --limit 1 "tag:zsdidaktis OR from:@zsdidaktis.cz"
    Segmentation fault (core dumped)
    $ notmuch search --offset 405 --limit 1 --output tags "tag:zsdidaktis OR from:@zsdidaktis.cz"
    archive
    attachment
    inbox
    me
    replied
    seznam.cz
    unread
    zsdidaktis

Please note it is perfectly possible mine notmuch database is somehow corrupted as I ran out of disk space few times and so, but e.g. `notmuch compact` works fine.

I'm getting segfault for all these 3 commands:

 * `notmuch search "tag:zsdidaktis" "from:@zsdidaktis.cz"`
 * `notmuch search "tag:zsdidaktis"`
 * `notmuch search "from:@zsdidaktis.cz"`

BTW looks like "summary" and "threads" are the only outputs showing the error:

    $ notmuch search --offset 405 --limit 1 --output=summary "tag:zsdidaktis OR from:@zsdidaktis.cz"
    Segmentation fault (core dumped)
    $ notmuch search --offset 405 --limit 1 --output=threads "tag:zsdidaktis OR from:@zsdidaktis.cz"
    Segmentation fault (core dumped)
    $ notmuch search --offset 405 --limit 1 --output=messages "tag:zsdidaktis OR from:@zsdidaktis.cz"
    id:VI1PR10MB16636E801F616CC93C139117F46E0.PROD.OUTLOOK.COM
    $ notmuch search --offset 405 --limit 1 --output=files "tag:zsdidaktis OR from:@zsdidaktis.cz"
    /home/jhutar/.Maildir/cur/1593426607.M383559P7159.localhost.localdomain:2,S
    $ notmuch search --offset 405 --limit 1 --output=tags "tag:zsdidaktis OR from:@zsdidaktis.cz"
    archive
    attachment
    inbox
    me
    replied
    seznam.cz
    unread
    zsdidaktis

Comment 18 Michael J Gruber 2022-03-31 10:20:18 UTC
Very keen and helpful observation, and inline with the original stack trace, since do_search_threads() is used in these two modes only. (I originally looked deeper down in the stack/lines up.)

At this point we can take this to the notmuch-devel list or continue here - I'm sure it's no package problem nor a regression. I suspect a problem in the database which, arguably, notmuch could deal with in a better way.

Things you could try:

`notmuch search --exclude=false` could avoid the codepath which calls into notmuch_message_get_tags(), but the real problem might show up later, as it did with my first patch (which made one codepath behave well).

`NOTMUCH_DEBUG_QUERY=1 notmuch search` will output the underlying Xapian queries which notmuch uses. In particular, they are different for different output modes and might indicate whether there are suspicious Terms or thread entries (G...).

From that we could delve into the Xapian db. I did this once but would have to brush up on my Xapian-foo.

Possible solution for the suspected db problem:

`notmuch reindex` will probably run into the same segfault before reindexing the message ... So, alternatively: Backup the tags (selective dump), move the message file away from the mail tree, `notmuch new` to purge it from the db, then move it back or `notmuch insert` it, possibly followed up by restoring the tags.

Comment 19 Michael J Gruber 2022-08-29 10:52:25 UTC
Closing due to the version this was reported against. Please feel free to reopen if problems persist with notmuch 0.36 or the upcoming 0.37 (in updates-testing).

Comment 20 Jan Hutař 2023-01-02 14:57:31 UTC
Hello. Unfortunately the issue is still there:

    $ rpm -q notmuch
    notmuch-0.37-1.fc37.x86_64
    $ notmuch search "tag:zsdidaktis OR from:@zsdidaktis.cz"
    [...more than 400 rows...]
    thread:000000000002515e   2020-03-02 [1/1] [...]
    thread:(null)   1970-01-01 [0/0] ; (null) ()
    thread:(null)   1970-01-01 [0/0] ; (null) ()
    thread:(null)   1970-01-01 [0/0] ; (null) ()
    thread:(null)   1970-01-01 [0/0] ; (null) ()
    thread:(null)   1970-01-01 [0/0] ; (null) ()
    thread:(null)   1970-01-01 [0/0] ; (null) ()
    thread:(null)   1970-01-01 [0/0] ; (null) ()
    thread:(null)   1970-01-01 [0/0] ; (null) ()
    thread:0000000000024b47   2020-02-04 [2/3] [...]
    thread:0000000000024b57   2020-02-03 [1/1] [...]
    thread:0000000000024afa   2020-01-31 [1/1] [...]
    thread:0000000000024a02   2020-01-27 [1/1] [...]
    thread:00000000000249b9   2020-01-24 [1/2] [...]
    Segmentation fault (core dumped)
    $ echo $?
    139

Looks like "--exclude=false" is not helping:

    $ notmuch search --exclude=false "tag:zsdidaktis OR from:@zsdidaktis.cz"
    [...]
    Segmentation fault (core dumped)

Using "NOTMUCH_DEBUG_QUERY=1":

    $ NOTMUCH_DEBUG_QUERY=1 notmuch search "tag:zsdidaktis OR from:@zsdidaktis.cz"
    [...]
    Query string is:
    thread:0000000000021c67
    Exclude query is:
    Query()
    Final query is:
    Query((Tmail AND 0 * G0000000000021c67))
    Segmentation fault (core dumped)

"notmuch reindex" seems to be failing silently:

    $ notmuch reindex "tag:zsdidaktis OR from:@zsdidaktis.cz"
    $ echo $?
    1

In the end I made this (I do not need these messages):

    $ mv $( notmuch search --output=files "tag:zsdidaktis OR from:@zsdidaktis.cz" ) ../DELME/
    $ notmuch new

Thank you and happy new year!

Comment 21 Michael J Gruber 2023-01-02 16:37:17 UTC
Thanks for reporting back, and happy new year to you, too!

> thread:(null)   1970-01-01 [0/0] ; (null) ()

These lines should definitely not be there and indicate a database problem. No, notmuch should not crash, but the root cause is a corrupt db.

>    Query string is:
>    thread:0000000000021c67
>    Exclude query is:
>    Query()
>    Final query is:
>    Query((Tmail AND 0 * G0000000000021c67))

Those are the queries for the individual result threads just before display, not the original query. This indeed means that thread:0000000000021c67 should be the (first) one causing the segfault, but parallism could fool us.

Did moving some messages away solve the issue?

I would in fact try the opposite approach: Use `notmuch dump` to backup all tags, remove all xapian files (in `$(notmuch config get database.path)/xapian)`), rescan your mail (I know this takes a bit), then `notmuch restore` your tags.

Comment 22 Jan Hutař 2023-01-02 20:48:20 UTC
Problem is I can not dump the tags as I was getting segfault. I understand the issue is with data being corrupt. I'll remove all messages and pass them through procmail again.


Note You need to log in before you can comment on or make changes to this bug.