1167424 – MariaDB crashes with a backtrace: "[ERROR] mysqld got signal 11 ;"

Bug 1167424 - MariaDB crashes with a backtrace: "[ERROR] mysqld got signal 11 ;"

Summary: MariaDB crashes with a backtrace: "[ERROR] mysqld got signal 11 ;"

Keywords:
Status:	CLOSED NEXTRELEASE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	mariadb
Sub Component:
Version:	21
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Honza Horak
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2014-11-24 17:32 UTC by Kashyap Chamarthy
Modified:	2014-12-08 15:24 UTC (History)
CC List:	7 users (show)
Fixed In Version:	mariadb-10.0.15-1.fc22
Clone Of:
Environment:
Last Closed:	2014-12-08 15:24:34 UTC
Type:	Bug
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Complete mariadb.log (38.45 KB, text/plain) 2014-11-24 17:35 UTC, Kashyap Chamarthy	no flags	Details
View All

Description Kashyap Chamarthy 2014-11-24 17:32:34 UTC

Description of problem:

Noticed this crash when testing with DevStack (OpenStack's Developer installation)

[. . .]
2014-11-24 12:05:16.451 TRACE neutron.api.v2.resource DBConnectionError: (OperationalError) (2013, 'Lost connection to MySQL server during query') 'UPDATE routers SET gw_port_id=%s WHERE routers.id = %s' ('6656ea81-5d7c-4a20-990c-8b8098b3c6e9', '41a03fae-640c-4357-9280-fde33787ea70')
[. . .]

Version-Release number of selected component (if applicable):

  - mariadb-10.0.14-7.fc21.x86_64
  - DevStack with Git Master ommit: 59bfc08489f9d322acc0e44f14f27d630d730be4
  - Neutron with Git Master commit: 2f8ec46314ea414c0594d3f2102a1bcc3524f3de

Contextual log snippet with backtrace:

[. . .]
141124 11:59:22 mysqld_safe mysqld from pid file /var/run/mariadb/mariadb.pid ended
141124 11:59:23 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
141124 11:59:23 [Note] InnoDB: Using mutexes to ref count buffer pool pages
141124 11:59:23 [Note] InnoDB: The InnoDB memory heap is disabled
141124 11:59:23 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
141124 11:59:23 [Note] InnoDB: Memory barrier is not used
141124 11:59:23 [Note] InnoDB: Compressed tables use zlib 1.2.8
141124 11:59:23 [Note] InnoDB: Using Linux native AIO
141124 11:59:23 [Note] InnoDB: Using CPU crc32 instructions
141124 11:59:23 [Note] InnoDB: Initializing buffer pool, size = 128.0M
141124 11:59:23 [Note] InnoDB: Completed initialization of buffer pool
141124 11:59:23 [Note] InnoDB: Highest supported file format is Barracuda.
141124 11:59:23 [Note] InnoDB: 128 rollback segment(s) are active.
141124 11:59:23 [Note] InnoDB: Waiting for purge to start
141124 11:59:23 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.20-68.0 started; log sequence number 39265321
141124 11:59:23 [Note] Plugin 'FEEDBACK' is disabled.
141124 11:59:23 [Note] Server socket created on IP: '0.0.0.0'.
141124 11:59:24 [Note] Event Scheduler: Loaded 0 events
141124 11:59:24 [Note] /usr/libexec/mysqld: ready for connections.
Version: '10.0.14-MariaDB-log'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MariaDB Server
141124 12:05:16 [ERROR] mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.

To report this bug, see http://kb.askmonty.org/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.

Server version: 10.0.14-MariaDB-log
key_buffer_size=134217728
read_buffer_size=131072
max_used_connections=12
max_threads=153
thread_count=8
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 467145 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0x7f17d12bbbc8
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f17b810adc0 thread_stack 0x48000
/usr/libexec/mysqld(my_print_stacktrace+0x3d)[0x7f17cebc92fd]
/usr/libexec/mysqld(handle_fatal_signal+0x345)[0x7f17ce734c75]
/lib64/libpthread.so.0(+0x100d0)[0x7f17cddcf0d0]
/usr/libexec/mysqld(+0x912f24)[0x7f17ceb11f24]
/usr/libexec/mysqld(+0x804cea)[0x7f17cea03cea]
/usr/libexec/mysqld(+0x8081d0)[0x7f17cea071d0]
/usr/libexec/mysqld(+0x76424e)[0x7f17ce96324e]
/usr/libexec/mysqld(_ZN7handler13ha_update_rowEPKhPh+0x46e)[0x7f17ce73f09e]
/usr/libexec/mysqld(_Z12mysql_updateP3THDP10TABLE_LISTR4ListI4ItemES6_PS4_jP8st_ordery15enum_duplicatesbPySB_+0x1244)[0x7f17ce671aa4]
/usr/libexec/mysqld(_Z21mysql_execute_commandP3THD+0x25cc)[0x7f17ce5d8f9c]
/usr/libexec/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x20b)[0x7f17ce5de1db]
/usr/libexec/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x15b0)[0x7f17ce5dfe40]
/usr/libexec/mysqld(_Z24do_handle_one_connectionP3THD+0x1e2)[0x7f17ce6a3af2]
/usr/libexec/mysqld(handle_one_connection+0x40)[0x7f17ce6a3b90]
/lib64/libpthread.so.0(+0x752a)[0x7f17cddc652a]
/lib64/libc.so.6(clone+0x6d)[0x7f17cc3d177d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7f175c004cc0): UPDATE routers SET gw_port_id='6656ea81-5d7c-4a20-990c-8b8098b3c6e9' WHERE routers.id = '41a03fae-640c-4357-9280-fde33787ea70'
Connection ID (thread ID): 25
Status: NOT_KILLED

Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on
[. . .]



Additional info
----------------

MariaDB was running just fine when this occurs:
=======
$ systemctl status mariadb -l
● mariadb.service - MariaDB 10.0 database server
   Loaded: loaded (/usr/lib/systemd/system/mariadb.service; enabled)
  Drop-In: /usr/lib/systemd/system/mariadb.service.d
           └─mysql-compat.conf
   Active: active (running) since Mon 2014-11-24 11:59:24 EST; 32min ago
  Process: 23032 ExecStartPost=/usr/libexec/mysql-check-upgrade (code=exited, status=0/SUCCESS)
  Process: 22866 ExecStartPost=/usr/libexec/mysql-wait-ready $MAINPID (code=exited, status=0/SUCCESS)
  Process: 22830 ExecStartPre=/usr/libexec/mysql-prepare-db-dir %n (code=exited, status=0/SUCCESS)
  Process: 22806 ExecStartPre=/usr/libexec/mysql-check-socket (code=exited, status=0/SUCCESS)
 Main PID: 22865 (mysqld_safe)
   CGroup: /system.slice/mariadb.service
           ├─22865 /bin/sh /usr/bin/mysqld_safe --basedir=/usr
           └─26152 /usr/libexec/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql/plugin --log-error=/var/log/mariadb/mariadb.log --pid-file=/var/run/mariadb/mariadb.pid --socket=/var/lib/mysql/mysql.sock
=======

Comment 1 Kashyap Chamarthy 2014-11-24 17:35:02 UTC

Created attachment 960882 [details]
Complete mariadb.log

Comment 2 Kashyap Chamarthy 2014-11-26 12:03:49 UTC

However, updating to mariadb-10.0.14-8.fc21.x86_64, which is in updates testing:


  $ bodhi -L mariadb | grep f21
         f21-updates-testing  mariadb-10.0.14-8.fc21
                         f21  mariadb-10.0.14-7.fc21
       f21-updates-candidate  mariadb-10.0.14-7.fc21

fixes the issue for me. Added Karma for the said package version in updates-testing in Bodhi

  https://admin.fedoraproject.org/updates/FEDORA-2014-15761/mariadb-10.0.14-8.fc21

Comment 3 Kashyap Chamarthy 2014-11-26 14:13:34 UTC

Spoke too soon :-( On my first test run with mariadb-10.0.14-8.fc21 -- it fixes the issue (the said QUERY runs just fine). However, I re-ran the test two more times with it, I can consistently reproduce the Seg fault ('signal 11')

From /var/log/mariadb/mariadb.log:

[. . .]
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7fc9ac274dc0 thread_stack 0x48000
/usr/libexec/mysqld(my_print_stacktrace+0x3d)[0x7fc9c453632d]
/usr/libexec/mysqld(handle_fatal_signal+0x345)[0x7fc9c40a1c75]
/lib64/libpthread.so.0(+0x100d0)[0x7fc9c373c0d0]
/usr/libexec/mysqld(+0x92b920)[0x7fc9c4497920]
/usr/libexec/mysqld(+0x81d6ea)[0x7fc9c43896ea]
/usr/libexec/mysqld(+0x820bd0)[0x7fc9c438cbd0]
/usr/libexec/mysqld(+0x77cc4e)[0x7fc9c42e8c4e]
/usr/libexec/mysqld(_ZN7handler13ha_update_rowEPKhPh+0x46e)[0x7fc9c40ac09e]
/usr/libexec/mysqld(_Z12mysql_updateP3THDP10TABLE_LISTR4ListI4ItemES6_PS4_jP8st_ordery15enum_duplicatesbPySB_+0x1244)[0x7fc9c3fdeaa4]
/usr/libexec/mysqld(_Z21mysql_execute_commandP3THD+0x25cc)[0x7fc9c3f45f9c]
/usr/libexec/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x20b)[0x7fc9c3f4b1db]
/usr/libexec/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x15b0)[0x7fc9c3f4ce40]
/usr/libexec/mysqld(_Z24do_handle_one_connectionP3THD+0x1e2)[0x7fc9c4010af2]
/usr/libexec/mysqld(handle_one_connection+0x40)[0x7fc9c4010b90]
/lib64/libpthread.so.0(+0x752a)[0x7fc9c373352a]
/lib64/libc.so.6(clone+0x6d)[0x7fc9c1d3e77d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7fc94c004cc0): UPDATE routers SET gw_port_id='5f32203c-a0b5-4139-a8e1-27909d736103' WHERE routers.id = 'c7f828bd-92df-4630-a404-b11cde9faeb1'
Connection ID (thread ID): 25
Status: NOT_KILLED
[. . .]


So, re-opening the bug.

Comment 4 Daniel Black 2014-11-27 03:12:09 UTC

Hi, saw your #maria IRC channel comments.

You may have seen that 10.0.15 has been released.

Its possible that this might be a result of MDEV-7026.

Looking at the change log I can't see a bug that maps to this bug report.
https://mariadb.com/kb/en/mariadb/development/changelogs/mariadb-10015-changelog/

Is it possible to try 10.0.15 anyway? (https://downloads.mariadb.org/mariadb/repositories/#mirror=aarnet_pty_ltd&distro=Fedora&distro_release=fedora20-amd64&version=10.0 ) I know its not Fedora21 but can you perhaps extract the mysqld binary and place this temporally in /usr/sbin to see if it also generates the error.

Your stack trace is missing some information see: https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/

I'm not sure if this is deadlock related however could you enable innodb_print_all_deadlocks=1 to record information if it is.

Can you provide a schema? And a query plan (EXPLAIN UPDATE routers....)?

Does if occur if you manually do this update?

Feed free to do an upstream report with as much info as possible.

For bulk or private uploads see: https://mariadb.com/kb/en/meta/ftp/

Comment 5 Daniel Black 2014-11-27 03:13:39 UTC

if its repeatable perhaps even enabling the general query log.

Comment 6 Kashyap Chamarthy 2014-11-27 12:40:10 UTC

(In reply to Daniel Black from comment #4)
> Hi, saw your #maria IRC channel comments.
>
> 
> You may have seen that 10.0.15 has been released.

I haven't noticed, I picked whatever is the latest from Fedora's Rawhide.

> 
> Its possible that this might be a result of MDEV-7026.

Are you referring to this? https://mariadb.atlassian.net/browse/MDEV-7026 -- Race in InnoDB/XtraDB mutex implementation can stall or hang the server

> 
> Looking at the change log I can't see a bug that maps to this bug report.
> https://mariadb.com/kb/en/mariadb/development/changelogs/mariadb-10015-
> changelog/
> 
>
> Is it possible to try 10.0.15 anyway?
>
> (https://downloads.mariadb.org/mariadb/repositories/
> #mirror=aarnet_pty_ltd&distro=Fedora&distro_release=fedora20-
> amd64&version=10.0 ) I know its not Fedora21 but can you perhaps extract the
> mysqld binary and place this temporally in /usr/sbin to see if it also
> generates the error.

I'll try to test with this when I setup a new test environment, it's a fairly involved setup. Ideally I'd prefer if 10.0.15 lands in Fedora Rawhide - I pinged MariaDB package maintainer about it.

> Your stack trace is missing some information see:
> https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/

I'll see if I can get this done this week.

> I'm not sure if this is deadlock related however could you enable
> innodb_print_all_deadlocks=1 to record information if it is.
> 
> Can you provide a schema? And a query plan (EXPLAIN UPDATE routers....)?

Sorry, the said test environment is not available anymore, and I went with PostgreSQL on that machine as it worked well in this case.
 
> Does if occur if you manually do this update?

No, it doesn't. I should have mentioned it in my original description, when I tried manually, it works:

[. . .]
MariaDB [(none)]> use neutron;
MariaDB [neutron]> UPDATE routers SET gw_port_id='ad942f2f-df76-4bc0-9405-e64828008c2a' WHERE routers.id = '1d8428b0-f
333-4f63-af0e-91f93810e303';
Query OK, 1 row affected (0.02 sec)
Rows matched: 1  Changed: 1  Warnings: 0
[. . .]

> Feed free to do an upstream report with as much info as possible.
> 
> For bulk or private uploads see: https://mariadb.com/kb/en/meta/ftp/

Thanks for your response.

Comment 7 Daniel Black 2014-11-27 23:23:34 UTC

> Are you referring to this? https://mariadb.atlassian.net/browse/MDEV-7026 -- Race in InnoDB/XtraDB mutex implementation can stall or hang the server

Yes, though it doesn't sound immediately obvious the wrong values in mutexes that prevent operations when 0 could occur. Of course I could be really wrong.

> I'll try to test with this when I setup a new test environment, it's a fairly involved setup. Ideally I'd prefer if 10.0.15 lands in Fedora Rawhide - I pinged MariaDB package maintainer about it.

I understand. I'm a sysadmin too. Extracting and moving one executable in and out of service is a since compromise until the package maintainer updates. Provided 'ldd ./mysqld' picks up all the shared libraries is a pretty safe bet.

This 10.0.15 release fixes a huge 100+ bugs which is a larger than any recent release (for large values of recent).

Anyway. Don't quite have enough to go on but thanks for reporting. If you do get to try out 10.0.15 in a test environment it would be good to know the result. It seemed fairly repeatable even if it didn't do it on command.

Comment 8 Kashyap Chamarthy 2014-12-02 01:34:19 UTC

(In reply to Daniel Black from comment #7)

[. . .]

> Anyway. Don't quite have enough to go on but thanks for reporting. If you do
> get to try out 10.0.15 in a test environment it would be good to know the
> result.

I just tested MariaDB 10.0.15 from Fedora Rawhide, and I don't see the 
bug in question any more -- not atleast in my three consecutive runs of
the same test: setting up a minimal OpenStack DevStack environment with
its Neutron networking).

So, I can reasonably say: MariaDB 10.0.15 "fixes" this bug.

Thanks for the pointers, Daniel.


As a later artifact for others following the bug, I was at these are the 
Master git commits of OpenStack projects involved when I tested this:
-----------------------------------------------
For DevStack:
commit 3b1647310d8b8b975e74d2601ba841e373b97b20
For Glance:
commit f82574fa686fd3aae2e1284233c40ff05ad8abf2
For Keystone:
commit 10831618368f97a4521aeff8edf3efa5a8a8a663
For Neutron:
commit ab8d6756ebbe23624b39cbdfedcf2c54e65f04fe
For Nova:
commit 93f0f4ad8e7effba0902674bd476677a94946f75
For requirements:
commit 29c2e207d656c001580f59d4da1e4ffddde09397
-----------------------------------------------

Comment 9 Daniel Black 2014-12-02 05:06:50 UTC

glad it worked out. Thanks for testing again. I feel reassured that my 10.0.15+ installs are less likely to crash now.

Comment 10 Jakub Dorňák 2014-12-02 12:36:53 UTC

Hi, now, there is mariadb-10.0.15 available in rawhide.

Comment 11 Kashyap Chamarthy 2014-12-02 13:11:34 UTC

(In reply to Jakub Dorňák from comment #10)
> Hi, now, there is mariadb-10.0.15 available in rawhide.

Thanks Jakub. I tested it and mentioned the result in comment #8 above.

Note You need to log in before you can comment on or make changes to this bug.