Bug 1567344 - Mariadb crashing
Summary: Mariadb crashing
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: mariadb
Version: 7.7
Hardware: All
OS: Linux
urgent
urgent
Target Milestone: rc
: ---
Assignee: Michal Schorm
QA Contact: qe-baseos-daemons
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-04-13 20:11 UTC by Rupesh Patel
Modified: 2019-08-02 19:43 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-08-02 19:42:55 UTC
Target Upstream Version:


Attachments (Terms of Use)
alarm_events (1.55 KB, text/plain)
2018-06-06 18:05 UTC, Rupesh Patel
no flags Details
rest_measurement_registry (614 bytes, text/plain)
2018-06-06 18:06 UTC, Rupesh Patel
no flags Details
site_configuration (2.01 KB, text/plain)
2018-06-06 18:06 UTC, Rupesh Patel
no flags Details
syslog_events (1.46 KB, text/plain)
2018-06-06 18:07 UTC, Rupesh Patel
no flags Details
turbine_information (869 bytes, text/plain)
2018-06-06 18:07 UTC, Rupesh Patel
no flags Details

Description Rupesh Patel 2018-04-13 20:11:54 UTC
#Rpm installed #

mariadb-5.5.56-2.el7.x86_64                                 
mariadb-libs-5.5.56-2.el7.x86_64                            
mariadb-server-5.5.56-2.el7.x86_64                          

# Customer run specific query and it crash. 


Database changed
MariaDB [tcm_offline]> SELECT
    ->     *
    -> FROM
    ->     (SELECT
    ->         site_index,
    ->             turbine_id AS wt_id,
    ->             turbine_serial AS wt_serial,
    ->             turbine_name AS wt_turbine,
    ->             display_name AS wt_name,
    ->             (SELECT
    ->                     SUM(CASE
    ->                             WHEN
    ->                                 (severity = 'fault'
    ->                                     OR severity = 'emergency')
    ->                                     AND acknowledged = 'no'
    ->                                     AND syslog_events.state = 'active'
    ->                             THEN
    ->                                 1
    ->                             ELSE 0
    ->                         END)
    ->                 FROM
    ->                     syslog_events
    ->                 WHERE
    ->                     syslog_events.location_serial = turbine_serial) AS unack_syslogs,
    ->             (SELECT
    ->                     SUM(CASE
    ->                             WHEN
    ->                                 (severity = 'fault'
    ->                                     OR severity = 'emergency')
    ->                                     AND syslog_events.state = 'active'
    ->                             THEN
    ->                                 1
    ->                             ELSE 0
    ->                         END)
    ->                 FROM
    ->                     syslog_events
    ->                 WHERE
    ->                     syslog_events.location_serial = turbine_serial) AS syslogs,
    ->             (SELECT
    ->                     GROUP_CONCAT(DISTINCT syslog_events.type_name)
    ->                 FROM
    ->                     syslog_events
    ->                 WHERE
    ->                     syslog_events.location_serial = turbine_serial
    ->                         AND state = 'Active'
    ->                         AND (severity = 'fault'
    ->                         OR severity = 'emergency')) AS types,
    ->             (SELECT
    ->                     DATEDIFF(CURDATE(), MAX(after))
    ->                 FROM
    ->                     rest_measurement_registry
    ->                 WHERE
    ->                     rest_measurement_registry.serial = turbine_serial) AS measurment_is_delayed
    ->     FROM
    ->         tcm_offline.site_configuration
    ->     JOIN turbine_information ON site_index = mysqld_multi_index
    ->     WHERE
    ->         site_index = 2 AND site_index IN (1 , 2)) AS greens
    ->         LEFT JOIN
    ->     (SELECT
    ->         SUBSTRING_INDEX(SUBSTRING_INDEX(MeasurementKey, '/', 2), '/', - 1) AS turbine,
    ->             IFNULL(MAX(IF(AckStatus = 'UnAck', IF(ISNULL(ToTime), CASE
    ->                 WHEN
    ->                     AlarmType LIKE '%Red%'
    ->                         OR AlarmType LIKE 'Overrun'
    ->                 THEN
    ->                     8
    ->                 WHEN AlarmType LIKE '%Yellow%' THEN 7
    ->                 WHEN AlarmType LIKE 'Blue%' THEN 6
    ->                 ELSE 0
    ->             END, 2), IF(ISNULL(ToTime), CASE
    ->                 WHEN
    ->                     AlarmType LIKE '%Red%'
    ->                         OR AlarmType LIKE 'Overrun'
    ->                 THEN
    ->                     5
    ->                 WHEN AlarmType LIKE '%Yellow%' THEN 4
    ->                 WHEN AlarmType LIKE 'Blue%' THEN 3
    ->                 ELSE 0
    ->             END, 1))), - 1) AS severity
    ->     FROM
    ->         alarm_events
    ->     WHERE
    ->         site_index = 2 AND site_index IN (1 , 2)
    ->     GROUP BY turbine) AS alarms ON greens.wt_turbine = alarms.turbine
    -> ORDER BY measurment_is_delayed DESC , severity DESC;
ERROR 2013 (HY000): Lost connection to MySQL server during query
MariaDB [tcm_offline]>

# Crash details #

180327  6:42:02 [ERROR] mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.

Server version: 5.5.56-MariaDB
key_buffer_size=1073741824
read_buffer_size=131072
max_used_connections=2
max_threads=202
thread_count=2
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1491713 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

# Stack strace, #

Thread pointer: 0x7f9193266df0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f918825bdc0 thread_stack 0x48000
/usr/libexec/mysqld(my_print_stacktrace+0x3d)[0x7f918fe5996d]
/usr/libexec/mysqld(handle_fatal_signal+0x515)[0x7f918fa6f285]
/lib64/libpthread.so.0(+0xf5e0)[0x7f918f19c5e0]
/usr/libexec/mysqld(_ZN10Item_fieldC1EP5Field+0xc)[0x7f918fa8625c]
/usr/libexec/mysqld(_ZN14Item_subselect18get_tmp_table_itemEP3THD+0x66)[0x7f918faf21d6]
/usr/libexec/mysqld(_ZN20Item_direct_view_ref18get_tmp_table_itemEP3THD+0x11)[0x7f918fa93f91]
/usr/libexec/mysqld(_ZN4JOIN4execEv+0x184c)[0x7f918f98f81c]
/usr/libexec/mysqld(_Z12mysql_selectP3THDPPP4ItemP10TABLE_LISTjR4ListIS1_ES2_jP8st_orderSB_S2_SB_yP13select_resultP18st_select_lex_unitP13st_select_lex+0x119)[0x7f918f989b39]
/usr/libexec/mysqld(_Z13handle_selectP3THDP3LEXP13select_resultm+0x2db)[0x7f918f98a5fb]
/usr/libexec/mysqld(+0x36b9d9)[0x7f918f9389d9]
/usr/libexec/mysqld(_Z21mysql_execute_commandP3THD+0x45d1)[0x7f918f943001]
/usr/libexec/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x125)[0x7f918f9458c5]
/usr/libexec/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x1753)[0x7f918f947923]
/usr/libexec/mysqld(_Z24do_handle_one_connectionP3THD+0x1c2)[0x7f918f9f9632]
/usr/libexec/mysqld(handle_one_connection+0x4a)[0x7f918f9f96da]
/lib64/libpthread.so.0(+0x7e25)[0x7f918f194e25]
/lib64/libc.so.6(clone+0x6d)[0x7f918d99234d]

Comment 5 Rupesh Patel 2018-04-19 17:17:51 UTC
Created attachment 1424222 [details]
table structure

Comment 19 Michal Schorm 2018-05-15 13:14:11 UTC
I'm totally fine with leaving them in the ticket.
Thanks for them

Comment 24 Rupesh Patel 2018-05-17 17:20:05 UTC
Some more information shared by customer. 

Hi Rupesh,

Customer created a new VM and they are able to reproduce the issue after running for 10 mins. Below is the update from customer:

So this time it managed to run for 10 min.

It works until the tables in tcm_offline DB getting populated with data.
The first line we see here is the Enterprise server getting turbine information from the site. 
The second line is the actual alarms and data being synced to the Enterprise server.

2018-05-17 12:00:23  rx : 183650, tx : 514, ok: 7, fail :0, time : 4.101, works
2018-05-17 12:10:28  rx : 43771206, tx : 500, ok : 7, fail : 0, time : 8.097, Fails

I tested the select statement just after I discovered this.

The whole script fails, but picking this out and run it alone, it works:

SELECT 
        SUBSTRING_INDEX(SUBSTRING_INDEX(MeasurementKey, '/', 2), '/', - 1) AS turbine,
            IFNULL(MAX(IF(AckStatus = 'UnAck', IF(ISNULL(ToTime), CASE
                WHEN
                    AlarmType LIKE '%Red%'
                        OR AlarmType LIKE 'Overrun'
                THEN
                    8
                WHEN AlarmType LIKE '%Yellow%' THEN 7
                WHEN AlarmType LIKE 'Blue%' THEN 6
                ELSE 0
            END, 2), IF(ISNULL(ToTime), CASE
                WHEN
                    AlarmType LIKE '%Red%'
                        OR AlarmType LIKE 'Overrun'
                THEN
                    5
                WHEN AlarmType LIKE '%Yellow%' THEN 4
                WHEN AlarmType LIKE 'Blue%' THEN 3
                ELSE 0
            END, 1))), - 1) AS severity
    FROM
        alarm_events
    WHERE
        site_index = 2 AND site_index IN (1 , 2)
    GROUP BY turbine;


Note:- This was also reproduced on Azure machine only.

Comment 29 Rupesh Patel 2018-06-06 18:05:52 UTC
Created attachment 1448402 [details]
alarm_events

Comment 30 Rupesh Patel 2018-06-06 18:06:23 UTC
Created attachment 1448403 [details]
rest_measurement_registry

Comment 31 Rupesh Patel 2018-06-06 18:06:50 UTC
Created attachment 1448404 [details]
site_configuration

Comment 32 Rupesh Patel 2018-06-06 18:07:18 UTC
Created attachment 1448405 [details]
syslog_events

Comment 33 Rupesh Patel 2018-06-06 18:07:49 UTC
Created attachment 1448406 [details]
turbine_information

Comment 48 Rupesh Patel 2019-08-02 19:42:55 UTC
After discussion with the team closing this bug


Note You need to log in before you can comment on or make changes to this bug.