Bug 555716
| Summary: | [qpidd+store] broker rarely segfaults when stressed by perftest | ||
|---|---|---|---|
| Product: | Red Hat Enterprise MRG | Reporter: | Frantisek Reznicek <freznice> |
| Component: | qpid-cpp | Assignee: | Andrew Stitcher <astitcher> |
| Status: | CLOSED ERRATA | QA Contact: | Frantisek Reznicek <freznice> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 1.2 | CC: | esammons, gsim |
| Target Milestone: | 1.3 | ||
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: |
Under rare conditions a broker with the persistence storage module could crash with a SIGSEGV signal.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2010-10-14 16:04:29 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Frantisek Reznicek
2010-01-15 10:14:31 UTC
Additional info to reproducer, following snippet shows how are the qpidd|perftest parameters sweept (in qpid_test_qpidd-perftest_performance test): linked: http://cvs.devel.redhat.com/cgi-bin/cvsweb.cgi/tests/distribution/MRG/Messaging/qpid_test_qpidd-perftest_performance/runtest.sh?rev=HEAD for ((i_loop=0; i_loop<${NR_OF_LOOPS}; i_loop++)); do # test start-up settings case $((${i_loop}%3)) in 0) # w/o management w store qpidd_test_params_run="${QPIDD_TEST_PARAMS} \ ${QPIDD_TEST_PARAMS_COMMON_JRNL} --mgmt-enable 0" ;; 1) # w management w store qpidd_test_params_run="${QPIDD_TEST_PARAMS} \ ${QPIDD_TEST_PARAMS_COMMON_JRNL} --mgmt-enable 1" ;; 2) # w/o management w/o store qpidd_test_params_run="${QPIDD_TEST_PARAMS} --mgmt-enable 0" # rename the msgstore if [ -e ${msgstore_fp} ]; then mv ${msgstore_fp} ${msgstore_fp}_ else lognl "WARNING: store module not found - skipping ${msgstore_fp} " \ "rename ${i_loop_p1}/${NR_OF_LOOPS}" fi ;; esac ... pt_mode_list="shared fanout topic" pt_qt_list="1 2" pt_durable_list="yes no" pt_npubs_list="1 2 3" pt_nsubs_list="1 2 3" pt_msg_count_list="200000 400000" pt_msg_size_list="128 1024" pt_tx_list="0 1 2" pt_tx_list="0" pt_ac_list="yes no" pt_ac_list="no" pt_iterations=1 pt_common="--iterations ${pt_iterations} --summary --unique-data yes" pt_common="${pt_common} --log-enable info+" # perftest mode loop for i_pt_mode in ${pt_mode_list}; do pt_mode="--mode ${i_pt_mode}" # modify qt in perftest fanout mode if [ "${i_pt_mode}" == "fanout" ]; then pt_qt_list="1" fi # qt switch loop for i_pt_qt in ${pt_qt_list} ; do pt_qt="--qt ${i_pt_qt}" # perftest durable loop for i_pt_durable in ${pt_durable_list}; do pt_durable="--durable ${i_pt_durable}" # perftest npubs loop for i_pt_npubs in ${pt_npubs_list}; do pt_npubs="--npubs ${i_pt_npubs}" # perftest nsubs loop for i_pt_nsubs in ${pt_nsubs_list}; do pt_nsubs="--nsubs ${i_pt_nsubs}" # perftest msg count loop for i_pt_msg_count in ${pt_msg_count_list}; do pt_msg_count="--count ${i_pt_msg_count}" # perftest msg size loop for i_pt_msg_size in ${pt_msg_size_list}; do pt_msg_size="--size ${i_pt_msg_size}" # perftest tx loop for i_pt_tx in ${pt_tx_list}; do pt_tx="--tx ${i_pt_tx}" # perftest async-commit loop for i_pt_ac in ${pt_ac_list}; do pt_ac="--async-commit ${i_pt_ac}" # randomly select the qpidd port - conditioned if [ "${i_loop}" -lt \ "${i_loop_thr_for_qpidd_keep_running}" ]; then mrg_gen_my_rand_in_range 40001 43590 QPIDD_PORT=${MY_RAND} fi # collect perftest parameters pt_params="${pt_common} -p ${QPIDD_PORT}" pt_params="${pt_params} ${pt_mode} ${pt_qt} ${pt_durable}" pt_params="${pt_params} ${pt_npubs} ${pt_nsubs}" pt_params="${pt_params} ${pt_msg_count} ${pt_msg_size}" pt_params="${pt_params} ${pt_tx} ${pt_ac}" ... ( /usr/bin/time -f "%e" -o ${TIME_TRANSCRIPT} \ perftest ${pt_params} >> ${PERFTEST_TRANSCRIPT} 2>&1; \ echo $? > ${TEMP_FILE} ) & ... done done done done done done done done done done In the 1.2 source code it looks most likely that the qpid::broker::Connection timeoutTimer member is 0 causing a 0 dereference and a SIGSEGV. I think there's a reasonable chance that changes in the Connection class has fixed the bug on the trunk code line. So retesting for this bug would be very helpful. The issue has been fixed (no aborts / crashes detected), tested in four extended week runs on RHEL 4.8 / 5.5 i386 / x86_64 on packages: python-qpid-0.7.946106-1.el5 python-saslwrapper-0.1.934605-2.el5 qpid-cpp-client-0.7.946106-2.el5 qpid-cpp-client-devel-0.7.946106-2.el5 qpid-cpp-client-devel-docs-0.7.946106-2.el5 qpid-cpp-client-ssl-0.7.946106-2.el5 qpid-cpp-mrg-debuginfo-0.7.946106-2.el5 qpid-cpp-server-0.7.946106-2.el5 qpid-cpp-server-cluster-0.7.946106-2.el5 qpid-cpp-server-devel-0.7.946106-2.el5 qpid-cpp-server-ssl-0.7.946106-2.el5 qpid-cpp-server-store-0.7.946106-2.el5 qpid-cpp-server-xml-0.7.946106-2.el5 qpid-java-client-0.7.946106-3.el5 qpid-java-common-0.7.946106-3.el5 qpid-tests-0.7.946106-1.el5 qpid-tools-0.7.946106-4.el5 ruby-qpid-0.7.946106-2.el5 ruby-saslwrapper-0.1.934605-2.el5 saslwrapper-0.1.934605-2.el5 saslwrapper-devel-0.1.934605-2.el5 -> VERIFIED This bug seems to have been fixed as part of some other work and there is no information here about that other bug fix. I'm afraid there isn't enough information here to figure out a real release note.
Technical note added. If any revisions are required, please edit the "Technical Notes" field
accordingly. All revisions will be proofread by the Engineering Content Services team.
New Contents:
Under rare conditions a broker with the persistence storage module could crash with a SIGSEGV signal.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2010-0773.html |