Bug 631782
Summary: | condor_gridmanager segmentation fault when run from command line | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise MRG | Reporter: | Lubos Trilety <ltrilety> | ||||
Component: | condor | Assignee: | Matthew Farrellee <matt> | ||||
Status: | CLOSED ERRATA | QA Contact: | Tomas Rusnak <trusnak> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | low | ||||||
Version: | 1.0 | CC: | iboverma, ltoscano, matt, trusnak | ||||
Target Milestone: | 2.0 | ||||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | condor-7.5.6-0.1 | Doc Type: | Bug Fix | ||||
Doc Text: |
C: condor_gridmanager deleted uninitialized memory when run as root or when passed -o. This is not a user concer, because condor_gridmanager is not intended to be run directly, from root or not, and is run properly when invoked from Condor.
C: No significant consequence, because condor_gridmanager is invoked properly by Condor itself.
F: Checks were put in place to avoid the improper delete.
R: All is well.
|
Story Points: | --- | ||||
Clone Of: | Environment: | ||||||
Last Closed: | 2011-06-23 15:41:13 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 693778 | ||||||
Attachments: |
|
Description
Lubos Trilety
2010-09-08 11:36:14 UTC
Analyse of core file: Core was generated by `condor_gridmanager'. Program terminated with signal 11, Segmentation fault. #0 DC_Exit (status=1, shutdown_program=0x0) at daemon_core_main.cpp:270 270 delete daemonCore; (gdb) info threads * 1 Thread 12984 DC_Exit (status=1, shutdown_program=0x0) at daemon_core_main.cpp:270 (gdb) bt #0 DC_Exit (status=1, shutdown_program=0x0) at daemon_core_main.cpp:270 #1 0x00000000004f7430 in main (argc=1, argv=0x7fffec31c778) at daemon_core_main.cpp:1574 condor_gridmanager should probably be in libexec. This happens when run by root or as a user when passed only -o. Program received signal SIGSEGV, Segmentation fault. DC_Exit (status=1, shutdown_program=0x0) at daemon_core_main.cpp:270 270 delete daemonCore; (gdb) where #0 DC_Exit (status=1, shutdown_program=0x0) at daemon_core_main.cpp:270 #1 0x00000000004f7430 in main (argc=1, argv=0x7fff9ed1f258) at daemon_core_main.cpp:1574 gridmanager_main.cpp: void main_pre_dc_init( int argc, char* argv[] ) { ... } else if ( is_root() ) { dprintf( D_ALWAYS, "Don't know what user to run as!\n" ); DC_Exit( 1 ); ... Problem is DC_Exit in pre_dc_init tries to delete dc (daemonCore), which is NULL. Either DC_Exit should be more careful, or DC_Exit from pre_dc_init should be illegal. The case where -o is passed comes from pre_dc_init calling usage calling DC_Exit. Created attachment 445965 [details]
strace
strace log from the run
Fixed upstream for 7.5.6 -- Author: Matthew Farrellee <matt@redhat> Added NULL detection around "delete daemonCore" in DC_Exit The issue was discovered when running condor_gridmanager from the command line. The gridmanager can call DC_Exit from within main_pre_dc_init, which is by definition before the global daemonCore instance is allocated. DC_Exit would blindly attempt to delete a NULL daemonCore. An alternative fix was to prevent the gridmanager from calling DC_Exit within main_pre_dc_init, but code already in DC_Exit tested for daemonCore == NULL, making it appear that it should handle all cases where daemonCore may be null. diff --git a/src/condor_daemon_core.V6/daemon_core_main.cpp b/src/condor_daemon_core.V6/daemon_core_main.cpp index 1301cbc..1821d6e 100644 --- a/src/condor_daemon_core.V6/daemon_core_main.cpp +++ b/src/condor_daemon_core.V6/daemon_core_main.cpp @@ -280,9 +280,12 @@ DC_Exit( int status, const char *shutdown_program ) #endif /* ! WIN32 */ // Now, delete the daemonCore object, since we allocated it. - unsigned long pid = daemonCore->getpid( ); - delete daemonCore; - daemonCore = NULL; + unsigned long pid = 0; + if (daemonCore) { + pid = daemonCore->getpid( ); + delete daemonCore; + daemonCore = NULL; + } // Free up the memory from the config hash table, too. clear_config(); Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: C: condor_gridmanager deleted uninitialized memory when run as root or when passed -o. This is not a user concer, because condor_gridmanager is not intended to be run directly, from root or not, and is run properly when invoked from Condor. C: No significant consequence, because condor_gridmanager is invoked properly by Condor itself. F: Checks were put in place to avoid the improper delete. R: All is well. Reproduced on RHEL5/x86_64 with: $CondorVersion: 7.4.5 Feb 4 2011 BuildID: RH-7.4.5-0.8.el5 PRE-RELEASE $ $CondorPlatform: X86_64-LINUX_RHEL5 $ # condor_gridmanager Segmentation fault Retested over supported platforms x86,x86_64/RHEL5,RHEL6 with:
condor-7.6.1-0.4
# condor_gridmanager
# echo $?
1
No core file created. No crash found.
>>> VERIFIED
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHEA-2011-0889.html |