Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 636271

Summary: Incomplete HFS config causes Negotaitor to SEGV
Product: Red Hat Enterprise MRG Reporter: Matthew Farrellee <matt>
Component: condorAssignee: Erik Erlandson <eerlands>
Status: CLOSED ERRATA QA Contact: Tomas Rusnak <trusnak>
Severity: medium Docs Contact:
Priority: medium    
Version: 1.3CC: jthomas, pmackinn, trusnak
Target Milestone: 1.3.2   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: condor-7.4.5-0.2 Doc Type: Bug Fix
Doc Text:
When 'GROUP_LIST' contains a subgroup, but the subgroup's parent is missing, the missing parent group caused the tree construction routine to attempt to access a non-existent parent data structure which resulted in a segmentation fault. With this update, a missing parent is detected during the tree construction routine and a warning is output to an appropriate log file.
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-02-15 12:13:58 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 528800    

Description Matthew Farrellee 2010-09-21 19:36:46 UTC
Description of problem:

Incomplete HFS config causes Negotaitor to SEGV


Version-Release number of selected component (if applicable):

7.4.4-0.14


How reproducible:

100%


Steps to Reproduce:
1. Config...

GROUP_NAMES = A1.A2.A3
GROUP_QUOTA_DYNAMIC_A1.A2 = 0.1
GROUP_QUOTA_DYNAMIC_A1.A2.A3 = 0.1

2. Start Negotiator

Actual results:

09/21/10 14:34:17 Group Table : group a1.a2.a3 quota 0.100 usage 0.000 prio 0.00
09/21/10 14:34:17 Sort : sorting group vector
Stack dump for process 1121 at timestamp 1285097657 (9 frames)
condor_negotiator(dprintf_dump_stack+0x63)[0x499203]
condor_negotiator[0x49b1b2]
/lib64/libpthread.so.0[0x3dfda0f440]
condor_negotiator(Matchmaker::negotiationTime()+0x1c95)[0x477d75]
condor_negotiator(TimerManager::Timeout()+0x129)[0x4983e9]
condor_negotiator(DaemonCore::Driver()+0x23e)[0x48690e]
condor_negotiator(main+0x1030)[0x494730]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x3dfd61ec5d]
condor_negotiator[0x4633f9]

Expected results:

No crash and, either message about incomplete configuration and defined fallback semantics, or defined semantics for groups not listed in GROUP_NAMES.

Comment 2 Matthew Farrellee 2010-09-23 19:23:17 UTC
*** Bug 636951 has been marked as a duplicate of this bug. ***

Comment 3 Erik Erlandson 2010-11-19 21:19:30 UTC
Latest devel branch has improved config checking:
V7_4-BZ619557-HFS-tree-structure

Functional tests for bad HFS config are attached here:
https://bugzilla.redhat.com/show_bug.cgi?id=619557

Comment 4 Erik Erlandson 2010-12-21 15:27:17 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause:
This bug manifests when GROUP_LIST contains a subgroup, but is missing the subgroup's parent.  For example, containing the subgroup "a.b", but missing parent group "a".

Consequence:
The missing parent group causes the tree construction routine to attempt to access non-existent parent data structure which results in a seg-fault.

Fix:
Logic was added to the tree construction routine to detect a missing parent and output a warning to log.

Result:
Negotiator will now output a warning to log when it encounters a group with a missing parent group, instead of crashing.

Comment 6 Tomas Rusnak 2011-01-07 13:37:13 UTC
Reproduced on x86_64 with condor-7.4.4-0.17

Retested over current packages on all supported platforms RHEL4/RHEL5 on x86/x86_64:

# rpm -qa condor
condor-7.4.5-0.6.el5 (condor-7.4.5-0.6.el4 on RHEL4)

Config used:

GROUP_NAMES = A1.A2.A3
GROUP_QUOTA_DYNAMIC_A1.A2 = 0.1
GROUP_QUOTA_DYNAMIC_A1.A2.A3 = 0.1
ALL_DEBUG = D_FULLDEBUG

01/07/11 08:28:46 ---------- Started Negotiation Cycle ----------
01/07/11 08:28:46 Phase 1:  Obtaining ads from collector ...
01/07/11 08:28:46   Getting all public ads ...
01/07/11 08:28:46 Trying to query collector <localhost:9618>
01/07/11 08:28:46   Sorting 6 ads ...
01/07/11 08:28:46   Getting startd private ads ...
01/07/11 08:28:46 Trying to query collector <localhost:9618>
01/07/11 08:28:46 Got ads: 6 public and 2 private
01/07/11 08:28:46 Public ads include 0 submitter, 2 startd
01/07/11 08:28:46 Phase 1: numDynGroupSlots 2  untrimmedSlotWeightTotal 2.000000 
01/07/11 08:28:46 Phase 2:  Performing accounting ...
01/07/11 08:28:46 Entering compute_significant_attrs()
01/07/11 08:28:46 Leaving compute_significant_attrs() - result=JobUniverse,LastCheckpointPlatform,NumCkpts
01/07/11 08:28:47 Phase 3:  Sorting submitter ads by priority ...
01/07/11 08:28:47 Phase 4.1:  Negotiating with schedds ...
01/07/11 08:28:47     numSlots = 2
01/07/11 08:28:47     slotWeightTotal = 2.000000
01/07/11 08:28:47     pieLeft = 0.000
01/07/11 08:28:47     NormalFactor = 0.000000
01/07/11 08:28:47     MaxPrioValue = 0.000000
01/07/11 08:28:47     NumSubmitterAds = 0
01/07/11 08:28:47  resources used by  are 0.000000
01/07/11 08:28:47  resources used scheddused 0.000000 groupUsed 0.000000
01/07/11 08:28:47  negotiateWithGroup resources used scheddAds length 0 
01/07/11 08:28:47 ---------- Finished Negotiation Cycle ----------

No regression with stack dump found.

>>> VERIFIED

Comment 7 Martin Prpič 2011-02-08 15:22:51 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1,11 +1 @@
-Cause:
+When 'GROUP_LIST' contains a subgroup, but the subgroup's parent is missing, the missing parent group causes the tree construction routine to attempt to access a non-existent parent data structure which results in a segmentation fault. With this update, a missing parent is detected during the tree construction routine and a warning is output to an appropriate log file.-This bug manifests when GROUP_LIST contains a subgroup, but is missing the subgroup's parent.  For example, containing the subgroup "a.b", but missing parent group "a".
-
-Consequence:
-The missing parent group causes the tree construction routine to attempt to access non-existent parent data structure which results in a seg-fault.
-
-Fix:
-Logic was added to the tree construction routine to detect a missing parent and output a warning to log.
-
-Result:
-Negotiator will now output a warning to log when it encounters a group with a missing parent group, instead of crashing.

Comment 8 Martin Prpič 2011-02-08 15:33:48 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1 +1 @@
-When 'GROUP_LIST' contains a subgroup, but the subgroup's parent is missing, the missing parent group causes the tree construction routine to attempt to access a non-existent parent data structure which results in a segmentation fault. With this update, a missing parent is detected during the tree construction routine and a warning is output to an appropriate log file.+When 'GROUP_LIST' contains a subgroup, but the subgroup's parent is missing, the missing parent group caused the tree construction routine to attempt to access a non-existent parent data structure which resulted in a segmentation fault. With this update, a missing parent is detected during the tree construction routine and a warning is output to an appropriate log file.

Comment 9 errata-xmlrpc 2011-02-15 12:13:58 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0217.html