Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 705437 - schedd crash on Windows due to bug in timed_queue<>
schedd crash on Windows due to bug in timed_queue<>
Status: CLOSED ERRATA
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: condor (Show other bugs)
2.0
Unspecified Windows
low Severity high
: 2.0.1
: ---
Assigned To: Erik Erlandson
MRG Quality Engineering
: Reopened
Depends On:
Blocks: 723887
  Show dependency treegraph
 
Reported: 2011-05-17 12:52 EDT by Erik Erlandson
Modified: 2011-09-07 12:44 EDT (History)
4 users (show)

See Also:
Fixed In Version: condor-7.6.2-0.1
Doc Type: Bug Fix
Doc Text:
The scheduler could have potentially suffered a memory access error due to a missing check for an empty queue. This check has been implemented, thus eliminating the chance of incurring a memory access error.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2011-09-07 12:44:49 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2011:1249 normal SHIPPED_LIVE Moderate: Red Hat Enterprise MRG Grid 2.0 security, bug fix and enhancement update 2011-09-07 12:40:45 EDT

  None (edit)
Description Erik Erlandson 2011-05-17 12:52:28 EDT
Description of problem:
A bug in the new timed_queue<> class introduced with the schedd performance stats has caused a crash when the schedd runs on windows (upstream).  

This crash is only known to have occurred on Windows, which we do not support for scheduler, however it might in principle occur on other OS.  It is unknown why it has not manifested on RHEL or Fedora


How reproducible:
100% (?) on Windows.   So far 0% on other OS.

Steps to Reproduce:
1. Start up the schedd on Windows
  
Actual results:
Crash

Expected results:
normal execution


Additional info:
currently a fix has been committed to upstream master, which can be backported:

Upstream commit diff:

$ git diff l/uw/master~1..l/uw/master
diff --git a/src/condor_utils/timed_queue.h b/src/condor_utils/timed_queue.h
index da2794d..3e88b7d 100644
--- a/src/condor_utils/timed_queue.h
+++ b/src/condor_utils/timed_queue.h
@@ -70,7 +70,7 @@ struct timed_queue : public std::deque<std::pair<time_t, Data> > {
 
     void max_time(size_type t) {
         _max_time = t;
-        if (max_time() > 0) trim_time(base_type::front().first - max_time());
+        if ((!base_type::empty()) && (max_time() > 0)) trim_time(base_type::front().first - max_time());
     }
     size_type max_time() const {
         return _max_time;
Comment 1 Erik Erlandson 2011-05-31 20:00:41 EDT
pushed fix to: UPSTREAM-7.7.0-BZ705437-timed-queue-crash
Comment 2 Erik Erlandson 2011-05-31 20:02:32 EDT
Not sure how to repro this.  It didn't show up when I tested with MALLOC_PERTURB_, and upstream reported they couldn't see it with valgrind either.
Comment 6 Erik Erlandson 2011-07-25 18:25:38 EDT
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause:
timed_queue<> data structure was missing a check for empty queue.

Consequence:
Left scheduler open to a potential memory access error.

Fix:
Proper check for empty queue was added to the data structure code.

Result:
The potential memory access error is now eliminated.
Comment 7 Douglas Silas 2011-08-08 10:25:34 EDT
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1,11 +1 @@
-Cause:
+The scheduler could have potentially suffered a memory access error due to a missing check for an empty queue. This check has been implemented, thus eliminating the chance of incurring a memory access error.-timed_queue<> data structure was missing a check for empty queue.
-
-Consequence:
-Left scheduler open to a potential memory access error.
-
-Fix:
-Proper check for empty queue was added to the data structure code.
-
-Result:
-The potential memory access error is now eliminated.
Comment 9 Martin Kudlej 2011-08-09 08:18:50 EDT
Code inspection made by me and ltoscano. -->VERIFIED
Comment 10 errata-xmlrpc 2011-09-07 12:44:49 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-1249.html

Note You need to log in before you can comment on or make changes to this bug.