Bug 826562 - condor_q doesn't get queue info because of broken .schedd_address file
condor_q doesn't get queue info because of broken .schedd_address file
Status: CLOSED WONTFIX
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: condor (Show other bugs)
1.0
All Linux
low Severity medium
: ---
: ---
Assigned To: grid-maint-list
MRG Quality Engineering
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-05-30 10:00 EDT by Martin Kudlej
Modified: 2016-05-26 16:01 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-05-26 16:01:14 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Martin Kudlej 2012-05-30 10:00:14 EDT
Description of problem:
There is .schedd_address(contains invalid address, not cleaned after scheduler crash) file which has higher priority than scheduler name(information from collector).
condor_q has wrong output. condor_q -name `condor_config_val SCHEDD_NAME` has right output.

Version-Release number of selected component (if applicable):


How reproducible:
100%

Steps to Reproduce:
1. set up pool with HA Scheduler + RH HA
2. run "i=1; while true; do echo -n "$i..."; ./y.sh ; i=$(($i+1));done" on all machines with potential HA Schedd
3. wait till condor_q start to get wrong output
  
Actual results:
Condor_q doesn't work right when there is .schedd_address with wrong address.

Expected results:
condor_q will output information from all available schedulers, so it will consider all schedulers from collector.

Additional info:

$ cat y.sh
#!/bin/bash

#wait till scheduler is up
while [ "0$(ps aux | grep condor_schedd | grep -v grep | wc -l)" -eq "0" ]; do
  sleep 1
done

sleep 10

killall -9 condor_schedd

#wait till scheduler is up
while [ "0$(ps aux | grep condor_schedd | grep -v grep | wc -l)" -eq "0" ]; do
  sleep 1
done

sleep 60

PID=$(pidof condor_schedd)
if [ -n "$PID" ]; then
  SAVED_PID=$(cat /var/run/condor/condor_schedd.pid)

  if [ "0$PID" -eq "0$SAVED_PID" ]; then
    echo "OK...... $PID == $SAVED_PID"
  else
    echo "ERROR... $PID != $SAVED_PID"
  fi
fi

Note You need to log in before you can comment on or make changes to this bug.