Bug 1948456
| Summary: | qpid-stat crashes when running in the loop | ||
|---|---|---|---|
| Product: | Red Hat Enterprise MRG | Reporter: | Barbora Vassova <bvassova> |
| Component: | qpid-tools | Assignee: | messaging-bugs <messaging-bugs> |
| Status: | NEW --- | QA Contact: | Messaging QE <messaging-qe-bugs> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 3.2 | CC: | ataylor, daduval, jdanek, jross, messaging-bugs |
| Target Milestone: | --- | Flags: | jdanek:
needinfo?
(mcressma) ataylor: needinfo- |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | Bug | |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Description of problem: When running qpid-stat in a loop with some period of sleep between iterations, the qpid-stat command will throw the following exception: Traceback (most recent call last): File "/usr/lib64/python2.6/atexit.py", line 24, in _run_exitfuncs func(*targs, **kargs) File "/usr/lib/python2.6/site-packages/qpid/selector.py", line 185, in stop self.wakeup() File "/usr/lib/python2.6/site-packages/qpid/selector.py", line 97, in wakeup self.waiter.wakeup() File "/usr/lib/python2.6/site-packages/qpid/compat.py", line 119, in wakeup self._do_write() File "/usr/lib/python2.6/site-packages/qpid/compat.py", line 200, in _do_write os.write(self.write_fd, "\0") OSError: [Errno 9] Bad file descriptor Version-Release number of selected component (if applicable): qpid-tools-1.36.0-22+hf5.el6_10.noarch How reproducible: not always, when I was reproducing the issue, the script was sometimes able to finish ok Steps to Reproduce: 1. use the following script #!/bin/bash if [ $# -ne 2 ]; then echo "$0 <port_number> <sleep_seconds>" exit 1 fi PORT=$1 SLEEP_AMT=$2 echo "started at: $(date)" for i in $(seq 1 50000); do echo "${0} ITER: ${i} $(date '+%T.%N')" | logger -t BEGIN qpid-stat -q -b localhost:${PORT} -S queue -L 50000 | grep -v Response >> $0.log echo "${0} ITER: ${i} $(date '+%T.%N')" | logger -t END grep -i error nohup.out 2>&1>/dev/null && kill -9 $$ echo "$(date +%c.%N) stayin alive, stayin alive..." sleep ${SLEEP_AMT} done 2. run with # nohup ./reproducer.sh 5672 1 > nohup.out & 3. after ~2h observe the error. Actual results: Script errors out with bad descriptor error Expected results: Script should finish with "finished at: $(date)" Additional info: This has been reported by customer to be possible to patch with: (output from patch file): --- /usr/lib/python2.6/site-packages/qpid/selector.py 2017-08-15 19:38:01.000000000 +0000 +++ ./selector.py 2021-02-15 20:58:05.164764081 +0000 @@ -79,6 +79,7 @@ atexit.register(sel.stop) Selector.DEFAULT = sel Selector._current_pid = os.getpid() + #os.system("echo 'pid: " + str(Selector._current_pid) + "' | logger -t SHAWNDEBUG") return Selector.DEFAULT finally: Selector.lock.release() @@ -93,8 +94,14 @@ self.exception = None def wakeup(self): - _check(self.exception) - self.waiter.wakeup() + Selector.lock.acquire() + try: + _check(self.exception) + self.waiter.wakeup() + except: + pass + finally: + Selector.lock.release() def register(self, selectable): self.selectables.add(selectable) @@ -182,13 +189,18 @@ """Stop the selector and wait for it's thread to exit. It cannot be re-started""" if self.thread and not self.stopped: self.stopped = SelectorStopped("qpid.messaging thread has been stopped") + + #os.system("echo 'calling wakeup' | logger -t SHAWNDEBUG") self.wakeup() + + #os.system("echo 'calling thread join' | logger -t SHAWNDEBUG") self.thread.join(timeout) def dead(self, e): """Mark the Selector as dead if it is stopped for any reason. Ensure there any future attempt to use the selector or any of its connections will throw an exception. """ + #os.system("echo 'we b dead' | logger -t SHAWNDEBUG") self.exception = e try: for sel in self.selectables.copy(): With this patch applied, when the exception would have been thrown, there is instead: No handlers could be found for logger "qpid.messaging"