Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1948456

Summary: qpid-stat crashes when running in the loop
Product: Red Hat Enterprise MRG Reporter: Barbora Vassova <bvassova>
Component: qpid-toolsAssignee: messaging-bugs <messaging-bugs>
Status: CLOSED UPSTREAM QA Contact: Messaging QE <messaging-qe-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 3.2CC: ataylor, daduval, jdanek, jross, messaging-bugs
Target Milestone: ---Flags: ataylor: needinfo-
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2025-02-10 04:00:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Barbora Vassova 2021-04-12 08:39:44 UTC
Description of problem:
When running qpid-stat in a loop with some period of sleep between iterations, the qpid-stat command will throw the following exception:

Traceback (most recent call last):
  File "/usr/lib64/python2.6/atexit.py", line 24, in _run_exitfuncs
    func(*targs, **kargs)
  File "/usr/lib/python2.6/site-packages/qpid/selector.py", line 185, in stop
    self.wakeup()
  File "/usr/lib/python2.6/site-packages/qpid/selector.py", line 97, in wakeup
    self.waiter.wakeup()
  File "/usr/lib/python2.6/site-packages/qpid/compat.py", line 119, in wakeup
    self._do_write()
  File "/usr/lib/python2.6/site-packages/qpid/compat.py", line 200, in _do_write
    os.write(self.write_fd, "\0")
OSError: [Errno 9] Bad file descriptor

Version-Release number of selected component (if applicable):
qpid-tools-1.36.0-22+hf5.el6_10.noarch


How reproducible:
not always, when I was reproducing the issue, the script was sometimes able to finish ok 

Steps to Reproduce:
1. use the following script

#!/bin/bash

if [ $# -ne 2 ]; then
  echo "$0 <port_number> <sleep_seconds>"
  exit 1
fi

PORT=$1
SLEEP_AMT=$2

echo "started at: $(date)"
for i in $(seq 1 50000); do
  echo "${0} ITER: ${i} $(date '+%T.%N')" | logger -t BEGIN
  qpid-stat -q -b localhost:${PORT} -S queue -L 50000 | grep -v Response >> $0.log
  echo "${0} ITER: ${i} $(date '+%T.%N')" | logger -t END
  grep -i error nohup.out 2>&1>/dev/null && kill -9 $$
  echo "$(date +%c.%N) stayin alive, stayin alive..."
  sleep ${SLEEP_AMT}
done

2. run with

# nohup ./reproducer.sh 5672 1 > nohup.out &

3. after ~2h observe the error.


Actual results:
Script errors out with bad descriptor error

Expected results:
Script should finish with "finished at: $(date)"


Additional info:
This has been reported by customer to be possible to patch with:

(output from patch file):

--- /usr/lib/python2.6/site-packages/qpid/selector.py	2017-08-15 19:38:01.000000000 +0000
+++ ./selector.py	2021-02-15 20:58:05.164764081 +0000
@@ -79,6 +79,7 @@
         atexit.register(sel.stop)
         Selector.DEFAULT = sel
         Selector._current_pid = os.getpid()
+        #os.system("echo 'pid: " + str(Selector._current_pid) + "' | logger -t SHAWNDEBUG")
       return Selector.DEFAULT
     finally:
       Selector.lock.release()
@@ -93,8 +94,14 @@
     self.exception = None

   def wakeup(self):
-    _check(self.exception)
-    self.waiter.wakeup()
+    Selector.lock.acquire()
+    try:
+      _check(self.exception)
+      self.waiter.wakeup()
+    except:
+      pass
+    finally:
+      Selector.lock.release()

   def register(self, selectable):
     self.selectables.add(selectable)
@@ -182,13 +189,18 @@
     """Stop the selector and wait for it's thread to exit. It cannot be re-started"""
     if self.thread and not self.stopped:
       self.stopped = SelectorStopped("qpid.messaging thread has been stopped")
+
+      #os.system("echo 'calling wakeup' | logger -t SHAWNDEBUG")
       self.wakeup()
+
+      #os.system("echo 'calling thread join' | logger -t SHAWNDEBUG")
       self.thread.join(timeout)

   def dead(self, e):
     """Mark the Selector as dead if it is stopped for any reason.  Ensure there any future
     attempt to use the selector or any of its connections will throw an exception.
     """
+    #os.system("echo 'we b dead' | logger -t SHAWNDEBUG")
     self.exception = e
     try:
       for sel in self.selectables.copy():

With this patch applied, when the exception would have been thrown, there is instead:
No handlers could be found for logger "qpid.messaging"

Comment 8 Red Hat Bugzilla 2025-02-10 04:00:33 UTC
This product has been discontinued or is no longer tracked in Red Hat Bugzilla.

Comment 9 Red Hat Bugzilla 2025-06-11 04:25:53 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days