Bug 159965 - Cluster Management window does not refresh after configuration change and traceback
Cluster Management window does not refresh after configuration change and tra...
Status: CLOSED ERRATA
Product: Red Hat Cluster Suite
Classification: Red Hat
Component: redhat-config-cluster (Show other bugs)
4
All Linux
medium Severity medium
: ---
: ---
Assigned To: Jim Parsons
Cluster QE
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2005-06-09 15:30 EDT by Paul Kennedy
Modified: 2015-04-19 20:46 EDT (History)
5 users (show)

See Also:
Fixed In Version: RHBA-2006-0198
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-03-09 14:49:52 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Paul Kennedy 2005-06-09 15:30:58 EDT
Description of problem:

--------
Symptom
--------

After creating or deleting a service and propagating the updated configuration,
the Cluster Management window does not get refreshed automatically with the
changed configuration. This symptom occurs after a traceback is caused by trying
to enable, disable, or restart a service. The traceback occurs intermittently.
See traceback output in "Additional info".

The display *does* get refreshed after changing the state of a service (for
example, enabling, disabling, or restarting a service). However, sometimes the
node information in the "Members" display vanishes and cannot be refreshed.

-----------
Workaround
-----------
Restart system-config-cluster or run clustat to view status.

Version-Release number of selected component (if applicable):
system-config-cluster 1.0.12

How reproducible:


Steps to Reproduce:
1.  At the Cluster Configuration window, create or delete a
    service. Save and propagate the configuration file.
2.  At the Cluster Management window, observe that the display reflects 
    the changes resulting from creating or deleting a service.
3,  At the Cluster Management window, and while observing the command line,
    change the state of a service (for example, disable and enable a service)
    several times until a traceback occurs.
4.  At the Cluster Configuration window, create or delete a service. Save
    and propagate the configuration file.
5.  At the Cluster Management window, observe that the changes made in
    the previous step are not refreshed. In addition, sometimes the nodes
    are not displayed in the "Members" display box. 
6.  At the Cluster Management window, change the state of a service (for
    example, disable, enable, or restart a service.
7.  Observe that the display gets refreshed after changing the state of a
    service in step 5. However, if the node display was lost (in step 5), the
    node display does not get refreshed. It remains blank.

  
Actual results:


Expected results:


Additional info:

-----------------
Traceback output 
-----------------
In this example, the traceback happened after enabling and disabling the service
"Mr. Slate" twice. Other times may require more state changes. 

Member tng3-3 trying to enable Mr. Slate...failed
Member tng3-3 disabling Mr. Slate...success
Member tng3-3 trying to enable Mr. Slate...failed
Member tng3-3 disabling Mr. Slate...success
rhpl.executil waitpid: No child processes
Traceback (most recent call last):
  File "/usr/share/system-config-cluster/MgmtTab.py", line 232, in onTimer
    self.prep_tree()
  File "/usr/share/system-config-cluster/MgmtTab.py", line 182, in prep_tree
    nodes = self.command_handler.getNodesInfo(self.model_builder.getLockType())
  File "/usr/share/system-config-cluster/CommandHandler.py", line 253, in
getNodesInfo
    out,err,res =  rhpl.executil.execWithCaptureErrorStatus("/sbin/cman_tool",args)
  File "/usr/lib/python2.3/site-packages/rhpl/executil.py", line 267, in
execWithCaptureErrorStatus
    if os.WIFEXITED(status) and (os.WEXITSTATUS(status) == 0):
UnboundLocalError: local variable 'status' referenced before assignment
Comment 1 Stanko Kupcevic 2005-08-03 16:20:06 EDT
Fixed in Errata Candidate
Comment 2 Corey Marthaler 2005-09-07 15:03:44 EDT
I have still hit this assert a couple of times today while playing around with
starting and stoping and moving around services. I can't pinpoint what exactly
causes this though because it usually works. 

Spurious OS signal error in waitpid attempt
Traceback (most recent call last):
  File "/usr/share/system-config-cluster/MgmtTab.py", line 232, in onTimer
    self.prep_tree()
  File "/usr/share/system-config-cluster/MgmtTab.py", line 182, in prep_tree
    nodes = self.command_handler.getNodesInfo(self.model_builder.getLockType())
  File "/usr/share/system-config-cluster/CommandHandler.py", line 252, in
getNodesInfo
    out,err,res =  executil.execWithCaptureErrorStatus("/sbin/cman_tool",args)
  File "/usr/share/system-config-cluster/executil.py", line 20, in
execWithCaptureErrorStatus
    return __execWithCaptureErrorStatus(BASH_PATH, [BASH_PATH, '-c', command])
  File "/usr/share/system-config-cluster/executil.py", line 91, in
__execWithCaptureErrorStatus
    (pid, status) = os.waitpid(childpid, 0)
  File "/usr/share/system-config-cluster/ForkedCommand.py", line 133, in
serviceSignalHandler
    if(reaped == EXT_PID):
UnboundLocalError: local variable 'reaped' referenced before assignment
Comment 3 Stanko Kupcevic 2005-09-07 19:35:43 EDT
Fixed in 1.0.16
Comment 4 Corey Marthaler 2005-09-16 16:56:55 EDT
Not sure if this is the exact same bug but I was doing the same senario (playing
around with serivces; start and stopping...) and I hit this similar traceback:

Traceback (most recent call last):
  File "/usr/share/system-config-cluster/MgmtTab.py", line 234, in onTimer
    self.prep_service_tree()
  File "/usr/share/system-config-cluster/MgmtTab.py", line 205, in prep_service_tree
    services = self.command_handler.getServicesInfo()
  File "/usr/share/system-config-cluster/CommandHandler.py", line 327, in
getServicesInfo
    out,err,res =  executil.execWithCaptureErrorStatus(clustat_path,args)
  File "/usr/share/system-config-cluster/executil.py", line 20, in
execWithCaptureErrorStatus
    return __execWithCaptureErrorStatus(BASH_PATH, [BASH_PATH, '-c', command])
  File "/usr/share/system-config-cluster/executil.py", line 72, in
__execWithCaptureErrorStatus
    i,o,e = select.select(in_list, [], [])
select.error: (4, 'Interrupted system call')


If this is a different bug, let me know and I'll close this one and open a new one. 
Comment 5 Stanko Kupcevic 2005-09-19 12:37:51 EDT
This traceback is a different bug, but since it also originates from the code
that executes shell commands, it fits under the same umbrella. 
Comment 6 Stanko Kupcevic 2005-09-19 13:07:19 EDT
Shouldn't python's select.select() take care of EINTR, and enter select()
syscall again?

See traceback in comment #4

Adding misa@redhat.com to CC

I can check for EINTR in s-c-cluster; just wondering if exception on EINTR is
expected behavior. 
Comment 7 Corey Marthaler 2005-10-05 12:47:31 EDT
FYI: hit this again today while attempting to stop a running service.
Comment 8 Red Hat Bugzilla 2005-10-07 12:47:35 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2005-753.html
Comment 9 Corey Marthaler 2005-10-20 15:08:19 EDT
Why was this bug closed? There was never a final "fixed" message. 
Plus, I just hit this bug again.
Comment 10 Jim Parsons 2005-12-01 16:27:47 EST
Fixed in U3 errata build...
Comment 13 Red Hat Bugzilla 2006-03-09 14:49:55 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2006-0198.html

Note You need to log in before you can comment on or make changes to this bug.