Red Hat Bugzilla – Bug 270421
clurgmgrd crashes when resources are deeply nested
Last modified: 2009-04-16 16:22:39 EDT
Description of problem:
The config file attached causes clurgmgrd to crash, and since it crashes the
system is rebooted.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Setup a cluster and use the cluster.conf attached
2. Start ccsd, cman, etc.
3. When rgmanager is started it will crash
Aug 31 06:04:53 node2 clurgmgrd: <notice> Resource Group Manager Starting
Aug 31 06:04:53 node2 clurgmgrd: <info> Loading Service Data
Aug 31 06:04:55 node2 clurgmgrd: <crit> Watchdog: Daemon died, rebooting...
The config attached isn't exactly ideal, but clurgmgrd shouldn't crash. It
should either work, or clurgmgrd should refuse to start.
backtrace is attached. The original cluster configuration was created by a
customer after misinterpreting our advice. attached cluster.conf (slightly
modified from customer's) reproduced the problem for me (on i686, I haven't
Created attachment 182861 [details]
cluster.conf w/ extreme nesting
Created attachment 182881 [details]
Created attachment 182901 [details]
Wow, that's impressive.
You don't have to nest that deep to get the same effect in 4.5 and later...
I agree that it shouldn't crash, but disagree on the refuse to start bit, given
that configuration can be changed run-time (i.e., the user can fix it later).
Services are considered independent of one another: one service's configuration
- should not affect whether or not another independent service is allowed to start.
At any rate, to fix this, we either need to cap maximum tree depth (Conga caps
it at 10 levels, IIRC) or make the query buffers dynamically allocated. In the
case of the former, we need to prominently state that the user's tree is too deep.
Created attachment 198881 [details]
Caps depth at 12 and increases the query buffer sizes to 1k
Created attachment 198891 [details]
Example output of rg_test given the cluster.conf w/ extreme nesting.
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release. Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products. This request is not yet committed for inclusion in an Update
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.