Bug 1461649 - glusterd crashes when statedump is taken
glusterd crashes when statedump is taken
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: glusterd (Show other bugs)
Unspecified Unspecified
unspecified Severity unspecified
: ---
: RHGS 3.3.0
Assigned To: Atin Mukherjee
Bala Konda Reddy M
Depends On: 1461655
Blocks: 1417151
  Show dependency treegraph
Reported: 2017-06-15 01:27 EDT by Raghavendra G
Modified: 2017-09-21 00:59 EDT (History)
4 users (show)

See Also:
Fixed In Version: glusterfs-3.8.4-29
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1461655 (view as bug list)
Last Closed: 2017-09-21 00:59:42 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:2774 normal SHIPPED_LIVE glusterfs bug fix and enhancement update 2017-09-21 04:16:29 EDT

  None (edit)
Description Raghavendra G 2017-06-15 01:27:05 EDT
Description of problem:

* start glusterd
* take statedump
* glusterd crashes

Thread 1 (Thread 0x7f535fd71700 (LWP 1557)):
#0  glusterd_dump_priv (this=<optimized out>) at glusterd-statedump.c:243
#1  0x00007f5367aaf655 in gf_proc_dump_xlator_info (top=<optimized out>) at statedump.c:502
#2  0x00007f5367aafb92 in gf_proc_dump_info (signum=signum@entry=10, ctx=0xd1e010) at statedump.c:837
#3  0x00000000004089b9 in glusterfs_sigwaiter (arg=<optimized out>) at glusterfsd.c:2083
#4  0x0000003866e07d14 in start_thread (arg=0x7f535fd71700) at pthread_create.c:309
#5  0x0000003866af168d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
(gdb) l
248	                                                "pmap[%d].type", port);
249	                        gf_proc_dump_write (key, "%d", pmap->ports[port].type);
250	                        gf_proc_dump_build_key (key, "glusterd",
251	                                                "pmap[%d].brickname", port);
252	                        gf_proc_dump_write (key, "%s",
253	                                            pmap->ports[port].brickname);
255	                }
256	                /* Dump client details */
257	                glusterd_dump_client_details (priv);
(gdb) p pmap
$2 = (struct pmap_registry *) 0x0
(gdb) l glusterd-statedump.c:243
238	                /* Dump peer details */
239	                GLUSTERD_DUMP_PEERS (&priv->peers, uuid_list, _gf_false);
241	                /* Dump pmap data structure from base port to last alloc */
242	                pmap = priv->pmap;
243	                for (port = pmap->base_port; port <= pmap->last_alloc;
244	                     port++) {
245	                        gf_proc_dump_build_key (key, "glusterd", "pmap_port");
246	                        gf_proc_dump_write (key, "%d", port);
247	                        gf_proc_dump_build_key (key, "glusterd",
(gdb) p pmap
$3 = (struct pmap_registry *) 0x0

I think since there are no bricks, pmap is NULL resulting in crash. This _MAY_ not happen when there are volumes and bricks associated with glusterd

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:

Actual results:

Expected results:

Additional info:
Comment 2 Atin Mukherjee 2017-06-15 01:46:53 EDT
You are right. This is because the pmap object is created when glusterd allocates the the port for the first time and that will happen only when the first volume is started.
Comment 3 Atin Mukherjee 2017-06-15 02:07:04 EDT
upstream patch : https://review.gluster.org/#/c/17549
Comment 5 Atin Mukherjee 2017-06-15 07:01:08 EDT
downstream patch : https://code.engineering.redhat.com/gerrit/#/c/109180
Comment 8 Bala Konda Reddy M 2017-07-07 02:17:01 EDT
BUILD: 3.8.4-32

On a fresh setup Started glusterd, didn't perform any gluster volume operations and took statedump of glusterd. No glusterd crashes seen.

Hence marking the bz as verified
Comment 10 errata-xmlrpc 2017-09-21 00:59:42 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.