Bug 1461649 - glusterd crashes when statedump is taken
Summary: glusterd crashes when statedump is taken
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: glusterd
Version: rhgs-3.3
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: RHGS 3.3.0
Assignee: Atin Mukherjee
QA Contact: Bala Konda Reddy M
URL:
Whiteboard:
Depends On: 1461655
Blocks: 1417151
TreeView+ depends on / blocked
 
Reported: 2017-06-15 05:27 UTC by Raghavendra G
Modified: 2017-09-21 04:59 UTC (History)
4 users (show)

Fixed In Version: glusterfs-3.8.4-29
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1461655 (view as bug list)
Environment:
Last Closed: 2017-09-21 04:59:42 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:2774 0 normal SHIPPED_LIVE glusterfs bug fix and enhancement update 2017-09-21 08:16:29 UTC

Description Raghavendra G 2017-06-15 05:27:05 UTC
Description of problem:

* start glusterd
* take statedump
* glusterd crashes

Thread 1 (Thread 0x7f535fd71700 (LWP 1557)):
#0  glusterd_dump_priv (this=<optimized out>) at glusterd-statedump.c:243
#1  0x00007f5367aaf655 in gf_proc_dump_xlator_info (top=<optimized out>) at statedump.c:502
#2  0x00007f5367aafb92 in gf_proc_dump_info (signum=signum@entry=10, ctx=0xd1e010) at statedump.c:837
#3  0x00000000004089b9 in glusterfs_sigwaiter (arg=<optimized out>) at glusterfsd.c:2083
#4  0x0000003866e07d14 in start_thread (arg=0x7f535fd71700) at pthread_create.c:309
#5  0x0000003866af168d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
(gdb) l
248	                                                "pmap[%d].type", port);
249	                        gf_proc_dump_write (key, "%d", pmap->ports[port].type);
250	                        gf_proc_dump_build_key (key, "glusterd",
251	                                                "pmap[%d].brickname", port);
252	                        gf_proc_dump_write (key, "%s",
253	                                            pmap->ports[port].brickname);
254	
255	                }
256	                /* Dump client details */
257	                glusterd_dump_client_details (priv);
(gdb) p pmap
$2 = (struct pmap_registry *) 0x0
(gdb) l glusterd-statedump.c:243
238	                /* Dump peer details */
239	                GLUSTERD_DUMP_PEERS (&priv->peers, uuid_list, _gf_false);
240	
241	                /* Dump pmap data structure from base port to last alloc */
242	                pmap = priv->pmap;
243	                for (port = pmap->base_port; port <= pmap->last_alloc;
244	                     port++) {
245	                        gf_proc_dump_build_key (key, "glusterd", "pmap_port");
246	                        gf_proc_dump_write (key, "%d", port);
247	                        gf_proc_dump_build_key (key, "glusterd",
(gdb) p pmap
$3 = (struct pmap_registry *) 0x0

I think since there are no bricks, pmap is NULL resulting in crash. This _MAY_ not happen when there are volumes and bricks associated with glusterd

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 Atin Mukherjee 2017-06-15 05:46:53 UTC
You are right. This is because the pmap object is created when glusterd allocates the the port for the first time and that will happen only when the first volume is started.

Comment 3 Atin Mukherjee 2017-06-15 06:07:04 UTC
upstream patch : https://review.gluster.org/#/c/17549

Comment 5 Atin Mukherjee 2017-06-15 11:01:08 UTC
downstream patch : https://code.engineering.redhat.com/gerrit/#/c/109180

Comment 8 Bala Konda Reddy M 2017-07-07 06:17:01 UTC
BUILD: 3.8.4-32

On a fresh setup Started glusterd, didn't perform any gluster volume operations and took statedump of glusterd. No glusterd crashes seen.

Hence marking the bz as verified

Comment 10 errata-xmlrpc 2017-09-21 04:59:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2774


Note You need to log in before you can comment on or make changes to this bug.