Description of problem: When I run the command below in my test gluster cluster I got a core dump. # gluster volume heal test info healed After I use gdb to track where program crashed, I found out the program crash at strftime function, at cli-rpc-ops.c file in function cmd_heal_volume_brick_out. Version-Release number of selected component (if applicable): GlusterFS 3.3.0 How reproducible: create a replica-stripe volume, halt any node when client write the file. Then boot the down node and health cluster, and then run command: # gluster volume heal <VOLUME_NAME> info healed Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: When I do some research on this bug, I found the return of localtime function is 0x00, so that is why I got a core dump. After I look deep into localtime function, I found the argument receive a uint64_t pointer, but the code give a 4 byte uint32_t parameter, so that's the problem. Below is my fix for function cmd_heal_volume_brick_out. void cmd_heal_volume_brick_out (dict_t *dict, int brick) { uint64_t num_entries = 0; int ret = 0; char key[256] = {0}; char *hostname = NULL; char *path = NULL; char *status = NULL; uint64_t i = 0; uint64_t convert_time = 0; uint32_t time = 0; char timestr[256]; struct tm *tm = NULL; snprintf (key, sizeof (key), "%d-hostname", brick); ret = dict_get_str (dict, key, &hostname); if (ret) goto out; snprintf (key, sizeof (key), "%d-path", brick); ret = dict_get_str (dict, key, &path); if (ret) goto out; cli_out ("\nBrick %s:%s", hostname, path); snprintf (key, sizeof (key), "%d-count", brick); ret = dict_get_uint64 (dict, key, &num_entries); cli_out ("Number of entries: %"PRIu64, num_entries); snprintf (key, sizeof (key), "%d-status", brick); ret = dict_get_str (dict, key, &status); if (status && strlen (status)) cli_out ("Status: %s", status); for (i = 0; i < num_entries; i++) { snprintf (key, sizeof (key), "%d-%"PRIu64, brick, i); ret = dict_get_str (dict, key, &path); if (ret) continue; time = 0; snprintf (key, sizeof (key), "%d-%"PRIu64"-time", brick, i); ret = dict_get_uint32 (dict, key, &time); if (!time) { cli_out ("%s", path); } else { /* * We found the localtime function require a 8 byte * integer (long or uint64_t), so we should convert * uint32_t time to uint64_t convert_time. * If not this will cause a segment fault !! */ convert_time = time; tm = localtime ((time_t*)(&convert_time)); strftime (timestr, sizeof (timestr), "%Y-%m-%d %H:%M:%S", tm); if (i ==0) { cli_out ("at path on brick"); cli_out ("-----------------------------------"); } cli_out ("%s %s", timestr, path); } } out: return; }
I ran into the same issue. It's already fixed on Github. https://github.com/gluster/glusterfs/commit/2fde351b8228720bc13f8bea3453b6af1d68c5ad.patch is the patch. Saz
*** This bug has been marked as a duplicate of bug 828058 ***