(copy of message sent to the rdist developers) X-url: http://www.isi.edu/~johnh/ To: rdist-dev Subject: rdist memory consumption problem (with fix) Date: Thu, 04 Feb 1999 14:59:30 -0800 From: John Heidemann <johnh.edu> I've found a serious memory consumption problem for rdist when running over directories that contain many hard-linked files. The attached patch fixes this problem at the cost of ~12 more lines of code and a little more dynamic memory allocation. Rdist allocates a fixed-sized buffer (struct linkbuf in defs.h) for each file that has more than one hard link. This buffer includes pre-allocated space for 3 strings, each of length BUFSIZ. Undef Redhat 5.2/linux-2.0.36 BUFSIZ is 8KB, so a lot of this buffer goes to waste. For my mh mailbox directory (mh---even worse than news spools at stressing file systems!), I have ~54k directory entires, ~35k files with only one link and ~8300 with 2-7 links. With typical rdist this was resulting memory usage of >90MB and much swapping on my poor laptop. With the included fix memory for this case is <3MB. A remaining problem is that rdist does a linear search for links in its list, but this is a second-order problem to swapping. (If I were to fix the linear-search problem I'd probably grab the Tcl implementation of resizing hashes since it's debuged and reasonabally small. Would you be willing to accept that kind of a fix for linear searchs?) The attached patch is against the redhat-5.2 (a slightly patched) version of rdist-6.1.0. I looked at the source for dist-6.1.5; the code I change looked the same to me. (Thanks for keeping rdist open source so users can find and fix these kinds of problems!) -John Heidemann USC/ISI ---------------------------------------- --- ./src/client.c- Mon Jun 5 07:49:38 1995 +++ ./src/client.c Thu Feb 4 13:46:51 1999 @@ -301,6 +301,18 @@ return(0); } +void freelinkinfo(lp) + struct linkbuf *lp; +{ + if (lp->pathname) + free(lp->pathname); + if (lp->src) + free(lp->src); + if (lp->target) + free(lp->target); + free(lp); +} + /* * Save and retrieve hard link info */ @@ -309,6 +321,7 @@ { struct linkbuf *lp; + /* xxx: linear search doesn't scale with many links */ for (lp = ihead; lp != NULL; lp = lp->nextp) if (lp->inum == statp->st_ino && lp->devnum == statp->st_dev) { lp->count--; @@ -321,12 +334,14 @@ lp->inum = statp->st_ino; lp->devnum = statp->st_dev; lp->count = statp->st_nlink - 1; - (void) strcpy(lp->pathname, target); - (void) strcpy(lp->src, source); + lp->pathname = strdup(target); + lp->src = strdup(source); if (Tdest) - (void) strcpy(lp->target, Tdest); + lp->target = strdup(Tdest); else - *lp->target = CNULL; + lp->target = NULL; + if (!lp->pathname || !lp->src || !(Tdest && lp->target)) + fatalerr("Cannot malloc memory in linkinfo."); return((struct linkbuf *) NULL); } --- ./src/docmd.c- Tue Apr 26 10:10:09 1994 +++ ./src/docmd.c Thu Feb 4 13:46:33 1999 @@ -568,7 +568,7 @@ if (!nflag) { register struct linkbuf *nextl, *l; - for (l = ihead; l != NULL; free((char *)l), l = nextl) { + for (l = ihead; l != NULL; freelinkinfo(l), l = nextl) { nextl = l->nextp; if (contimedout || IS_ON(opts, DO_IGNLNKS) || l->count == 0) --- ./include/defs.h- Mon Apr 11 16:19:22 1994 +++ ./include/defs.h Thu Feb 4 13:36:54 1999 @@ -300,9 +300,9 @@ ino_t inum; dev_t devnum; int count; - char pathname[BUFSIZ]; - char src[BUFSIZ]; - char target[BUFSIZ]; + char *pathname; + char *src; + char *target; struct linkbuf *nextp; };
Fixed in Raw Hide, rdist-6.1.5-3.