Bug 725463 - clustered broker persistent cluster backups may rapidly fill up the remaining disk space
Summary: clustered broker persistent cluster backups may rapidly fill up the remaining...
Keywords:
Status: NEW
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp
Version: 1.3
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: messaging-bugs
QA Contact: MRG Quality Engineering
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-07-25 15:00 UTC by Frantisek Reznicek
Modified: 2020-11-04 18:29 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:


Attachments (Terms of Use)

Description Frantisek Reznicek 2011-07-25 15:00:34 UTC
Description of problem:

Qpidd clustered broker performs every couple of minutes persistent cluster backups to data dir:

  [root@dhcp-lab-125 ~]# ll /var/lib/qpidd/
  total 2400
  drwxr-x--- 2 qpidd ais      4096 Jul 22 17:58 cluster
  drwxr-x--- 3 qpidd ais      4096 Jul 22 18:01 _cluster.bak.0001
  drwxr-x--- 3 qpidd ais      4096 Jul 22 18:04 _cluster.bak.0002
  ...
  drwxr-x--- 3 qpidd ais      4096 Jul 23 08:56 _cluster.bak.0057

Directories _cluster.bak.* may easily fill up all free space on the machine which may cause various problems.


There is therefore need for configuration of this feature.

Suggestions:
a] Allow to define maximal number of backups
b] Allow to define maximal disk size occupied


Version-Release number of selected component (if applicable):
2.0 or 1.3.3


How reproducible:
100%

Steps to Reproduce:
1. Run long term cluster test (couple of clients, high load, frequently change with cluster configuration)
2. watch "ls -lah /var/lib/qpidd/ ; du -h /var/lib/qpidd/ | tail"
  
Actual results:
/var/lib/qpidd/_cluster.* data grows indefinitely

Expected results:
/var/lib/qpidd/_cluster.* data growth should be configurable


Additional info:

[root@dhcp-lab-125 ~]# ll /var/lib/qpidd/
total 2400
drwxr-x--- 2 qpidd ais      4096 Jul 22 17:58 cluster
drwxr-x--- 3 qpidd ais      4096 Jul 22 18:01 _cluster.bak.0001
drwxr-x--- 3 qpidd ais      4096 Jul 22 18:04 _cluster.bak.0002
drwxr-x--- 3 qpidd ais      4096 Jul 22 18:12 _cluster.bak.0003
drwxr-x--- 3 qpidd ais      4096 Jul 22 18:22 _cluster.bak.0004
drwxr-x--- 3 qpidd ais      4096 Jul 22 18:37 _cluster.bak.0005
drwxr-x--- 3 qpidd ais      4096 Jul 22 18:46 _cluster.bak.0006
drwxr-x--- 3 qpidd ais      4096 Jul 22 18:49 _cluster.bak.0007
drwxr-x--- 3 qpidd ais      4096 Jul 22 18:52 _cluster.bak.0008
drwxr-x--- 3 qpidd ais      4096 Jul 22 18:58 _cluster.bak.0009
drwxr-x--- 3 qpidd ais      4096 Jul 22 19:02 _cluster.bak.000a
drwxr-x--- 3 qpidd ais      4096 Jul 22 19:13 _cluster.bak.000b
drwxr-x--- 3 qpidd ais      4096 Jul 22 19:22 _cluster.bak.000c
drwxr-x--- 3 qpidd ais      4096 Jul 22 19:26 _cluster.bak.000d
drwxr-x--- 3 qpidd ais      4096 Jul 22 19:29 _cluster.bak.000e
drwxr-x--- 3 qpidd ais      4096 Jul 22 19:34 _cluster.bak.000f
drwxr-x--- 3 qpidd ais      4096 Jul 22 19:39 _cluster.bak.0010
drwxr-x--- 3 qpidd ais      4096 Jul 22 19:53 _cluster.bak.0011
drwxr-x--- 3 qpidd ais      4096 Jul 22 20:00 _cluster.bak.0012
drwxr-x--- 3 qpidd ais      4096 Jul 22 20:04 _cluster.bak.0013
drwxr-x--- 3 qpidd ais      4096 Jul 22 20:07 _cluster.bak.0014
drwxr-x--- 3 qpidd ais      4096 Jul 22 20:12 _cluster.bak.0015
drwxr-x--- 3 qpidd ais      4096 Jul 22 20:17 _cluster.bak.0016
drwxr-x--- 3 qpidd ais      4096 Jul 22 20:28 _cluster.bak.0017
drwxr-x--- 3 qpidd ais      4096 Jul 22 20:36 _cluster.bak.0018
drwxr-x--- 3 qpidd ais      4096 Jul 22 20:40 _cluster.bak.0019
drwxr-x--- 3 qpidd ais      4096 Jul 22 20:43 _cluster.bak.001a
drwxr-x--- 3 qpidd ais      4096 Jul 22 20:48 _cluster.bak.001b
drwxr-x--- 3 qpidd ais      4096 Jul 22 20:53 _cluster.bak.001c
drwxr-x--- 3 qpidd ais      4096 Jul 22 21:11 _cluster.bak.001d
drwxr-x--- 3 qpidd ais      4096 Jul 22 21:18 _cluster.bak.001e
drwxr-x--- 3 qpidd ais      4096 Jul 22 21:22 _cluster.bak.001f
drwxr-x--- 3 qpidd ais      4096 Jul 22 21:25 _cluster.bak.0020
drwxr-x--- 3 qpidd ais      4096 Jul 22 21:30 _cluster.bak.0021
drwxr-x--- 3 qpidd ais      4096 Jul 22 21:36 _cluster.bak.0022
drwxr-x--- 3 qpidd ais      4096 Jul 22 21:49 _cluster.bak.0023
drwxr-x--- 3 qpidd ais      4096 Jul 22 22:07 _cluster.bak.0024
drwxr-x--- 3 qpidd ais      4096 Jul 22 22:13 _cluster.bak.0025
drwxr-x--- 3 qpidd ais      4096 Jul 22 22:19 _cluster.bak.0026
drwxr-x--- 3 qpidd ais      4096 Jul 22 22:27 _cluster.bak.0027
drwxr-x--- 3 qpidd ais      4096 Jul 22 22:34 _cluster.bak.0028
drwxr-x--- 3 qpidd ais      4096 Jul 22 22:47 _cluster.bak.0029
drwxr-x--- 3 qpidd ais      4096 Jul 22 23:04 _cluster.bak.002a
drwxr-x--- 3 qpidd ais      4096 Jul 22 23:11 _cluster.bak.002b
drwxr-x--- 3 qpidd ais      4096 Jul 22 23:17 _cluster.bak.002c
drwxr-x--- 3 qpidd ais      4096 Jul 22 23:24 _cluster.bak.002d
drwxr-x--- 3 qpidd ais      4096 Jul 22 23:31 _cluster.bak.002e
drwxr-x--- 3 qpidd ais      4096 Jul 22 23:45 _cluster.bak.002f
drwxr-x--- 3 qpidd ais      4096 Jul 23 00:06 _cluster.bak.0030
drwxr-x--- 3 qpidd ais      4096 Jul 23 00:06 _cluster.bak.0031
drwxr-x--- 3 qpidd ais      4096 Jul 23 00:21 _cluster.bak.0032
drwxr-x--- 3 qpidd ais      4096 Jul 23 00:49 _cluster.bak.0033
drwxr-x--- 3 qpidd ais      4096 Jul 23 00:50 _cluster.bak.0034
drwxr-x--- 3 qpidd ais      4096 Jul 23 00:52 _cluster.bak.0035
drwxr-x--- 3 qpidd ais      4096 Jul 23 00:53 _cluster.bak.0036
drwxr-x--- 3 qpidd ais      4096 Jul 23 01:00 _cluster.bak.0037
drwxr-x--- 3 qpidd ais      4096 Jul 23 01:02 _cluster.bak.0038
drwxr-x--- 3 qpidd ais      4096 Jul 23 01:10 _cluster.bak.0039
drwxr-x--- 3 qpidd ais      4096 Jul 23 01:11 _cluster.bak.003a
drwxr-x--- 3 qpidd ais      4096 Jul 23 01:13 _cluster.bak.003b
drwxr-x--- 3 qpidd ais      4096 Jul 23 01:14 _cluster.bak.003c
drwxr-x--- 3 qpidd ais      4096 Jul 23 01:20 _cluster.bak.003d
drwxr-x--- 3 qpidd ais      4096 Jul 23 01:22 _cluster.bak.003e
drwxr-x--- 3 qpidd ais      4096 Jul 23 01:31 _cluster.bak.003f
drwxr-x--- 3 qpidd ais      4096 Jul 23 01:56 _cluster.bak.0040
drwxr-x--- 3 qpidd ais      4096 Jul 23 02:17 _cluster.bak.0041
drwxr-x--- 3 qpidd ais      4096 Jul 23 02:37 _cluster.bak.0042
drwxr-x--- 3 qpidd ais      4096 Jul 23 03:05 _cluster.bak.0043
drwxr-x--- 3 qpidd ais      4096 Jul 23 03:27 _cluster.bak.0044
drwxr-x--- 3 qpidd ais      4096 Jul 23 03:46 _cluster.bak.0045
drwxr-x--- 3 qpidd ais      4096 Jul 23 04:10 _cluster.bak.0046
drwxr-x--- 3 qpidd ais      4096 Jul 23 04:31 _cluster.bak.0047
drwxr-x--- 3 qpidd ais      4096 Jul 23 04:51 _cluster.bak.0048
drwxr-x--- 3 qpidd ais      4096 Jul 23 05:17 _cluster.bak.0049
drwxr-x--- 3 qpidd ais      4096 Jul 23 05:40 _cluster.bak.004a
drwxr-x--- 3 qpidd ais      4096 Jul 23 05:59 _cluster.bak.004b
drwxr-x--- 3 qpidd ais      4096 Jul 23 06:29 _cluster.bak.004c
drwxr-x--- 3 qpidd ais      4096 Jul 23 06:51 _cluster.bak.004d
drwxr-x--- 3 qpidd ais      4096 Jul 23 07:10 _cluster.bak.004e
drwxr-x--- 3 qpidd ais      4096 Jul 23 07:37 _cluster.bak.004f
drwxr-x--- 3 qpidd ais      4096 Jul 23 07:58 _cluster.bak.0050
drwxr-x--- 3 qpidd ais      4096 Jul 23 08:17 _cluster.bak.0051
drwxr-x--- 3 qpidd ais      4096 Jul 23 08:43 _cluster.bak.0052
drwxr-x--- 3 qpidd ais      4096 Jul 23 08:44 _cluster.bak.0053
drwxr-x--- 3 qpidd ais      4096 Jul 23 08:46 _cluster.bak.0054
drwxr-x--- 3 qpidd ais      4096 Jul 23 08:47 _cluster.bak.0055
drwxr-x--- 3 qpidd ais      4096 Jul 23 08:54 _cluster.bak.0056
drwxr-x--- 3 qpidd ais      4096 Jul 23 08:56 _cluster.bak.0057
-rw-r----- 1 qpidd ais         0 Jul 22 17:58 lock
-rw-r--r-- 1 qpidd ais   1691648 Jul 25 07:07 qpidd.log
-rw------- 1 qpidd qpidd   12288 Jul 15 13:49 qpidd.sasldb
drwxr-x--- 4 qpidd ais      4096 Jul 23 08:56 rhm
-rw-r----- 1 qpidd ais        37 May 18 13:45 systemId
[root@dhcp-lab-125 ~]# du -h /var/lib/qpidd/ | tail
13M     /var/lib/qpidd/_cluster.bak.004f/rhm/jrnl/001c
13M     /var/lib/qpidd/_cluster.bak.004f/rhm/jrnl/0000/test_cluster_toggle_load_w_sesame-00
13M     /var/lib/qpidd/_cluster.bak.004f/rhm/jrnl/0000
13M     /var/lib/qpidd/_cluster.bak.004f/rhm/jrnl/0007/test_cluster_toggle_load_w_sesame-90
13M     /var/lib/qpidd/_cluster.bak.004f/rhm/jrnl/0007
122M    /var/lib/qpidd/_cluster.bak.004f/rhm/jrnl
740K    /var/lib/qpidd/_cluster.bak.004f/rhm/dat
122M    /var/lib/qpidd/_cluster.bak.004f/rhm
122M    /var/lib/qpidd/_cluster.bak.004f
6.2G    /var/lib/qpidd/ 
[root@dhcp-lab-125 ~]# df -ah
Filesystem            Size  Used Avail Use% Mounted on
/dev/hda1             8.9G  8.4G     0 100% /
proc                     0     0     0   -  /proc
sysfs                    0     0     0   -  /sys
devpts                   0     0     0   -  /dev/pts
tmpfs                 753M     0  753M   0% /dev/shm
none                     0     0     0   -  /proc/sys/fs/binfmt_misc
sunrpc                   0     0     0   -  /var/lib/nfs/rpc_pipefs

Comment 1 Frantisek Reznicek 2011-07-28 10:46:04 UTC
Severity raised to medium.
The shortest duration between two broker data backups was seen as 2 minutes, which is my opinion very short time.

Updating above suggestions:
a] Allow to define maximal number of backups and/or disk size occupied
b] Allow to define minimal time between two consecutive backups


Note You need to log in before you can comment on or make changes to this bug.