Cause: sysstat didn't check whether the sa daily data files exist from previous month when appending new statistics to them
Consequence: in some edge cases (as for example short month) sysstat was appending new statistics to the old sa daily data files
Fix: sysstat was modified to check whether the old sa daily data file exists and remove it if it does before appending new data to it
Result: sysstat doesn't append new statistics to the old sa daily data files anymore
Description of problem:
In order to retain some additional SAR logging, we locally chose to raise HISTORY in /etc/sysconfig/sysstat from the RPM default of 7 to 28. By an odd quirk, we believe that the SAR logs can become corrupted if HISTORY > 27, and the system itself is reconfigured with additional devices during February during a non-Leap-Year.
Version-Release number of selected component (if applicable):
sysstat-9.0.4-20.el6.i686
Additional info:
The SAR scripts /usr/lib/sa/sa1 and sa2 both normally use logs in /var/log/sa itself, but if HISTORY is > 28 , the scripts use a tree of log directories under /var/log/sa. There may be a problem with this log file layout as well, but I haven't been able to check for that case as yet.
When HISTORY is < 28, then /usr/lib/sa/sa2's action to expunge old logs which are more than HISTORY days old will mean that "tomorrow's" saXX file never exists prior to /usr/lib/sa/sa1 creating it on the first pass of the day.
However, when HISTORY == 28, then on March 1st in a non-leap year, the log file sa01 will already exist from Feb 1st having not been pre-expunged by the sa2 script. Similarly for March 2nd through 28th.
It so happened that we reconfigured a CentOS6 system with an additional disk during February, and seemingly as a consequence the exact layout of the SAR logs changed sufficiently for sadc to regard it as invalid when sa1 attempted to reuse the log of the same name 28 days later.
Therefore, we believe it would be prudent to bullet-proof the sa1 script by pre-expunging a SAR log if it is more than 1 day old, which will prevent a log file from a month ago being re-used "today".
$ diff -u /usr/lib/sa/sa1 sa1.script.tweak
--- /usr/lib/sa/sa1 2012-06-22 11:11:48.000000000 +0100
+++ sa1.script.tweak 2013-03-14 14:16:38.211424137 +0000
@@ -9,13 +9,15 @@
SADC_OPTIONS="-S DISK"
SYSCONFIG_DIR=/etc/sysconfig
[ -r ${SYSCONFIG_DIR}/sysstat ] && . ${SYSCONFIG_DIR}/sysstat
+
+CURRENTDIR=`date +%Y%m`
+DATE=`date +%d`
+CURRENTFILE=sa${DATE}
+DDIR=/var/log/sa
+cd ${DDIR} || exit 1
+
if [ ${HISTORY} -gt 28 ]
then
- CURRENTDIR=`date +%Y%m`
- DATE=`date +%d`
- CURRENTFILE=sa${DATE}
- DDIR=/var/log/sa
- cd ${DDIR} || exit 1
[ -d ${CURRENTDIR} ] || mkdir -p ${CURRENTDIR}
# If ${CURRENTFILE} exists and is a regular file, then make sure
# the file was modified this day (and not e.g. month ago)
@@ -24,11 +26,30 @@
[ -f ${CURRENTFILE} ] &&
[ "`date +%Y%m%d -r ${CURRENTFILE}`" = "${CURRENTDIR}${DATE}" ] &&
mv -f ${CURRENTFILE} ${CURRENTDIR}/${CURRENTFILE}
+ # If ${CURRENTFILE} exists and is a regular file, then make sure
+ # the file was modified this day (and not e.g. month ago).
+ # If it is old, remove it so that it is recreated by sadc afresh
+ if [ -f ${CURRENTFILE} ]; then
+ find ${CURRENTDIR} -type f -name ${CURRENTFILE} -mtime -1 -print | grep -q ${CURRENTFILE}
+ if [ $? -ne 0 ]; then
+ rm -f ${CURRENTDIR}/${CURRENTFILE}
+ fi
+ fi
touch ${CURRENTDIR}/${CURRENTFILE}
# Remove the "compatibility" link and recreate it to point to
# the (new) current file
rm -f ${CURRENTFILE}
ln -s ${CURRENTDIR}/${CURRENTFILE} ${CURRENTFILE}
+else
+ # If ${CURRENTFILE} exists and is a regular file, then make sure
+ # the file was modified this day (and not e.g. month ago).
+ # If it is old, remove it so that it is recreated by sadc afresh
+ if [ ! -L ${CURRENTFILE} -a -f ${CURRENTFILE} ]; then
+ find . -type f -name ${CURRENTFILE} -mtime -1 -print | grep -q ${CURRENTFILE}
+ if [ $? -ne 0 ]; then
+ rm -f ${CURRENTFILE}
+ fi
+ fi
fi
umask 0022
ENDIR=/usr/lib/sa
$
$ cat sa1.script.tweak
#!/bin/sh
# /usr/lib/sa/sa1
# (C) 1999-2009 Sebastien Godard (sysstat <at> orange.fr)
#
#@(#) sysstat-9.0.4
#@(#) sa1: Collect and store binary data in system activity data file.
#
HISTORY=0
SADC_OPTIONS="-S DISK"
SYSCONFIG_DIR=/etc/sysconfig
[ -r ${SYSCONFIG_DIR}/sysstat ] && . ${SYSCONFIG_DIR}/sysstat
CURRENTDIR=`date +%Y%m`
DATE=`date +%d`
CURRENTFILE=sa${DATE}
DDIR=/var/log/sa
cd ${DDIR} || exit 1
if [ ${HISTORY} -gt 28 ]
then
[ -d ${CURRENTDIR} ] || mkdir -p ${CURRENTDIR}
# If ${CURRENTFILE} exists and is a regular file, then make sure
# the file was modified this day (and not e.g. month ago)
# and move it to ${CURRENTDIR}
[ ! -L ${CURRENTFILE} ] &&
[ -f ${CURRENTFILE} ] &&
[ "`date +%Y%m%d -r ${CURRENTFILE}`" = "${CURRENTDIR}${DATE}" ] &&
mv -f ${CURRENTFILE} ${CURRENTDIR}/${CURRENTFILE}
# If ${CURRENTFILE} exists and is a regular file, then make sure
# the file was modified this day (and not e.g. month ago).
# If it is old, remove it so that it is recreated by sadc afresh
if [ -f ${CURRENTFILE} ]; then
find ${CURRENTDIR} -type f -name ${CURRENTFILE} -mtime -1 -print | grep -q ${CURRENTFILE}
if [ $? -ne 0 ]; then
rm -f ${CURRENTDIR}/${CURRENTFILE}
fi
fi
touch ${CURRENTDIR}/${CURRENTFILE}
# Remove the "compatibility" link and recreate it to point to
# the (new) current file
rm -f ${CURRENTFILE}
ln -s ${CURRENTDIR}/${CURRENTFILE} ${CURRENTFILE}
else
# If ${CURRENTFILE} exists and is a regular file, then make sure
# the file was modified this day (and not e.g. month ago).
# If it is old, remove it so that it is recreated by sadc afresh
if [ ! -L ${CURRENTFILE} -a -f ${CURRENTFILE} ]; then
find . -type f -name ${CURRENTFILE} -mtime -1 -print | grep -q ${CURRENTFILE}
if [ $? -ne 0 ]; then
rm -f ${CURRENTFILE}
fi
fi
fi
umask 0022
ENDIR=/usr/lib/sa
cd ${ENDIR}
[ "$1" = "--boot" ] && shift && BOOT=y || BOOT=n
if [ $# = 0 ] && [ "${BOOT}" = "n" ]
then
# Note: Stats are written at the end of previous file *and* at the
# beginning of the new one (when there is a file rotation) only if
# outfile has been specified as '-' on the command line...
exec ${ENDIR}/sadc -F -L ${SADC_OPTIONS} 1 1 -
else
exec ${ENDIR}/sadc -F -L ${SADC_OPTIONS} $* -
fi
$
Created attachment 898410[details]
sysstat-9.0.4-history-25.patch
This problem is fixed by this patch and lowering the default history value to 25 days.
The patch was updated reverting the change of boundary when the sa data files are stored in directories and default history value from 25 back to 28, as this change could potentially lead to broken backward compatibility issues in some cases.
I am seeing this happen not only in the February / March case, but April / May as well, and probably with all other 30 months as well.
I have collected sosreports that show this.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
http://rhn.redhat.com/errata/RHBA-2014-1468.html
Description of problem: In order to retain some additional SAR logging, we locally chose to raise HISTORY in /etc/sysconfig/sysstat from the RPM default of 7 to 28. By an odd quirk, we believe that the SAR logs can become corrupted if HISTORY > 27, and the system itself is reconfigured with additional devices during February during a non-Leap-Year. Version-Release number of selected component (if applicable): sysstat-9.0.4-20.el6.i686 Additional info: The SAR scripts /usr/lib/sa/sa1 and sa2 both normally use logs in /var/log/sa itself, but if HISTORY is > 28 , the scripts use a tree of log directories under /var/log/sa. There may be a problem with this log file layout as well, but I haven't been able to check for that case as yet. When HISTORY is < 28, then /usr/lib/sa/sa2's action to expunge old logs which are more than HISTORY days old will mean that "tomorrow's" saXX file never exists prior to /usr/lib/sa/sa1 creating it on the first pass of the day. However, when HISTORY == 28, then on March 1st in a non-leap year, the log file sa01 will already exist from Feb 1st having not been pre-expunged by the sa2 script. Similarly for March 2nd through 28th. It so happened that we reconfigured a CentOS6 system with an additional disk during February, and seemingly as a consequence the exact layout of the SAR logs changed sufficiently for sadc to regard it as invalid when sa1 attempted to reuse the log of the same name 28 days later. Therefore, we believe it would be prudent to bullet-proof the sa1 script by pre-expunging a SAR log if it is more than 1 day old, which will prevent a log file from a month ago being re-used "today". $ diff -u /usr/lib/sa/sa1 sa1.script.tweak --- /usr/lib/sa/sa1 2012-06-22 11:11:48.000000000 +0100 +++ sa1.script.tweak 2013-03-14 14:16:38.211424137 +0000 @@ -9,13 +9,15 @@ SADC_OPTIONS="-S DISK" SYSCONFIG_DIR=/etc/sysconfig [ -r ${SYSCONFIG_DIR}/sysstat ] && . ${SYSCONFIG_DIR}/sysstat + +CURRENTDIR=`date +%Y%m` +DATE=`date +%d` +CURRENTFILE=sa${DATE} +DDIR=/var/log/sa +cd ${DDIR} || exit 1 + if [ ${HISTORY} -gt 28 ] then - CURRENTDIR=`date +%Y%m` - DATE=`date +%d` - CURRENTFILE=sa${DATE} - DDIR=/var/log/sa - cd ${DDIR} || exit 1 [ -d ${CURRENTDIR} ] || mkdir -p ${CURRENTDIR} # If ${CURRENTFILE} exists and is a regular file, then make sure # the file was modified this day (and not e.g. month ago) @@ -24,11 +26,30 @@ [ -f ${CURRENTFILE} ] && [ "`date +%Y%m%d -r ${CURRENTFILE}`" = "${CURRENTDIR}${DATE}" ] && mv -f ${CURRENTFILE} ${CURRENTDIR}/${CURRENTFILE} + # If ${CURRENTFILE} exists and is a regular file, then make sure + # the file was modified this day (and not e.g. month ago). + # If it is old, remove it so that it is recreated by sadc afresh + if [ -f ${CURRENTFILE} ]; then + find ${CURRENTDIR} -type f -name ${CURRENTFILE} -mtime -1 -print | grep -q ${CURRENTFILE} + if [ $? -ne 0 ]; then + rm -f ${CURRENTDIR}/${CURRENTFILE} + fi + fi touch ${CURRENTDIR}/${CURRENTFILE} # Remove the "compatibility" link and recreate it to point to # the (new) current file rm -f ${CURRENTFILE} ln -s ${CURRENTDIR}/${CURRENTFILE} ${CURRENTFILE} +else + # If ${CURRENTFILE} exists and is a regular file, then make sure + # the file was modified this day (and not e.g. month ago). + # If it is old, remove it so that it is recreated by sadc afresh + if [ ! -L ${CURRENTFILE} -a -f ${CURRENTFILE} ]; then + find . -type f -name ${CURRENTFILE} -mtime -1 -print | grep -q ${CURRENTFILE} + if [ $? -ne 0 ]; then + rm -f ${CURRENTFILE} + fi + fi fi umask 0022 ENDIR=/usr/lib/sa $ $ cat sa1.script.tweak #!/bin/sh # /usr/lib/sa/sa1 # (C) 1999-2009 Sebastien Godard (sysstat <at> orange.fr) # #@(#) sysstat-9.0.4 #@(#) sa1: Collect and store binary data in system activity data file. # HISTORY=0 SADC_OPTIONS="-S DISK" SYSCONFIG_DIR=/etc/sysconfig [ -r ${SYSCONFIG_DIR}/sysstat ] && . ${SYSCONFIG_DIR}/sysstat CURRENTDIR=`date +%Y%m` DATE=`date +%d` CURRENTFILE=sa${DATE} DDIR=/var/log/sa cd ${DDIR} || exit 1 if [ ${HISTORY} -gt 28 ] then [ -d ${CURRENTDIR} ] || mkdir -p ${CURRENTDIR} # If ${CURRENTFILE} exists and is a regular file, then make sure # the file was modified this day (and not e.g. month ago) # and move it to ${CURRENTDIR} [ ! -L ${CURRENTFILE} ] && [ -f ${CURRENTFILE} ] && [ "`date +%Y%m%d -r ${CURRENTFILE}`" = "${CURRENTDIR}${DATE}" ] && mv -f ${CURRENTFILE} ${CURRENTDIR}/${CURRENTFILE} # If ${CURRENTFILE} exists and is a regular file, then make sure # the file was modified this day (and not e.g. month ago). # If it is old, remove it so that it is recreated by sadc afresh if [ -f ${CURRENTFILE} ]; then find ${CURRENTDIR} -type f -name ${CURRENTFILE} -mtime -1 -print | grep -q ${CURRENTFILE} if [ $? -ne 0 ]; then rm -f ${CURRENTDIR}/${CURRENTFILE} fi fi touch ${CURRENTDIR}/${CURRENTFILE} # Remove the "compatibility" link and recreate it to point to # the (new) current file rm -f ${CURRENTFILE} ln -s ${CURRENTDIR}/${CURRENTFILE} ${CURRENTFILE} else # If ${CURRENTFILE} exists and is a regular file, then make sure # the file was modified this day (and not e.g. month ago). # If it is old, remove it so that it is recreated by sadc afresh if [ ! -L ${CURRENTFILE} -a -f ${CURRENTFILE} ]; then find . -type f -name ${CURRENTFILE} -mtime -1 -print | grep -q ${CURRENTFILE} if [ $? -ne 0 ]; then rm -f ${CURRENTFILE} fi fi fi umask 0022 ENDIR=/usr/lib/sa cd ${ENDIR} [ "$1" = "--boot" ] && shift && BOOT=y || BOOT=n if [ $# = 0 ] && [ "${BOOT}" = "n" ] then # Note: Stats are written at the end of previous file *and* at the # beginning of the new one (when there is a file rotation) only if # outfile has been specified as '-' on the command line... exec ${ENDIR}/sadc -F -L ${SADC_OPTIONS} 1 1 - else exec ${ENDIR}/sadc -F -L ${SADC_OPTIONS} $* - fi $