Description of problem: On a system which has 4 processors but 3 of them are disabled, collect binary cpu activity data and then run 'sar -P ALL -f xxxx', it doesn't display any information beside header. Version-Release number of selected component (if applicable): sysstat-7.0.2-3.el5_5.1 How reproducible: 100% Steps to Reproduce: 1. setup a virtual machine with 4 processors 2. disable cpu1, 2 and 3 by "echo 0 >/sys/devices/system/cpu/cpuX/online 3. sar -P ALL 1 3 -o cpu_binary; 4. sar -P ALL -f cpu_binary Actual results: No activity information displayed Expected results: Display cpu activity information as all cpus are online. Additional info:
Following is step trace of the next_slice() by gdb. # export LANG=C # gdb sar (gdb) b next_slice Breakpoint 1 at 0x408d30: file sa_common.c, line 187. (gdb) r -P ALL -f sarx.o1 Starting program: /usr/bin/sar -P ALL -f sarx.o1 Linux 2.6.18-164.9.1.el5 (rhel5u4-sartest) 08/19/10 Breakpoint 1, next_slice (uptime_ref=120706461, uptime=120706562, file_hdr=0x60f600, reset=1, interval=1) at sa_common.c:187 187 if (!last_uptime || reset) (gdb) bt #0 next_slice (uptime_ref=120706461, uptime=120706562, file_hdr=0x60f600, reset=1, interval=1) at sa_common.c:187 #1 0x0000000000405256 in write_stats (curr=1, dis=1, act=4, read_from_file=, cnt=0x7fff5a57a920, use_tm_start=0, use_tm_end=0, reset=1, want_since_boot=0) at sar.c:895 #2 0x000000000040559a in handle_curr_act_stats (ifd=6, fpos=, curr=0x7fff5a57a936, cnt=0x7fff5a57a920, eosaf=0x7fff5a57a92c, rows=, act=4, reset=0x7fff5a57a928, nr_cpu=4, nr_irq=0) at sar.c:1280 #3 0x0000000000405ee8 in read_stats_from_file (from_file=) at sar.c:1408 #4 0x0000000000406368 in main (argc=, argv=0x7fff5a57acd8) at sar.c:1790 (gdb) : (gdb) s 215 f = (((double) ((uptime - uptime_ref) & 0xffffffff)) / cpu_nr) / HZ; (gdb) p ((double) ((uptime - uptime_ref) & 0xffffffff)) $14 = 101 (gdb) s 216 entry = (unsigned long) f; (gdb) p f $15 = 0.2525 (gdb) s 217 if ((f * 10) - (entry * 10) >= 5) (gdb) p (f * 10) - (entry * 10) $16 = 2.5249999999999999 (gdb) s 218 entry++; (gdb) p entry $17 = 0 (gdb) s 220 min = entry - (file_interval / 2); (gdb) p entry - (file_interval / 2) $18 = 0 (gdb) p (file_interval / 2) $19 = 0 (gdb) s 218 entry++; (gdb) s 221 max = entry + (file_interval / 2) + (file_interval & 0x1); (gdb) p entry $20 = 0 (gdb) p (file_interval & 0x1) $21 = 0 (gdb) p entry + (file_interval / 2) $22 = 0 (gdb) s 220 min = entry - (file_interval / 2); (gdb) s 221 max = entry + (file_interval / 2) + (file_interval & 0x1); (gdb) s 222 pt1 = (entry / interval) * interval; (gdb) p (entry / interval) * interval $24 = 0 (gdb) p entry + (file_interval / 2) + (file_interval & 0x1) $25 = 0 (gdb) s 222 pt1 = (entry / interval) * interval; (gdb) p (entry / interval) * interval $26 = 0 (gdb) s 225 return (((pt1 >= min) && (pt1 < max)) || ((pt2 >= min) && (pt2 < max))); (gdb) p ((pt1 >= min) && (pt1 < max)) $27 = 0 (gdb) p pt1 $28 = 0 (gdb) p min $29 = 0 (gdb) p max $30 = 0 (gdb) s 223 pt2 = ((entry / interval) + 1) * interval; (gdb) p ((entry / interval) + 1) * interval $32 = 1
This request was evaluated by Red Hat Product Management for inclusion in the current release of Red Hat Enterprise Linux. Because the affected component is not scheduled to be updated in the current release, Red Hat is unfortunately unable to address this request at this time. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux.
When some processors are disabled on-line, the number of processors in the header of binary sar data doesn't change accordingly. But the uptime only adds up the uptime of on-line cpus, so if the uptime is still divided by the total processor number, the interval of records becomes 0 unexpectedly. This issue has been fixed in upstream already. Instead of the total uptime of all processors, it use the uptime of single processor. next_slice() { ... /* uptime is expressed in jiffies (basis of 1 processor) */ if (!last_uptime || reset) last_uptime = uptime_ref; /* Interval cannot be greater than 0xffffffff here */ f = ((double) ((uptime - last_uptime) & 0xffffffff)) / HZ; ... } write_stats() { ... if (read_from_file) { if (!next_slice(file_stats[2].uptime0, file_stats[curr].uptime0, reset, interval)) ... } I have tested with sar-8.0.4, and it does provide better cpu hotplug support.
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: When some processors were disabled on a multi-processor system, the sar utility sometimes failed to provide information about the CPU activity. With this update, the uptime of a single processor is used to compute the statistics, rather than the total uptime of all processors, and this bug no longer occurs.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-1005.html