Created attachment 861454 [details] Collected output from affected system Description of problem: RHEV-M reports "Critical, Low disk space. Host <host> has less than 500 MB of free space left on: /var/log." although the df -h output shows Filesystem Size Used Avail Use% Mounted on /dev/mapper/live-rw 1.5G 464M 1.1G 31% / Version-Release number of selected component (if applicable): Red Hat Enterprise Virtualization Hypervisor release 6.5 (20140112.0.el6) vdsm-4.13.2-0.6.el6ev.x86_64 How reproducible: Unknown at this time, however lsof does not show files needing to be deleted, so this is not an exact regression of https://access.redhat.com/site/solutions/289583 Actual results: RHEV-M reports low space on /var/log Expected results: No messages unless /var/log has less than 500 MB free
I'll see if I can figure out where this is pulling from.
Can I please see the output of `du -ha /var/log`? It looks like the space may be taken, but I can't see what files are present from the information available.
From VDSM view, the logrotate is working, rotating up to 100. 213M /var/log/vdsm 69M /var/log/libvirt ============================= Total: 282M Seems we need a cron job to log rotate messages, moving back to RHEV-H guys. Please fell free to move back in case it goes to vdsm domain.
Alexander, could you please try to gather the following informations: $ mount $ findmnt /var/log $ df -h /var/log
I am not yet sure if this bug is about an incorrect logrotate for the messages file, or if it's about using an incorrect free-space number.
I'm happy to tweak logrotate for messages. But before I do: Alexander, Could you try to ascertain why messages is so large? The presence of removed, rotated files indicates that logrotate is at least working (though it appears to be pretty far from a recommended configuration), but 1.3gb for syslog indicates a problem somewhere...
Make it as regression, it happened on 6.4 see bug 947807.
From https://bugzilla.redhat.com/show_bug.cgi?id=1073566 it looks like it should be resolved with the following packages ovirt-node-3.0.1-18.el6_5.8.noarch.rpm ovirt-node-plugin-cim-3.0.1-18.el6_5.8.noarch.rpm ovirt-node-plugin-ipmi-3.0.1-18.el6_5.8.noarch.rpm ovirt-node-plugin-puppet-3.0.1-18.el6_5.8.noarch.rpm ovirt-node-plugin-rhn-3.0.1-18.el6_5.8.noarch.rpm ovirt-node-plugin-snmp-3.0.1-18.el6_5.8.noarch.rpm ovirt-node-recipe-3.0.1-18.el6_5.8.noarch.rpm ovirt-node-selinux-3.0.1-18.el6_5.8.noarch.rpm ovirt-node-tools-3.0.1-18.el6_5.8.noarch.rpm however this is still being seen with the following packages installed ovirt-node-3.0.1-18.el6_5.noarch ovirt-node-plugin-rhn-3.0.1-18.el6_5.noarch ovirt-node-plugin-snmp-3.0.1-18.el6_5.noarch ovirt-node-plugin-cim-3.0.1-18.el6_5.noarch And this is seen with the 20140407 build of RHEV-H for 3.3 as well
Moving this to the correct product, tho we still need to identify where the problem really lies.
(In reply to Fabian Deutsch from comment #20) > Moving this to the correct product, tho we still need to identify where the > problem really lies. I believe there are two problems. First is whatever was filling the logs so quickly (which appears to have been a vdsm issue fixed elsewhere). Second is that the default Node install has logrotate settings which are optimistic, and leave it open to the possibility of very active logs filling the disk. This addresses the second.
could reproduce this bug in the follow version: Red Hat Enterprise Virtualization Hypervisor release 6.5 (20140112.0.el6) vdsm-4.13.2-0.6.el6ev.x86_64
Apologies about the wait on the NEEDINFO here. Here's what my customer has in /etc/logrotate.d/syslog: # cat /etc/logrotate.d/syslog /var/log/cron /var/log/maillog /var/log/messages /var/log/secure /var/log/spooler { sharedscripts rotate 5 size 15M compress postrotate /bin/kill -HUP `cat /var/run/syslogd.pid 2> /dev/null` 2> /dev/null || true endscript } Their newest host (data from 11.13.14) shows ovirt-node-3.0.1-18.el6.14.noarch Fri Aug 22 04:05:29 2014 -=>>cat helpro0365-2014111117381415727498/etc/logrotate.d/syslog /var/log/cron /var/log/maillog /var/log/messages /var/log/secure /var/log/spooler { sharedscripts postrotate /bin/kill -HUP `cat /var/run/syslogd.pid 2> /dev/null` 2> /dev/null || true endscript }
Test version: rhev-hypervisor6-6.6-20141218.0.iso ovirt-node-3.1.0-0.37.20141218gitcf277e1.el6.noarch Red Hat Enterprise Virtualization Manager Version: 3.5.0-0.27.el6ev Test steps: 1. Register rhevh into rhevm and approved into up 2. bumped messages of /var/log in rhevh up to 1.5GB [root@dhcp-10-54 log]# df -h /var/log Filesystem Size Used Avail Use% Mounted on /var/log 2.0G 44M 1.8G 3% /tmp/early-logs [root@dhcp-10-54 log]# dd if=/dev/zero of=messages bs=1M count=1500 1500+0 records in 1500+0 records out 1572864000 bytes (1.6 GB) copied, 10.5616 s, 149 MB/s [root@dhcp-10-54 log]# df -h /var/log Filesystem Size Used Avail Use% Mounted on /var/log 2.0G 1.5G 324M 83% /tmp/early-logs 3.ran logrotate -f. It appropriately rotated and compressed the logs. [root@dhcp-10-54 log]# logrotate -f /etc/logrotate.d/syslog [root@dhcp-10-54 log]# ls audit dmesg lost+found messages-20150106 sa tuned boot.log dmesg.old maillog.1.gz messages.4.gz sanlock.log vdsm core glusterfs maillog.2.gz ntpstats secure.1.gz vdsm-reg cron journal mcelog ovirt.log secure.4.gz wtmp cron.1.gz lastlog messages ovirt-node.log spooler.1.gz yum.log cron.4.gz libvirt messages.1.gz rhsm spooler.2.gz [root@dhcp-10-54 log]# df -h /var/log Filesystem Size Used Avail Use% Mounted on /var/log 2.0G 44M 1.8G 3% /tmp/early-logs Test result: After step2, RHEV-M reports "Critical, Low disk space. Host <host> has less than 500 MB of free space left on: /var/log." After step3, RHEV-M won't report "Critical, Low disk space. Host <host> has less than 500 MB of free space left on: /var/log." again so patch http://gerrit.ovirt.org/#/c/25096/1/recipe/common-post.ks works well in rhev-hypervisor6-6.6-20141218.0.iso [root@dhcp-10-54 admin]# cat /etc/logrotate.d/syslog /var/log/cron /var/log/maillog /var/log/messages /var/log/secure /var/log/spooler { sharedscripts rotate 5 size 15M compress postrotate /bin/kill -HUP `cat /var/run/syslogd.pid 2> /dev/null` 2> /dev/null || true endscript } change the status of this bug into "verified".
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2015-0160.html