Description of problem: I've got a set of repeated migrations happening, so the logs for a particular guest are getting quite long; after leaving it doing that all weekend I'm noticing something odd: [root@virtlab403 qemu]# ls -l /var/log/libvirt/qemu/rhel6* -rw-------. 1 root root 1827041 Mar 19 05:49 /var/log/libvirt/qemu/rhel6.log -rw-------. 1 root root 2097152 Mar 19 01:53 /var/log/libvirt/qemu/rhel6.log.0 -rw-------. 1 root root 2097152 Mar 18 21:19 /var/log/libvirt/qemu/rhel6.log.1 -rw-------. 1 root root 2097152 Mar 18 16:48 /var/log/libvirt/qemu/rhel6.log.2 -rw-------. 1 root root 5080 Mar 1 03:37 /var/log/libvirt/qemu/rhel6.log-20180301.gz -rw-------. 1 root root 20 Mar 4 03:12 /var/log/libvirt/qemu/rhel6.log-20180304.gz -rw-------. 1 root root 20 Mar 15 15:23 /var/log/libvirt/qemu/rhel6.log-20180315.gz -rw-------. 1 root root 841447 Mar 18 03:11 /var/log/libvirt/qemu/rhel6.log-20180318 so each log file seems to be renamed to .0,.1,... when it hits 2MB - OK but then the rhel6.log-2018-0318 seems shorter than that - so I'm guessing that the log rotation has taken the current rhel6.log and renamed that to add the date; but that's lost the other .1/.1/.2 etc - so anything that happened on 17th is gone. The same pattern seems to be in the other host which it's ping-ponging to. Velrsion number of selected component (if applicable): libvirt-3.9.0-13.el7.x86_64 How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Historically we relied on logrotate scripts to rollover to the guest log files. This is barely acceptable because by the time logrotate runs, the guest could already have inflicted a DOS by filling up all disk space with logs. Thus we introduced virtlogd which owns the logfile, and QEMU just writes to a pipe to virtlogd. Virtlogd does rollover at a specific file size, to guarantee the guest can't perform a DoS. The trouble comes because the user can optionally turn off virtlogd, so we still ship the logrotate config files in the RPM to cope with that scenario. We could mitigate most of the problem by setting a minimum size in the log rotate config file, that is larger than the default rollover size used by virtlogd. That would turn logrotate into a no-op with the out of the box config.
(In reply to Dr. David Alan Gilbert from comment #0) > Description of problem: > > I've got a set of repeated migrations happening, so the logs for a > particular guest are getting quite long; after leaving it doing that all > weekend I'm noticing something odd: > [root@virtlab403 qemu]# ls -l /var/log/libvirt/qemu/rhel6* > -rw-------. 1 root root 1827041 Mar 19 05:49 /var/log/libvirt/qemu/rhel6.log > -rw-------. 1 root root 2097152 Mar 19 01:53 > /var/log/libvirt/qemu/rhel6.log.0 > -rw-------. 1 root root 2097152 Mar 18 21:19 > /var/log/libvirt/qemu/rhel6.log.1 > -rw-------. 1 root root 2097152 Mar 18 16:48 > /var/log/libvirt/qemu/rhel6.log.2 > -rw-------. 1 root root 5080 Mar 1 03:37 > /var/log/libvirt/qemu/rhel6.log-20180301.gz > -rw-------. 1 root root 20 Mar 4 03:12 > /var/log/libvirt/qemu/rhel6.log-20180304.gz > -rw-------. 1 root root 20 Mar 15 15:23 > /var/log/libvirt/qemu/rhel6.log-20180315.gz > -rw-------. 1 root root 841447 Mar 18 03:11 > /var/log/libvirt/qemu/rhel6.log-20180318 > > so each log file seems to be renamed to .0,.1,... when it hits 2MB - OK > but then the rhel6.log-2018-0318 seems shorter than that - so I'm guessing > that the log rotation has taken the current rhel6.log and renamed that to > add the date; but that's lost the other .1/.1/.2 etc - so anything that > happened on 17th is gone. The same pattern seems to be in the other host > which it's ping-ponging to. Can you confirm you're not using virtlogd as the logging backend? If so, I'd suggest giving virtlogd a try whether it helps your use case. I don't think it's worth doing anything from libvirt's POV unless this happens with virtlogd (which is the default) too.
Hmm this is a long time ago; I'm not sure what the config was at the time. However, even on a rhel8 I still see that we've got logrotate happening on all the files in /var/log/libvirt/qemu/ and that's got virtlogd configured - so I think it will still happen. [The other bad logrotate interaction we have is that if you started a VM once, say a guestfs with temporary name, logrotate will create a 0 length file when it rotates the file, and so you permenantly get 5 approximately empty files for a VM you've not used in years]
(In reply to Erik Skultety from comment #3) > Can you confirm you're not using virtlogd as the logging backend? If so, I'd > suggest giving virtlogd a try whether it helps your use case. I don't think > it's worth doing anything from libvirt's POV unless this happens with > virtlogd (which is the default) too. It is exactly the interaction between virtlogd & logrotate that is the problem - they're both trying to rotate the same files per my comment #2
Since this ticket we did an improvement in upstream to try to stop logrotate interfering with virtlogd: commit 975b004d07e472a4f7557f923fed228f3fa02f86 Author: Daniel P. Berrangé <berrange> Date: Fri Jul 5 10:32:53 2019 +0100 logging: ensure virtlogd rollover takes priority over logrotate The virtlogd config is set to rollover logs every 2 MB. Normally a logrotate config file is also installed to handle cases where virtlogd is disabled. This is set to rollover weekly with no size constraint. As a result logrotate can interfere with virtlogd's, rolling over files that virtlogd has already taken care of. This changes logrotate configs to rollover based on a max size constraint of 2 MB + 1 byte. When virtlogd is running the log files will never get this large, making logrotate a no-op. If the user changes the size in virtlogd's config to something larger, they are responsible for also changing the logrotate config suitably. The LXC/libxl drivers don't use virtlogd, but there logrotate config is altered to match the QEMU driver config, for the sake of consistency. Reviewed-by: Jim Fehlig <jfehlig> Signed-off-by: Daniel P. Berrangé <berrange> This merged in v5.6.0, so we got it via the recent rebase
Reproduced on package: libvirt-daemon-kvm-4.5.0-35.module+el8.1.0+4227+b2722cb3.x86_64 Test steps: 1. Check the lograte conf file libvirtd.qemu. It's weekly and without file size. # cat /etc/logrotate.d/libvirtd.qemu /var/log/libvirt/qemu/*.log { weekly missingok rotate 4 compress delaycompress copytruncate } Tested on package: libvirt-daemon-kvm-5.6.0-6.module+el8.1.0+4244+9aa4e6bb.x86_64 Test steps: 1. Check the lograte conf file libvirtd.qemu. It's with size 2097153, which is max_size [in virtlogd.conf]+1. # cat /etc/libvirt/virtlogd.conf |grep max_size #max_size = 2097152 # cat /etc/logrotate.d/libvirtd.qemu /var/log/libvirt/qemu/*.log { size 2097153 missingok rotate 4 compress delaycompress copytruncate } 2. Start/stop VM for 5000 times, and check the logs: the virtlogd max_size work well, no logrotate log file created. # for i in {1..5000}; do virsh destroy r8t; sleep 2; virsh start r8t; done # ls -lh /var/log/libvirt/qemu/ | grep r8t -rw-------. 1 root root 954K Sep 24 12:35 r8t.log -rw-------. 1 root root 2.0M Sep 24 12:21 r8t.log.0 -rw-------. 1 root root 2.0M Sep 24 11:51 r8t.log.1 -rw-------. 1 root root 2.0M Sep 24 11:22 r8t.log.2
According to comment8, set the bug status to verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:3723