Bug 1557933 - big qemu logs vs log rotation
Summary: big qemu logs vs log rotation
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: libvirt
Version: 8.1
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Daniel Berrangé
QA Contact: chhu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-03-19 09:54 UTC by Dr. David Alan Gilbert
Modified: 2020-11-14 08:31 UTC (History)
6 users (show)

Fixed In Version: libvirt-5.6.0-1.el8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-11-06 07:11:30 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:3723 0 None None None 2019-11-06 07:12:01 UTC

Description Dr. David Alan Gilbert 2018-03-19 09:54:16 UTC
Description of problem:

I've got a set of repeated migrations happening, so the logs for a particular guest are getting quite long;   after leaving it doing that all weekend I'm noticing something odd:
[root@virtlab403 qemu]# ls -l /var/log/libvirt/qemu/rhel6*
-rw-------. 1 root root 1827041 Mar 19 05:49 /var/log/libvirt/qemu/rhel6.log
-rw-------. 1 root root 2097152 Mar 19 01:53 /var/log/libvirt/qemu/rhel6.log.0
-rw-------. 1 root root 2097152 Mar 18 21:19 /var/log/libvirt/qemu/rhel6.log.1
-rw-------. 1 root root 2097152 Mar 18 16:48 /var/log/libvirt/qemu/rhel6.log.2
-rw-------. 1 root root    5080 Mar  1 03:37 /var/log/libvirt/qemu/rhel6.log-20180301.gz
-rw-------. 1 root root      20 Mar  4 03:12 /var/log/libvirt/qemu/rhel6.log-20180304.gz
-rw-------. 1 root root      20 Mar 15 15:23 /var/log/libvirt/qemu/rhel6.log-20180315.gz
-rw-------. 1 root root  841447 Mar 18 03:11 /var/log/libvirt/qemu/rhel6.log-20180318

so each log file seems to be renamed to .0,.1,... when it hits 2MB - OK
but then the rhel6.log-2018-0318 seems shorter than that - so I'm guessing that the log rotation has taken the current rhel6.log and renamed that to add the date; but that's lost the other .1/.1/.2 etc - so anything that happened on 17th is gone.   The same pattern seems to be in the other host which it's ping-ponging to.

Velrsion number of selected component (if applicable):
libvirt-3.9.0-13.el7.x86_64

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 Daniel Berrangé 2018-03-19 11:37:38 UTC
Historically we relied on logrotate scripts to rollover to the guest log files. This is barely acceptable because by the time logrotate runs, the guest could already have inflicted a DOS by filling up all disk space with logs.

Thus we introduced virtlogd which owns the logfile, and QEMU just writes to a pipe to virtlogd. Virtlogd does rollover at a specific file size, to guarantee the guest can't perform a DoS.

The trouble comes because the user can optionally turn off virtlogd, so we still ship the logrotate config files in the RPM to cope with that scenario.

We could mitigate most of the problem by setting a minimum size in the log rotate config file, that is larger than the default rollover size used by virtlogd. That would turn logrotate into a no-op with the out of the box config.

Comment 3 Erik Skultety 2019-03-11 12:35:14 UTC
(In reply to Dr. David Alan Gilbert from comment #0)
> Description of problem:
> 
> I've got a set of repeated migrations happening, so the logs for a
> particular guest are getting quite long;   after leaving it doing that all
> weekend I'm noticing something odd:
> [root@virtlab403 qemu]# ls -l /var/log/libvirt/qemu/rhel6*
> -rw-------. 1 root root 1827041 Mar 19 05:49 /var/log/libvirt/qemu/rhel6.log
> -rw-------. 1 root root 2097152 Mar 19 01:53
> /var/log/libvirt/qemu/rhel6.log.0
> -rw-------. 1 root root 2097152 Mar 18 21:19
> /var/log/libvirt/qemu/rhel6.log.1
> -rw-------. 1 root root 2097152 Mar 18 16:48
> /var/log/libvirt/qemu/rhel6.log.2
> -rw-------. 1 root root    5080 Mar  1 03:37
> /var/log/libvirt/qemu/rhel6.log-20180301.gz
> -rw-------. 1 root root      20 Mar  4 03:12
> /var/log/libvirt/qemu/rhel6.log-20180304.gz
> -rw-------. 1 root root      20 Mar 15 15:23
> /var/log/libvirt/qemu/rhel6.log-20180315.gz
> -rw-------. 1 root root  841447 Mar 18 03:11
> /var/log/libvirt/qemu/rhel6.log-20180318
> 
> so each log file seems to be renamed to .0,.1,... when it hits 2MB - OK
> but then the rhel6.log-2018-0318 seems shorter than that - so I'm guessing
> that the log rotation has taken the current rhel6.log and renamed that to
> add the date; but that's lost the other .1/.1/.2 etc - so anything that
> happened on 17th is gone.   The same pattern seems to be in the other host
> which it's ping-ponging to.

Can you confirm you're not using virtlogd as the logging backend? If so, I'd suggest giving virtlogd a try whether it helps your use case. I don't think it's worth doing anything from libvirt's POV unless this happens with virtlogd (which is the default) too.

Comment 4 Dr. David Alan Gilbert 2019-03-19 11:39:47 UTC
Hmm this is a long time ago; I'm not sure what the config was at the time.

However, even on a rhel8 I still see that we've got logrotate happening on all the files in /var/log/libvirt/qemu/ and that's got virtlogd configured - so I think it will still happen.

[The other bad logrotate interaction we have is that if you started a VM once, say a guestfs with temporary name, logrotate will create a 0 length file when it rotates
the file, and so you permenantly get 5 approximately empty files for a VM you've not used in years]

Comment 5 Daniel Berrangé 2019-03-19 11:46:02 UTC
(In reply to Erik Skultety from comment #3)
> Can you confirm you're not using virtlogd as the logging backend? If so, I'd
> suggest giving virtlogd a try whether it helps your use case. I don't think
> it's worth doing anything from libvirt's POV unless this happens with
> virtlogd (which is the default) too.

It is exactly the interaction between virtlogd & logrotate that is the problem - they're both trying to rotate the same files per my comment #2

Comment 6 Daniel Berrangé 2019-09-11 12:11:32 UTC
Since this ticket we did an improvement in upstream to try to stop logrotate interfering with virtlogd:

commit 975b004d07e472a4f7557f923fed228f3fa02f86
Author: Daniel P. Berrangé <berrange>
Date:   Fri Jul 5 10:32:53 2019 +0100

    logging: ensure virtlogd rollover takes priority over logrotate
    
    The virtlogd config is set to rollover logs every 2 MB.
    
    Normally a logrotate config file is also installed to handle cases where
    virtlogd is disabled. This is set to rollover weekly with no size
    constraint.
    
    As a result logrotate can interfere with virtlogd's, rolling over files
    that virtlogd has already taken care of.
    
    This changes logrotate configs to rollover based on a max size
    constraint of 2 MB + 1 byte. When virtlogd is running the log files will
    never get this large, making logrotate a no-op.
    
    If the user changes the size in virtlogd's config to something larger,
    they are responsible for also changing the logrotate config suitably.
    
    The LXC/libxl drivers don't use virtlogd, but there logrotate config is
    altered to match the QEMU driver config, for the sake of consistency.
    
    Reviewed-by: Jim Fehlig <jfehlig>
    Signed-off-by: Daniel P. Berrangé <berrange>

This merged in v5.6.0, so we got it via the recent rebase

Comment 8 chhu 2019-09-25 01:04:54 UTC
Reproduced on package:
libvirt-daemon-kvm-4.5.0-35.module+el8.1.0+4227+b2722cb3.x86_64

Test steps:
1. Check the lograte conf file libvirtd.qemu. It's weekly and without file size.
# cat /etc/logrotate.d/libvirtd.qemu
/var/log/libvirt/qemu/*.log {
        weekly
        missingok
        rotate 4
        compress
        delaycompress
        copytruncate
}

Tested on package:
libvirt-daemon-kvm-5.6.0-6.module+el8.1.0+4244+9aa4e6bb.x86_64

Test steps:
1. Check the lograte conf file libvirtd.qemu. It's with size 2097153,
which is max_size [in virtlogd.conf]+1.

# cat /etc/libvirt/virtlogd.conf |grep max_size
#max_size = 2097152

# cat /etc/logrotate.d/libvirtd.qemu
/var/log/libvirt/qemu/*.log {
        size 2097153
        missingok
        rotate 4
        compress
        delaycompress
        copytruncate
}

2. Start/stop VM for 5000 times, and check the logs: the virtlogd max_size work well,
   no logrotate log file created.
 # for i in {1..5000}; do virsh destroy r8t; sleep 2; virsh start r8t; done
 # ls -lh /var/log/libvirt/qemu/ | grep r8t
-rw-------. 1 root root 954K Sep 24 12:35 r8t.log
-rw-------. 1 root root 2.0M Sep 24 12:21 r8t.log.0
-rw-------. 1 root root 2.0M Sep 24 11:51 r8t.log.1
-rw-------. 1 root root 2.0M Sep 24 11:22 r8t.log.2

Comment 9 chhu 2019-09-25 01:12:22 UTC
According to comment8, set the bug status to verified.

Comment 11 errata-xmlrpc 2019-11-06 07:11:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:3723


Note You need to log in before you can comment on or make changes to this bug.