Bug 1584946 - Logs under /var/log/containers/httpd are not correctly rotated
Summary: Logs under /var/log/containers/httpd are not correctly rotated
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: puppet-tripleo
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ga
: 13.0 (Queens)
Assignee: Michele Baldessari
QA Contact: pkomarov
URL:
Whiteboard:
: 1585534 (view as bug list)
Depends On:
Blocks: 1581780 1589836
TreeView+ depends on / blocked
 
Reported: 2018-06-01 02:26 UTC by Chen
Modified: 2022-08-09 10:36 UTC (History)
21 users (show)

Fixed In Version: puppet-tripleo-8.3.2-8.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-06-27 13:57:08 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1776533 0 None None None 2018-06-13 08:15:51 UTC
OpenStack gerrit 574835 0 'None' MERGED Fix logrotate_crond issues 2021-01-27 02:20:38 UTC
OpenStack gerrit 575525 0 'None' MERGED Fix logrotate_crond issues 2021-01-27 02:20:38 UTC
OpenStack gerrit 575526 0 'None' MERGED Fix logrotate_crond issues 2021-01-27 02:21:21 UTC
Red Hat Issue Tracker OSP-9113 0 None None None 2022-08-09 10:36:26 UTC
Red Hat Product Errata RHEA-2018:2086 0 None None None 2018-06-27 13:58:54 UTC

Description Chen 2018-06-01 02:26:28 UTC
Description of problem:

We can see access.log.1 but we don't have normal access.log without any suffix.

ls -la /var/log/containers/httpd/*/*
-rw-r--r--. 1 root root 52046652 May 29 14:10 /var/log/containers/httpd/aodh-api/aodh_wsgi_access.log.1
-rw-r--r--. 1 root root        0 May 24 19:14 /var/log/containers/httpd/aodh-api/aodh_wsgi_error.log
-rw-r--r--. 1 root root     5894 May 24 19:14 /var/log/containers/httpd/aodh-api/error_log
-rw-r--r--. 1 root root 50755274 May 29 14:10 /var/log/containers/httpd/cinder-api/cinder_wsgi_access.log.1
-rw-r--r--. 1 root root     4138 May 25 11:58 /var/log/containers/httpd/cinder-api/cinder_wsgi_error.log
-rw-r--r--. 1 root root     5894 May 24 19:13 /var/log/containers/httpd/cinder-api/error_log

Version-Release number of selected component (if applicable):

OSP13 beta

How reproducible:

100%

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 3 Chen 2018-06-08 07:41:18 UTC
Hi team,

Please disregard comment #2.

Also if I wrongly set a blocker flag please forgive me.

This issue is reproducible in OSP13 beta and this seems critical to the production environment if logs under /var/log/containers/httpd can not be correctly rotated. 

Best Regards,
Chen

Comment 4 Keigo Noha 2018-06-11 04:33:09 UTC
Hello Emilien,

My customer hits this issue.
I change this case priority/severity to high.

This issue also has an impact to all customers in RHOSP13.
Could you consider to fix this issue a.s.a.p?

Best Regards,
Keigo Noha

Comment 5 Steve Baker 2018-06-11 20:51:28 UTC
Assigning to PIDONE for triage

Comment 6 Damien Ciabrini 2018-06-12 10:48:04 UTC
So currently the logrotate_crond container has the following /etc/logrotate-crond.conf configuration:
  postrotate
    /usr/bin/find /var/log/containers -type f \
    \( -mtime +14 -or \
    -atime +14 -or \
    -ctime +14 \) -exec rm -f {} \;;
    /sbin/lsof -nPs +L1 +D /var/log 2>&1|\
    awk '/\S+\s+[0-9]+\s.*\/var\/log\/.*\(deleted\)/ {print $2}' |\
    /bin/xargs -n1 -r -t kill -HUP
  endscript

This seems to not work correctly when run inside the privileged logrotate_crond container:
[root@overcloud-controller-0 crond]# docker exec -it logrotate_crond /bin/bash -c '/sbin/lsof -nPs +L1  +D /var/log 2>&1' | grep -e '/var/log/containers' | wc -l
0

On the host we get:
[root@overcloud-controller-0  crond]# /bin/bash -c '/sbin/lsof -nPs +L1  +D /var/log 2>&1' |  grep -e '/var/log/containers' | wc -l
288

It seems that inside the container we also need to add and additional +D /var/log/containers in order for /var/log/containers to match the right files:
[root@overcloud-controller-0 crond]# docker exec -it logrotate_crond /bin/bash -c '/sbin/lsof -nPs +L1  +D /var/log +D /var/log/containers 2>&1' | grep -e '/var/log/containers' | wc -l
289

On the host:
[root@overcloud-controller-0 crond]# /bin/bash -c '/sbin/lsof -nPs +L1  +D /var/log +D /var/log/containers 2>&1' | grep -e '/var/log/containers'  | wc -l
289

Also the current script, seems to send multiple SIGHUPs in the postrotate script and should probably sort -u the list returned by awk.

Comment 9 Michele Baldessari 2018-06-13 08:15:00 UTC
*** Bug 1585534 has been marked as a duplicate of this bug. ***

Comment 19 pkomarov 2018-06-18 07:42:53 UTC
Verified, 

$ rhos-release -L
Installed repositories (rhel-7.5):
  13
  ceph-3
  ceph-osd-3
  rhel-7.5
$ cat core_puddle_version
2018-06-15.1



commit verification:

testing gerrit : https://code.engineering.redhat.com/gerrit/#/c/141691/1//COMMIT_MSG

1)

[root@controller-0 ~]# docker ps|grep key
2d5e222db618        192.168.24.1:8787/rhosp13/openstack-keystone:2018-06-15.1                    "/bin/bash -c '/us..."   18 hours ago        Up 18 hours                                 keystone_cron
0e4639066ac9        192.168.24.1:8787/rhosp13/openstack-keystone:2018-06-15.1                    "kolla_start"            18 hours ago        Up 18 hours (healthy)                       keystone
[root@controller-0 ~]# docker stop 0e4639066ac9 2d5e222db618
0e4639066ac9
2d5e222db618
[root@controller-0 ~]# docker ps|grep key

[root@controller-0 ~]# du -sch /var/log/containers/keystone/keystone.log
3.9M    /var/log/containers/keystone/keystone.log
3.9M    total
[root@controller-0 ~]# mv /var/log/containers/keystone/keystone.log /tmp
[root@controller-0 ~]# fallocate -l 22M /var/log/containers/keystone/keystone.log
[root@controller-0 ~]# du -sch /var/log/containers/keystone/keystone.log
22M     /var/log/containers/keystone/keystone.log
22M     total
[root@controller-0 ~]#chown 42425.42425 /var/log/containers/keystone/keystone.log
[root@controller-0 ~]# docker start 0e4639066ac9 2d5e222db618
0e4639066ac9
2d5e222db618
[root@controller-0 ~]# tailf /var/log/containers/keystone/keystone.log
2018-06-18 07:09:16.037 24 INFO keystone.common.wsgi [req-356777dd-f5d6-4291-90a6-15547d4955cd - - - - -] GET http://controller-0.internalapi.localdomain/v3/
2018-06-18 07:09:16.185 19 INFO keystone.common.wsgi [req-41c901ca-828f-42db-9ffb-edcd5febafdb - - - - -] GET http://controller-0.ctlplane.localdomain/v3/
2018-06-18 07:09:16.224 22 INFO keystone.common.wsgi [req-20dea5e5-dc95-4b66-a41d-7fead8d488b2 - - - - -] GET http://controller-0.internalapi.localdomain/v3/
[root@controller-0 ~]# docker ps|grep log
89c18be01506        192.168.24.1:8787/rhosp13/openstack-cron:2018-06-15.1                        "kolla_start"            19 hours ago        Up 19 hours                                  logrotate_crond
[root@controller-0 ~]# docker exec -it logrotate_crond  bash
()[root@89c18be01506 /]# cat /etc/logrotate-crond.conf
/var/log/containers/*/*log /var/log/containers/*/*/*log {
  rotate 14
  size 10M
  missingok
  notifempty

  delaycompress

  postrotate
    /sbin/lsof -nPs +L1 +D /var/log/containers 2>&1|\
    awk '/\S+\s+[0-9]+\s.*\/var\/log\/containers\/.*\(deleted\)/ {print $2}' |\
    sort -u | /bin/xargs -n1 -r -t kill -HUP
  endscript
}

()[root@89c18be01506 /]# /usr/sbin/logrotate -s /var/lib/logrotate/logrotate-crond.status /etc/logrotate-crond.conf
()[root@89c18be01506 /]# exit
[root@controller-0 ~]# ll /var/log/containers/keystone/
total 86848
-rw-r--r--. 1 42425 42425    23635 Jun 18 07:24 keystone.log
-rw-r--r--. 1 42425 42425 23962612 Jun 18 07:23 keystone.log.1
-rw-r--r--. 1 42425 42425 10649406 Jun 18 05:00 keystone.log.2
-rw-r--r--. 1 42425 42425 10703510 Jun 17 23:01 keystone.log.3
-rw-r--r--. 1 42425 42425 10926984 Jun 17 17:00 keystone.log.4
-rw-r--r--. 1 42425 42425        0 Jun 17 13:01 keystone-tokenflush.log
[root@controller-0 ~]# date
Mon Jun 18 07:24:44 UTC 2018

2)

[root@controller-0 ~]# systemctl status rsyslog
● rsyslog.service - System Logging Service
   Loaded: loaded (/usr/lib/systemd/system/rsyslog.service; enabled; vendor preset: enabled)
   Active: active (running) since Sun 2018-06-17 11:28:39 UTC; 19h ago
     Docs: man:rsyslogd(8)
           http://www.rsyslog.com/doc/
 Main PID: 1786 (rsyslogd)
    Tasks: 3
   Memory: 198.2M
   CGroup: /system.slice/rsyslog.service
           └─1786 /usr/sbin/rsyslogd -n

Jun 18 04:40:57 controller-0 rsyslogd[1786]: imjournal: journal reloaded... [v8.24.0 try http://www.rsyslog.com/e/0 ]
Jun 18 04:40:57 controller-0 rsyslogd[1786]: imjournal: journal reloaded... [v8.24.0 try http://www.rsyslog.com/e/0 ]
Jun 18 04:40:57 controller-0 rsyslogd[1786]: imjournal: journal reloaded... [v8.24.0 try http://www.rsyslog.com/e/0 ]
Jun 18 04:40:57 controller-0 rsyslogd[1786]: imjournal: journal reloaded... [v8.24.0 try http://www.rsyslog.com/e/0 ]
Jun 18 06:29:42 controller-0 rsyslogd[1786]: imjournal: journal reloaded... [v8.24.0 try http://www.rsyslog.com/e/0 ]
Jun 18 06:29:42 controller-0 rsyslogd[1786]: imjournal: journal reloaded... [v8.24.0 try http://www.rsyslog.com/e/0 ]
Jun 18 06:29:42 controller-0 rsyslogd[1786]: imjournal: journal reloaded... [v8.24.0 try http://www.rsyslog.com/e/0 ]
Jun 18 06:29:42 controller-0 rsyslogd[1786]: imjournal: journal reloaded... [v8.24.0 try http://www.rsyslog.com/e/0 ]
Jun 18 06:29:42 controller-0 rsyslogd[1786]: imjournal: journal reloaded... [v8.24.0 try http://www.rsyslog.com/e/0 ]
Jun 18 06:29:42 controller-0 rsyslogd[1786]: imjournal: journal reloaded... [v8.24.0 try http://www.rsyslog.com/e/0 ]
[root@controller-0 ~]# systemctl stop rsyslog

[root@controller-0 ~]# mv /var/log/messages /tmp/
[root@controller-0 ~]# fallocate -l 22M /var/log/messages
[root@controller-0 ~]# du -sch /var/log/messages
22M     /var/log/messages
22M     total
[root@controller-0 ~]# ll /tmp/messages
-rw-------. 1 root root 194721108 Jun 18 07:26 /tmp/messages
[root@controller-0 ~]# ll /var/log/messages
-rw-r--r--. 1 root root 23068672 Jun 18 07:28 /var/log/messages
[root@controller-0 ~]# systemctl start rsyslog
[root@controller-0 ~]# systemctl status rsyslog
● rsyslog.service - System Logging Service
   Loaded: loaded (/usr/lib/systemd/system/rsyslog.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2018-06-18 07:28:48 UTC; 5s ago
     Docs: man:rsyslogd(8)
           http://www.rsyslog.com/doc/
 Main PID: 940235 (rsyslogd)
    Tasks: 3
   Memory: 1.4M
   CGroup: /system.slice/rsyslog.service
           └─940235 /usr/sbin/rsyslogd -n

Jun 18 07:28:48 controller-0 systemd[1]: Starting System Logging Service...
Jun 18 07:28:48 controller-0 rsyslogd[940235]:  [origin software="rsyslogd" swVersion="8.24.0" x-pid="940235" x-info="http://www... start
Jun 18 07:28:48 controller-0 systemd[1]: Started System Logging Service.
Hint: Some lines were ellipsized, use -l to show in full.
[root@controller-0 ~]# tailf /var/log/messages
Jun 18 07:28:53 controller-0 dockerd-current: time="2018-06-18T07:28:53.684690563Z" level=debug msg="attach: stdout: end"
Jun 18 07:28:53 controller-0 dockerd-current: time="2018-06-18T07:28:53.684759821Z" level=debug msg="Health check for container 93c8bf9feec34233aebb9c1f46885760a76b66c79491bb01765345cc5741b671 done (exitCode=0)"
Jun 18 07:28:53 controller-0 dockerd-current: time="2018-06-18T07:28:53.765920525Z" level=debug msg="containerd: process exited" id=7d2e2c32d5b0524f5b9b33de4d5ee571a4bcbe8bf5256225f1f85316bdcefe48 pid=a037348ddcb4474b621abb6b7bc79f40fef24cccf80283414cb2d29b812e5115 status=0 systemPid=940848
Jun 18 07:28:53 controller-0 dockerd-current: time="2018-06-18T07:28:53.767201279Z" level=debug msg="libcontainerd: received containerd event: &types.Event{Type:\"exit\", Id:\"7d2e2c32d5b0524f5b9b33de4d5ee571a4bcbe8bf5256225f1f85316bdcefe48\", Status:0x0, Pid:\"a037348ddcb4474b621abb6b7bc79f40fef24cccf80283414cb2d29b812e5115\", Timestamp:(*timestamp.Timestamp)(0xc422008f90)}"
Jun 18 07:28:53 controller-0 dockerd-current: time="2018-06-18T07:28:53.767487966Z" level=debug msg="attach: stderr: end"
Jun 18 07:28:53 controller-0 dockerd-current: time="2018-06-18T07:28:53.767522495Z" level=debug msg="attach: stdout: end"
Jun 18 07:28:53 controller-0 dockerd-current: time="2018-06-18T07:28:53.767575362Z" level=debug msg="Health check for container 7d2e2c32d5b0524f5b9b33de4d5ee571a4bcbe8bf5256225f1f85316bdcefe48 done (exitCode=0)"
Jun 18 07:28:54 controller-0 haproxy[11]: Connect from 192.168.24.7:48900 to 192.168.24.10:35357 (keystone_admin/HTTP)
Jun 18 07:28:55 controller-0 haproxy[11]: 172.17.1.10:54830 [18/Jun/2018:07:28:54.809] nova_osapi nova_osapi/controller-1.internalapi.localdomain 0/0/0/772/772 200 398 - - ---- 165/1/0/1/0 0/0 "GET /v2.1/servers/detail?all_tenants=True&changes-since=2018-06-18T07%3A23%3A55.066921%2B00%3A00 HTTP/1.1"
Jun 18 07:28:57 controller-0 haproxy[11]: Connect from 192.168.24.10:48472 to 192.168.24.10:35357 (keystone_admin/HTTP)

[root@controller-0 ~]#docker exec -it logrotate_crond  bash
()[root@89c18be01506 /]#/usr/sbin/logrotate -s /var/lib/logrotate/logrotate-crond.status /etc/logrotate-crond.conf

Observed that no SIGHUP was sent to rsyslog on the host

[root@controller-0 ~]# systemctl status rsyslog
● rsyslog.service - System Logging Service
   Loaded: loaded (/usr/lib/systemd/system/rsyslog.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2018-06-18 07:28:48 UTC; 12min ago
     Docs: man:rsyslogd(8)
           http://www.rsyslog.com/doc/
 Main PID: 940235 (rsyslogd)
    Tasks: 3
   Memory: 3.5M
   CGroup: /system.slice/rsyslog.service
           └─940235 /usr/sbin/rsyslogd -n

Jun 18 07:28:48 controller-0 systemd[1]: Starting System Logging Service...
Jun 18 07:28:48 controller-0 rsyslogd[940235]:  [origin software="rsyslogd" swVersion="8.24.0" x-pid="940235" x-info="http://www.rsyslog....] start
Jun 18 07:28:48 controller-0 systemd[1]: Started System Logging Service.
Hint: Some lines were ellipsized, use -l to show in full.



OVS service basic verification: 

[root@controller-0 openvswitch]# date
Mon Jun 18 06:54:01 UTC 2018

[root@controller-0 openvswitch]# ll /var/log/openvswitch/
total 8
-rw-r-----. 1 openvswitch hugetlbfs    0 Jun 18 03:44 ovsdb-server.log
-rw-r-----. 1 openvswitch hugetlbfs  290 Jun 17 11:28 ovsdb-server.log-20180618.gz
-rw-r-----. 1 openvswitch hugetlbfs    0 Jun 18 03:44 ovs-vswitchd.log
-rw-r-----. 1 openvswitch hugetlbfs 1675 Jun 17 14:59 ovs-vswitchd.log-20180618.gz

[root@controller-0 openvswitch]# ll /var/run/openvswitch
total 12
srwxr-x---. 1 openvswitch hugetlbfs  0 Jun 17 12:10 br-ex.mgmt
srwxr-x---. 1 openvswitch hugetlbfs  0 Jun 17 11:29 br-ex.snoop
srwxr-x---. 1 openvswitch hugetlbfs  0 Jun 17 12:10 br-int.mgmt
srwxr-x---. 1 openvswitch hugetlbfs  0 Jun 17 12:10 br-int.snoop
srwxr-x---. 1 openvswitch hugetlbfs  0 Jun 17 12:10 br-isolated.mgmt
srwxr-x---. 1 openvswitch hugetlbfs  0 Jun 17 11:29 br-isolated.snoop
srwxr-x---. 1 openvswitch hugetlbfs  0 Jun 17 12:10 br-tun.mgmt
srwxr-x---. 1 openvswitch hugetlbfs  0 Jun 17 12:10 br-tun.snoop
srwxr-x---. 1 openvswitch hugetlbfs  0 Jun 17 11:28 db.sock
srwxr-x---. 1 openvswitch hugetlbfs  0 Jun 17 11:28 ovsdb-server.873.ctl
-rw-r--r--. 1 openvswitch hugetlbfs  4 Jun 17 11:28 ovsdb-server.pid
srwxr-x---. 1 openvswitch hugetlbfs  0 Jun 17 11:28 ovs-vswitchd.1031.ctl
-rw-r--r--. 1 openvswitch hugetlbfs  5 Jun 17 11:28 ovs-vswitchd.pid
-rw-r--r--. 1 root        root      41 Jun 17 11:28 useropts
[root@controller-0 openvswitch]#  ovs-vsctl show
3a5e8479-a2e9-4c05-8e45-a7038737083a
    Manager "ptcp:6640:127.0.0.1"
        is_connected: true
    Bridge br-int
        Controller "tcp:127.0.0.1:6633"
            is_connected: true
        fail_mode: secure
        Port int-br-isolated
            Interface int-br-isolated
                type: patch
                options: {peer=phy-br-isolated}
        Port patch-tun
            Interface patch-tun
                type: patch
                options: {peer=patch-int}
        Port int-br-ex
            Interface int-br-ex
                type: patch
                options: {peer=phy-br-ex}
        Port br-int
            Interface br-int
                type: internal
    Bridge br-ex
        Controller "tcp:127.0.0.1:6633"
            is_connected: true
        fail_mode: secure
        Port "eth2"
            Interface "eth2"
        Port br-ex
            Interface br-ex
                type: internal
        Port phy-br-ex
            Interface phy-br-ex
                type: patch
                options: {peer=int-br-ex}
    Bridge br-tun
        Controller "tcp:127.0.0.1:6633"
            is_connected: true
        fail_mode: secure
        Port "vxlan-ac11020c"
            Interface "vxlan-ac11020c"
                type: vxlan
                options: {df_default="true", in_key=flow, local_ip="172.17.2.18", out_key=flow, remote_ip="172.17.2.12"}
        Port br-tun
            Interface br-tun
                type: internal
        Port "vxlan-ac110215"
            Interface "vxlan-ac110215"
                type: vxlan
                options: {df_default="true", in_key=flow, local_ip="172.17.2.18", out_key=flow, remote_ip="172.17.2.21"}
        Port "vxlan-ac11020d"
            Interface "vxlan-ac11020d"
                type: vxlan
                options: {df_default="true", in_key=flow, local_ip="172.17.2.18", out_key=flow, remote_ip="172.17.2.13"}
        Port patch-int
            Interface patch-int
                type: patch
                options: {peer=patch-tun}
        Port "vxlan-ac11020b"
            Interface "vxlan-ac11020b"
                type: vxlan
                options: {df_default="true", in_key=flow, local_ip="172.17.2.18", out_key=flow, remote_ip="172.17.2.11"}
    Bridge br-isolated
        Controller "tcp:127.0.0.1:6633"
            is_connected: true
        fail_mode: secure
        Port br-isolated
            Interface br-isolated
                type: internal
        Port "vlan50"
            tag: 50
            Interface "vlan50"
                type: internal
        Port "vlan40"
            tag: 40
            Interface "vlan40"
                type: internal
        Port "vlan20"
            tag: 20
            Interface "vlan20"
                type: internal
        Port "vlan30"
            tag: 30
            Interface "vlan30"
                type: internal
        Port phy-br-isolated
            Interface phy-br-isolated
                type: patch
                options: {peer=int-br-isolated}
        Port "eth1"
            Interface "eth1"
    ovs_version: "2.9.0"

Comment 22 errata-xmlrpc 2018-06-27 13:57:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:2086

Comment 23 Bogdan Dobrelya 2018-08-14 10:13:59 UTC
Please see the regression described in the linked upstream bug https://bugs.launchpad.net/tripleo/+bug/1776533/comments/12


Note You need to log in before you can comment on or make changes to this bug.