Description of problem: Openshift-ansible installed docker using default log driver json file. But the log file size wasn't limited. I found the the master-controller continue writing logs, the log file grow up to 16Gi and fill in the disk. Openshift-ansible should configure "max-size" options in log-opts to limit the log file size. Version-Release number of the following components: v3.9/v3.10 How reproducible: Always, Steps to Reproduce: 1) Deploy OCP v3.10.5 2) check the docker configure files # cat /etc/sysconfig/docker OPTIONS=' --selinux-enabled --signature-verification=False' if [ -z "${DOCKER_CERT_PATH}" ]; then DOCKER_CERT_PATH=/etc/docker fi 3) Running for serveral days, Check the log file [root@preserve-upgrader6-master-etcd-1 containers]# ls -alh lrwxrwxrwx. 1 root root 64 Jun 22 09:14 master-controllers-preserve-upgrader6-master-etcd-1_kube-system_controllers-8e06a362c8ca650eed5b3995823f512e5bb9dfe35c873979f4d04b449d2c267b.log -> /var/log/pods/f8a3ee7782c553ccbfe2ee36e97d899a/controllers/1.log [root@preserve-upgrader6-master-etcd-1 controllers]# ls -alh lrwxrwxrwx. 1 root root 165 Jun 22 09:14 1.log -> /var/lib/docker/containers/8e06a362c8ca650eed5b3995823f512e5bb9dfe35c873979f4d04b449d2c267b/8e06a362c8ca650eed5b3995823f512e5bb9dfe35c873979f4d04b449d2c267b-json.log [root@preserve-upgrader6-master-etcd-1 containers]# ls -alh 8e06a362c8ca650eed5b3995823f512e5bb9dfe35c873979f4d04b449d2c267b/8e06a362c8ca650eed5b3995823f512e5bb9dfe35c873979f4d04b449d2c267b-json.log -rw-r-----. 1 root root 16G Jun 24 23:27 8e06a362c8ca650eed5b3995823f512e5bb9dfe35c873979f4d04b449d2c267b/8e06a362c8ca650eed5b3995823f512e5bb9dfe35c873979f4d04b449d2c267b-json.log # ls -alh total 16G drwx------. 4 root root 166 Jun 24 23:27 . drwx------. 15 root root 4.0K Jun 25 06:20 .. -rw-r-----. 1 root root 16G Jun 24 23:27 8e06a362c8ca650eed5b3995823f512e5bb9dfe35c873979f4d04b449d2c267b-json.log drwx------. 2 root root 6 Jun 22 09:14 checkpoints -rw-r--r--. 1 root root 7.1K Jun 24 23:27 config.v2.json -rw-r--r--. 1 root root 1.8K Jun 24 23:27 hostconfig.json drwxr-xr-x. 3 root root 18 Jun 22 09:14 secrets Actual results: The disk is full because of the container log are too bigger. Expected results: The log size is limited.
Unsure what the right thing to do here is, deferring to logging team. If you need help implementing any change please let me know how we can help.
Scott - which team usually handles changes to docker configuration options in openshift-ansible? It isn't the logging team.
Being fixed in https://github.com/openshift/openshift-ansible/pull/8985.
While the fix gets in, there is a workaround to this problem. When running the installer add the -e flag and set openshift_docker_log_options like this: -e openshift_docker_log_options='max-size=50m'
The workaround works well. But the fix wasn't merged
Anping, the fix got merged upstream. It is being fixed in the 3.10 branch in https://github.com/openshift/openshift-ansible/pull/9197.
Fix is in the 3.10 branch also now.
The code is in openshift-ansible-roles-3.10.18-1.git.314.cfe4f91.el7.noarch. But it didn't add default max-size in /etc/sysconfig/docker
Anping, the max-size option is set to take effect only when the log-driver is set to "json", so to see it happen you need to change the log-driver also. If your default log-driver is "journald", that should automatically manage the log-size as it has that functionality.
Anping, could you retest based on Urvashi's comments?
openshift-ansible:v3.10.34 [root@qe-anli310master-etcd-zone1-1 ~]# docker info |grep Driver WARNING: You're not using the default seccomp profile Storage Driver: overlay2 Logging Driver: json-file Cgroup Driver: systemd WARNING: bridge-nf-call-iptables is disabled WARNING: bridge-nf-call-ip6tables is disabled [root@qe-anli310master-etcd-zone1-1 ~]# cat /etc/sysconfig/docker |grep OPTIONS OPTIONS=' --selinux-enabled --signature-verification=False'
Anping, thanks for the info. The code was setting max size if the driver was "json", fixing it to check for "json-file" in https://github.com/openshift/openshift-ansible/pull/9729. Should be in 3.10 soon as well.
Anping, the fix just got merged in 3.10 - https://github.com/openshift/openshift-ansible/pull/9735. Please test it out again.
@Anping, can you please test the fix out and verify? Thanks!
There is still no max-size in /etc/sysconfig/docker with the patch.
@Anping, could I please get the inventory file you are using for the test. Want to walk through it to reproduce the issue.
@Anping, could you see what the value of openshift_docker_log_driver is for you? It looks like the log-driver value is not being set and that is why the installer doesn't pick up the log options. Are you explicitly setting openshift_docker_log_driver to json-file in your inventory?
Both didn't work. It seems there isn't openshift_docker_log_driver variable in openshift_docker_log_options_defaults. container_runtime/defaults/main.yml. openshift_docker_log_options_defaults: json-file: - "max-size=50m" openshift_docker_log_options: "{{ openshift_docker_log_options_defaults[openshift_docker_log_driver] | default([]) }}"
@Anping, it is there at https://github.com/openshift/openshift-ansible/blob/master/roles/container_runtime/defaults/main.yml#L19. Can you set that in you inventory to json-file and see if it picks up the max-size?
It work when I set openshift_docker_log_driver=json-file in inventory file.
@Anping it works as intended, when you set the log-driver in the inventory to json-file, it picks up the log-options for it, which has the max-size setting. Were you expecting a different behavior?
Yes, it will be great if the playbook can detect the current docker logdriver type and set correct max-size.
@Anping so double checked and the installer is not supposed to inspect the system to check what the driver is. If you set the driver in your inventory then the installer will know whether to set the max-size or not.
@Urvashi, Let check if there is default values.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0405