Bug 1564179
Summary: | first master's docker-daemon has higher debug level than the rest of the cluster | ||||||
---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Nicholas Schuetz <nick> | ||||
Component: | Installer | Assignee: | Jay Boyd <jaboyd> | ||||
Status: | CLOSED ERRATA | QA Contact: | Weihua Meng <wmeng> | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 3.9.0 | CC: | aos-bugs, jokerman, mmccomas, nick, smunilla, wsun | ||||
Target Milestone: | --- | Flags: | wmeng:
needinfo-
|
||||
Target Release: | 3.9.z | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: |
Cause: Service Catalog pods had a high log verbosity set by default.
Consequence: Service Catalog pods on master node produced large amount of log data.
Fix: default log verbosity reset to a lower level.
|
Story Points: | --- | ||||
Clone Of: | Environment: | ||||||
Last Closed: | 2018-05-17 06:43:35 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Nicholas Schuetz
2018-04-05 14:57:06 UTC
It's worth noting that this occurs whether i use the first master as the install host or not. I've got the same result when using an external bastion host (not on a cluster member node). Also, i use 10GB root vol sizes... so it's possible that there is a weekly job that "vacuums" the journal and docker is filling it up before that can occur (within a couple of days). This will likely be an issue on AWS/Cloud installs where the default root disk is also 10GB in size. The journal is filling up at a rate of 528.0M / per day. Again, this is default behavior and i have to add a script in /etc/cron.daily to keep it from causing master01 to fail after a few days. Have you been able to confirm whether or not this is triggered by the installer? I don't see any code in 3.9 that would trigger docker logging to be set to debug level. Which specific option is set incorrectly? Can you provide your inventory so we have a proposed reproducer? hosts file used for deployment below. [OSEv3:children] masters nodes new_nodes etcd lb glusterfs ## Set variables common for all OSEv3 hosts [OSEv3:vars] openshift_deployment_type=openshift-enterprise #openshift_deployment_type=origin #containerized=true ##internal image repos ##openshift_additional_repos=[{'id': 'ose-devel', 'name': 'rhaos-3.9', 'baseurl': 'http://repo.home.nicknach.net/repo/rhaos-3.9', 'enabled': 1, 'gpgcheck': 0}] openshift_docker_additional_registries=repo.home.nicknach.net openshift_docker_insecure_registries=repo.home.nicknach.net openshift_docker_blocked_registries=registry.access.redhat.com,docker.io oreg_url=repo.home.nicknach.net/openshift3/ose-${component}:${version} openshift_examples_modify_imagestreams=true openshift_metrics_image_prefix=repo.home.nicknach.net/openshift3/ openshift_metrics_image_version=v3.9.14 openshift_logging_image_prefix=repo.home.nicknach.net/openshift3/ openshift_logging_image_version=v3.9.14 ansible_service_broker_image_prefix=repo.home.nicknach.net/openshift3/ose- ansible_service_broker_image_tag=v3.9.14 ansible_service_broker_etcd_image_prefix=repo.home.nicknach.net/rhel7/ ansible_service_broker_etcd_image_tag=latest openshift_service_catalog_image_prefix=repo.home.nicknach.net/openshift3/ose- openshift_service_catalog_image_version=v3.9.14 openshift_cockpit_deployer_prefix=repo.home.nicknach.net/openshift3/ openshift_web_console_prefix=repo.home.nicknach.net/openshift3/ose- openshift_web_console_version=v3.9.14 openshift_prometheus_image_prefix=repo.home.nicknach.net/openshift3/ openshift_prometheus_image_version=v3.9.14 openshift_prometheus_alertmanager_image_prefix=repo.home.nicknach.net/openshift3/ openshift_prometheus_alertmanager_image_version=v3.9.14 openshift_prometheus_alertbuffer_image_prefix=repo.home.nicknach.net/openshift3/ openshift_prometheus_alertbuffer_image_version=v3.9.14 openshift_prometheus_node_exporter_image_prefix=repo.home.nicknach.net/openshift3/ openshift_prometheus_node_exporter_image_version=v3.9.14 openshift_prometheus_proxy_image_prefix=repo.home.nicknach.net/openshift3/ openshift_prometheus_proxy_image_version=v3.9.14 template_service_broker_prefix=repo.home.nicknach.net/openshift3/ose- template_service_broker_version=v3.9.14 openshift_storage_glusterfs_image=repo.home.nicknach.net/rhgs3/rhgs-server-rhel7 openshift_storage_glusterfs_version=latest openshift_storage_glusterfs_heketi_image=repo.home.nicknach.net/rhgs3/rhgs-volmanager-rhel7 openshift_storage_glusterfs_heketi_version=latest # release ver #openshift_release=v3.9.14 #openshift_image_tag=v3.9.14 ## enable ntp #openshift_clock_enabled=false ## disable template imports #openshift_install_examples=false ## If ansible_ssh_user is not root, ansible_sudo must be set to true ansible_ssh_user=root #ansible_ssh_user=cloud-user #ansible_sudo=true #ansible_become=yes ## authentication stuff ## htpasswd file auth #openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider', 'filename': '/etc/origin/master/htpasswd'}] #openshift_master_htpasswd_users={'ocpuser':'welcome1'} ## ldap auth (AD) #openshift_master_identity_providers=[{"name":"NNWIN","challenge":true,"login":true,"kind":"LDAPPasswordIdentityProvider","attributes":{"id":["dn"],"email":["mail"],"name":["cn"],"preferredUsername":["sAMAccountName"]},"bindDN":"CN=SVC-nn-ose,OU=SVC,OU=FNA,DC=nnwin,DC=ad,DC=nncorp,DC=com","bindPassword":"<REDACTED>","insecure":true,"url":"ldap://uswin.nicknach.com:389/DC=uswin,DC=ad,DC=nncorp,DC=com?sAMAccountName?sub"}] #openshift_master_ldap_ca_file=/etc/ssl/certs/NNWINDC_Cert_Chain.pem ## ldap auth (IPA) openshift_master_identity_providers=[{"name":"myipa","challenge":true,"login":true,"kind":"LDAPPasswordIdentityProvider","attributes":{"id":["dn"],"email":["mail"],"name":["cn"],"preferredUsername":["uid"]},"bindDN":"","bindPassword":"","ca":"my-ldap-ca-bundle.crt","insecure":false,"url":"ldap://gw.home.nicknach.net/cn=users,cn=accounts,dc=home,dc=nicknach,dc=net?uid"}] openshift_master_ldap_ca_file=~/my-ldap-ca-bundle.crt #openshift_master_named_certificates=[{"certfile": "/etc/origin/master/ocp.nicknach.net.crt", "keyfile": "/etc/origin/master/ocp.nicknach.net.key", "names": ["console.ocp.nicknach.net"]}] #openshift_master_overwrite_named_certificates=false ## registry on nfs openshift_hosted_registry_storage_kind=nfs openshift_hosted_registry_storage_access_modes=['ReadWriteMany'] openshift_hosted_registry_storage_host=storage.home.nicknach.net openshift_hosted_registry_storage_nfs_directory=/data/openshift/enterprise #openshift_hosted_registry_storage_nfs_options='*(rw,root_squash,sync,no_wdelay)' openshift_hosted_registry_storage_volume_name=docker-registry openshift_hosted_registry_storage_volume_size=20Gi # etcd on nfs openshift_hosted_etcd_storage_kind=nfs openshift_hosted_etcd_storage_access_modes=["ReadWriteOnce"] openshift_hosted_etcd_storage_host=storage.home.nicknach.net openshift_hosted_etcd_storage_nfs_directory=/data/openshift/enterprise #openshift_hosted_etcd_storage_nfs_options="*(rw,root_squash,sync,no_wdelay)" openshift_hosted_etcd_storage_volume_name=etcd openshift_hosted_etcd_storage_volume_size=1Gi openshift_hosted_etcd_storage_labels={'storage':'etcd'} # logging on nfs openshift_logging_install_logging=true openshift_logging_storage_kind=nfs openshift_logging_storage_access_modes=['ReadWriteOnce'] openshift_logging_storage_host=storage.home.nicknach.net openshift_logging_storage_nfs_directory=/data/openshift/enterprise #openshift_logging_storage_nfs_options='*(rw,root_squash,sync,no_wdelay)' openshift_logging_storage_volume_name=logging openshift_logging_storage_volume_size=10Gi openshift_logging_storage_labels={'storage':'logging'} openshift_logging_es_pv_selector=region=infra # metrics on nfs openshift_metrics_install_metrics=true openshift_metrics_storage_kind=nfs openshift_metrics_storage_access_modes=['ReadWriteOnce'] openshift_metrics_storage_host=storage.home.nicknach.net openshift_metrics_storage_nfs_directory=/data/openshift/enterprise #openshift_metrics_storage_nfs_options='*(rw,root_squash,sync,no_wdelay)' openshift_metrics_storage_volume_name=metrics openshift_metrics_storage_volume_size=15Gi openshift_metrics_storage_labels={'storage':'metrics'} openshift_metrics_hawkular_nodeselector={'region':'infra'} openshift_metrics_heapster_nodeselector={'region':'infra'} openshift_metrics_cassandra_nodeselector={'region':'infra'} # prometheus on nfs openshift_hosted_prometheus_deploy=true openshift_prometheus_storage_kind=nfs openshift_prometheus_storage_access_modes=['ReadWriteOnce'] openshift_prometheus_storage_host=storage.home.nicknach.net openshift_prometheus_storage_nfs_directory=/data/openshift/enterprise #openshift_prometheus_storage_nfs_options='*(rw,root_squash,sync,no_wdelay)' openshift_prometheus_storage_volume_name=prometheus openshift_prometheus_storage_volume_size=7Gi openshift_prometheus_storage_labels={'storage':'prometheus'} openshift_prometheus_node_selector={'region':'infra'} openshift_prometheus_storage_type='pvc' # For prometheus-alertmanager openshift_prometheus_alertmanager_storage_kind=nfs openshift_prometheus_alertmanager_storage_access_modes=['ReadWriteOnce'] openshift_prometheus_alertmanager_storage_host=storage.home.nicknach.net openshift_prometheus_alertmanager_storage_nfs_directory=/data/openshift/enterprise #openshift_prometheus_alertmanager_storage_nfs_options='*(rw,root_squash,sync,no_wdelay)' openshift_prometheus_alertmanager_storage_volume_name=prometheus-alertmanager openshift_prometheus_alertmanager_storage_volume_size=6Gi openshift_prometheus_alertmanager_storage_labels={'storage':'prometheus-alertmanager'} openshift_prometheus_alertmanager_storage_type='pvc' # For prometheus-alertbuffer openshift_prometheus_alertbuffer_storage_kind=nfs openshift_prometheus_alertbuffer_storage_access_modes=['ReadWriteOnce'] openshift_prometheus_alertbuffer_storage_host=storage.home.nicknach.net openshift_prometheus_alertbuffer_storage_nfs_directory=/data/openshift/enterprise #openshift_prometheus_alertbuffer_storage_nfs_options='*(rw,root_squash,sync,no_wdelay)' openshift_prometheus_alertbuffer_storage_volume_name=prometheus-alertbuffer openshift_prometheus_alertbuffer_storage_volume_size=5Gi openshift_prometheus_alertbuffer_storage_labels={'storage':'prometheus-alertbuffer'} openshift_prometheus_alertbuffer_storage_type='pvc' # disable checks openshift_disable_check=disk_availability,docker_storage,memory_availability,docker_image_availability,package_availability,package_version # cluster stuff (uncomment for multi-master mode) openshift_master_cluster_method=native openshift_master_cluster_hostname=api.ocp.nicknach.net openshift_master_cluster_public_hostname=console.ocp.nicknach.net ## cns openshift_storage_glusterfs_namespace=app-storage openshift_storage_glusterfs_storageclass=true #openshift_hosted_registry_storage_kind=glusterfs #openshift_metrics_install_metrics=true #openshift_metrics_storage_kind=dynamic #openshift_logging_es_pvc_size=10Gi #openshift_logging_install_logging=true #openshift_logging_storage_kind=dynamic #openshift_storage_glusterfs_block_deploy=true #openshift_storage_glusterfs_registry_namespace=infra-storage #openshift_storage_glusterfs_registry_storageclass=false #openshift_storage_glusterfs_registry_block_deploy=true #openshift_storage_glusterfs_registry_block_host_vol_size=50 #openshift_storage_glusterfs_registry_block_storageclass=true #openshift_storage_glusterfs_registry_block_storageclass_default=true #openshift_storageclass_default=false ## cloud provider configs ## AWS #openshift_cloudprovider_kind=aws #openshift_cloudprovider_aws_access_key= #openshift_cloudprovider_aws_secret_key= ## GCE #openshift_cloudprovider_kind=gce ## Openstack #openshift_cloudprovider_kind=openstack #openshift_cloudprovider_openstack_auth_url=https://controller.home.nicknach.com:35357/v2.0 #openshift_cloudprovider_openstack_username=svc-openshift-np #openshift_cloudprovider_openstack_password=kX7mE10dkX7mE10d #openshift_cloudprovider_openstack_tenant_id=f741ba7204ec47c9886c050891dd592e #openshift_cloudprovider_openstack_tenant_name=nn-dev #openshift_cloudprovider_openstack_region=RegionOne #openshift_cloudprovider_openstack_lb_subnet_id=d7c61f2a-d591-461d-af28-308ade046c0d ## set the router region #openshift_hosted_manage_router=true #openshift_hosted_router_selector=region=infra ## domain stuff openshift_master_default_subdomain=apps.ocp.nicknach.net ## network stuff #os_sdn_network_plugin_name='redhat/openshift-ovs-multitenant' # set these if you are behind a proxy #openshift_http_proxy=http://192.168.0.254:3128 #openshift_https_proxy=http://192.168.0.254:3128 #openshift_no_proxy= ## use these if there is a conflict with the docker bridge and/or SDN networks #osm_cluster_network_cidr=10.129.0.0/14 #openshift_portal_net=172.31.0.0/16 ## use these if you want to switch the console/api port to something other that 8443 #openshift_master_public_api_url=https://api.ocp.nicknach.net:443 #openshift_master_public_console_url=https://console.ocp.nicknach.net:443/console #openshift_master_api_port=443 #openshift_master_console_port=443 ## adjust max pods for scale testing #openshift_node_kubelet_args={'pods-per-core': ['15'], 'max-pods': ['500'], 'image-gc-high-threshold': ['85'], 'image-gc-low-threshold': ['80']} ## adjust scheduler #osm_controller_args={'node-monitor-period': ['2s'], 'node-monitor-grace-period': ['16s'], 'pod-eviction-timeout': ['30s']} #osm_controller_args={'resource-quota-sync-period': ['10s']} ## load balancer [lb] lb.ocp.nicknach.net ## host group for etcd (uncomment for multi-master) [etcd] master01.ocp.nicknach.net master02.ocp.nicknach.net master03.ocp.nicknach.net ## host group for masters [masters] master01.ocp.nicknach.net master02.ocp.nicknach.net master03.ocp.nicknach.net [nodes] master01.ocp.nicknach.net openshift_node_labels="{'region': 'masters', 'zone': 'a', 'role': 'master'}" openshift_schedulable=true master02.ocp.nicknach.net openshift_node_labels="{'region': 'masters', 'zone': 'a', 'role': 'master'}" openshift_schedulable=true master03.ocp.nicknach.net openshift_node_labels="{'region': 'masters', 'zone': 'a', 'role': 'master'}" openshift_schedulable=true infra01.ocp.nicknach.net openshift_node_labels="{'region': 'infra', 'zone': 'a', 'role': 'infra'}" openshift_schedulable=true infra02.ocp.nicknach.net openshift_node_labels="{'region': 'infra', 'zone': 'a', 'role': 'infra'}" openshift_schedulable=true infra03.ocp.nicknach.net openshift_node_labels="{'region': 'infra', 'zone': 'a', 'role': 'infra'}" openshift_schedulable=true node01.ocp.nicknach.net openshift_node_labels="{'region': 'primary', 'zone': 'a', 'role': 'compute'}" openshift_schedulable=true node02.ocp.nicknach.net openshift_node_labels="{'region': 'primary', 'zone': 'a', 'role': 'compute'}" openshift_schedulable=true node03.ocp.nicknach.net openshift_node_labels="{'region': 'primary', 'zone': 'a', 'role': 'compute'}" openshift_schedulable=true ## if using gluster (Container Native Storage) [glusterfs] node01.ocp.nicknach.net glusterfs_devices='[ "/dev/vdc" ]' node02.ocp.nicknach.net glusterfs_devices='[ "/dev/vdc" ]' node03.ocp.nicknach.net glusterfs_devices='[ "/dev/vdc" ]' #[glusterfs_registry] #infra01.ocp.nicknach.net glusterfs_devices='[ "/dev/vdc" ]' #infra02.ocp.nicknach.net glusterfs_devices='[ "/dev/vdc" ]' #infra03.ocp.nicknach.net glusterfs_devices='[ "/dev/vdc" ]' [new_nodes] ## hold for use when adding new nodes Created attachment 1419442 [details]
dockerd-current logs on master01
here's a sample of the excessive logs being displayed on master01
ah ha! the apiserver (deployed to master01) appears to be the culprit. kube-service-catalog apiserver-lc6gd 1/1 Running 1 4d 10.129.0.9 master01.ocp.nicknach.net This was fixed in master here https://github.com/openshift/openshift-ansible/pull/7681/files#diff-f5c4b4675369f72d180a86be3772fe87R43 Needs a backport of at least the verbosity log changes. merged on april 12 Fixed. openshift-ansible-3.9.22-1.git.0.2e15102.el7.noarch.rpm # oc describe pod apiserver-jc96t Command: /usr/bin/service-catalog Args: apiserver --storage-type etcd --secure-port 6443 --etcd-servers https://qe-wmengrpm39-master-etcd-1:2379 --etcd-cafile /etc/origin/master/master.etcd-ca.crt --etcd-certfile /etc/origin/master/master.etcd-client.crt --etcd-keyfile /etc/origin/master/master.etcd-client.key -v 3 # oc describe pod controller-manager-r9nnt Command: /usr/bin/service-catalog Args: controller-manager --port 8080 -v 3 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:1566 |