Bug 1798617 - collectd coredump on a ceph/compute node filling up the / filesystem
Summary: collectd coredump on a ceph/compute node filling up the / filesystem
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: collectd
Version: 13.0 (Queens)
Hardware: All
OS: All
high
high
Target Milestone: z13
: ---
Assignee: Ryan McCabe
QA Contact: Leonid Natapov
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-02-05 16:55 UTC by David Hill
Modified: 2023-09-07 21:46 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-05-11 13:28:57 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-28332 0 None None None 2023-09-07 21:46:54 UTC
Red Hat Knowledge Base (Solution) 4855731 0 None None None 2020-03-02 18:43:46 UTC

Description David Hill 2020-02-05 16:55:57 UTC
Description of problem:
collectd coredump on a ceph/compute node filling up the / filesystem and we got the coredumps of when it happened as well as a sosreport for that compute.



                "Log": [
                    {
                        "Start": "2020-02-04T13:57:42.139076467-08:00",
                        "End": "2020-02-04T13:57:42.73061108-08:00",
                        "ExitCode": 1,
                        "Output": "ERROR: Failed to connect to daemon at unix:/var/run/collectd-socket: Connection refused.\n"
                    },
                    {
                        "Start": "2020-02-04T13:58:12.73168984-08:00",
                        "End": "2020-02-04T13:58:13.349369442-08:00",
                        "ExitCode": 1,
                        "Output": "ERROR: Failed to connect to daemon at unix:/var/run/collectd-socket: Connection refused.\n"
                    },
                    {
                        "Start": "2020-02-04T13:58:43.349623844-08:00",
                        "End": "2020-02-04T13:58:43.922973732-08:00",
                        "ExitCode": 1,
                        "Output": "ERROR: Failed to connect to daemon at unix:/var/run/collectd-socket: Connection refused.\n"
                    },
                    {
                        "Start": "2020-02-04T13:59:13.923192814-08:00",
                        "End": "2020-02-04T13:59:14.532747201-08:00",
                        "ExitCode": 1,
                        "Output": "ERROR: Failed to connect to daemon at unix:/var/run/collectd-socket: Connection refused.\n"
                    },
                    {
                        "Start": "2020-02-04T13:59:44.533008885-08:00",
                        "End": "2020-02-04T13:59:45.070755975-08:00",
                        "ExitCode": 1,
                        "Output": "ERROR: Failed to connect to daemon at unix:/var/run/collectd-socket: Connection refused.\n"
                    }
                ]



                "architecture": "x86_64",
                "authoritative-source-url": "registry.access.redhat.com",
                "batch": "20190224.1",
                "build-date": "2019-04-09T13:29:51.238587",
                "com.redhat.build-host": "cpt-0013.osbs.prod.upshift.rdu2.redhat.com",
                "com.redhat.component": "openstack-collectd-container",
                "com.redhat.license_terms": "https://www.redhat.com/licenses/eulas",
                "config_data": "{\"healthcheck\": {\"test\": \"/openstack/healthcheck\"}, \"image\": \"satpol01.mgmt:5000/t-mobile_magentabox-production-composite_openstack_13-osp13_containers-collectd:latest\", \"pid\": \"host\", \"environment\": [\"KOLLA_CONFIG_STRATEGY=COPY_ALWAYS\", \"TRIPLEO_CONFIG_HASH=17cbf15232999c8cd2ddfba815cea138\"], \"user\": \"root\", \"volumes\": [\"/etc/hosts:/etc/hosts:ro\", \"/etc/localtime:/etc/localtime:ro\", \"/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro\", \"/etc/pki/ca-trust/source/anchors:/etc/pki/ca-trust/source/anchors:ro\", \"/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro\", \"/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro\", \"/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro\", \"/dev/log:/dev/log\", \"/etc/ssh/ssh_known_hosts:/etc/ssh/ssh_known_hosts:ro\", \"/etc/puppet:/etc/puppet:ro\", \"/var/lib/kolla/config_files/collectd.json:/var/lib/kolla/config_files/config.json:ro\", \"/var/lib/config-data/puppet-generated/collectd/:/var/lib/kolla/config_files/src:ro\", \"/var/log/containers/collectd:/var/log/collectd:rw\", \"/var/run/openvswitch:/var/run/openvswitch:ro\", \"/var/run/ceph:/var/run/ceph:ro\", \"/var/run/libvirt:/var/run/libvirt:ro\"], \"net\": \"host\", \"privileged\": true, \"restart\": \"always\"}",
                "config_id": "tripleo_step5",
                "container_name": "collectd",
                "description": "Red Hat OpenStack Platform 13.0 collectd",
                "distribution-scope": "public",
                "io.k8s.description": "Red Hat OpenStack Platform 13.0 collectd",
                "io.k8s.display-name": "Red Hat OpenStack Platform 13.0 collectd",
                "io.openshift.tags": "rhosp osp openstack osp-13.0",
                "managed_by": "paunch",
                "name": "rhosp13/openstack-collectd",
                "release": "61.1554788831",
                "summary": "Red Hat OpenStack Platform 13.0 collectd",
                "url": "https://access.redhat.com/containers/#/registry.access.redhat.com/rhosp13/openstack-collectd/images/13.0-61.1554788831",
                "vcs-ref": "33ba785c229d2855db5cd8e83f2303fbf12c450f",
                "vcs-type": "git",
                "vendor": "Red Hat, Inc.",
                "version": "13.0"

Version-Release number of selected component (if applicable):


How reproducible:
This time

Steps to Reproduce:
1. It coredumped
2.
3.

Actual results:
It coredumped

Expected results:
Shouldn't coredump

Additional info:

Comment 5 Matthias Runge 2020-05-06 15:25:45 UTC
Is this still an issue or can we close this? It isn't really reproducible from our side.

Comment 6 David Hill 2020-05-11 12:11:00 UTC
Case is closed so go ahead, close this BZ too .

Comment 7 Matthias Runge 2020-05-11 13:28:57 UTC
thank you, will do.

Comment 8 Alexander Stafeyev 2020-11-26 08:13:37 UTC
(In reply to Matthias Runge from comment #5)
> Is this still an issue or can we close this? It isn't really reproducible
> from our side.

Hi Matthias, on which zstream you tried to reproduce ? 
Thanks


Note You need to log in before you can comment on or make changes to this bug.