1439910 – Large partitions make Cassandra unstable and cause requests to fail in Hawkular Metric

Bug 1439910 - Large partitions make Cassandra unstable and cause requests to fail in Hawkular Metric

Summary: Large partitions make Cassandra unstable and cause requests to fail in Hawkul...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Hawkular
Sub Component:
Version:	3.4.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	3.4.z
Assignee:	Matt Wringe
QA Contact:	Liming Zhou
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1440548 (view as bug list)
Depends On:	1422271 1439912
Blocks:	1439852
TreeView+	depends on / blocked

Reported:	2017-04-06 20:00 UTC by John Sanda
Modified:	2023-09-14 03:56 UTC (History)
CC List:	19 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:	1422271
Environment:
Last Closed:	2017-05-18 09:27:59 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Issue Tracker	HWKMETRICS-606	0	Major	Closed	Large (> 100 MB) partitions in metrics_idx table make Cassandra unstable	2020-05-21 16:00:11 UTC
Red Hat Product Errata	RHBA-2017:1235	0	normal	SHIPPED_LIVE	OpenShift Container Platform 3.5, 3.4, 3.3, and 3.1 bug fix update	2017-05-18 13:15:52 UTC

Comment 3 Matt Wringe 2017-04-27 20:35:20 UTC

*** Bug 1440548 has been marked as a duplicate of this bug. ***

Comment 6 Liming Zhou 2017-05-10 08:53:22 UTC

@mwringe,

This bug is similiar with bug "1439912" with different OCP version, in that bug, @juzhao is asking if the following scenario is ok to cover the test for bug:
###
Thanks a lot, I see compaction_large_partition_warning_threshold_mb=100 in hawkular-cassandra pod log, I think we can verify this fix by the following steps:

1. Create a lot of projects to consume memory, CPU and network resources, so data can be kept in cassandra partition.

2. Check the hawkular-cassandra and hawkular-metrics pod logs, make sure there are no such warn info
"WARN  18:29:53 Writing large partition hawkular_metrics/metrics_idx:ops-health-monitoring:2 (****** bytes)"

Do you think my solution is well enough to verify this defect?
###
So my question is also does above steps ok to verify the bug?

Thanks,
lizhou

Comment 7 Matt Wringe 2017-05-10 18:43:16 UTC

@jsanda: can you provide a test case which can be used to verify this is fixed?

Comment 10 Junqi Zhao 2017-05-16 05:49:25 UTC

Vlaad(vlaad) created 6500 pods and deleted them under one project, and I checked the hawkular-cassandra and hawkular-metrics pod logs, there were no such warn info exists:
"WARN  18:29:53 Writing large partition hawkular_metrics/metrics_idx:ops-health-monitoring:2 (****** bytes)"

Set it to VERIFIED

Comment 12 errata-xmlrpc 2017-05-18 09:27:59 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1235

Comment 13 Red Hat Bugzilla 2023-09-14 03:56:05 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days

Note You need to log in before you can comment on or make changes to this bug.