Bugzilla will be upgraded to version 5.0 on a still to be determined date in the near future. The original upgrade date has been delayed.
Bug 1613124 - Cassandra error : org.apache.cassandra.io.FSReadError: java.io.IOException: Channel not open for writing - cannot extend file to required size [NEEDINFO]
Cassandra error : org.apache.cassandra.io.FSReadError: java.io.IOException: ...
Status: CLOSED DEFERRED
Product: OpenShift Container Platform
Classification: Red Hat
Component: Hawkular (Show other bugs)
3.9.0
x86_64 Unspecified
unspecified Severity unspecified
: ---
: 3.9.z
Assigned To: John Sanda
Junqi Zhao
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2018-08-07 00:44 EDT by Venkata Tadimarri
Modified: 2018-09-24 13:47 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2018-09-24 13:47:37 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
jsanda: needinfo? (ktadimar)


Attachments (Terms of Use)

  None (edit)
Description Venkata Tadimarri 2018-08-07 00:44:07 EDT
Description of problem:

Periodic connection refused errors on metrics. A restart of cassandra pod fixes the problem. However, the same issue occurs again after approximately 6hrs where a restart of the pod is required again to bring back metrics to functionality.

Errors on the log:

WARN  [CompactionExecutor:3] 2018-08-05 21:37:34,306 BigTableWriter.java:171 - Writing large partition hawkular_metrics/metrics_idx:automation-deploy-1\:9abb8a21-9383-11e8-80d0-001a4a408a0b:0 (135955925 bytes to sstable /cassandra_data/data/hawkular_metrics/metrics_idx-d7ed7420945e11e8a51f5d38c63d79ac/mc-148-big-Data.db)
ERROR [Reference-Reaper:1] 2018-08-05 21:37:35,734 Ref.java:223 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@3c801e0c) to class org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$Tidy@843577324:[Memory@[0..14), Memory@[0..168)] was not released before the reference was garbage collected
ERROR [Reference-Reaper:1] 2018-08-05 21:37:35,734 Ref.java:223 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@3678757c) to class org.apache.cassandra.io.util.MmappedRegions$Tidier@1937440215:/cassandra_data/data/hawkular_metrics/metrics_idx-d7ed7420945e11e8a51f5d38c63d79ac/mc-148-big-Index.db was not released before the reference was garbage collected
ERROR [CompactionExecutor:3] 2018-08-05 21:37:36,042 CassandraDaemon.java:207 - Exception in thread Thread[CompactionExecutor:3,1,main]
org.apache.cassandra.io.FSReadError: java.io.IOException: Channel not open for writing - cannot extend file to required size
        at org.apache.cassandra.io.util.ChannelProxy.map(ChannelProxy.java:156) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.io.util.MmappedRegions$State.add(MmappedRegions.java:280) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.io.util.MmappedRegions$State.access$400(MmappedRegions.java:216) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.io.util.MmappedRegions.updateState(MmappedRegions.java:142) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.io.util.MmappedRegions.<init>(MmappedRegions.java:74) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.io.util.MmappedRegions.<init>(MmappedRegions.java:58) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.io.util.MmappedRegions.map(MmappedRegions.java:104) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.io.util.MmappedSegmentedFile$Builder.updateRegions(MmappedSegmentedFile.java:128) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.io.util.MmappedSegmentedFile$Builder.complete(MmappedSegmentedFile.java:111) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.io.util.SegmentedFile$Builder.complete(SegmentedFile.java:177) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.io.util.SegmentedFile$Builder.buildIndex(SegmentedFile.java:198) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.io.sstable.format.big.BigTableWriter.openEarly(BigTableWriter.java:244) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.io.sstable.SSTableRewriter.maybeReopenEarly(SSTableRewriter.java:172) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:124) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.db.compaction.writers.MaxSSTableSizeWriter.realAppend(MaxSSTableSizeWriter.java:88) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.append(CompactionAwareWriter.java:109) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:195) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:89) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:264) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_171]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_171]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_171]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_171]
        at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79) [apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_171]
Caused by: java.io.IOException: Channel not open for writing - cannot extend file to required size
        at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:901) ~[na:1.8.0_171]
        at org.apache.cassandra.io.util.ChannelProxy.map(ChannelProxy.java:152) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        ... 26 common frames omitted
ERROR [CompactionExecutor:3] 2018-08-05 21:37:36,043 StorageService.java:417 - Stopping gossiper
WARN  [CompactionExecutor:3] 2018-08-05 21:37:36,044 StorageService.java:308 - Stopping gossip by operator request
INFO  [CompactionExecutor:3] 2018-08-05 21:37:36,044 Gossiper.java:1492 - Announcing shutdown
INFO  [CompactionExecutor:3] 2018-08-05 21:37:36,044 StorageService.java:2007 - Node hawkular-cassandra-1-9r7lv/100.85.2.250 state jump to shutdown
ERROR [CompactionExecutor:3] 2018-08-05 21:37:38,045 StorageService.java:427 - Stopping native transport
INFO  [CompactionExecutor:3] 2018-08-05 21:37:38,056 Server.java:180 - Stop listening for CQL clients
WARN  [CompactionExecutor:3] 2018-08-05 21:38:10,768 BigTableWriter.java:171 - Writing large partition hawkular_metrics/metrics_idx:automation-deploy-1\:9abb8a21-9383-11e8-80d0-001a4a408a0b:0 (135955925 bytes to sstable /cassandra_data/data/hawkular_metrics/metrics_idx-d7ed7420945e11e8a51f5d38c63d79ac/mc-149-big-Data.db)
WARN  [Service Thread] 2018-08-05 22:17:10,211 GCInspector.java:282 - ParNew GC in 2243ms.  CMS Old Gen: 441171344 -> 441202720; Par Eden Space: 286130176 -> 0; Par Survivor Space: 7255728 -> 19842080
 

Version-Release number of selected component (if applicable):

atomic-openshift-3.9.30-1.git.0.dec1ba7.el7.x86_64
atomic-openshift-clients-3.9.30-1.git.0.dec1ba7.el7.x86_64
atomic-openshift-docker-excluder-3.9.30-1.git.0.dec1ba7.el7.noarch
atomic-openshift-excluder-3.9.30-1.git.0.dec1ba7.el7.noarch
atomic-openshift-node-3.9.30-1.git.0.dec1ba7.el7.x86_64
atomic-openshift-sdn-ovs-3.9.30-1.git.0.dec1ba7.el7.x86_64
atomic-registries-1.22.1-22.git5a342e3.el7.x86_64
Comment 2 Venkata Tadimarri 2018-08-07 18:17:33 EDT
Update: 

After increasing the memory to 8GB, the issue has not occurred for 15hrs so far.
Comment 4 John Sanda 2018-08-14 09:28:16 EDT
Cassandra using mmap for most I/O operations, meaning writes go to the operating system page cache in memory.  Beside the JVM heap, Cassandra relies on the OS for memory management. The error is likely caused by having a memory limit. I recommend removing the limit.

Note You need to log in before you can comment on or make changes to this bug.