Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1613124

Summary: Cassandra error : org.apache.cassandra.io.FSReadError: java.io.IOException: Channel not open for writing - cannot extend file to required size
Product: OpenShift Container Platform Reporter: Venkata Tadimarri <ktadimar>
Component: HawkularAssignee: Ruben Vargas Palma <rvargasp>
Status: CLOSED DEFERRED QA Contact: Junqi Zhao <juzhao>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 3.9.0CC: aos-bugs, ktadimar, mmariyan
Target Milestone: ---   
Target Release: 3.9.z   
Hardware: x86_64   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-09-24 17:47:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Venkata Tadimarri 2018-08-07 04:44:07 UTC
Description of problem:

Periodic connection refused errors on metrics. A restart of cassandra pod fixes the problem. However, the same issue occurs again after approximately 6hrs where a restart of the pod is required again to bring back metrics to functionality.

Errors on the log:

WARN  [CompactionExecutor:3] 2018-08-05 21:37:34,306 BigTableWriter.java:171 - Writing large partition hawkular_metrics/metrics_idx:automation-deploy-1\:9abb8a21-9383-11e8-80d0-001a4a408a0b:0 (135955925 bytes to sstable /cassandra_data/data/hawkular_metrics/metrics_idx-d7ed7420945e11e8a51f5d38c63d79ac/mc-148-big-Data.db)
ERROR [Reference-Reaper:1] 2018-08-05 21:37:35,734 Ref.java:223 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@3c801e0c) to class org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$Tidy@843577324:[Memory@[0..14), Memory@[0..168)] was not released before the reference was garbage collected
ERROR [Reference-Reaper:1] 2018-08-05 21:37:35,734 Ref.java:223 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@3678757c) to class org.apache.cassandra.io.util.MmappedRegions$Tidier@1937440215:/cassandra_data/data/hawkular_metrics/metrics_idx-d7ed7420945e11e8a51f5d38c63d79ac/mc-148-big-Index.db was not released before the reference was garbage collected
ERROR [CompactionExecutor:3] 2018-08-05 21:37:36,042 CassandraDaemon.java:207 - Exception in thread Thread[CompactionExecutor:3,1,main]
org.apache.cassandra.io.FSReadError: java.io.IOException: Channel not open for writing - cannot extend file to required size
        at org.apache.cassandra.io.util.ChannelProxy.map(ChannelProxy.java:156) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.io.util.MmappedRegions$State.add(MmappedRegions.java:280) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.io.util.MmappedRegions$State.access$400(MmappedRegions.java:216) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.io.util.MmappedRegions.updateState(MmappedRegions.java:142) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.io.util.MmappedRegions.<init>(MmappedRegions.java:74) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.io.util.MmappedRegions.<init>(MmappedRegions.java:58) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.io.util.MmappedRegions.map(MmappedRegions.java:104) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.io.util.MmappedSegmentedFile$Builder.updateRegions(MmappedSegmentedFile.java:128) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.io.util.MmappedSegmentedFile$Builder.complete(MmappedSegmentedFile.java:111) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.io.util.SegmentedFile$Builder.complete(SegmentedFile.java:177) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.io.util.SegmentedFile$Builder.buildIndex(SegmentedFile.java:198) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.io.sstable.format.big.BigTableWriter.openEarly(BigTableWriter.java:244) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.io.sstable.SSTableRewriter.maybeReopenEarly(SSTableRewriter.java:172) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:124) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.db.compaction.writers.MaxSSTableSizeWriter.realAppend(MaxSSTableSizeWriter.java:88) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.append(CompactionAwareWriter.java:109) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:195) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:89) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:264) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_171]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_171]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_171]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_171]
        at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79) [apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_171]
Caused by: java.io.IOException: Channel not open for writing - cannot extend file to required size
        at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:901) ~[na:1.8.0_171]
        at org.apache.cassandra.io.util.ChannelProxy.map(ChannelProxy.java:152) ~[apache-cassandra-3.0.15.redhat-1.jar:3.0.15.redhat-1]
        ... 26 common frames omitted
ERROR [CompactionExecutor:3] 2018-08-05 21:37:36,043 StorageService.java:417 - Stopping gossiper
WARN  [CompactionExecutor:3] 2018-08-05 21:37:36,044 StorageService.java:308 - Stopping gossip by operator request
INFO  [CompactionExecutor:3] 2018-08-05 21:37:36,044 Gossiper.java:1492 - Announcing shutdown
INFO  [CompactionExecutor:3] 2018-08-05 21:37:36,044 StorageService.java:2007 - Node hawkular-cassandra-1-9r7lv/100.85.2.250 state jump to shutdown
ERROR [CompactionExecutor:3] 2018-08-05 21:37:38,045 StorageService.java:427 - Stopping native transport
INFO  [CompactionExecutor:3] 2018-08-05 21:37:38,056 Server.java:180 - Stop listening for CQL clients
WARN  [CompactionExecutor:3] 2018-08-05 21:38:10,768 BigTableWriter.java:171 - Writing large partition hawkular_metrics/metrics_idx:automation-deploy-1\:9abb8a21-9383-11e8-80d0-001a4a408a0b:0 (135955925 bytes to sstable /cassandra_data/data/hawkular_metrics/metrics_idx-d7ed7420945e11e8a51f5d38c63d79ac/mc-149-big-Data.db)
WARN  [Service Thread] 2018-08-05 22:17:10,211 GCInspector.java:282 - ParNew GC in 2243ms.  CMS Old Gen: 441171344 -> 441202720; Par Eden Space: 286130176 -> 0; Par Survivor Space: 7255728 -> 19842080
 

Version-Release number of selected component (if applicable):

atomic-openshift-3.9.30-1.git.0.dec1ba7.el7.x86_64
atomic-openshift-clients-3.9.30-1.git.0.dec1ba7.el7.x86_64
atomic-openshift-docker-excluder-3.9.30-1.git.0.dec1ba7.el7.noarch
atomic-openshift-excluder-3.9.30-1.git.0.dec1ba7.el7.noarch
atomic-openshift-node-3.9.30-1.git.0.dec1ba7.el7.x86_64
atomic-openshift-sdn-ovs-3.9.30-1.git.0.dec1ba7.el7.x86_64
atomic-registries-1.22.1-22.git5a342e3.el7.x86_64

Comment 2 Venkata Tadimarri 2018-08-07 22:17:33 UTC
Update: 

After increasing the memory to 8GB, the issue has not occurred for 15hrs so far.

Comment 4 John Sanda 2018-08-14 13:28:16 UTC
Cassandra using mmap for most I/O operations, meaning writes go to the operating system page cache in memory.  Beside the JVM heap, Cassandra relies on the OS for memory management. The error is likely caused by having a memory limit. I recommend removing the limit.

Comment 7 Red Hat Bugzilla 2023-09-15 00:11:24 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days