Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1486874

Summary: [starter-us-west-2] hawkular-cassandra-1 in crashloopbackoff
Product: OpenShift Container Platform Reporter: Justin Pierce <jupierce>
Component: HawkularAssignee: Matt Wringe <mwringe>
Status: CLOSED NOTABUG QA Contact: Junqi Zhao <juzhao>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 3.6.0CC: aos-bugs, jcantril, jgoulding, jupierce, sbain, sten
Target Milestone: ---   
Target Release: 3.6.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-10-17 13:40:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Pod log none

Description Justin Pierce 2017-08-30 16:48:54 UTC
Description of problem:
[root@starter-us-west-2-master-0e70a ~]# oc get rc
NAME                      DESIRED   CURRENT   READY     AGE
hawkular-cassandra-1      1         1         0         124d
hawkular-cassandra-2      1         1         1         124d
hawkular-metrics          1         1         0         124d
heapster                  1         1         0         124d

[root@starter-us-west-2-master-0e70a ~]# oc get pods
NAME                         READY     STATUS             RESTARTS   AGE
hawkular-cassandra-1-3cv4w   0/1       CrashLoopBackOff   1570       5d
hawkular-cassandra-2-n070r   1/1       Running            0          12d
hawkular-metrics-l97mb       0/1       Running            2139       12d
heapster-bd4hf               0/1       Running            891        5d


Version-Release number of selected component (if applicable):
oc v3.6.173.0.5
kubernetes v1.6.1+5115d708d7
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://internal.api.starter-us-west-2.openshift.com:443
openshift v3.6.173.0.5
kubernetes v1.6.1+5115d708d7


How reproducible:
Some of our starter clusters are working fine.

Steps to Reproduce:
1. Metrics was installed using openshift-ansible
2.
3.

Actual results:
hawkular-cassandra-1 in crashloopbackoff.


Additional info:

[root@starter-us-west-2-master-0e70a ~]# oc logs hawkular-cassandra-1-3cv4w
The MAX_HEAP_SIZE envar is not set. Basing the MAX_HEAP_SIZE on the available memory limit for the pod (2000000000).
The memory limit is less than 2GB. Using 1/2 of available memory for the max_heap_size.
The MAX_HEAP_SIZE has been set to 953M
THE HEAP_NEWSIZE envar is not set. Setting to 372M based on the CPU_LIMIT of 3725. [100M per CPU core]
About to generate seeds
Trying to access the Seed list [try #1]
Setting seeds to be ;; connection timed out; trying next origin,;; connection timed out; trying next origin,;; connection timed out; trying next origin,;; connection timed out; trying next origin,;; connection timed out; no servers could be reached
sed: -e expression #1, char 13: unterminated `s' command
Creating the Cassandra keystore from the Secret's cert data
Converting the PKCS12 keystore into a Java Keystore
Picked up JAVA_TOOL_OPTIONS: -Duser.home=/home/jboss -Duser.name=jboss
Entry for alias cassandra successfully imported.
Import command completed:  1 entries successfully imported, 0 entries failed or cancelled
[Storing /opt/apache-cassandra/conf/.keystore]
Building the trust store for inter node communication
Picked up JAVA_TOOL_OPTIONS: -Duser.home=/home/jboss -Duser.name=jboss
Certificate was added to keystore
Picked up JAVA_TOOL_OPTIONS: -Duser.home=/home/jboss -Duser.name=jboss
Certificate was added to keystore
/opt/apache-cassandra/bin/cassandra-docker.sh: line 307: cd: /home/jboss: Permission denied
Building the trust store for client communication
Picked up JAVA_TOOL_OPTIONS: -Duser.home=/home/jboss -Duser.name=jboss
Certificate was added to keystore
Picked up JAVA_TOOL_OPTIONS: -Duser.home=/home/jboss -Duser.name=jboss
Certificate was added to keystore
Generating self signed certificates for the local client for cqlsh
Generating a 4096 bit RSA private key
...............................................................................................................................................++
............++
writing new private key to '.cassandra.local.client.key'
-----
Picked up JAVA_TOOL_OPTIONS: -Duser.home=/home/jboss -Duser.name=jboss
Certificate was added to keystore
The previous version of Cassandra was 3.0.14.redhat-1. The current version is 3.0.14.redhat-1
cat: /etc/ld.so.conf.d/*.conf: No such file or directory
getopt: invalid option -- 'R'
Picked up JAVA_TOOL_OPTIONS: -Duser.home=/home/jboss -Duser.name=jboss
OpenJDK 64-Bit Server VM warning: Cannot open file /opt/apache-cassandra/logs/gc.log due to No such file or directory

CompilerOracle: dontinline org/apache/cassandra/db/Columns$Serializer.deserializeLargeSubset (Lorg/apache/cassandra/io/util/DataInputPlus;Lorg/apache/cassandra/db/Columns;I)Lorg/apache/cassandra/db/Columns;
CompilerOracle: dontinline org/apache/cassandra/db/Columns$Serializer.serializeLargeSubset (Ljava/util/Collection;ILorg/apache/cassandra/db/Columns;ILorg/apache/cassandra/io/util/DataOutputPlus;)V
CompilerOracle: dontinline org/apache/cassandra/db/Columns$Serializer.serializeLargeSubsetSize (Ljava/util/Collection;ILorg/apache/cassandra/db/Columns;I)I
CompilerOracle: dontinline org/apache/cassandra/db/transform/BaseIterator.tryGetMoreContents ()Z
CompilerOracle: dontinline org/apache/cassandra/db/transform/StoppingTransformation.stop ()V
CompilerOracle: dontinline org/apache/cassandra/db/transform/StoppingTransformation.stopInPartition ()V
CompilerOracle: dontinline org/apache/cassandra/io/util/BufferedDataOutputStreamPlus.doFlush (I)V
CompilerOracle: dontinline org/apache/cassandra/io/util/BufferedDataOutputStreamPlus.writeExcessSlow ()V
CompilerOracle: dontinline org/apache/cassandra/io/util/BufferedDataOutputStreamPlus.writeSlow (JI)V
CompilerOracle: dontinline org/apache/cassandra/io/util/RebufferingInputStream.readPrimitiveSlowly (I)J
CompilerOracle: inline org/apache/cassandra/io/util/Memory.checkBounds (JJ)V
CompilerOracle: inline org/apache/cassandra/io/util/SafeMemory.checkBounds (JJ)V
CompilerOracle: inline org/apache/cassandra/utils/AsymmetricOrdering.selectBoundary (Lorg/apache/cassandra/utils/AsymmetricOrdering/Op;II)I
CompilerOracle: inline org/apache/cassandra/utils/AsymmetricOrdering.strictnessOfLessThan (Lorg/apache/cassandra/utils/AsymmetricOrdering/Op;)I
CompilerOracle: inline org/apache/cassandra/utils/ByteBufferUtil.compare (Ljava/nio/ByteBuffer;[B)I
CompilerOracle: inline org/apache/cassandra/utils/ByteBufferUtil.compare ([BLjava/nio/ByteBuffer;)I
CompilerOracle: inline org/apache/cassandra/utils/ByteBufferUtil.compareUnsigned (Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)I
CompilerOracle: inline org/apache/cassandra/utils/FastByteOperations$UnsafeOperations.compareTo (Ljava/lang/Object;JILjava/lang/Object;JI)I
CompilerOracle: inline org/apache/cassandra/utils/FastByteOperations$UnsafeOperations.compareTo (Ljava/lang/Object;JILjava/nio/ByteBuffer;)I
CompilerOracle: inline org/apache/cassandra/utils/FastByteOperations$UnsafeOperations.compareTo (Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)I
CompilerOracle: inline org/apache/cassandra/utils/vint/VIntCoding.encodeVInt (JI)[B
INFO  [main] 2017-08-30 16:46:30,624 YamlConfigurationLoader.java:85 - Configuration location: file:/opt/apache-cassandra-3.0.14.redhat-1/conf/cassandra.yaml
INFO  [main] 2017-08-30 16:46:30,734 Config.java:457 - Node configuration:[allocate_tokens_for_keyspace=null; authenticator=AllowAllAuthenticator; authorizer=AllowAllAuthorizer; auto_bootstrap=true; auto_snapshot=true; batch_size_fail_threshold_in_kb=50; batch_size_warn_threshold_in_kb=5; batchlog_replay_throttle_in_kb=1024; broadcast_address=null; broadcast_rpc_address=null; buffer_pool_use_heap_if_exhausted=true; cas_contention_timeout_in_ms=1000; client_encryption_options=<REDACTED>; cluster_name=hawkular-metrics; column_index_size_in_kb=64; commit_failure_policy=stop; commitlog_compression=LZ4Compressor; commitlog_directory=/cassandra_data/commitlog; commitlog_max_compression_buffers_in_pool=3; commitlog_periodic_queue_size=-1; commitlog_segment_size_in_mb=32; commitlog_sync=periodic; commitlog_sync_batch_window_in_ms=null; commitlog_sync_period_in_ms=10000; commitlog_total_space_in_mb=null; compaction_large_partition_warning_threshold_mb=100; compaction_throughput_mb_per_sec=16; concurrent_compactors=null; concurrent_counter_writes=32; concurrent_materialized_view_writes=32; concurrent_reads=32; concurrent_replicates=null; concurrent_writes=32; counter_cache_keys_to_save=2147483647; counter_cache_save_period=7200; counter_cache_size_in_mb=null; counter_write_request_timeout_in_ms=5000; cross_node_timeout=false; data_file_directories=[Ljava.lang.String;@52aa2946; disk_access_mode=auto; disk_failure_policy=stop; disk_optimization_estimate_percentile=0.95; disk_optimization_page_cross_chance=0.1; disk_optimization_strategy=ssd; dynamic_snitch=true; dynamic_snitch_badness_threshold=0.1; dynamic_snitch_reset_interval_in_ms=600000; dynamic_snitch_update_interval_in_ms=100; enable_scripted_user_defined_functions=false; enable_user_defined_functions=false; enable_user_defined_functions_threads=true; encryption_options=null; endpoint_snitch=SimpleSnitch; file_cache_size_in_mb=512; gc_log_threshold_in_ms=200; gc_warn_threshold_in_ms=1000; hinted_handoff_disabled_datacenters=[]; hinted_handoff_enabled=true; hinted_handoff_throttle_in_kb=1024; hints_compression=null; hints_directory=null; hints_flush_period_in_ms=10000; incremental_backups=false; index_interval=null; index_summary_capacity_in_mb=null; index_summary_resize_interval_in_minutes=60; initial_token=null; inter_dc_stream_throughput_outbound_megabits_per_sec=200; inter_dc_tcp_nodelay=false; internode_authenticator=null; internode_compression=all; internode_recv_buff_size_in_bytes=null; internode_send_buff_size_in_bytes=null; key_cache_keys_to_save=2147483647; key_cache_save_period=14400; key_cache_size_in_mb=null; listen_address=hawkular-cassandra-1-3cv4w; listen_interface=null; listen_interface_prefer_ipv6=false; listen_on_broadcast_address=false; max_hint_window_in_ms=10800000; max_hints_delivery_threads=2; max_hints_file_size_in_mb=128; max_mutation_size_in_kb=null; max_streaming_retries=3; max_value_size_in_mb=256; memtable_allocation_type=heap_buffers; memtable_cleanup_threshold=null; memtable_flush_writers=null; memtable_heap_space_in_mb=null; memtable_offheap_space_in_mb=null; min_free_space_per_drive_in_mb=50; native_transport_max_concurrent_connections=-1; native_transport_max_concurrent_connections_per_ip=-1; native_transport_max_frame_size_in_mb=256; native_transport_max_threads=128; native_transport_port=9042; native_transport_port_ssl=null; num_tokens=256; otc_backlog_expiration_interval_ms=200; otc_coalescing_enough_coalesced_messages=8; otc_coalescing_strategy=TIMEHORIZON; otc_coalescing_window_us=200; partitioner=org.apache.cassandra.dht.Murmur3Partitioner; permissions_cache_max_entries=1000; permissions_update_interval_in_ms=-1; permissions_validity_in_ms=2000; phi_convict_threshold=8.0; range_request_timeout_in_ms=10000; read_request_timeout_in_ms=5000; request_scheduler=org.apache.cassandra.scheduler.NoScheduler; request_scheduler_id=null; request_scheduler_options=null; request_timeout_in_ms=10000; role_manager=CassandraRoleManager; roles_cache_max_entries=1000; roles_update_interval_in_ms=-1; roles_validity_in_ms=2000; row_cache_class_name=org.apache.cassandra.cache.OHCProvider; row_cache_keys_to_save=2147483647; row_cache_save_period=0; row_cache_size_in_mb=0; rpc_address=hawkular-cassandra-1-3cv4w; rpc_interface=null; rpc_interface_prefer_ipv6=false; rpc_keepalive=true; rpc_listen_backlog=50; rpc_max_threads=2147483647; rpc_min_threads=16; rpc_port=9160; rpc_recv_buff_size_in_bytes=null; rpc_send_buff_size_in_bytes=null; rpc_server_type=sync; saved_caches_directory=null; seed_provider=org.apache.cassandra.locator.SimpleSeedProvider{seeds=${SEEDS}}; server_encryption_options=<REDACTED>; snapshot_before_compaction=false; ssl_storage_port=7001; sstable_preemptive_open_interval_in_mb=50; start_native_transport=true; start_rpc=false; storage_port=7000; stream_throughput_outbound_megabits_per_sec=200; streaming_socket_timeout_in_ms=86400000; thrift_framed_transport_size_in_mb=15; thrift_max_message_length_in_mb=16; tombstone_failure_threshold=100000; tombstone_warn_threshold=1000; tracetype_query_ttl=86400; tracetype_repair_ttl=604800; trickle_fsync=false; trickle_fsync_interval_in_kb=10240; truncate_request_timeout_in_ms=60000; unlogged_batch_across_partitions_warn_threshold=10; user_defined_function_fail_timeout=1500; user_defined_function_warn_timeout=500; user_function_timeout_policy=die; windows_timer_interval=1; write_request_timeout_in_ms=2000]
INFO  [main] 2017-08-30 16:46:30,734 DatabaseDescriptor.java:323 - DiskAccessMode 'auto' determined to be mmap, indexAccessMode is mmap
INFO  [main] 2017-08-30 16:46:30,856 DatabaseDescriptor.java:430 - Global memtable on-heap threshold is enabled at 229MB
INFO  [main] 2017-08-30 16:46:30,856 DatabaseDescriptor.java:434 - Global memtable off-heap threshold is enabled at 229MB
WARN  [main] 2017-08-30 16:46:30,900 SimpleSeedProvider.java:60 - Seed provider couldn't lookup host ${SEEDS}
Exception (org.apache.cassandra.exceptions.ConfigurationException) encountered during startup: The seed provider lists no seeds.
The seed provider lists no seeds.
ERROR [main] 2017-08-30 16:46:30,902 CassandraDaemon.java:710 - Exception encountered during startup: The seed provider lists no seeds.

Comment 1 Jeff Cantrill 2017-08-30 17:08:37 UTC
It's interesting that one pod started and the other did not.  Can you provide additional info:

* DC definitions
* Pod definitions
* Logs from both Cassandra pods
* configmaps if any

Comment 3 Matt Wringe 2017-10-17 13:40:42 UTC
Closing as I think this is out of date. If we still have an issue here, please re-open

Comment 4 stbain 2018-06-28 21:22:26 UTC
Created attachment 1455404 [details]
Pod log

Failing in a manner similar to this bug. Consider re-opening?