Я использую кластер Cassandra из 4 узлов с коэффициентом репликации 2. Размер данных Cassandra каждого узла составляет около 2,7 ТБ.
3 дня назад произошел сбой одного из узлов Cassandra, я пытался запустить службу Cassandra и посмотреть system.log, я обнаружил ошибку Leak Detected в нескольких CF-
ERROR [Reference-Reaper:1] 2017-05-10 13:03:00,779 Ref.java:179 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@565b5b35) to class org.apache.cassandra.io.sstable.SSTableReader$InstanceTidier@408212172:/raid0/cassandra/data/system/compaction_history-b4dbb7b4dc493fb5b3bfce6e434832ca/system-compaction_history-ka-15171 was not released before the reference was garbage collected
LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@3e2430d) to class org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$1@554817289:[Memory@[0..4), Memory@[0..18)] was not released before the reference was garbage collected
ERROR [Reference-Reaper:1] 2017-05-10 13:03:00,787 Ref.java:179 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@2ff9f824) to class org.apache.cassandra.io.util.MmappedSegmentedFile$Cleanup@1142037527:/raid0/cassandra/data/system/compaction_history-b4dbb7b4dc493fb5b3bfce6e434832ca/system-compaction_history-ka-15172-Index.db was not released before the reference was garbage collected
ERROR [Reference-Reaper:1] 2017-05-10 13:03:00,788 Ref.java:179 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@35c52c94) to class org.apache.cassandra.io.sstable.SSTableReader$InstanceTidier@603046944:/raid0/cassandra/data/system/compaction_history-b4dbb7b4dc493fb5b3bfce6e434832ca/system-compaction_history-ka-15172 was not released before the reference was garbage collected
ERROR [Reference-Reaper:1] 2017-05-10 13:03:00,788 Ref.java:179 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@14834365) to class org.apache.cassandra.io.util.MmappedSegmentedFile$Cleanup@901621352:/raid0/cassandra/data/system/compaction_history-b4dbb7b4dc493fb5b3bfce6e434832ca/system-compaction_history-ka-15171-Index.db was not released before the reference was garbage collected
Я прочитал несколько ссылок или блогов об «Обнаружена утечка», некоторые люди говорят, что это длинная проблема GC, затем я помещаю ее ниже строк в файле cassandra-env.sh.
JVM_OPTS="$JVM_OPTS -XX:+PrintSafepointStatistics"
JVM_OPTS="$JVM_OPTS -XX:+PrintClassHistogramBeforeFullGC"
JVM_OPTS="$JVM_OPTS -XX:+PrintClassHistogramAfterFullGC"
После этого проверил system.log и обнаружил ниже строчки в логе -
INFO [CompactionExecutor:4] 2017-05-12 19:16:16,892 AutoSavingCache.java:302 - Saved KeyCache (915816 items) in 29601 ms
INFO [CompactionExecutor:7] 2017-05-12 23:16:16,563 AutoSavingCache.java:302 - Saved KeyCache (915816 items) in 29604 ms
INFO [CompactionExecutor:10] 2017-05-13 03:16:16,838 AutoSavingCache.java:302 - Saved KeyCache (915816 items) in 29875 ms
INFO [CompactionExecutor:13] 2017-05-13 07:16:16,849 AutoSavingCache.java:302 - Saved KeyCache (915816 items) in 29891 ms
INFO [CompactionExecutor:16] 2017-05-13 11:16:16,737 AutoSavingCache.java:302 - Saved KeyCache (915816 items) in 29779 ms
INFO [CompactionExecutor:19] 2017-05-13 15:16:16,848 AutoSavingCache.java:302 - Saved KeyCache (915816 items) in 29889 ms
INFO [CompactionExecutor:22] 2017-05-13 19:16:17,009 AutoSavingCache.java:302 - Saved KeyCache (915816 items) in 29729 ms
INFO [CompactionExecutor:25] 2017-05-13 23:16:16,476 AutoSavingCache.java:302 - Saved KeyCache (915816 items) in 29514 ms
INFO [CompactionExecutor:28] 2017-05-14 03:16:16,648 AutoSavingCache.java:302 - Saved KeyCache (915816 items) in 29685 ms
INFO [CompactionExecutor:31] 2017-05-14 07:16:16,724 AutoSavingCache.java:302 - Saved KeyCache (915816 items) in 29760 ms
INFO [CompactionExecutor:34] 2017-05-14 11:16:16,709 AutoSavingCache.java:302 - Saved KeyCache (915816 items) in 29715 ms
INFO [CompactionExecutor:37] 2017-05-14 15:16:16,515 AutoSavingCache.java:302 - Saved KeyCache (915816 items) in 29545 ms
INFO [CompactionExecutor:40] 2017-05-14 19:16:16,745 AutoSavingCache.java:302 - Saved KeyCache (915816 items) in 29776 ms
INFO [CompactionExecutor:43] 2017-05-14 23:16:16,504 AutoSavingCache.java:302 - Saved KeyCache (915816 items) in 29532 ms
INFO [CompactionExecutor:46] 2017-05-15 03:16:16,470 AutoSavingCache.java:302 - Saved KeyCache (915816 items) in 29496 ms
INFO [CompactionExecutor:49] 2017-05-15 07:16:16,519 AutoSavingCache.java:302 - Saved KeyCache (915816 items) in 29545 ms
INFO [CompactionExecutor:52] 2017-05-15 11:16:16,385 AutoSavingCache.java:302 - Saved KeyCache (915816 items) in 29411 ms
Через 3 дня служба Cassandra не работает. Пожалуйста, помогите мне решить эту проблему.
Системная информация -
Cassandra Version = 2.1.7
OS = Ubuntu 12.04
CPU Core = 4
RAM = 28GB