У нас есть два центра обработки данных (DC1 в Европе, DC2 в Северной Америке) кластера DatastaxEnterprise Solr (версия 4.5) на CentOs:
DC1: 2 nodes with rf set to 2
DC2: 1 nodes with rf set to 1
Каждый узел имеет 2 ядра и 4 ГБ оперативной памяти. Мы создали только одно пространство ключей, 2 узла DC1 имеют по 400 МБ данных каждый, а узел в DC2 пуст.
если я запускаю восстановление nodetool на узле в DC2, команда работает хорошо в течение примерно 20/30 минут, а затем перестает работать, оставшись зависшим.
В журналах узла в DC2 я могу прочитать это:
WARN [NonPeriodicTasks:1] 2014-10-01 05:57:44,188 WorkPool.java (line 398) Timeout while waiting for workers when flushing pool {}. IndexCurrent timeout is Failure to flush may cause excessive growth of Cassandra commit log.
millis, consider increasing it, or reducing load on the node.
ERROR [NonPeriodicTasks:1] 2014-10-01 05:57:44,190 CassandraDaemon.java (line 199) Exception in thread Thread[NonPeriodicTasks:1,5,main]
org.apache.solr.common.SolrException: java.lang.RuntimeException: Timeout while waiting for workers when flushing pool {}. IndexCurrent timeout is Failure to flush may cause excessive growth of Cassandra commit log.
millis, consider increasing it, or reducing load on the node.
at com.datastax.bdp.search.solr.handler.update.CassandraDirectUpdateHandler.commit(CassandraDirectUpdateHandler.java:351)
at com.datastax.bdp.search.solr.AbstractSolrSecondaryIndex.doCommit(AbstractSolrSecondaryIndex.java:994)
at com.datastax.bdp.search.solr.AbstractSolrSecondaryIndex.forceBlockingFlush(AbstractSolrSecondaryIndex.java:139)
at org.apache.cassandra.db.index.SecondaryIndexManager.flushIndexesBlocking(SecondaryIndexManager.java:338)
at org.apache.cassandra.db.index.SecondaryIndexManager.maybeBuildSecondaryIndexes(SecondaryIndexManager.java:144)
at org.apache.cassandra.streaming.StreamReceiveTask$OnCompletionRunnable.run(StreamReceiveTask.java:113)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(Unknown Source)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: java.lang.RuntimeException: Timeout while waiting for workers when flushing pool {}. IndexCurrent timeout is Failure to flush may cause excessive growth of Cassandra commit log.
millis, consider increasing it, or reducing load on the node.
at com.datastax.bdp.concurrent.WorkPool.doFlush(WorkPool.java:399)
at com.datastax.bdp.concurrent.WorkPool.flush(WorkPool.java:339)
at com.datastax.bdp.search.solr.AbstractSolrSecondaryIndex.flushIndexUpdates(AbstractSolrSecondaryIndex.java:484)
at com.datastax.bdp.search.solr.handler.update.CassandraDirectUpdateHandler.commit(CassandraDirectUpdateHandler.java:278)
... 12 more
WARN [commitScheduler-3-thread-1] 2014-10-01 05:58:47,351 WorkPool.java (line 398) Timeout while waiting for workers when flushing pool {}. IndexCurrent timeout is Failure to flush may cause excessive growth of Cassandra commit log.
millis, consider increasing it, or reducing load on the node.
ERROR [commitScheduler-3-thread-1] 2014-10-01 05:58:47,352 SolrException.java (line 136) auto commit error...:org.apache.solr.common.SolrException: java.lang.RuntimeException: Timeout while waiting for workers when flushing pool {}. IndexCurrent timeout is Failure to flush may cause excessive growth of Cassandra commit log.
millis, consider increasing it, or reducing load on the node.
at com.datastax.bdp.search.solr.handler.update.CassandraDirectUpdateHandler.commit(CassandraDirectUpdateHandler.java:351)
at org.apache.solr.update.CommitTracker.run(CommitTracker.java:216)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(Unknown Source)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: java.lang.RuntimeException: Timeout while waiting for workers when flushing pool {}. IndexCurrent timeout is Failure to flush may cause excessive growth of Cassandra commit log.
millis, consider increasing it, or reducing load on the node.
at com.datastax.bdp.concurrent.WorkPool.doFlush(WorkPool.java:399)
at com.datastax.bdp.concurrent.WorkPool.flush(WorkPool.java:339)
at com.datastax.bdp.search.solr.AbstractSolrSecondaryIndex.flushIndexUpdates(AbstractSolrSecondaryIndex.java:484)
at com.datastax.bdp.search.solr.handler.update.CassandraDirectUpdateHandler.commit(CassandraDirectUpdateHandler.java:278)
... 8 more
Я безуспешно пытался увеличить время ожидания в файле cassandra.yaml. Спасибо