diff --git a/docs/topics/Config-options.md b/docs/topics/Config-options.md index 71eb3f6b6f..fddebe7d42 100644 --- a/docs/topics/Config-options.md +++ b/docs/topics/Config-options.md @@ -35,6 +35,8 @@ A preprocessor node serves a bulk ingest HTTP api, which is then transformed, ra ## indexerConfig +Configuration options for the indexer node. + ### maxMessagesPerChunk Maximum number of messages that are created per chunk before closing and uploading to S3. This should be roughly equivalent to the `maxBytesPerChunk`, such that a rollover is triggered at roughly the same time regardless of @@ -82,33 +84,114 @@ the user. Enables queries such as `value` instead of specifying the field name e ### staleDurationSecs +```yaml +indexerConfig: + staleDurationSecs: 7200 +``` + +How long a stale chunk, or a chunk no longer being written to, can remain on an indexer before being deleted. If the +[indexerConfig.maxChunksOnDisk](Config-options.md#maxchunksondisk) limit is reached prior to this value the chunk will +be removed. + ### dataDirectory {id=indexer-data-directory} +```yaml +indexerConfig: + dataDirectory: /mnt/localdisk +``` + + +Path of data directory to use. Generally recommended to be instance storage backed by NVMe disks, or a memory mapped +storage like [tmpfs](https://kubernetes.io/docs/concepts/storage/volumes/#emptydir) for best performance. + + ### maxOffsetDelayMessages +```yaml +indexerConfig: + maxOffsetDelayMessages: 300000 +``` + +Maximum amount of messages that the indexer can lag behind on startup before creating an async recovery tasks. If the +current message lag exceeds this the indexer will immediately start indexing at the current time, and create a task to +be indexed by a recover node from the last persisted offset to where the indexer started from. ### defaultQueryTimeoutMs {id=indexer-default-query-timeout-ms} +```yaml +indexerConfig: + defaultQueryTimeoutMs: 6500 +``` + + +Timeout for searching an individual chunk. Should be set to some value below the `serverConfig.requestTimeoutMs` to +ensure that post-processing can occur before reaching the overall request timeout. + + ### readFromLocationOnStart +```yaml + +indexerConfig: + readFromLocationOnStart: LATEST +``` + +Defines where to read from Kafka when initializing a new cluster. + + + +Use the oldest Kafka offset when initializing cluster to include all messages currently on Kafka. + + +Use the latest Kafka offset when initializing cluster, will start indexing new messages from the cluster initialization +time onwards. See indexerConfig.createRecoveryTasksOnStart for +an additional config parameter related to using LATEST. + + + +This config is only used when initializing a new cluster, when no existing offsets are found in Zookeeper. + ### createRecoveryTasksOnStart +```yaml + +indexerConfig: + createRecoveryTasksOnStart: true +``` + +Defines if recovery tasks should be created when initializing a new cluster. + +This only applies when indexerConfig.readFromLocationOnStart is set to LATEST. +This config is only used when initializing a new cluster, when no existing offsets are found in Zookeeper. + ### maxChunksOnDisk +```yaml +indexerConfig: + maxChunksOnDisk: 3 +``` +How many stale chunks, or chunks no longer being written to, can remain on an indexer before being deleted. If the +[indexerConfig.staleDurationSecs](Config-options.md#staledurationsecs) limit is reached prior to this value the chunk +will be removed. + ### serverConfig {id=indexer-server-config} ```yaml indexerConfig: serverConfig: serverPort: 8081 - serverAddress: localhost - requestTimeoutMs: 5000 + serverAddress: 10.0.100.1 + requestTimeoutMs: 7000 ``` - + +Port used for application HTTP traffic. +Address at which this instance is accessible by other Astra components. Used for inter-node communication and is +registered to Zookeeper. +Request timeout for all HTTP traffic after which the request is cancelled. @@ -145,6 +228,15 @@ String value of JSON encoded properties to set on Kafka consumer. Any valid Kafk ## s3Config +S3 configuration options common to indexer, recovery, cache, and manager nodes. + + + +S3 compatible APIs from vendors other than Amazon may work, but are not guaranteed. Astra uses the +[AWS CRT client](https://github.com/awslabs/aws-crt-java) for improved performance, but this library has less expected +compatibility with other vendors. + + ```yaml s3Config: s3AccessKey: access @@ -155,17 +247,34 @@ s3Config: s3TargetThroughputGbps: 25 ``` -### s3AccessKey - -### s3SecretKey - -### s3Region + + +AWS access key. If both access key and secret key are empty will use the AWS default credentials provider. + + +AWS secret key. If both access key and secret key are empty will use the AWS default credentials provider. + + -### s3EndPoint +AWS region, ie `us-east-1`, `us-west-2` + + +S3 endpoint to use. If this setting is null or empty will not attempt to override the endpoint and will use the default +provided by the AWS client. + + +AWS S3 bucket name -### s3Bucket +Separate buckets per cluster is recommended for better cost tracking and improved performance. + + +Throughput target in gigabits per second. This configuration controls how many concurrent connections will be +established in the AWS CRT client. Recommended to be set to match the maximum bandwidth of the underlying host. + + -### s3TargetThroughputGbps +See also astra.s3CrtBlobFs.maxNativeMemoryLimitBytes +system properties config. ## tracingConfig @@ -179,33 +288,56 @@ tracingConfig: ### zipkinEndpoint +Fully path to the Zipkin [POST spans endpoint](https://zipkin.io/zipkin-api/#/default/post_spans). Will be submitted as +a JSON array of span data. + ### commonTags +Optional common tags to annotate on all submitted Zipkin traces. Can be overwritten by spans at runtime, if keys +collide. -Recommended common tags: - - - - - - +Recommended common tags: clusterName, env ## queryConfig +Configuration options for the query node. + ### serverConfig {id=query-server-config} ```yaml queryConfig: serverConfig: serverPort: 8081 - serverAddress: localhost - requestTimeoutMs: 5000 + serverAddress: 10.0.100.2 + requestTimeoutMs: 60000 ``` - ### defaultQueryTimeout +```yaml +queryConfig: + defaultQueryTimeout: 55000 +``` + +Query timeout for individual indexer and cache nodes when performing a query. This value should be set lower than the +queryConfig.serverConfig.requestTimeoutMs and equal-to or greater-than +the indexerConfig.serverConfig.requestTimeoutMs and +cacheConfig.serverConfig.requestTimeoutMs. + +When setting a timeout ensure that the +queryConfig.serverConfig.requestTimeoutMs is at least a few +seconds higher than the queryConfig.serverConfig.defaultQueryTimeout +to allow for aggregation post-processing to occur. + + ### managerConnectString +```yaml +queryConfig: + managerConnectString: 10.0.100.2:8085 +``` + +experimental +Host address for manager node, used for on-demand recovery requests. ## metadataStoreConfig ```yaml @@ -221,28 +353,63 @@ metadataStoreConfig: ### zookeeperConfig + +Zookeeper connection string - list of servers to connect to or a common service discovery endpoint +(ie, [consul endpoint](https://www.consul.io/)). + +Common prefix to use for Astra data. Useful when using a common Zookeeper installation. + +A shared Zookeeper cluster is only recommended for very small Astra clusters. +Zookeeper session timeout in milliseconds. +Zookeeper connection timeout in milliseconds. +How long to wait between retries when attempting to reconnect a Zookeeper session. Will retry up to the +`zkSessionTimeoutMs`. ## cacheConfig +Configuration options for the cache node. + ### slotsPerInstance ### replicaSet ### dataDirectory {id=cache-data-directory} +```yaml +cacheConfig: + dataDirectory: /mnt/localdisk +``` + + + ### defaultQueryTimeoutMs {id=cache-default-query-timeout-ms} +```yaml +cacheConfig: + defaultQueryTimeoutMs: 50000 +``` + + + ### serverConfig {id=cache-server-config} +```yaml +cacheConfig: + serverConfig: + serverPort: 8081 + serverAddress: 10.0.100.3 + requestTimeoutMs: 55000 +``` + ## managerConfig @@ -251,8 +418,15 @@ metadataStoreConfig: ### scheduleInitialDelayMins ### serverConfig {id=manager-server-config} +```yaml +managerConfig: + serverConfig: + serverPort: 8081 + serverAddress: 10.0.100.10 + requestTimeoutMs: 30000 +``` - + ### replicaCreationServiceConfig @@ -269,17 +443,40 @@ metadataStoreConfig: ### replicaRestoreServiceConfig ## clusterConfig +Cluster configuration options common to all node type. + ```yaml clusterConfig: clusterName: astra_local env: local ``` + + +Unique name assigned to this cluster. Should be identical for all node types in the cluster, and is used for metrics +instrumentation. + + +Environment string for this cluster. Should be identical for all node types deployed to a single environment, and is +used for metrics instrumentation. + + + ## recoveryConfig +Configuration options for the recovery indexer node. + ### serverConfig {id=recovery-server-config} - +```yaml +recoveryConfig: + serverConfig: + serverPort: 8081 + serverAddress: 10.0.100.4 + requestTimeoutMs: 10000 +``` + + ### kafkaConfig {id=kafka-recovery} @@ -299,6 +496,8 @@ clusterConfig: ## preprocessorConfig +Configuration options for the preprocessor node. + ### kafkaStreamConfig ```yaml preprocessorConfig: @@ -309,25 +508,7 @@ preprocessorConfig: processingGuarantee: at_least_once additionalProps: "" ``` -

kafkaStreamConfig is deprecated and unsupported

- - - -DEPRECATED - - -DEPRECATED - - -DEPRECATED - - -DEPRECATED - - -DEPRECATED - - +kafkaStreamConfig is deprecated and unsupported. ### kafkaConfig {id=kafka-preprocessor} ```yaml @@ -361,18 +542,86 @@ For valid formatting options refer to [Schema documentation.](Schema.md#schema-f ### serverConfig {id=preprocessor-server-config} +```yaml +preprocessorConfig: + serverConfig: + serverPort: 8081 + serverAddress: 10.0.100.5 + requestTimeoutMs: 55000 +``` + + + ### upstreamTopics +```yaml +preprocessorConfig: + upstreamTopics: "" +``` +upstreamTopics is deprecated. +Should always be set to "" ### downstreamTopic +```yaml +preprocessorConfig: + downstreamTopic: "" +``` +downstreamTopic is deprecated. +Should always be set to "" ### preprocessorInstanceCount +```yaml +preprocessorConfig: + preprocessorInstanceCount: 2 +``` +Indicates how many instances of the preprocessor are currently deployed. Used for scaling rate limiters such that each +preprocessor instance will allow the `total rate limit / preprocessor instance count` through before applying. ### dataTransformer +dataTransformer is deprecated. + +```yaml +preprocessorConfig: + dataTransformer: json +``` +Should always be set to json ### rateLimiterMaxBurstSeconds +```yaml +preprocessorConfig: + rateLimiterMaxBurstSeconds: 1 +``` +Defines how many seconds rate limiting unused permits can be accumulated before no longer increasing. + +Must be greater than or equal to 1. ### kafkaPartitionStickyTimeoutMs +```yaml +preprocessorConfig: + kafkaPartitionStickyTimeoutMs: 0 +``` + +kafkaPartitionStickyTimeoutMs is deprecated. +Should always be set to 0 + ### useBulkApi -### rateLimitExceededErrorCode \ No newline at end of file +```yaml +preprocessorConfig: + useBulkApi: true +``` +useBulkApi is deprecated. +Should always be set to true + +Enable bulk ingest API, replacing the Kafka Streams API _(deprecated)_. + +### rateLimitExceededErrorCode + +```yaml +preprocessorConfig: + rateLimitExceededErrorCode: 400 +``` + +Error code to return when the rate limit of the preprocessor is exceeded. If using OpenSearch +[Data Prepper](https://opensearch.org/docs/latest/data-prepper/) a return code of `400` or `404` would mark the request +as unable to be retried and sent to the dead letter queue.