public static interface KafkaStreamingSourceOptions.Builder extends SdkPojo, CopyableBuilder<KafkaStreamingSourceOptions.Builder,KafkaStreamingSourceOptions>
| Modifier and Type | Method and Description |
|---|---|
KafkaStreamingSourceOptions.Builder |
addRecordTimestamp(String addRecordTimestamp)
When this option is set to 'true', the data output will contain an additional column named "__src_timestamp"
that indicates the time when the corresponding record received by the topic.
|
KafkaStreamingSourceOptions.Builder |
assign(String assign)
The specific
TopicPartitions to consume. |
KafkaStreamingSourceOptions.Builder |
bootstrapServers(String bootstrapServers)
A list of bootstrap server URLs, for example, as
b-1.vpc-test-2.o4q88o.c6.kafka.us-east-1.amazonaws.com:9094. |
KafkaStreamingSourceOptions.Builder |
classification(String classification)
An optional classification.
|
KafkaStreamingSourceOptions.Builder |
connectionName(String connectionName)
The name of the connection.
|
KafkaStreamingSourceOptions.Builder |
delimiter(String delimiter)
Specifies the delimiter character.
|
KafkaStreamingSourceOptions.Builder |
emitConsumerLagMetrics(String emitConsumerLagMetrics)
When this option is set to 'true', for each batch, it will emit the metrics for the duration between the
oldest record received by the topic and the time it arrives in Glue to CloudWatch.
|
KafkaStreamingSourceOptions.Builder |
endingOffsets(String endingOffsets)
The end point when a batch query is ended.
|
KafkaStreamingSourceOptions.Builder |
includeHeaders(Boolean includeHeaders)
Whether to include the Kafka headers.
|
KafkaStreamingSourceOptions.Builder |
maxOffsetsPerTrigger(Long maxOffsetsPerTrigger)
The rate limit on the maximum number of offsets that are processed per trigger interval.
|
KafkaStreamingSourceOptions.Builder |
minPartitions(Integer minPartitions)
The desired minimum number of partitions to read from Kafka.
|
KafkaStreamingSourceOptions.Builder |
numRetries(Integer numRetries)
The number of times to retry before failing to fetch Kafka offsets.
|
KafkaStreamingSourceOptions.Builder |
pollTimeoutMs(Long pollTimeoutMs)
The timeout in milliseconds to poll data from Kafka in Spark job executors.
|
KafkaStreamingSourceOptions.Builder |
retryIntervalMs(Long retryIntervalMs)
The time in milliseconds to wait before retrying to fetch Kafka offsets.
|
KafkaStreamingSourceOptions.Builder |
securityProtocol(String securityProtocol)
The protocol used to communicate with brokers.
|
KafkaStreamingSourceOptions.Builder |
startingOffsets(String startingOffsets)
The starting position in the Kafka topic to read data from.
|
KafkaStreamingSourceOptions.Builder |
startingTimestamp(Instant startingTimestamp)
The timestamp of the record in the Kafka topic to start reading data from.
|
KafkaStreamingSourceOptions.Builder |
subscribePattern(String subscribePattern)
A Java regex string that identifies the topic list to subscribe to.
|
KafkaStreamingSourceOptions.Builder |
topicName(String topicName)
The topic name as specified in Apache Kafka.
|
equalsBySdkFields, sdkFieldscopyapplyMutation, buildKafkaStreamingSourceOptions.Builder bootstrapServers(String bootstrapServers)
A list of bootstrap server URLs, for example, as
b-1.vpc-test-2.o4q88o.c6.kafka.us-east-1.amazonaws.com:9094. This option must be specified in
the API call or defined in the table metadata in the Data Catalog.
bootstrapServers - A list of bootstrap server URLs, for example, as
b-1.vpc-test-2.o4q88o.c6.kafka.us-east-1.amazonaws.com:9094. This option must be
specified in the API call or defined in the table metadata in the Data Catalog.KafkaStreamingSourceOptions.Builder securityProtocol(String securityProtocol)
The protocol used to communicate with brokers. The possible values are "SSL" or
"PLAINTEXT".
securityProtocol - The protocol used to communicate with brokers. The possible values are "SSL" or
"PLAINTEXT".KafkaStreamingSourceOptions.Builder connectionName(String connectionName)
The name of the connection.
connectionName - The name of the connection.KafkaStreamingSourceOptions.Builder topicName(String topicName)
The topic name as specified in Apache Kafka. You must specify at least one of "topicName",
"assign" or "subscribePattern".
topicName - The topic name as specified in Apache Kafka. You must specify at least one of "topicName"
, "assign" or "subscribePattern".KafkaStreamingSourceOptions.Builder assign(String assign)
The specific TopicPartitions to consume. You must specify at least one of
"topicName", "assign" or "subscribePattern".
assign - The specific TopicPartitions to consume. You must specify at least one of
"topicName", "assign" or "subscribePattern".KafkaStreamingSourceOptions.Builder subscribePattern(String subscribePattern)
A Java regex string that identifies the topic list to subscribe to. You must specify at least one of
"topicName", "assign" or "subscribePattern".
subscribePattern - A Java regex string that identifies the topic list to subscribe to. You must specify at least one of
"topicName", "assign" or "subscribePattern".KafkaStreamingSourceOptions.Builder classification(String classification)
An optional classification.
classification - An optional classification.KafkaStreamingSourceOptions.Builder delimiter(String delimiter)
Specifies the delimiter character.
delimiter - Specifies the delimiter character.KafkaStreamingSourceOptions.Builder startingOffsets(String startingOffsets)
The starting position in the Kafka topic to read data from. The possible values are "earliest"
or "latest". The default value is "latest".
startingOffsets - The starting position in the Kafka topic to read data from. The possible values are
"earliest" or "latest". The default value is "latest".KafkaStreamingSourceOptions.Builder endingOffsets(String endingOffsets)
The end point when a batch query is ended. Possible values are either "latest" or a JSON string
that specifies an ending offset for each TopicPartition.
endingOffsets - The end point when a batch query is ended. Possible values are either "latest" or a JSON
string that specifies an ending offset for each TopicPartition.KafkaStreamingSourceOptions.Builder pollTimeoutMs(Long pollTimeoutMs)
The timeout in milliseconds to poll data from Kafka in Spark job executors. The default value is
512.
pollTimeoutMs - The timeout in milliseconds to poll data from Kafka in Spark job executors. The default value is
512.KafkaStreamingSourceOptions.Builder numRetries(Integer numRetries)
The number of times to retry before failing to fetch Kafka offsets. The default value is 3.
numRetries - The number of times to retry before failing to fetch Kafka offsets. The default value is
3.KafkaStreamingSourceOptions.Builder retryIntervalMs(Long retryIntervalMs)
The time in milliseconds to wait before retrying to fetch Kafka offsets. The default value is 10
.
retryIntervalMs - The time in milliseconds to wait before retrying to fetch Kafka offsets. The default value is
10.KafkaStreamingSourceOptions.Builder maxOffsetsPerTrigger(Long maxOffsetsPerTrigger)
The rate limit on the maximum number of offsets that are processed per trigger interval. The specified total
number of offsets is proportionally split across topicPartitions of different volumes. The
default value is null, which means that the consumer reads all offsets until the known latest offset.
maxOffsetsPerTrigger - The rate limit on the maximum number of offsets that are processed per trigger interval. The specified
total number of offsets is proportionally split across topicPartitions of different
volumes. The default value is null, which means that the consumer reads all offsets until the known
latest offset.KafkaStreamingSourceOptions.Builder minPartitions(Integer minPartitions)
The desired minimum number of partitions to read from Kafka. The default value is null, which means that the number of spark partitions is equal to the number of Kafka partitions.
minPartitions - The desired minimum number of partitions to read from Kafka. The default value is null, which means
that the number of spark partitions is equal to the number of Kafka partitions.KafkaStreamingSourceOptions.Builder includeHeaders(Boolean includeHeaders)
Whether to include the Kafka headers. When the option is set to "true", the data output will contain an
additional column named "glue_streaming_kafka_headers" with type
Array[Struct(key: String, value: String)]. The default value is "false". This option is
available in Glue version 3.0 or later only.
includeHeaders - Whether to include the Kafka headers. When the option is set to "true", the data output will contain
an additional column named "glue_streaming_kafka_headers" with type
Array[Struct(key: String, value: String)]. The default value is "false". This option is
available in Glue version 3.0 or later only.KafkaStreamingSourceOptions.Builder addRecordTimestamp(String addRecordTimestamp)
When this option is set to 'true', the data output will contain an additional column named "__src_timestamp" that indicates the time when the corresponding record received by the topic. The default value is 'false'. This option is supported in Glue version 4.0 or later.
addRecordTimestamp - When this option is set to 'true', the data output will contain an additional column named
"__src_timestamp" that indicates the time when the corresponding record received by the topic. The
default value is 'false'. This option is supported in Glue version 4.0 or later.KafkaStreamingSourceOptions.Builder emitConsumerLagMetrics(String emitConsumerLagMetrics)
When this option is set to 'true', for each batch, it will emit the metrics for the duration between the oldest record received by the topic and the time it arrives in Glue to CloudWatch. The metric's name is "glue.driver.streaming.maxConsumerLagInMs". The default value is 'false'. This option is supported in Glue version 4.0 or later.
emitConsumerLagMetrics - When this option is set to 'true', for each batch, it will emit the metrics for the duration between
the oldest record received by the topic and the time it arrives in Glue to CloudWatch. The metric's
name is "glue.driver.streaming.maxConsumerLagInMs". The default value is 'false'. This option is
supported in Glue version 4.0 or later.KafkaStreamingSourceOptions.Builder startingTimestamp(Instant startingTimestamp)
The timestamp of the record in the Kafka topic to start reading data from. The possible values are a
timestamp string in UTC format of the pattern yyyy-mm-ddTHH:MM:SSZ (where Z represents a UTC
timezone offset with a +/-. For example: "2023-04-04T08:00:00+08:00").
Only one of StartingTimestamp or StartingOffsets must be set.
startingTimestamp - The timestamp of the record in the Kafka topic to start reading data from. The possible values are a
timestamp string in UTC format of the pattern yyyy-mm-ddTHH:MM:SSZ (where Z represents a
UTC timezone offset with a +/-. For example: "2023-04-04T08:00:00+08:00").
Only one of StartingTimestamp or StartingOffsets must be set.
Copyright © 2023. All rights reserved.