Reliability vs high throughput #2

fnobilia · 2016-09-29T08:07:46Z

Kafka messages can be consumed according to two different strategies:

at most once messages can be lost due to consumer faults
at least once messages can be read twice due to consumer faults

This design choice drives the consumer implementation.

We may plan to code both approaches, and select the most suitable after tests.

EDIT @blootsvoets: reversed at most and at least

blootsvoets · 2016-09-29T09:34:25Z

Since we add timestamps to all messages, I'd opt for at least once processing to start, we can always establish unicity by the timestamp. It may be harder to retrieve lost data.

fnobilia · 2016-09-29T10:23:09Z

Yes but this approach open another issue.

We would use group of consumer to maximise the throughput. Groups implies that Kafka periodically and automatically re-assigns partitions across consumers inside a single group for balancing the work-load. This results in a new generation of the group. After the rebalance, Consumer_X may manage partitions previously handled by consumer_Y. The last seen timestamp per each key cannot be stored locally (e.g. inside each consumer), this information has to be stored in a data-structure shared among all group members. A NonBlockingHashMap could solve this issue.

afolarin · 2016-09-29T10:47:04Z

@fnobilia Presumably this is going to depend on the requirements of the Consumer / use case? I think At-Least-Once is generally recommended though.

https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-HowdoIgetexactly-oncemessagingfromKafka? As far as I'm aware Exactly-Once-Consumption can be done but Consumers need to locally transactionally checkpoint. However in practice this overhead also places constraints on throughput.

Also take a look at Flink Checkpointing http://data-artisans.com/kafka-flink-a-practical-how-to/

fnobilia · 2016-09-29T11:41:01Z

Unfortunately the exactly-once is under active investigation (1 and 2).

Apache Flink may be useful in near future to provide fault-tolerance and scalability within the analysis layer.

fnobilia added duplicate question feature labels Sep 29, 2016

fnobilia assigned blootsvoets, afolarin and MaximMoinat Sep 29, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reliability vs high throughput #2

Reliability vs high throughput #2

fnobilia commented Sep 29, 2016 •

edited by blootsvoets

Loading

blootsvoets commented Sep 29, 2016

fnobilia commented Sep 29, 2016 •

edited

Loading

afolarin commented Sep 29, 2016 •

edited

Loading

fnobilia commented Sep 29, 2016

Reliability vs high throughput #2

Reliability vs high throughput #2

Comments

fnobilia commented Sep 29, 2016 • edited by blootsvoets Loading

blootsvoets commented Sep 29, 2016

fnobilia commented Sep 29, 2016 • edited Loading

afolarin commented Sep 29, 2016 • edited Loading

fnobilia commented Sep 29, 2016

fnobilia commented Sep 29, 2016 •

edited by blootsvoets

Loading

fnobilia commented Sep 29, 2016 •

edited

Loading

afolarin commented Sep 29, 2016 •

edited

Loading