- Author(s): Xiaolong Ran
- Last updated: Dec 05, 2019
- Discussion at: pingcap#117
The proposal aims at using the key_shared subscription type of Apache Pulsar to enhance binlog data processing capabilities.
During the Change Data Capture(CDC), we need to ensure the order of the messages, which we use Kafka to process. However Apache Kafka can only guarantee the order of messages within a single partition. If we need to expand data processing capabilities for downstream, sometimes more partitions are needed. How do we ensure the order of messages in this scenario is a tricky question to tackle.
We will make these changes:
- Replace kafak-client-go with pulsar-client-go.
- Provide a plugin to support more message queues, including: Apache Pulsar
tidb-binlog
source collects messages from other databases and publishes the messages to Pulsar topics.
tidb-binlog
sink consumes the messages from the Pulsar topics through the key_shared subscription mode and publishes the messages to TiDB.
There are no compatibility issues. Apache Pulsar is compatible with Apache Kafka protocol
- Implement pulsar-client-go, and make sure it meets the required interface in tidb-binlog.
- @pingcap/ecosystem-tools