Xiaomi Galaxy Talos Book

Talos Frequently Asked Questions


Why use Talos, what benefits does migration from Kafka bring to the user?

  • Talos provides a complete authentication/authorization mechanism and fine-grained access control, guarantees data security/isolation under multi-tenancy, and resolves the insecurity of Kafka data
  • The bottom layer of Talos is based on Hdfs, which can guarantee that data is not lost and resolves the issue that Kafka may lose data
  • Talos provides a good failover and load balancing mechanism to resolve the rebalance pain point of Kafka failover
  • Talos provides Rest interface, which can satisfy the company's internal users as well as Mi Ecosystem users
  • Talos's easy operation can quickly respond to problems and solves the complexity of Kafka's operation and maintenance

About data duplication: Talos provides the "more than once" semantics. Under what circumstances will there be more than once?

During Producer Timeout, the following may appear
  • When the user writes data to Talos, if the response is failure due to Timeout, a relatively small probability is that the Server may have already written "Successful." In this case, the client's retry will cause the data to be written to duplicate;
During Consumer Failover, no commit or when commit fails
  • In the process of using the High Level Consumer by the user, for example, the user instantiates two TalosConsumer instances A and B; suppose A is killed when the partition data is consumed or the machine is down. At this time, the data checkpoint consumed by A is not yet available for commit, and then B takes over the partition that was consumed by A before. B repeats the consumption of that part of the data A had consumed but did not commit before hanging up. At this time appears more than once;

  • In the process of using the High Level Consumer, when multiple Consumers go online and offline, the Consumer will perform a Rebalance among them, that is, reallocate the partition; suppose Consumer A originally consumes partition [1, 2, 3], and when other Consumers go online. A needs to release partition 1. At this time, A will commit the data of partition 1 that has already been consumed. If the commit fails, more than once will appear; the cause of the failure of commit may be: 1) Consumer A has lost partition 1's lock and cannot commit, for example, gc or network causes loss of lock; 2) When Consumer A commits, when writing HBase an error leads to failure; the probability of this situation is relatively small, and the server will have a retry logic; When Commit fails, when the other Consumer takes over the partition, it will repeat the consumption of that part of data already consumed by Consumer A;

About data integrity: Will Talos lose data?

In simple terms, when Talos mentions that data was written successfully, we promise not to lose it; please be informed that in some scenarios, the data that the user has not yet written may be lost.

  • When the user uses the Simple Producer (synchronous interface), if the feedback is successful, we promise we will not lose it;

  • When the user uses High Level TalosProducer (asynchronous interface), we promise not to lose it when the onSuccess of the user callback is successfully executed;

  • When the user uses High Level TalosProducer (asynchronous interface), if the process is hung up or killed, the data not yet written in the buffer will be lost;