Stream
SDS Stream provides an updated stream of information recorded in proper order in the SDS table. When you enable Stream for a table, SDS will capture record update in the table and show the relevant information as a message in the user-defined Talos
Stream characteristics
- Definition of more than once: Relevant update message is sent at least once to Talos consumers for records that have been successfully written to SDS.
- Row level preservation
Stream type and message format
RECORD_IMAGE, record view after update
- Record: Record view, if an entire row of operation is deleted, it will then only contain entity group keys and primary keys
- rowDeleted: Whether the row is to be deleted
- Timestamp: Update timestamp
MUTATE_LOG, update log
- Record: If an entire row of operation is deleted, use RECORD_IMAGE to record. If partially deleted, value of the key in record that represents deleted property can be ignored
- Type: Operation type, including put/delete/increment
- rowDeleted: Whether the row has been deleted
- timestamp: Update timestamp
Use Flow
- Create topic in Talos
- When creating table or updating table schema, enter topic name to add Stream
- Produce data to a table
- Create Talos consumer and consume Stream messages
Issues to be noted
- Timestamp in the message format refers to the update timestamp of SDS server
MUTATE_LOG refers to log corresponding to successful executions of server-side operations. It is not equivalent to original operations from client. There are two main differences:
- Batch operations are broken down into individual unit operations; each unit is of a message
- Operation with a condition, if successful, condition will not be recorded in the message; if it fails, no message will be generated
Stream collects updates based on the number of slices in the table. Therefore, it is recommended to pre-slice the table for performance improvement when building a table.
For MUTATE_LOG type of Stream, by the definition of more than once for non-idempotent operations, such as Increment, this needs to de-duplicate according to time stamp, and then replay
Further work
- At present, Talos cannot preserve the order when increasing the number of topic partitions. Therefore, if application requires row-level preservation, the number of partitions can only be specified once at time of creating the topic.
- Corresponding Talos endpoint is as follows:
- aws Beijing => https://awsbj0.talos.api.xiaomi.com
- aws American West => https://awsusor0.talos.api.xiaomi.com