In order to fully understand the system status and user use, Talos provides a relatively complete Counter system. The original intention of the Counter system design is for the understanding of the following information:
System performance, load conditions; for example, latency and qps
Data storage; for example, Topic's data amount, Partition's current offset range: [start, end]
Data consumption; for example, consumer group's consumption record checkpoint and consumption data accumulation
For this purpose, the Metrics provided by the Talos system are roughly divided into:
MessageNumber: topic/partition's current number of Messages
In addition, for all Operation (API) , they all have both Latency and QPS Metrics. The MetricName corresponding to these Metrics are as follows:<0><1>1>0>
API |
MetricName Set |
---|---|
putMessage |
putMessage.Time.75thPercentile |
-- | putMessage.Time.95thPercentile |
-- | putMessage.Time.98thPercentile |
-- | putMessage.Time.999thPercentile |
getMessage |
getMessage.Time.75thPercentile |
-- | getMessage.Time.95thPercentile |
-- | getMessage.Time.98thPercentile |
-- | getMessage.Time.999thPercentile |
Note: all Percentile computed samples are data in the last 5 minutes
API |
MetricName Set |
---|---|
putMessage |
putMessage.60sRate |
-- | putMessage.300sRate |
-- | putMessage.900sRate |
getMessage |
getMessage.60sRate |
-- | getMessage.300sRate |
-- | getMessage.900Rate |
Among them, 60s/300s/900s in QPS MetricName refers to the time window for calculating QPS: QPS for the last minute, QPS for the last 5 minutes, and data for the last 15 minutes.
We are web-servicing the current alarm system. Now it supports background configuration. If the user wants to set the monitoring alarm (email/SMS) of the above Metrics, please send an email to talos-help@xiaomi.com
Please fill out the application form. The following is an example (due to layout issues, split the form into 2):
Cluster |
TopicName |
MetricName |
Alert-Value |
---|---|---|---|
azbjsrv-talos |
testTopic |
putMessage.Time.95thPercentile |
100ms |
azbjsrv-talos |
testTopic |
ConsumerOffsetLag |
10000 |
Email |
Phone |
ConsumerGroup |
---|---|---|
alert@xiaomi.com |
1877777777 |
None |
alert@xiaomi.com |
1877777777 |
myConsumerGroupName |