Use multiple copies and load balancing
Brief introduction
The Xiaomi Cloud-ML model service supports multiple copies and load balancing. Users specify the number of copies when creating a model service, and the platform creates multiple replica instances and implements load balancing. Users can access the entire cluster as if they were a single-node service.
Use the multiple-copy feature
By adding the -r
parameter when creating the model service, users can automatically create multiple replica instances and load balancing.
cloudml models create -n linear -v v1 -u fds://cloud-ml/linear -r 3
Parameter introduction
- The
-r
parameter indicates the number of copies used. Note that the greater the number of copies, the more Quota they will occupy.