Use multiple copies and load balancing

Brief introduction

The Xiaomi Cloud-ML model service supports multiple copies and load balancing. Users specify the number of copies when creating a model service, and the platform creates multiple replica instances and implements load balancing. Users can access the entire cluster as if they were a single-node service.

Use the multiple-copy feature

By adding the -r parameter when creating the model service, users can automatically create multiple replica instances and load balancing.

cloudml models create -n linear -v v1 -u fds://cloud-ml/linear -r 3

Parameter introduction

  • The -r parameter indicates the number of copies used. Note that the greater the number of copies, the more Quota they will occupy.