Use GPU training

Brief introduction

Xiaomi Cloud-ML supports the use of GPUs for training. You only need to specify the number of GPUs when submitting tasks. The GPU resources requested by the user will be mapped and isolated by generic containers, and the access interfaces and host GPU devices are sure to be the same.

Note that the user must have a sufficient GPU resource Quota to submit tasks using the GPU.

Use GPU training

Once the user has compiled the TensorFlow model code, GPU resources can be used by adding the GPU parameter when the model is submitted.

cloudml jobs submit -n linear -m trainer.task -u fds://cloud-ml/linear/trainer-1.0.tar.gz -c 1 -M 1G -g 1

Parameter introduction

  • -GThis parameter indicates the number of GPUs used. The platform will select a resource-sufficient machine for scheduling. In addition, the startup GPU container image is also different from the CPU container image.