Use hyperparameters for automatic tuning

Brief introduction

Hyperparameter auto-tuning is a feature of Xiaomi Cloud-ML, whereby multiple sets of hyperparameter combinations are defined at a time. After submission, training is performed concurrently and returns the optimal hyperparameter combination.

Code specifications

Users can customize the "best result" index. In the code of the TensorFlow model, the index must be written training/hptuning/metric, as follows.

tf.summary.scalar("training/hptuning/metric", loss)

Usage example

When submitting, we can use the example code in Samples to write the hyperparameters in the job.json file.

{
  "job_name": "hpat",
  "module_name": "trainer.task",
  "trainer_uri": "fds://cloud-ml/linear/trainer-1.0.tar.gz",
  "job_args": "--max_epochs 1000",
  "cpu_limit": "0.25",
  "memory_limit": "250M",
  "hyperparameters": {
    "goal": "MINIMIZE",
    "output_path": "fds://cloud-ml/linear/linear_hpat",
    "params": [
      {"optimizer": "ftrl", "learning_rate": 0.1},
      {"optimizer": "ftrl", "learning_rate": 0.5},
      {"optimizer": "sgd", "learning_rate": 0.1},
      {"optimizer": "sgd", "learning_rate": 0.5}
    ]
  }
}

Then use the cloudml command-line tool to submit. Since we need to define a combination of hyperparameters, the JSON file must be imported as a parameter.

cloudml jobs submit -f job.json

Once the training is over, we can use the command line to view the results of any one task. The system can automatically select the hyperparameter combination that will yield the best result and return it to the user.

cloudml jobs list

cloudml jobs logs hpat-hp-0

cloudml jobs hp hpat-hp-0

Parameters introduction

  • -F is an optional parameter. Users can use the JSON file to describe the parameters of the submitted task. Note that it cannot be combined with other parameters.