Use a generic gRPC client

Brief introduction

TensorFlow serving provides a generic Grpc interface. We are also considering implementing a generic GRPC client. Because model input data differs from user to user, we define the data via JSON and integrate it into Xiaomi Cloud-ML.

Use Xiaomi Cloud-ML

After installing the Cloud-ML command tool, we can create a Model Service, and the requested data is saved to a local JSON file.

{
  "keys_dtype": "int32",
  "keys": [[1], [2]],
  "features_dtype": "float32",
  "features": [[1,2,3,4,5,6,7,8,9], [1,2,3,4,5,6,7,8,9]]
}

Note that all fields in the Json file correspond to the definitions of the user TensorFlow model. The keys of different models are likewise different. To build a TensorProto, we also need the user to set the type for each key. The field name is the key name plus "_dtype". The value is a format that TensorFlow can support.

If the model is deployed by us, by default only we can acquire the IP and therefore directly specify the model name and and predict the version.

cloudml models predict linear v1 -f ./data.json 

If we want to access the generic TensorFlow serving service, we do not need to create a model. Access is also possible by directly specifying the IP.

cloudml models predict -n linear -s 127.0.0.1:9000 -f ./data.json

Implementation principles

Xiaomi Cloud-ML's generic client is implemented based on Python and is itself similar to an ordinary Python gRPC client. For its method of implementation, refer to the TensorFlow Serving Python client.

We package the Python gRPC client and support reading JSON files in order to build a TensorProto. This method is easier to use but less flexible. If you do not wish to use the Cloud-ML command line, you can run the following script directly instead.

./generic_predict_client.py --server 127.0.0.1:9000 --model linear --data ./linear.json