pserver_client.md 4.0 KB
Newer Older
H
Helin Wang 已提交
1 2 3 4 5 6 7 8 9 10 11 12 13 14
# Design Doc: The Client Library of Parameter Server

For an overview of trainer's role, please refer to [distributed training design doc](README.md). In this design doc, we will discuss the parameter server's client library, which will manage communication with parameter servers. The library will be implemented in [Go](https://golang.org/) and made available as a static or dynamic library with a C header file.

## C Interface

```c
#define PADDLE_ELEMENT_TYPE_INT32   0
#define PADDLE_ELEMENT_TYPE_UINT32  1
#define PADDLE_ELEMENT_TYPE_INT64   2
#define PADDLE_ELEMENT_TYPE_UINT64  3
#define PADDLE_ELEMENT_TYPE_FLOAT32 4
#define PADDLE_ELEMENT_TYPE_FLOAT64 5

H
Helin Wang 已提交
15 16 17 18 19 20 21
typedef struct {
  char* name;
  int   element_type;
  void* content;
  int   content_len;
} paddle_parameter, paddle_gradient;

H
Helin Wang 已提交
22 23 24 25 26 27
typedef struct paddle_pserver_client paddle_pserver_client;

paddle_pserver_client* paddle_new_pserver_client();
void paddle_pserver_client_release(paddle_pserver_client* client);

/**
H
Helin Wang 已提交
28 29
 * @brief paddle_begin_init_params begins to initialize parameters on
 * parameter servers.
H
Helin Wang 已提交
30
 *
H
Helin Wang 已提交
31 32
 * paddle_begin_init_params will be called from multiple trainers,
 * only one trainer will be selected to initialize the parameters on
H
Helin Wang 已提交
33 34
 * parameter servers. Other trainers will be blocked until the
 * initialization is done, and they need to get the initialized
H
Helin Wang 已提交
35
 * parameters from parameter servers using @paddle_get_params.
H
Helin Wang 已提交
36
 *
H
Helin Wang 已提交
37 38 39 40
 * @param config_proto serialized parameter server configuration in
 * Protocol Buffers format.
 * @return 1 if the trainer is selected to initialize parameter
 * servers, otherwise 0.
H
Helin Wang 已提交
41
 */
H
Helin Wang 已提交
42
int paddle_begin_init_params(paddle_pserver_client* client, const char* config_proto);
H
Helin Wang 已提交
43 44 45 46 47

/**
 * @brief paddle_init_param initializes the parameter on parameter
 * servers.
 *
H
Helin Wang 已提交
48
 * @param param the parameter to initialize.
H
Helin Wang 已提交
49 50 51 52
 * @return 0 if successful, otherwise -1. On failure, the trainer
 * needs to restart the entire initialization process (starting from
 * @paddle_begin_init_param). Or simply exit the program and wait for
 * the cluster management system to restart the trainer.
H
Helin Wang 已提交
53
 */
H
Helin Wang 已提交
54
int paddle_init_param(paddle_pserver_client* client, paddle_parameter params);
H
Helin Wang 已提交
55 56

/**
H
Helin Wang 已提交
57
 * @brief paddle_finish_init_params tells parameter servers client has
H
Helin Wang 已提交
58 59
 * sent all parameters to parameter servers as initialization.
 *
H
Helin Wang 已提交
60 61 62 63
 * @return 0 if successful, otherwise -1. On failure, the trainer
 * needs to restart the entire initialization process (starting from
 * @paddle_begin_init_param). Or simply exit the program and wait for
 * the cluster management system to restart the trainer.
H
Helin Wang 已提交
64
 */
H
Helin Wang 已提交
65
int paddle_finish_init_params(paddle_pserver_client* client);
H
Helin Wang 已提交
66 67

/**
H
Helin Wang 已提交
68
 * @brief paddle_send_grads sends gradients to parameter servers for
H
Helin Wang 已提交
69 70
 * updating parameters.
 *
H
Helin Wang 已提交
71 72 73
 * @param grads the array of gradients to send.
 * @param total the total number of gradient inside the gradient array.
 * @param learning_rate the learning rate for the gradients.
H
Helin Wang 已提交
74 75
 * @return 0 if successful, otherwise -1.
 */
H
Helin Wang 已提交
76
int paddle_send_grads(paddle_pserver_client* client, const paddle_gradient* grads, int total, double learning_rate);
H
Helin Wang 已提交
77 78

/**
H
Helin Wang 已提交
79
 * @brief paddle_set_params sets parameters to parameter servers.
H
Helin Wang 已提交
80
 *
H
Helin Wang 已提交
81
 * @param params the array of parameters to set to parameter servers.
H
Helin Wang 已提交
82 83
 * @param total the total number of parameters inside the parameter
 * array.
H
Helin Wang 已提交
84 85
 * @return 0 if successful, otherwise -1.
 */
H
Helin Wang 已提交
86
int paddle_set_params(paddle_pserver_client* client, const paddle_parameter* params, int total);
H
Helin Wang 已提交
87 88

/**
H
Helin Wang 已提交
89
 * @brief paddle_get_params gets parameters from parameter servers.
H
Helin Wang 已提交
90
 *
H
Helin Wang 已提交
91 92 93
 * @param names the array of names of the parameters to get.
 * @param dst the destination array of parameters to save to.
 * @param total the total number of parameters to get.
H
Helin Wang 已提交
94 95
 * @return 0 if successful, otherwise -1.
 */
H
Helin Wang 已提交
96
int paddle_get_params(paddle_pserver_client* client, const char** names, paddle_parameter* dst, int total);
H
Helin Wang 已提交
97 98 99 100 101

/**
 * @brief paddle_save_model indicates parameters to save the parameter
 * to the given path
 *
H
Helin Wang 已提交
102
 * @param path the path to save parameters.
H
Helin Wang 已提交
103 104 105 106
 * @return 0 if successful, otherwise -1.
 */
int paddle_save_model(paddle_pserver_client* client, const char* path);
```