evaluator.md 1.9 KB
Newer Older
D
Dong Zhihong 已提交
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
## Evaluator Design

### The Problem

During training or serving, we provide the evaluation function to measure the model performance, e.g., accuracy, precision. In the operator based framework design, the data go through the network pipeline batch by batch. As a result, inside the operator, we only can calculate one minibatch metrics. We need to provide a mechanism to calculate the metrics for each N pass/batch the user wanted.

### Evaluator Design
Currently, every operation is expressed in the graph. we divide the evaluator process into three steps.

1. Initialize the metric state necessary and add it into the block.

2. Calculate the statistic of the metric state in every mini-batch. The single operator is only responsible for calculating necessary statistics for one mini-batch. For example, accuracy operator only calculate a minibatch data if run once.\


3. Merge the mini-batch statistics to form the evaluation result for multiple mini-batches. When it comes to distributed training/Multi-GPU training, aggregate the value from different devices.

### Implementation
This design is shown in python API. There would be an abstract python interface and multiple inheritances for each evaluation method.

```python
class Evaluator(object):
    """
    Evalutor Base class.
    """
D
Dong Zhihong 已提交
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
    def __init__(self):
       """
       create metric states and append to block
       """ 
       pass

    def _clear_state(self):
      """
      clear metric states at the begin of each pass
      """
      pass

    def _append_evalutor_op(self):
      """
      add mini-batch caculate operators to block
      add increment operator to accumulate the metric state
      """
      pass

    def _merge(self):
      """
      Merge the mini-batch statistics to form the evaluation result for multiple mini-batches.
      """
      pass
D
Dong Zhihong 已提交
49 50

    def evaluate(self):
D
Dong Zhihong 已提交
51 52 53 54 55
      """
      only one exported interface
      user calculate the result
      """
      pass
D
Dong Zhihong 已提交
56 57

```