evaluator.md 2.5 KB
Newer Older
W
weixing 已提交
1
# Evaluator Design
D
Dong Zhihong 已提交
2

W
weixing 已提交
3
## Problem Statement
D
Dong Zhihong 已提交
4

5
During training or inference, we provide an evaluation function to measure the model performance, for example, accuracy, precision, etc. In the operator based framework design, the data passes through the network pipeline batch by batch. As a result, inside the operator, we only calculate the metrics for one minibatch. Thus, we need to provide a mechanism to calculate the metrics for each N pass/batch the user wants.
D
Dong Zhihong 已提交
6

W
weixing 已提交
7
## Evaluator Design
8
Currently, every operation is expressed in the graph. We divide the evaluator process into three steps.
D
Dong Zhihong 已提交
9

D
Dong Zhihong 已提交
10
1. Initialize the metric state and add it into the block.
D
Dong Zhihong 已提交
11

12
2. Calculate the concerned metrics for every mini-batch. The single evaluator operator is only responsible for calculating the necessary statistics for one mini-batch. For example, the accuracy operator only calculates the accuracy for a minibatch data if run once.
D
Dong Zhihong 已提交
13 14 15 16


3. Merge the mini-batch statistics to form the evaluation result for multiple mini-batches. When it comes to distributed training/Multi-GPU training, aggregate the value from different devices.

W
weixing 已提交
17 18 19 20
## Implementation
This design is shown in the Python API.
Each metric operator needs to caculate the metric statistic and return the batch-aware states. Python side is responsible for accumulating the states for each pass.

D
Dong Zhihong 已提交
21 22 23 24

```python
class Evaluator(object):
    """
D
Dong Zhihong 已提交
25
    Evaluator Base class.
D
Dong Zhihong 已提交
26
    """
D
Dong Zhihong 已提交
27
    def __init__(self, name, **kwargs):
D
Dong Zhihong 已提交
28
       """
D
Dong Zhihong 已提交
29 30
       Different evaluator may has different metric states. E.g, Accuracy need two variables, total and right sample counts.
       Auc need four variables, `true_positives`,
D
Dong Zhihong 已提交
31
         `true_negatives`, `false_positives` and `false_negatives`. So every evaluator should create its needed variables and append to main_program
D
Dong Zhihong 已提交
32 33 34

       The initialization of Evaluator should be responsible for:
       create metric states and append to the main_program
W
weixing 已提交
35
       """
D
Dong Zhihong 已提交
36 37
       pass

D
Dong Zhihong 已提交
38 39 40 41 42
    def _update_ops(self, input, label, **kwargs)
       """
       Add mini-batch evaluator caculate operators to the main_program.
       Add increment operator to accumulate the metric states.
       """
W
weixing 已提交
43

D
Dong Zhihong 已提交
44

D
Dong Zhihong 已提交
45
    def reset(self, executor, reset_program=None):
D
Dong Zhihong 已提交
46
      """
D
Dong Zhihong 已提交
47 48
      Reset metric states at the begin of each pass/user specified batch number.
      Execute the reset_program to reset the states.
D
Dong Zhihong 已提交
49
      """
W
weixing 已提交
50

D
Dong Zhihong 已提交
51

D
Dong Zhihong 已提交
52
    def eval(self, executor, eval_program=None):
D
Dong Zhihong 已提交
53
      """
D
Dong Zhihong 已提交
54
      Merge the mini-batch statistics to form the evaluation result for multiple mini-batches.
D
Dong Zhihong 已提交
55
      Execute the eval_program and return the result.
D
Dong Zhihong 已提交
56
      """
D
Dong Zhihong 已提交
57
      return eval_result
D
Dong Zhihong 已提交
58
```