evaluator.md 2.2 KB
Newer Older
D
Dong Zhihong 已提交
1 2 3 4 5 6 7 8 9
## Evaluator Design

### The Problem

During training or serving, we provide the evaluation function to measure the model performance, e.g., accuracy, precision. In the operator based framework design, the data go through the network pipeline batch by batch. As a result, inside the operator, we only can calculate one minibatch metrics. We need to provide a mechanism to calculate the metrics for each N pass/batch the user wanted.

### Evaluator Design
Currently, every operation is expressed in the graph. we divide the evaluator process into three steps.

D
Dong Zhihong 已提交
10
1. Initialize the metric state and add it into the block.
D
Dong Zhihong 已提交
11

D
Dong Zhihong 已提交
12
2. Calculate the statistic of the metric state in every mini-batch. The single operator is only responsible for calculating necessary statistics for one mini-batch. For example, accuracy operator only calculate a minibatch data if run once.
D
Dong Zhihong 已提交
13 14 15 16 17 18 19 20 21 22


3. Merge the mini-batch statistics to form the evaluation result for multiple mini-batches. When it comes to distributed training/Multi-GPU training, aggregate the value from different devices.

### Implementation
This design is shown in python API. There would be an abstract python interface and multiple inheritances for each evaluation method.

```python
class Evaluator(object):
    """
D
Dong Zhihong 已提交
23
    Evaluator Base class.
D
Dong Zhihong 已提交
24
    """
D
Dong Zhihong 已提交
25 26
    def __init__(self):
       """
D
Dong Zhihong 已提交
27 28 29 30 31 32 33 34
       Different evaluator may has different metric states. E.g, Accuracy need two variables, total and right sample counts.
       Auc need four variables, `true_positives`,
         `true_negatives`, `false_positives` and `false_negatives`. So every evaluator should create its needed variables and append the related mini-batch operator to main_program

       The initialization of Evaluator should be responsible for:
       create metric states and append to the main_program
       add mini-batch evaluator caculate operators to the main_program
       add increment operator to accumulate the metric states
D
Dong Zhihong 已提交
35 36 37
       """ 
       pass

D
Dong Zhihong 已提交
38
    def clear(self):
D
Dong Zhihong 已提交
39
      """
D
Dong Zhihong 已提交
40
      clear metric states at the begin of each pass/user specified batch
D
Dong Zhihong 已提交
41
      """
D
Dong Zhihong 已提交
42
      return init_program
D
Dong Zhihong 已提交
43 44

    def evaluate(self):
D
Dong Zhihong 已提交
45
      """
D
Dong Zhihong 已提交
46
      Merge the mini-batch statistics to form the evaluation result for multiple mini-batches.
D
Dong Zhihong 已提交
47
      """
D
Dong Zhihong 已提交
48
      return eval_program
D
Dong Zhihong 已提交
49
```