提交 d0717eac 编写于 作者: S Steffy-zxf 提交者: wuzewu

update autodl finetuner (#196)

* update autodl

* fix reader that drops dataset in predicting phase

* update the directory of tmp.txt
上级 bcc2dfe0
......@@ -4,7 +4,7 @@
[![License](https://img.shields.io/badge/license-Apache%202-blue.svg)](LICENSE)
[![Version](https://img.shields.io/github/release/PaddlePaddle/PaddleHub.svg)](https://github.com/PaddlePaddle/PaddleHub/releases)
PaddleHub是基于PaddlePaddle生态下的预训练模型管理和迁移学习工具,可以结合预训练模型更便捷地开展迁移学习工作。通过PaddleHub,您可以
PaddleHub是基于PaddlePaddle生态下的预训练模型管理和迁移学习工具,可以结合预训练模型更便捷地开展迁移学习工作。PaddleHub特性
* 便捷地获取PaddlePaddle生态下的所有预训练模型,涵盖了图像分类、目标检测、词法分析、语义模型、情感分析、语言模型、视频分类、图像生成、图像分割等主流模型。
* 更多详情可查看官网:https://www.paddlepaddle.org.cn/hub
......@@ -17,9 +17,9 @@ PaddleHub是基于PaddlePaddle生态下的预训练模型管理和迁移学习
* [回归任务](https://github.com/PaddlePaddle/PaddleHub/tree/release/v1.2/demo/sentence_similarity)
* [句子语义相似度计算](https://github.com/PaddlePaddle/PaddleHub/tree/release/v1.2/demo/sentence_similarity)
* [阅读理解任务](https://github.com/PaddlePaddle/PaddleHub/tree/release/v1.2/demo/reading-comprehension)
* PaddleHub支持超参优化(Auto Fine-tune),给定Fine-tune任务运行脚本以及超参搜索范围,Auto Fine-tune即可给出对于当前任务的较佳超参数组合。
* [PaddleHub超参优化功能autofinetune使用教程](https://github.com/PaddlePaddle/PaddleHub/blob/release/v1.2/tutorial/autofinetune.md)
* PaddleHub引入『**模型即软件**』的设计理念,支持通过Python API或者命令行工具,一键完成预训练模型地预测,更方便的应用PaddlePaddle模型库。
* 支持超参优化(AutoDL Finetuner),自动调整超参数,给出效果较佳的超参数组合。
* [PaddleHub超参优化功能AutoDL Finetuner使用示例](https://github.com/PaddlePaddle/PaddleHub/tree/release/v1.2/demo/autofinetune)
* 引入『**模型即软件**』的设计理念,通过Python API或者命令行实现一键预测,更方便地应用PaddlePaddle模型库。
* [PaddleHub命令行工具介绍](https://github.com/PaddlePaddle/PaddleHub/wiki/PaddleHub%E5%91%BD%E4%BB%A4%E8%A1%8C%E5%B7%A5%E5%85%B7)
......@@ -105,9 +105,9 @@ PaddleHub如何完成迁移学习,详情参考[wiki教程](https://github.com/
PaddleHub如何自定义迁移任务,详情参考[wiki教程](https://github.com/PaddlePaddle/PaddleHub/wiki/PaddleHub:-%E8%87%AA%E5%AE%9A%E4%B9%89Task)
如何使用PaddleHub超参优化功能,详情参考[autofinetune使用教程](https://github.com/PaddlePaddle/PaddleHub/blob/release/v1.2/tutorial/autofinetune.md)
PaddleHub如何自动优化超参数,详情参考[AutoDL Finetuner使用教程](https://github.com/PaddlePaddle/PaddleHub/blob/release/v1.2/tutorial/autofinetune.md)
如何使用ULMFiT策略微调PaddleHub预训练模型,详情参考[PaddleHub 迁移学习与ULMFiT微调策略](https://github.com/PaddlePaddle/PaddleHub/blob/release/v1.2/tutorial/strategy_exp.md)
PaddleHub如何使用ULMFiT策略微调预训练模型,详情参考[PaddleHub 迁移学习与ULMFiT微调策略](https://github.com/PaddlePaddle/PaddleHub/blob/release/v1.2/tutorial/strategy_exp.md)
## FAQ
......@@ -115,8 +115,6 @@ PaddleHub如何自定义迁移任务,详情参考[wiki教程](https://github.c
**A:** 因为ernie/bert module的创建时和此时运行环境中PaddlePaddle版本不对应。可以将PaddlePaddle和PaddleHub升级至最新版本,同时将ernie卸载。
```shell
# 若是CPU环境,则 pip install --upgrade paddlepaddle
$ pip install --upgrade paddlepaddle-gpu
$ pip install --upgrade paddlehub
$ hub uninstall ernie
```
......@@ -149,7 +147,7 @@ print(res)
## 用户交流群
* 飞桨PaddlePaddle 交流群:432676488(QQ群)
* 飞桨PaddlePaddle 交流群:796771754(QQ群)
* 飞桨 ERNIE交流群:760439550(QQ群)
......
# `v1.2.0`
# `v1.2.1`
* 新增**超参优化Auto Fine-tune**,实现给定超参搜索空间,PaddleHub自动给出较佳的超参组合
* 支持两种优化策略:HAZero和PSHE2
* 支持两种评估方式:FullTrail和ModelBased
* 支持两种超参优化算法:HAZero和PSHE2
* 支持两种评估方式:FullTrail和PopulationBased
* 新增Fine-tune**优化策略ULMFiT**,包括以下三种设置
* Slanted triangular learning rates:学习率先线性增加后缓慢降低
* Discriminative fine-tuning:将计算图划分为n段,不同的段设置不同学习率
......
# PaddleHub超参优化——图像分类
**确认安装PaddleHub版本在1.2.1以上, 同时PaddleHub AutoDL Finetuner功能要求至少有一张GPU显卡可用。**
本示例展示如何利用PaddleHub超参优化AutoDL Finetuner,得到一个效果较佳的超参数组合
使用PaddleHub AutoDL Finetuner需要准备两个指定格式的文件:待优化的超参数信息yaml文件hparam.yaml和需要Fine-tune的python脚本train.py
以Fine-tune图像分类任务为例, 其中:
## hparam.yaml
hparam给出待搜索的超参名字、类型(int或者float)、搜索范围等信息。
通过这些信息构建了一个超参空间,PaddleHub将在这个空间内进行超参数的搜索,将搜索到的超参传入train.py获得评估效果,根据评估效果自动调整超参搜索方向,直到满足搜索次数。
本示例中待优化超参数为learning_rate和batch_size。
## img_cls.py
以mobilenet为预训练模型,在flowers数据集上进行Fine-tune。
## 如何开始超参优化
在完成安装PaddlePaddle与PaddleHub后,通过执行脚本`sh run_autofinetune.sh`即可开始使用超参优化功能。
`NOTE`: 关于PaddleHub超参优化详情参考[教程](https://github.com/PaddlePaddle/PaddleHub/blob/release/v1.2/tutorial/autofinetune.md)
param_list:
- name : learning_rate
init_value : 0.001
type : float
lower_than : 0.05
greater_than : 0.00005
- name : batch_size
init_value : 12
type : int
lower_than : 20
greater_than : 10
# coding:utf-8
import argparse
import os
import ast
import shutil
import paddle.fluid as fluid
import paddlehub as hub
from paddlehub.common.logger import logger
parser = argparse.ArgumentParser(__doc__)
parser.add_argument(
"--epochs", type=int, default=5, help="Number of epoches for fine-tuning.")
parser.add_argument(
"--checkpoint_dir", type=str, default=None, help="Path to save log data.")
parser.add_argument(
"--module",
type=str,
default="mobilenet",
help="Module used as feature extractor.")
# the name of hyperparameters to be searched should keep with hparam.py
parser.add_argument(
"--batch_size",
type=int,
default=16,
help="Total examples' number in batch for training.")
parser.add_argument(
"--learning_rate", type=float, default=1e-4, help="learning_rate.")
# saved_params_dir and model_path are needed by auto finetune
parser.add_argument(
"--saved_params_dir",
type=str,
default="",
help="Directory for saving model")
parser.add_argument(
"--model_path", type=str, default="", help="load model path")
module_map = {
"resnet50": "resnet_v2_50_imagenet",
"resnet101": "resnet_v2_101_imagenet",
"resnet152": "resnet_v2_152_imagenet",
"mobilenet": "mobilenet_v2_imagenet",
"nasnet": "nasnet_imagenet",
"pnasnet": "pnasnet_imagenet"
}
def is_path_valid(path):
if path == "":
return False
path = os.path.abspath(path)
dirname = os.path.dirname(path)
if not os.path.exists(dirname):
os.mkdir(dirname)
return True
def finetune(args):
# Load Paddlehub pretrained model, default as mobilenet
module = hub.Module(name=args.module)
input_dict, output_dict, program = module.context(trainable=True)
# Download dataset and use ImageClassificationReader to read dataset
dataset = hub.dataset.Flowers()
data_reader = hub.reader.ImageClassificationReader(
image_width=module.get_expected_image_width(),
image_height=module.get_expected_image_height(),
images_mean=module.get_pretrained_images_mean(),
images_std=module.get_pretrained_images_std(),
dataset=dataset)
# The last 2 layer of resnet_v2_101_imagenet network
feature_map = output_dict["feature_map"]
img = input_dict["image"]
feed_list = [img.name]
# Select finetune strategy, setup config and finetune
strategy = hub.DefaultFinetuneStrategy(learning_rate=args.learning_rate)
config = hub.RunConfig(
use_cuda=True,
num_epoch=args.epochs,
batch_size=args.batch_size,
checkpoint_dir=args.checkpoint_dir,
strategy=strategy)
# Construct transfer learning network
task = hub.ImageClassifierTask(
data_reader=data_reader,
feed_list=feed_list,
feature=feature_map,
num_classes=dataset.num_labels,
config=config)
# Load model from the defined model path or not
if args.model_path != "":
with task.phase_guard(phase="train"):
task.init_if_necessary()
task.load_parameters(args.model_path)
logger.info("PaddleHub has loaded model from %s" % args.model_path)
# Finetune by PaddleHub's API
task.finetune()
# Evaluate by PaddleHub's API
run_states = task.eval()
# Get acc score on dev
eval_avg_score, eval_avg_loss, eval_run_speed = task._calculate_metrics(
run_states)
# Move ckpt/best_model to the defined saved parameters directory
best_model_dir = os.path.join(config.checkpoint_dir, "best_model")
if is_path_valid(args.saved_params_dir) and os.path.exists(best_model_dir):
shutil.copytree(best_model_dir, args.saved_params_dir)
shutil.rmtree(config.checkpoint_dir)
# acc on dev will be used by auto finetune
hub.report_final_result(eval_avg_score["acc"])
if __name__ == "__main__":
args = parser.parse_args()
if not args.module in module_map:
hub.logger.error("module should in %s" % module_map.keys())
exit(1)
args.module = module_map[args.module]
finetune(args)
OUTPUT=result
hub autofinetune img_cls.py \
--param_file=hparam.yaml \
--gpu=0 \
--popsize=15 \
--round=10 \
--output_dir=${OUTPUT} \
--evaluator=fulltrail \
--tuning_strategy=pshe2
......@@ -119,7 +119,8 @@ seq_label_task = hub.SequenceLabelTask(
feed_list=feed_list,
max_seq_len=args.max_seq_len,
num_classes=dataset.num_labels,
config=config)
config=config,
add_crf=False)
seq_label_task.finetune_and_eval()
```
......@@ -128,6 +129,7 @@ seq_label_task.finetune_and_eval()
1. `outputs["sequence_output"]`返回了ERNIE/BERT模型输入单词的对应输出,可以用于单词的特征表达。
2. `feed_list`中的inputs参数指名了ERNIE/BERT中的输入tensor的顺序,与SequenceLabelReader返回的结果一致。
3. `hub.SequenceLabelTask`通过输入特征,迁移的类别数,可以生成适用于序列标注的迁移任务`SequenceLabelTask`
4. `hub.SequenceLabelTask`通过add_crf, 选择是否加入crf作为decoder。如果add_crf=True, 则在预训练模型计算图加入fc+crf层,否则只在在预训练模型计算图加入fc层。
## 可视化
......
......@@ -79,13 +79,15 @@ if __name__ == '__main__':
strategy=hub.finetune.strategy.DefaultFinetuneStrategy())
# Define a sequence labeling finetune task by PaddleHub's API
# if add crf, the network use crf as decoder
seq_label_task = hub.SequenceLabelTask(
data_reader=reader,
feature=sequence_output,
feed_list=feed_list,
max_seq_len=args.max_seq_len,
num_classes=dataset.num_labels,
config=config)
config=config,
add_crf=True)
# test data
data = [
......
......@@ -78,13 +78,15 @@ if __name__ == '__main__':
strategy=strategy)
# Define a sequence labeling finetune task by PaddleHub's API
# if add crf, the network use crf as decoder
seq_label_task = hub.SequenceLabelTask(
data_reader=reader,
feature=sequence_output,
feed_list=feed_list,
max_seq_len=args.max_seq_len,
num_classes=dataset.num_labels,
config=config)
config=config,
add_crf=True)
# Finetune and evaluate model by PaddleHub's API
# will finish training, evaluation, testing, save model automatically
......
......@@ -9,6 +9,7 @@ CKPT_DIR="./ckpt_${DATASET}"
# ChnSentiCorp: batch_size=24, weight_decay=0.01, num_epoch=3, max_seq_len=128, lr=5e-5
# NLPCC_DBQA: batch_size=8, weight_decay=0.01, num_epoch=3, max_seq_len=512, lr=2e-5
# LCQMC: batch_size=32, weight_decay=0, num_epoch=3, max_seq_len=128, lr=2e-5
# TNews: batch_size=32, weight_decay=0, num_epoch=3, max_seq_len=128, lr=5e-5
# QQP: batch_size=32, weight_decay=0, num_epoch=3, max_seq_len=128, lr=5e-5
# QNLI: batch_size=32, weight_decay=0, num_epoch=3, max_seq_len=128, lr=5e-5
# SST-2: batch_size=32, weight_decay=0, num_epoch=3, max_seq_len=128, lr=5e-5
......
......@@ -45,6 +45,10 @@ if __name__ == '__main__':
dataset = hub.dataset.ChnSentiCorp()
module = hub.Module(name="ernie")
metrics_choices = ["acc"]
elif args.dataset.lower() == "tnews":
dataset = hub.dataset.TNews()
module = hub.Module(name="ernie")
metrics_choices = ["acc", "f1"]
elif args.dataset.lower() == "nlpcc_dbqa":
dataset = hub.dataset.NLPCC_DBQA()
module = hub.Module(name="ernie")
......
......@@ -59,3 +59,5 @@ from .finetune.strategy import DefaultFinetuneStrategy
from .finetune.strategy import L2SPFinetuneStrategy
from .finetune.strategy import ULMFiTStrategy
from .finetune.strategy import CombinedStrategy
from .autofinetune.evaluator import report_final_result
......@@ -18,13 +18,14 @@ import copy
import json
import math
import numpy as np
import os
import six
import time
from tb_paddle import SummaryWriter
from paddlehub.common.logger import logger
from paddlehub.common.utils import mkdir
from paddlehub.autofinetune.evaluator import REWARD_SUM
from paddlehub.autofinetune.evaluator import REWARD_SUM, TMP_HOME
if six.PY3:
INF = math.inf
......@@ -63,8 +64,17 @@ class BaseTuningStrategy(object):
self._output_dir = "output_" + time_str
else:
self._output_dir = output_dir
# record the information for the whole auto finetune
self.writer = SummaryWriter(logdir=self._output_dir + '/visualization')
# record the information for per population in all round
self.writer_pop_trails = []
for i in range(self.popsize):
writer_pop_trail = SummaryWriter(
logdir=self._output_dir + '/visualization/pop_{}'.format(i))
self.writer_pop_trails.append(writer_pop_trail)
@property
def thread(self):
return self._num_thread
......@@ -150,7 +160,7 @@ class BaseTuningStrategy(object):
return self.current_hparams
def feedback(self, params_list, reward_list):
return NotImplementedError
raise NotImplementedError
def get_best_hparams(self):
return self.best_hparams_all_pop
......@@ -189,6 +199,9 @@ class BaseTuningStrategy(object):
params_cudas_dirs = []
self.feedback(solutions, solution_results)
# remove the tmp.txt which records the eval results for trials
tmp_file = os.path.join(TMP_HOME, "tmp.txt")
os.remove(tmp_file)
return solutions_modeldirs
......@@ -238,10 +251,9 @@ class HAZero(BaseTuningStrategy):
for index, hparam_name in enumerate(self.hparams_name_list):
print("%s=%s" % (hparam_name, local_hparams[index]))
for i in range(self.popsize):
if reward_list[i] < self.best_reward_all_pop:
self.best_hparams_all_pop = self.current_hparams[i]
self.best_reward_all_pop = reward_list[i]
if local_min_reward <= self.best_reward_all_pop:
self.best_reward_all_pop = local_min_reward
self.best_hparams_all_pop = params_list[local_min_reward_index]
best_hparams = self.evaluator.convert_params(self.best_hparams_all_pop)
for index, name in enumerate(self.hparams_name_list):
......@@ -253,6 +265,17 @@ class HAZero(BaseTuningStrategy):
tag="hyperparameter_tuning/best_eval_value",
scalar_value=self.get_best_eval_value(),
global_step=self.round)
for pop_num in range(self.popsize):
params = self.evaluator.convert_params(params_list[pop_num])
for index, name in enumerate(self.hparams_name_list):
self.writer_pop_trails[pop_num].add_scalar(
tag="population_transformation/" + name,
scalar_value=params[index],
global_step=self.round)
self.writer_pop_trails[pop_num].add_scalar(
tag="population_transformation/eval_value",
scalar_value=(REWARD_SUM - reward_list[pop_num]),
global_step=self.round)
self.evolution_stratefy.tell(params_list, reward_list)
self.evolution_stratefy.disp()
......@@ -349,6 +372,7 @@ class PSHE2(BaseTuningStrategy):
local_min_reward = min(reward_list)
local_min_reward_index = reward_list.index(local_min_reward)
local_hparams = self.evaluator.convert_params(
params_list[local_min_reward_index])
print("The local best eval value in the %s-th round is %s." %
......@@ -358,13 +382,15 @@ class PSHE2(BaseTuningStrategy):
print("%s=%s" % (hparam_name, local_hparams[index]))
for i in range(self.popsize):
if reward_list[i] < self.best_reward_per_pop[i]:
if reward_list[i] <= self.best_reward_per_pop[i]:
self.best_hparams_per_pop[i] = copy.deepcopy(
self.current_hparams[i])
self.best_reward_per_pop[i] = reward_list[i]
if reward_list[i] < self.best_reward_all_pop:
self.best_hparams_all_pop = self.current_hparams[i]
self.best_reward_all_pop = reward_list[i]
self.best_reward_per_pop[i] = copy.deepcopy(reward_list[i])
if local_min_reward <= self.best_reward_all_pop:
self.best_reward_all_pop = local_min_reward
self.best_hparams_all_pop = copy.deepcopy(
params_list[local_min_reward_index])
best_hparams = self.evaluator.convert_params(self.best_hparams_all_pop)
for index, name in enumerate(self.hparams_name_list):
......@@ -376,6 +402,17 @@ class PSHE2(BaseTuningStrategy):
tag="hyperparameter_tuning/best_eval_value",
scalar_value=self.get_best_eval_value(),
global_step=self.round)
for pop_num in range(self.popsize):
params = self.evaluator.convert_params(params_list[pop_num])
for index, name in enumerate(self.hparams_name_list):
self.writer_pop_trails[pop_num].add_scalar(
tag="population_transformation/" + name,
scalar_value=params[index],
global_step=self.round)
self.writer_pop_trails[pop_num].add_scalar(
tag="population_transformation/eval_value",
scalar_value=(REWARD_SUM - reward_list[pop_num]),
global_step=self.round)
self.estimate_momemtum()
for i in range(self.popsize):
......
......@@ -24,10 +24,12 @@ import random
import six
import yaml
from paddlehub.common.dir import HUB_HOME
from paddlehub.common.logger import logger
from paddlehub.common.utils import is_windows
from paddlehub.common.utils import is_windows, mkdir
REWARD_SUM = 10000
REWARD_SUM = 1
TMP_HOME = os.path.join(HUB_HOME, "tmp")
if six.PY3:
INF = math.inf
......@@ -35,6 +37,24 @@ else:
INF = float("inf")
def report_final_result(result):
trial_id = os.environ.get("PaddleHub_AutoDL_Trial_ID")
# tmp.txt is to record the eval results for trials
mkdir(TMP_HOME)
tmp_file = os.path.join(TMP_HOME, "tmp.txt")
with open(tmp_file, 'a') as file:
file.write(trial_id + "\t" + str(float(result)) + "\n")
def unique_name():
seed = "1234567890abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!@#$%^&*()_+=-"
x = []
for idx in range(4):
x.append(random.choice(seed))
rand_str = "".join(x)
return rand_str
class BaseEvaluator(object):
def __init__(self, params_file, finetunee_script, options_str=""):
with io.open(params_file, 'r', encoding='utf8') as f:
......@@ -132,16 +152,21 @@ class FullTrailEvaluator(BaseEvaluator):
else:
run_cmd = "export FLAGS_eager_delete_tensor_gb=0.0; export CUDA_VISIBLE_DEVICES=%s; python -u %s --saved_params_dir=%s %s %s >%s 2>&1" % \
(num_cuda, self.finetunee_script, saved_params_dir, param_str, self.options_str, log_file)
try:
# set temp environment variable to record the eval results for trials
rand_str = unique_name()
os.environ['PaddleHub_AutoDL_Trial_ID'] = rand_str
os.system(run_cmd)
with open(log_file, "r") as f:
lines = f.readlines()
eval_result = []
for line in lines:
line = line.strip()
if line.startswith("AutoFinetuneEval"):
data = line.split("\t")
eval_result = float(data[-1])
tmp_file = os.path.join(TMP_HOME, 'tmp.txt')
with open(tmp_file, 'r') as file:
for line in file:
data = line.strip().split("\t")
if rand_str == data[0]:
eval_result = float(data[1])
if eval_result == []:
print(
"WARNING: Program which was ran with hyperparameters as %s was crashed!"
......@@ -152,14 +177,15 @@ class FullTrailEvaluator(BaseEvaluator):
"WARNING: Program which was ran with hyperparameters as %s was crashed!"
% param_str.replace("--", ""))
eval_result = 0.0
reward = self.get_reward(eval_result)
self.model_rewards[saved_params_dir] = reward
return reward
class ModelBasedEvaluator(BaseEvaluator):
class PopulationBasedEvaluator(BaseEvaluator):
def __init__(self, params_file, finetunee_script, options_str=""):
super(ModelBasedEvaluator, self).__init__(
super(PopulationBasedEvaluator, self).__init__(
params_file, finetunee_script, options_str=options_str)
self.half_best_model_path = []
self.run_count = 0
......@@ -196,16 +222,21 @@ class ModelBasedEvaluator(BaseEvaluator):
(num_cuda, self.finetunee_script, saved_params_dir, param_str, self.options_str, log_file)
self.run_count += 1
try:
# set temp environment variable to record the eval results for trials
rand_str = unique_name()
os.environ['PaddleHub_AutoDL_Trial_ID'] = rand_str
os.system(run_cmd)
with open(log_file, "r") as f:
lines = f.readlines()
eval_result = []
for line in lines:
line = line.strip()
if line.startswith("AutoFinetuneEval"):
data = line.split("\t")
eval_result = float(data[-1])
tmp_file = os.join.path(TMP_HOME, 'tmp.txt')
with open(tmp_file, 'r') as file:
for line in file:
data = line.strip().split("\t")
if rand_str == data[0]:
eval_result = float(data[1])
if eval_result == []:
print(
"WARNING: Program which was ran with hyperparameters as %s was crashed!"
......@@ -216,6 +247,7 @@ class ModelBasedEvaluator(BaseEvaluator):
"WARNING: Program which was ran with hyperparameters as %s was crashed!"
% param_str.replace("--", ""))
eval_result = 0.0
reward = self.get_reward(eval_result)
self.model_rewards[saved_params_dir] = reward
return reward
......
......@@ -25,6 +25,7 @@ import sys
import ast
import six
import shutil
import pandas
import numpy as np
......@@ -33,7 +34,7 @@ from paddlehub.common.arg_helper import add_argument, print_arguments
from paddlehub.autofinetune.autoft import PSHE2
from paddlehub.autofinetune.autoft import HAZero
from paddlehub.autofinetune.evaluator import FullTrailEvaluator
from paddlehub.autofinetune.evaluator import ModelBasedEvaluator
from paddlehub.autofinetune.evaluator import PopulationBasedEvaluator
from paddlehub.common.logger import logger
import paddlehub as hub
......@@ -46,7 +47,7 @@ class AutoFineTuneCommand(BaseCommand):
super(AutoFineTuneCommand, self).__init__(name)
self.show_in_help = True
self.name = name
self.description = "Paddlehub helps to finetune a task by searching hyperparameters automatically."
self.description = "PaddleHub helps to finetune a task by searching hyperparameters automatically."
self.parser = argparse.ArgumentParser(
description=self.__class__.__doc__,
prog='%s %s <task to be fintuned in python script>' % (ENTRY,
......@@ -69,9 +70,9 @@ class AutoFineTuneCommand(BaseCommand):
self.arg_config_group.add_argument(
"--popsize", type=int, default=5, help="Population size")
self.arg_config_group.add_argument(
"--cuda",
type=ast.literal_eval,
default=['0'],
"--gpu",
type=str,
default="0",
required=True,
help="The list of gpu devices to be used")
self.arg_config_group.add_argument(
......@@ -82,10 +83,10 @@ class AutoFineTuneCommand(BaseCommand):
default=None,
help="Directory to model checkpoint")
self.arg_config_group.add_argument(
"--evaluate_choice",
"--evaluator",
type=str,
default="modelbased",
help="Choices: fulltrail or modelbased.")
default="populationbased",
help="Choices: fulltrail or populationbased.")
self.arg_config_group.add_argument(
"--tuning_strategy",
type=str,
......@@ -142,30 +143,33 @@ class AutoFineTuneCommand(BaseCommand):
if self.args.opts is not None:
options_str = self.convert_to_other_options(self.args.opts)
if self.args.evaluate_choice.lower() == "fulltrail":
device_ids = self.args.gpu.strip().split(",")
device_ids = [int(device_id) for device_id in device_ids]
if self.args.evaluator.lower() == "fulltrail":
evaluator = FullTrailEvaluator(
self.args.param_file,
self.fintunee_script,
options_str=options_str)
elif self.args.evaluate_choice.lower() == "modelbased":
evaluator = ModelBasedEvaluator(
elif self.args.evaluator.lower() == "populationbased":
evaluator = PopulationBasedEvaluator(
self.args.param_file,
self.fintunee_script,
options_str=options_str)
else:
raise ValueError(
"The evaluate %s is not defined!" % self.args.evaluate_choice)
"The evaluate %s is not defined!" % self.args.evaluator)
if self.args.tuning_strategy.lower() == "hazero":
autoft = HAZero(
evaluator,
cudas=self.args.cuda,
cudas=device_ids,
popsize=self.args.popsize,
output_dir=self.args.output_dir)
elif self.args.tuning_strategy.lower() == "pshe2":
autoft = PSHE2(
evaluator,
cudas=self.args.cuda,
cudas=device_ids,
popsize=self.args.popsize,
output_dir=self.args.output_dir)
else:
......@@ -191,16 +195,30 @@ class AutoFineTuneCommand(BaseCommand):
for index, hparam_name in enumerate(autoft.hparams_name_list):
print("%s=%s" % (hparam_name, best_hparams[index]))
f.write(hparam_name + "\t:\t" + str(best_hparams[index]) + "\n")
f.write("\n\n\n")
print("The final best eval score is %s." %
autoft.get_best_eval_value())
print("The final best model parameters are saved as " +
autoft._output_dir + "/best_model .")
f.write("The final best eval score is %s.\n" %
autoft.get_best_eval_value())
f.write(
"The final best model parameters are saved as ./best_model .")
best_model_dir = autoft._output_dir + "/best_model"
shutil.copytree(
solutions_modeldirs[tuple(autoft.get_best_hparams())],
best_model_dir)
f.write("\t".join(autoft.hparams_name_list) +
"\tsaved_params_dir\n\n")
"\tsaved_params_dir\n")
print(
"The related infomation about hyperparamemters searched are saved as %s/log_file.txt ."
% autoft._output_dir)
for solution, modeldir in solutions_modeldirs.items():
param = evaluator.convert_params(solution)
param = [str(p) for p in param]
f.write("\t".join(param) + "\t" + modeldir + "\n\n")
f.write("\t".join(param) + "\t" + modeldir + "\n")
return True
......
......@@ -23,6 +23,7 @@ from .toxic import Toxic
from .squad import SQUAD
from .xnli import XNLI
from .glue import GLUE
from .tnews import TNews
# CV Dataset
from .dogcat import DogCatDataset as DogCat
......
# coding:utf-8
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from collections import namedtuple
import io
import os
import csv
from paddlehub.dataset import InputExample, HubDataset
from paddlehub.common.downloader import default_downloader
from paddlehub.common.dir import DATA_HOME
from paddlehub.common.logger import logger
_DATA_URL = "https://bj.bcebos.com/paddlehub-dataset/tnews.tar.gz"
class TNews(HubDataset):
"""
TNews is the chinese news classification dataset on JinRiTouDiao App.
"""
def __init__(self):
self.dataset_dir = os.path.join(DATA_HOME, "tnews")
if not os.path.exists(self.dataset_dir):
ret, tips, self.dataset_dir = default_downloader.download_file_and_uncompress(
url=_DATA_URL, save_path=DATA_HOME, print_progress=True)
else:
logger.info("Dataset {} already cached.".format(self.dataset_dir))
self._load_train_examples()
self._load_test_examples()
self._load_dev_examples()
def _load_train_examples(self):
self.train_file = os.path.join(self.dataset_dir,
"toutiao_category_train.txt")
self.train_examples = self._read_file(self.train_file)
def _load_dev_examples(self):
self.dev_file = os.path.join(self.dataset_dir,
"toutiao_category_dev.txt")
self.dev_examples = self._read_file(self.dev_file)
def _load_test_examples(self):
self.test_file = os.path.join(self.dataset_dir,
"toutiao_category_test.txt")
self.test_examples = self._read_file(self.test_file)
def get_train_examples(self):
return self.train_examples
def get_dev_examples(self):
return self.dev_examples
def get_test_examples(self):
return self.test_examples
def get_labels(self):
return [
'news_game', 'news_sports', 'news_finance', 'news_entertainment',
'news_tech', 'news_house', 'news_car', 'news_culture', 'news_world',
'news_travel', 'news_agriculture', 'news_military', 'news_edu',
'news_story', 'stock'
]
@property
def num_labels(self):
"""
Return the number of labels in the dataset.
"""
return len(self.get_labels())
def _read_file(self, input_file):
"""Reads a tab separated value file."""
with io.open(input_file, "r", encoding="UTF-8") as file:
examples = []
for line in file:
data = line.strip().split("_!_")
example = InputExample(
guid=data[0], label=data[2], text_a=data[3])
examples.append(example)
return examples
if __name__ == "__main__":
ds = TNews()
for e in ds.get_train_examples()[:10]:
print("{}\t{}\t{}\t{}".format(e.guid, e.text_a, e.text_b, e.label))
......@@ -33,8 +33,8 @@ _DATA_URL = "https://bj.bcebos.com/paddlehub-dataset/toxic.tar.gz"
class Toxic(HubDataset):
"""
ChnSentiCorp (by Tan Songbo at ICT of Chinese Academy of Sciences, and for
opinion mining)
The kaggle Toxic dataset:
https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge
"""
def __init__(self):
......
......@@ -22,6 +22,7 @@ import contextlib
import time
import copy
import logging
import numpy as np
import paddle.fluid as fluid
from tb_paddle import SummaryWriter
......@@ -305,6 +306,10 @@ class BasicTask(object):
return [_places[0]]
return _places
@property
def return_numpy(self):
return True
@property
def is_train_phase(self):
return self.phase in ["train"]
......@@ -653,10 +658,18 @@ class BasicTask(object):
step_run_state.run_step = 1
num_batch_examples = len(batch)
if self.return_numpy:
fetch_result = self.exe.run(
self.main_program_to_be_run,
feed=data_feeder.feed(batch),
fetch_list=self.fetch_list)
else:
fetch_result = self.exe.run(
self.main_program_to_be_run,
feed=data_feeder.feed(batch),
fetch_list=self.fetch_list,
return_numpy=False)
fetch_result = [np.array(x) for x in fetch_result]
for index, result in enumerate(fetch_result):
step_run_state.run_results[index] = result
......@@ -694,8 +707,17 @@ class BasicTask(object):
num_batch_examples = self.config.batch_size * self.device_count
step_run_state = RunState(len(self.fetch_list))
step_run_state.run_step = 1
if self.return_numpy:
fetch_result = self.exe.run(
self.main_program_to_be_run,
fetch_list=self.fetch_list)
else:
fetch_result = self.exe.run(
self.main_program_to_be_run, fetch_list=self.fetch_list)
self.main_program_to_be_run,
fetch_list=self.fetch_list,
return_numpy=False)
fetch_result = [np.array(x) for x in fetch_result]
for index, result in enumerate(fetch_result):
step_run_state.run_results[index] = result
......
......@@ -34,10 +34,13 @@ class SequenceLabelTask(BasicTask):
data_reader,
startup_program=None,
config=None,
metrics_choices="default"):
metrics_choices="default",
add_crf=False):
if metrics_choices == "default":
metrics_choices = ["f1", "precision", "recall"]
self.add_crf = add_crf
main_program = feature.block.program
super(SequenceLabelTask, self).__init__(
data_reader=data_reader,
......@@ -50,7 +53,36 @@ class SequenceLabelTask(BasicTask):
self.max_seq_len = max_seq_len
self.num_classes = num_classes
@property
def return_numpy(self):
if self.add_crf:
return False
else:
return True
def _build_net(self):
self.seq_len = fluid.layers.data(
name="seq_len", shape=[1], dtype='int64')
seq_len = fluid.layers.assign(self.seq_len)
if self.add_crf:
unpad_feature = fluid.layers.sequence_unpad(
self.feature, length=self.seq_len)
self.emission = fluid.layers.fc(
size=self.num_classes,
input=unpad_feature,
param_attr=fluid.ParamAttr(
initializer=fluid.initializer.Uniform(low=-0.1, high=0.1),
regularizer=fluid.regularizer.L2DecayRegularizer(
regularization_coeff=1e-4)))
size = self.emission.shape[1]
fluid.layers.create_parameter(
shape=[size + 2, size], dtype=self.emission.dtype, name='crfw')
self.ret_infers = fluid.layers.crf_decoding(
input=self.emission, param_attr=fluid.ParamAttr(name='crfw'))
ret_infers = fluid.layers.assign(self.ret_infers)
return [ret_infers]
else:
self.logits = fluid.layers.fc(
input=self.feature,
size=self.num_classes,
......@@ -66,10 +98,6 @@ class SequenceLabelTask(BasicTask):
x=fluid.layers.argmax(self.logits, axis=2), shape=[-1, 1])
ret_infers = fluid.layers.assign(self.ret_infers)
self.seq_len = fluid.layers.data(
name="seq_len", shape=[1], dtype='int64')
seq_len = fluid.layers.assign(self.seq_len)
logits = self.logits
logits = fluid.layers.flatten(logits, axis=2)
logits = fluid.layers.softmax(logits)
......@@ -82,6 +110,14 @@ class SequenceLabelTask(BasicTask):
return [label]
def _add_loss(self):
if self.add_crf:
labels = fluid.layers.sequence_unpad(self.labels[0], self.seq_len)
crf_cost = fluid.layers.linear_chain_crf(
input=self.emission,
label=labels,
param_attr=fluid.ParamAttr(name='crfw'))
loss = fluid.layers.mean(x=crf_cost)
else:
labels = fluid.layers.flatten(self.labels[0], axis=2)
ce_loss = fluid.layers.cross_entropy(
input=self.outputs[0], label=labels)
......@@ -89,14 +125,36 @@ class SequenceLabelTask(BasicTask):
return loss
def _add_metrics(self):
self.ret_labels = fluid.layers.reshape(x=self.labels[0], shape=[-1, 1])
if self.add_crf:
labels = fluid.layers.sequence_unpad(self.labels[0], self.seq_len)
(precision, recall, f1_score, num_infer_chunks, num_label_chunks,
num_correct_chunks) = fluid.layers.chunk_eval(
input=self.outputs[0],
label=labels,
chunk_scheme="IOB",
num_chunk_types=int(np.ceil((self.num_classes - 1) / 2.0)))
chunk_evaluator = fluid.metrics.ChunkEvaluator()
chunk_evaluator.reset()
return [precision, recall, f1_score]
else:
self.ret_labels = fluid.layers.reshape(
x=self.labels[0], shape=[-1, 1])
return [self.ret_labels, self.ret_infers, self.seq_len]
def _calculate_metrics(self, run_states):
total_infer = total_label = total_correct = loss_sum = 0
run_step = run_time_used = run_examples = 0
precision_sum = recall_sum = f1_score_sum = 0
for run_state in run_states:
loss_sum += np.mean(run_state.run_results[-1])
if self.add_crf:
precision_sum += np.mean(
run_state.run_results[0]) * run_state.run_examples
recall_sum += np.mean(
run_state.run_results[1]) * run_state.run_examples
f1_score_sum += np.mean(
run_state.run_results[2]) * run_state.run_examples
else:
np_labels = run_state.run_results[0]
np_infers = run_state.run_results[1]
np_lens = run_state.run_results[2]
......@@ -113,6 +171,11 @@ class SequenceLabelTask(BasicTask):
run_speed = run_step / run_time_used
avg_loss = loss_sum / run_examples
if self.add_crf:
precision = precision_sum / run_examples
recall = recall_sum / run_examples
f1 = f1_score_sum / run_examples
else:
precision, recall, f1 = calculate_f1(total_label, total_infer,
total_correct)
# The first key will be used as main metrics to update the best model
......
......@@ -37,7 +37,7 @@ class ImageClassificationReader(object):
def __init__(self,
image_width,
image_height,
dataset,
dataset=None,
channel_order="RGB",
images_mean=None,
images_std=None,
......@@ -76,6 +76,8 @@ class ImageClassificationReader(object):
phase="train",
shuffle=False,
data=None):
if phase != 'predict' and not self.dataset:
raise ValueError("The dataset is none and it's not allowed!")
if phase == "train":
data = self.dataset.train_data(shuffle)
elif phase == "test":
......
......@@ -37,8 +37,8 @@ import paddlehub as hub
class BaseReader(object):
def __init__(self,
dataset,
vocab_path,
dataset=None,
label_map_config=None,
max_seq_len=512,
do_lower_case=True,
......@@ -63,9 +63,13 @@ class BaseReader(object):
# generate label map
self.label_map = {}
if self.dataset:
for index, label in enumerate(self.dataset.get_labels()):
self.label_map[label] = index
logger.info("Dataset label map = {}".format(self.label_map))
else:
logger.info("Dataset is None! label map = {}".format(
self.label_map))
self.current_example = 0
self.current_epoch = 0
......@@ -241,6 +245,8 @@ class BaseReader(object):
phase='train',
shuffle=True,
data=None):
if phase != 'predict' and not self.dataset:
raise ValueError("The dataset is None ! It isn't allowed.")
if phase == 'train':
shuffle = True
examples = self.get_train_examples()
......@@ -260,7 +266,10 @@ class BaseReader(object):
for item in data:
# set label in order to run the program
if self.dataset:
label = list(self.label_map.keys())[0]
else:
label = 0
if len(item) == 1:
item_i = InputExample(
guid=seq_id, text_a=item[0], label=label)
......@@ -498,7 +507,7 @@ class SequenceLabelReader(BaseReader):
class LACClassifyReader(object):
def __init__(self, dataset, vocab_path, in_tokens=False):
def __init__(self, vocab_path, dataset=None, in_tokens=False):
self.dataset = dataset
self.lac = hub.Module(name="lac")
self.tokenizer = tokenization.FullTokenizer(
......@@ -544,6 +553,8 @@ class LACClassifyReader(object):
phase="train",
shuffle=False,
data=None):
if phase != "predict" and not self.dataset:
raise ValueError("The dataset is None and it isn't allowed.")
if phase == "train":
shuffle = True
data = self.dataset.get_train_examples()
......@@ -756,8 +767,8 @@ class SquadInputFeatures(object):
class RegressionReader(BaseReader):
def __init__(self,
dataset,
vocab_path,
dataset=None,
label_map_config=None,
max_seq_len=128,
do_lower_case=True,
......@@ -845,6 +856,8 @@ class RegressionReader(BaseReader):
phase='train',
shuffle=True,
data=None):
if phase != 'predict' and not self.dataset:
raise ValueError("The dataset is none and it's not allowed.")
if phase == 'train':
shuffle = True
examples = self.get_train_examples()
......
......@@ -13,5 +13,5 @@
# See the License for the specific language governing permissions and
# limitations under the License.
""" PaddleHub version string """
hub_version = "1.2.0"
hub_version = "1.2.1"
module_proto_version = "1.0.0"
#for python2, you should use requirements_py2.txt
pre-commit
protobuf >= 3.1.0
protobuf >= 3.6.0
yapf == 0.26.0
pyyaml
numpy
......
......@@ -31,7 +31,7 @@ def python_version():
max_version, mid_version, min_version = python_version()
REQUIRED_PACKAGES = [
'six >= 1.10.0', 'protobuf >= 3.1.0', 'pyyaml', 'Pillow', 'requests',
'six >= 1.10.0', 'protobuf >= 3.6.0', 'pyyaml', 'Pillow', 'requests',
'tb-paddle', 'tb-nightly', 'cma == 2.7.0', 'flask >= 1.1.0'
]
......
# PaddleHub 超参优化(Auto Fine-tune)——CV图像分类任务
# PaddleHub AutoDL Finetuner——图像分类任务
使用PaddleHub Auto Fine-tune必须准备两个文件,并且这两个文件需要按照指定的格式书写。这两个文件分别是需要Fine-tune的python脚本finetuee.py和需要优化的超参数信息yaml文件hparam.yaml。
使用PaddleHub AutoDL Finetuner需要准备两个指定格式的文件:待优化的超参数信息yaml文件hparam.yaml和需要Fine-tune的python脚本train.py
以Fine-tune图像分类任务为例,我们展示如何利用PaddleHub Auto Finetune进行超参优化。
以Fine-tune图像分类任务为例,展示如何利用PaddleHub AutoDL Finetuner进行超参优化。
以下是待优化超参数的yaml文件hparam.yaml,包含需要搜素的超参名字、类型、范围等信息。其中类型只支持float和int类型
以下是待优化超参数的yaml文件hparam.yaml,包含需要搜素的超参名字、类型、范围等信息。目前参数搜索类型只支持float和int类型
```
param_list:
- name : learning_rate
......@@ -20,7 +20,7 @@ param_list:
greater_than : 10
```
以下是图像分类的finetunee.py
以下是图像分类的`train.py`
```python
# coding:utf-8
......@@ -31,18 +31,20 @@ import shutil
import paddle.fluid as fluid
import paddlehub as hub
import numpy as np
from paddlehub.common.logger import logger
# yapf: disable
parser = argparse.ArgumentParser(__doc__)
parser.add_argument("--num_epoch", type=int, default=1, help="Number of epoches for fine-tuning.")
parser.add_argument("--epochs", type=int, default=1, help="Number of epoches for fine-tuning.")
parser.add_argument("--use_gpu", type=ast.literal_eval, default=True, help="Whether use GPU for fine-tuning.")
parser.add_argument("--checkpoint_dir", type=str, default=None, help="Path to save log data.")
# the name of hyperparameters to be searched should keep with hparam.py
parser.add_argument("--batch_size", type=int, default=16, help="Total examples' number in batch for training.")
parser.add_argument("--saved_params_dir", type=str, default="", help="Directory for saving model")
parser.add_argument("--learning_rate", type=float, default=1e-4, help="learning_rate.")
# saved_params_dir and model_path are needed by auto finetune
parser.add_argument("--saved_params_dir", type=str, default="", help="Directory for saving model")
parser.add_argument("--model_path", type=str, default="", help="load model path")
# yapf: enable.
def is_path_valid(path):
......@@ -55,11 +57,12 @@ def is_path_valid(path):
return True
def finetune(args):
# Load Paddlehub resnet50 pretrained model
module = hub.Module(name="resnet_v2_50_imagenet")
input_dict, output_dict, program = module.context(trainable=True)
# Download dataset and use ImageClassificationReader to read dataset
dataset = hub.dataset.Flowers()
data_reader = hub.reader.ImageClassificationReader(
image_width=module.get_expected_image_width(),
image_height=module.get_expected_image_height(),
......@@ -78,11 +81,12 @@ def finetune(args):
config = hub.RunConfig(
use_cuda=True,
num_epoch=args.num_epoch,
num_epoch=args.epochs,
batch_size=args.batch_size,
checkpoint_dir=args.checkpoint_dir,
strategy=strategy)
# Construct transfer learning network
task = hub.ImageClassifierTask(
data_reader=data_reader,
feed_list=feed_list,
......@@ -108,6 +112,7 @@ def finetune(args):
shutil.copytree(best_model_dir, args.saved_params_dir)
shutil.rmtree(config.checkpoint_dir)
# acc on dev will be used by auto finetune
print("AutoFinetuneEval"+"\t"+str(float(eval_avg_score["acc"])))
......
# PaddleHub 超参优化(Auto Fine-tune)——NLP情感分类任务
# PaddleHub 超参优化(AutoDL Finetuner)——NLP情感分类任务
使用PaddleHub Auto Fine-tune必须准备两个文件,并且这两个文件需要按照指定的格式书写。这两个文件分别是需要Fine-tune的python脚本finetuee.py和需要优化的超参数信息yaml文件hparam.yaml。
使用PaddleHub AutoDL Finetuner需要准备两个指定格式的文件:待优化的超参数信息yaml文件hparam.yaml和需要Fine-tune的python脚本train.py
以Fine-tune中文情感分类任务为例,我们展示如何利用PaddleHub Auto Finetune进行超参优化。
以Fine-tune中文情感分类任务为例,展示如何利用PaddleHub AutoDL Finetuner进行超参优化。
以下是待优化超参数的yaml文件hparam.yaml,包含需要搜素的超参名字、类型、范围等信息。其中类型只支持float和int类型
以下是待优化超参数的yaml文件hparam.yaml,包含需要搜索的超参名字、类型、范围等信息。其中类型只支持float和int
```
param_list:
- name : learning_rate
......@@ -29,7 +29,7 @@ param_list:
greater_than : 0.0
```
以下是中文情感分类的finetunee.py
以下是中文情感分类的`train.py`
```python
from __future__ import absolute_import
......@@ -44,19 +44,22 @@ import paddlehub as hub
import os
from paddlehub.common.logger import logger
# yapf: disable
parser = argparse.ArgumentParser(__doc__)
parser.add_argument("--epochs", type=int, default=3, help="epochs.")
# the name of hyperparameters to be searched should keep with hparam.py
parser.add_argument("--batch_size", type=int, default=32, help="batch_size.")
parser.add_argument("--learning_rate", type=float, default=5e-5, help="learning_rate.")
parser.add_argument("--warmup_prop", type=float, default=0.1, help="warmup_prop.")
parser.add_argument("--weight_decay", type=float, default=0.01, help="weight_decay.")
parser.add_argument("--max_seq_len", type=int, default=128, help="Number of words of the longest seqence.")
parser.add_argument("--checkpoint_dir", type=str, default=None, help="Directory to model checkpoint")
# saved_params_dir and model_path are needed by auto finetune
parser.add_argument("--saved_params_dir", type=str, default="", help="Directory for saving model during ")
parser.add_argument("--model_path", type=str, default="", help="load model path")
args = parser.parse_args()
# yapf: enable.
def is_path_valid(path):
......@@ -138,5 +141,6 @@ if __name__ == '__main__':
shutil.copytree(best_model_dir, args.saved_params_dir)
shutil.rmtree(config.checkpoint_dir)
# acc on dev will be used by auto finetune
print("AutoFinetuneEval"+"\t"+str(float(eval_avg_score["acc"])))
```
# PaddleHub 超参优化(Auto Fine-tune
# PaddleHub 超参优化(AutoDL Finetuner
## 一、简介
机器学习训练模型的过程中自然少不了调参。模型的参数可分成两类:参数与超参数,前者是模型通过自身的训练学习得到的参数数据;后者则需要通过人工经验设置(如学习率、dropout_rate、batch_size等),以提高模型训练的效果。当前模型往往参数空间大,手动调参十分耗时,尝试成本高。PaddleHub Auto Fine-tune可以实现自动调整超参数。
目前深度学习模型参数可分类两类:*模型参数 (Model Parameters)**超参数 (Hyper Parameters)*,前者是模型通过大量的样本数据进行训练学习得到的参数数据;后者则需要通过人工经验或者不断尝试找到最佳设置(如学习率、dropout_rate、batch_size等),以提高模型训练的效果。如果想得到一个效果好的深度学习神经网络模型,超参的设置非常关键。因为模型参数空间大,目前超参调整都是通过手动,依赖人工经验或者不断尝试,且不同模型、样本数据和场景下不尽相同,所以需要大量尝试,时间成本和资源成本非常浪费。PaddleHub AutoDL Finetuner可以实现自动调整超参数。
PaddleHub Auto Fine-tune提供两种超参优化策略
PaddleHub AutoDL Finetuner提供两种超参优化算法
* HAZero: 核心思想是通过对正态分布中协方差矩阵的调整来处理变量之间的依赖关系和scaling。算法基本可以分成以下三步: 采样产生新解;计算目标函数值;更新正态分布参数。调整参数的基本思路为,调整参数使得产生更优解的概率逐渐增大。优化过程如下图:
* **HAZero**: 核心思想是通过对正态分布中协方差矩阵的调整来处理变量之间的依赖关系和scaling。算法基本可以分成以下三步:
1. 采样产生新解
2. 计算目标函数值
3. 更新正态分布参数。
调整参数的基本思路为,调整参数使得产生更优解的概率逐渐增大。优化过程如下图:
<p align="center">
<img src="https://raw.githubusercontent.com/PaddlePaddle/PaddleHub/release/v1.2/docs/imgs/bayesian_optimization.gif" hspace='10'/> <br />
</p>
*图片来源于https://www.kaggle.com/clair14/tutorial-bayesian-optimization*
* PSHE2: 采用粒子群算法,最优超参数组合就是所求问题的解。现在想求得最优解就是要找到更新超参数组合,即如何更新超参数,才能让算法更快更好的收敛到最优解。PSHE2算法根据超参数本身历史的最优,在一定随机扰动的情况下决定下一步的更新方向。
* PSHE2: 采用哈密尔顿动力系统搜索参数空间中“势能”最低的点。而最优超参数组合就是势能低点。现在想求得最优解就是要找到更新超参数组合,即如何更新超参数,才能让算法更快更好的收敛到最优解。PSHE2算法根据超参数本身历史的最优,在一定随机扰动的情况下决定下一步的更新方向。
<p align="center">
<img src="https://raw.githubusercontent.com/PaddlePaddle/PaddleHub/release/v1.2/docs/imgs/thermodynamics.gif" hspace='10'/> <br />
</p>
PaddleHub Auto Fine-tune提供两种超参评估策略:
PaddleHub AutoDL Finetuner为了评估搜索的超参对于任务的效果,提供两种超参评估策略:
* FullTrail: 给定一组超参,利用这组超参从头开始Finetune一个新模型,之后在数据集dev部分评估这个模型
* **Full-Trail**: 给定一组超参,利用这组超参从头开始Fine-tune一个新模型,之后在验证集评估这个模型
* ModelBased: 给定一组超参,若这组超参来自第一轮优化的超参,则从头开始Finetune一个新模型;若这组超参数不是来自第一轮优化的超参数,则程序会加载前几轮已经Fine-tune完毕后保存的较好模型,基于这个模型,在当前的超参数组合下继续Finetune。这个Fine-tune完毕后保存的较好模型,评估方式是这个模型在数据集dev部分的效果
* **Population-Based**: 给定一组超参,若这组超参是第一轮尝试的超参组合,则从头开始Fine-tune一个新模型;否则基于前几轮已保存的较好模型,在当前的超参数组合下继续Fine-tune并评估
## 二、准备工作
使用PaddleHub Auto Fine-tune必须准备两个文件,并且这两个文件需要按照指定的格式书写。这两个文件分别是需要Fine-tune的python脚本finetunee.py和需要优化的超参数信息yaml文件hparam.yaml
使用PaddleHub AutoDL Finetuner需要准备两个指定格式的文件:待优化的超参数信息yaml文件hparam.yaml和需要Fine-tune的python脚本train.py
### 关于hparam.yaml
### 1. hparam.yaml
hparam给出了需要搜索的超参名字、类型(int或者float,代表了离线型和连续型的两种超参)、搜索范围等信息,通过这些信息构建了一个超参空间,PaddleHub将在这个空间内进行超参数的搜索,将搜索到的超参传入finetunee.py获得评估效果,根据评估效果引导下一步的超参搜索方向,直到满足搜索次数
hparam给出待搜索的超参名字、类型(int或者float)、搜索范围等信息,通过这些信息构建了一个超参空间,PaddleHub将在这个空间内进行超参数的搜索,将搜索到的超参传入train.py获得评估效果,根据评估效果自动调整超参搜索方向,直到满足搜索次数。
`Note`:
**Note**:
* yaml文件的最外层级的key必须是param_list
```
param_list:
......@@ -44,64 +50,71 @@ hparam给出了需要搜索的超参名字、类型(int或者float,代表了
greater_than : 0.00005
...
```
* 超参名字可以随意指定,PaddleHub会将搜索到的值以指定名称传递给finetunee.py进行使用
* 超参名字可以任意指定,PaddleHub会将搜索到的值以指定名称传递给train.py使用
* 优化超参策略选择HAZero时,需要提供两个以上的待优化超参。
* PaddleHub Auto Fine-tune优化超参策略选择HAZero时,必须提供两个以上的待优化超参。
### 2. train.py
### 关于finetunee.py
train.py用于接受PaddleHub搜索到的超参进行一次优化过程,将优化后的效果返回
finetunee.py用于接受PaddleHub搜索到的超参进行一次优化过程,将优化后的效果返回
<p align="center">
<img src="https://raw.githubusercontent.com/PaddlePaddle/PaddleHub/release/v1.2/docs/imgs/demo.png" hspace='10'/> <br />
</p>
`Note`
**NOTE**:
* finetunee.py必须可以接收待优化超参数选项参数, 并且待搜索超参数选项名字和yaml文件中的超参数名字保持一致。
* train.py的选项参数须包含待优化超参数,需要将超参以argparser的方式写在其中,待搜索超参数选项名字和yaml文件中的超参数名字保持一致。
* finetunee.py必须有saved_params_dir这个选项。并且在完成优化后,将参数保存到该路径下。
* train.py须包含选项参数saved_params_dir,优化后的参数将会保存到该路径下。
* 如果PaddleHub Auto Fine-tune超参评估策略选择为ModelBased,则finetunee.py必须有model_path选项,并且从该选项指定的参数路径中恢复模型
* 超参评估策略选择PopulationBased时,train.py须包含选项参数model_path,自动从model_path指定的路径恢复模型
* finetunee.py必须输出模型的评价效果(建议使用dev或者test数据集),同时以“AutoFinetuneEval"开始,和评价效果之间以“\t”分开,如
* train.py须输出模型的评价效果(建议使用验证集或者测试集上的评价效果),输出以“AutoFinetuneEval"开始,与评价效果之间以“\t”分开,如
```python
print("AutoFinetuneEval"+"\t" + str(eval_acc))
```
* 输出的评价效果取值范围应`(-∞, 1]`,取值越高,表示效果越好。
* 输出的评价效果取值范围应为`(-∞, 1]`,取值越高,表示效果越好。
### 示例
[PaddleHub Auto Fine-tune超参优化--NLP情感分类任务](./autofinetune-nlp.md)
[PaddleHub AutoDL Finetuner超参优化--NLP情感分类任务](./autofinetune-nlp.md)
[PaddleHub Auto Fine-tune超参优化--CV图像分类任务](./autofinetune-cv.md)
[PaddleHub AutoDL Finetuner超参优化--CV图像分类任务](./autofinetune-cv.md)
## 三、启动方式
**确认安装PaddleHub版本在1.2.0以上, 同时PaddleHub Auto Fine-tune功能要求至少有一张GPU显卡可用。**
**确认安装PaddleHub版本在1.2.1以上, 同时PaddleHub AutoDL Finetuner功能要求至少有一张GPU显卡可用。**
通过以下命令方式:
```shell
$ OUTPUT=result/
$ hub autofinetune finetunee.py --param_file=hparam.yaml --cuda=['1','2'] --popsize=5 --round=10
$ hub autofinetune train.py --param_file=hparam.yaml --cuda=['1','2'] --popsize=5 --round=10
--output_dir=${OUTPUT} --evaluate_choice=fulltrail --tuning_strategy=pshe2
```
其中,选项
> `--param_file`: 需要优化的超参数信息yaml文件,即上述[hparam.yaml](#hparam.yaml)。
> `--param_file`: 必填,待优化的超参数信息yaml文件,即上述[hparam.yaml](#hparam.yaml)。
> `--cuda`: 设置运行程序的可用GPU卡号,list类型,中间以逗号隔开,不能有空格,默认为[‘0’]
> `--cuda`: 必填,设置运行程序的可用GPU卡号,list类型,中间以逗号隔开,不能有空格,默认为[‘0’]
> `--popsize`: 设置程序运行每轮产生的超参组合数,默认为5
> `--popsize`: 可选,设置程序运行每轮产生的超参组合数,默认为5
> `--round`: 设置程序运行的轮数,默认是10
> `--round`: 可选,设置程序运行的轮数,默认为10
> `--output_dir`: 设置程序运行输出结果存放目录,可选,不指定该选项参数时,在当前运行路径下生成存放程序运行输出信息的文件夹
> `--output_dir`: 可选,设置程序运行输出结果存放目录,不指定该选项参数时,在当前运行路径下生成存放程序运行输出信息的文件夹
> `--evaluate_choice`: 设置自动优化超参的评价效果方式,可选fulltrail和modelbased, 默认为modelbased
> `--evaluate_choice`: 可选,设置自动优化超参的评价效果方式,可选fulltrail和populationbased, 默认为populationbased
> `--tuning_strategy`: 设置自动优化超参策略,可选hazero和pshe2,默认为pshe2
> `--tuning_strategy`: 可选,设置自动优化超参算法,可选hazero和pshe2,默认为pshe2
`NOTE`
* 进行超参搜索时,一共会进行n轮(--round指定),每轮产生m组超参(--popsize指定)进行搜索。每一轮的超参会根据上一轮的优化结果决定,当指定GPU数量不足以同时跑一轮时,Auto Fine-tune功能自动实现排队,为了提高GPU利用率,建议卡数为刚好可以被popsize整除。如popsize=6,cuda=['0','1','2','3'],则每搜索一轮,Auto Fine-tune自动起四个进程训练,所以第5/6组超参组合需要排队一次,在搜索第5/6两组超参时,会存在两张卡出现空闲等待的情况,如果设置为3张可用的卡,则可以避免这种情况的出现。
**NOTE**:
* 进行超参搜索时,一共会进行n轮(--round指定),每轮产生m组超参(--popsize指定)进行搜索。上一轮的优化结果决定下一轮超参数调整方向
* 当指定GPU数量不足以同时跑一轮时,AutoDL Finetuner功能自动实现排队为了提高GPU利用率,建议卡数为刚好可以被popsize整除。如popsize=6,cuda=['0','1','2','3'],则每搜索一轮,AutoDL Finetuner自动起四个进程训练,所以第5/6组超参组合需要排队一次,在搜索第5/6两组超参时,会存在两张卡出现空闲等待的情况,如果设置为3张可用的卡,则可以避免这种情况的出现。
## 四、目录结构
......@@ -125,42 +138,43 @@ $ hub autofinetune finetunee.py --param_file=hparam.yaml --cuda=['1','2'] --pops
```
其中output_dir为启动autofinetune命令时指定的根目录,目录下:
* log_file.txt记录每一轮搜索所有的超参以及整个过程中所搜索到的最优超参
* log_file.txt记录每一轮搜索所有的超参以及整个过程中所搜索到的最优超参
* visualization记录可视化过程的日志文件
* visualization记录可视化过程的日志文件
* round0 ~ roundn记录每一轮的数据,在每个round目录下,还存在以下文件:
* round0 ~ roundn记录每一轮的数据,在每个round目录下,还存在以下文件:
* log-0.info ~ log-m.info记录每个搜索方向的日志
* log-0.info ~ log-m.info记录每个搜索方向的日志
* model-0 ~ model-m记录对应搜索的参数
* model-0 ~ model-m记录对应搜索的参数
## 五、可视化
Auto Finetune API在优化超参过程中会自动对关键训练指标进行打点,启动程序后执行下面命令
AutoDL Finetuner API在优化超参过程中会自动对关键训练指标进行打点,启动程序后执行下面命令
```shell
$ tensorboard --logdir ${OUTPUT}/visualization --host ${HOST_IP} --port ${PORT_NUM}
```
其中${OUTPUT}为AutoDL根目录,${HOST_IP}为本机IP地址,${PORT_NUM}为可用端口号,如本机IP地址为192.168.0.1,端口号8040,
用浏览器打开192.168.0.1:8040,即可看到搜索过程中各超参以及指标的变化情况
## 六、其他
1. 如在使用Auto Fine-tune功能时,输出信息中包含如下字样:
其中${OUTPUT}为AutoDL Finetuner输出目录,${HOST_IP}为本机IP地址,${PORT_NUM}为可用端口号,如本机IP地址为192.168.0.1,端口号8040,
用浏览器打开192.168.0.1:8040,即可看到搜索过程中各超参以及指标的变化情况。
**WARNING:Program which was ran with hyperparameters as ... was crashed!**
首先根据终端上的输出信息,确定这个输出信息是在第几个round(如round 3),之后查看${OUTPUT}/round3/下的日志文件信息log.info, 查看具体出错原因。
## 六、args参数传递
2. PaddleHub AutoFinetune 命令行支持从启动命令hub autofinetune传入finetunee.py中不需要搜索的选项参数,如
[PaddleHub Auto Fine-tune超参优化--NLP情感分类任务](./autofinetune-nlp.md)示例中的max_seq_len选项,可以参照以下方式传入。
PaddleHub AutoDL Finetuner 支持将train.py中的args其余不需要搜索的参数通过autofinetune remainder方式传入。这个不需要搜索的选项参数名称应该和通过hub autofinetune的传入选项参数名称保持一致。如[PaddleHub AutoDL Finetuner超参优化--NLP情感分类任务](./autofinetune-nlp.md)示例中的max_seq_len选项,可以参照以下方式传入。
```shell
$ OUTPUT=result/
$ hub autofinetune finetunee.py --param_file=hparam.yaml --cuda=['1','2'] --popsize=5 --round=10
$ hub autofinetune train.py --param_file=hparam.yaml --cuda=['1','2'] --popsize=5 --round=10
--output_dir=${OUTPUT} --evaluate_choice=fulltrail --tuning_strategy=pshe2 max_seq_len 128
```
3. PaddleHub Auto Fine-tune功能使用过程中确认使用的GPU卡仅供PaddleHub使用,无其他任务使用。
## 七、其他
1. 如在使用AutoDL Finetuner功能时,输出信息中包含如下字样:
**WARNING:Program which was ran with hyperparameters as ... was crashed!**
首先根据终端上的输出信息,确定这个输出信息是在第几个round(如round 3),之后查看${OUTPUT}/round3/下的日志文件信息log.info, 查看具体出错原因。
2. PaddleHub AutoDL Finetuner功能使用过程中建议使用的GPU卡仅供PaddleHub使用,无其他任务使用。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册