提交 32aceef1 编写于 作者: T Tensor Tang

Merge branch 'develop' into 'develop'

add missing files

See merge request paddlepaddle/paddlelitearmbackend!12
---
name: 建议(Feature request)
about: 您可以提出您的建议。 You could use this template for reporting a suggestion  issue.
---
欢迎您对PaddlePaddle提出建议,非常感谢您对PaddlePaddle的贡献!
在留下您的建议时,辛苦您同步提供如下信息:
- 版本、环境信息
1)PaddlePaddle版本:请提供您的PaddlePaddle版本号,例如1.1
2)CPU/GPU:您是否使用GPU进行训练,如是,请提供您的CUDA和cuDNN版本号
3)系统环境:请您描述系统类型、版本,例如Mac OS 10.14
- 复现信息:如为报错,请给出复现环境、复现步骤
- 建议描述:请您详细描述,您认为需优化的功能
Thank you for contributing to PaddlePaddle.
Before submitting the issue, you could search issue in the github in case that there was a similar issue submitted or resolved before.
Please make sure that this is a feature request.
**System information**
-PaddlePaddle version (eg.1.1)or CommitID
-CPU: including CPUMKL/OpenBlas/MKLDNN version
-GPU: including CUDA/CUDNN version
-OS Platform (eg.Mac OS 10.14)
**To Reproduce**
Steps to reproduce the behavior
**Describe the feature and the current behavior/state.**
**Any Other info.**
---
name: 预测(Inference Issue)
about: 您可以提问预测中报错、应用等问题。 You could use this template for reporting an inference issue.
---
为使您的问题得到快速解决,在建立Issue前,请您先通过如下方式搜索是否有相似问题:【搜索issue关键字】【使用labels筛选】【官方文档】
如果您没有查询到相似问题,为快速解决您的提问,建立issue时请提供如下细节信息:
- 标题:简洁、精准描述您的问题,例如“最新预测库的API文档在哪儿 ”
- 版本、环境信息:
   1)PaddlePaddle版本:请提供您的PaddlePaddle版本号(如1.1)或CommitID
   2)CPU:预测若用CPU,请提供CPU型号,MKL/OpenBlas/MKLDNN/等数学库使用情况
   3)GPU:预测若用GPU,请提供GPU型号、CUDA和CUDNN版本号
   4)系统环境:请您描述系统类型、版本(如Mac OS 10.14),Python版本
-预测信息
   1)C++预测:请您提供预测库安装包的版本信息,及其中的version.txt文件
   2)CMake包含路径的完整命令
   3)API信息(如调用请提供)
   4)预测库来源:官网下载/特殊环境(如BCLOUD编译)
- 复现信息:如为报错,请给出复现环境、复现步骤
- 问题描述:请详细描述您的问题,同步贴出报错信息、日志/代码关键片段
Thank you for contributing to PaddlePaddle.
Before submitting the issue, you could search issue in the github in case that th
If there is no solution,please make sure that this is an inference issue including the following details :
**System information**
-PaddlePaddle version (eg.1.1)or CommitID
-CPU: including CPUMKL/OpenBlas/MKLDNN version
-GPU: including CUDA/CUDNN version
-OS Platform (eg.Mac OS 10.14)
-Python version
-Cmake orders
-C++version.txt
-API information
**To Reproduce**
Steps to reproduce the behavior
**Describe your current behavior**
**Code to reproduce the issue**
**Other info / logs**
---
name: 安装(Installation Issue)
about: 您可以提问安装、编译出现报错等问题。 You could use this template for reporting an installation
 issue.
---
为使您的问题得到快速解决,在建立Issue前,请您先通过如下方式搜索是否有相似问题:【搜索issue关键字】【使用labels筛选】【官方文档】
建立issue时,为快速解决问题,请您根据使用情况给出如下信息:
- 标题:请包含关键词“安装错误”/“编译错误”,例如“Mac编译错误”
- 版本、环境信息:
   1)PaddlePaddle版本:请提供您的PaddlePaddle版本号(如1.1)或CommitID
   2)CPU:请提供CPU型号,MKL/OpenBlas/MKLDNN/等数学库的使用情况
   3)GPU:请提供GPU型号,CUDA和CUDNN版本号
   4)系统环境:请说明系统类型、版本(如Mac OS 10.14)、Python版本
- 安装方式信息:
1)pip安装/docker安装
2)本地编译:请提供cmake命令,编译命令
3)docker编译:请提供docker镜像,编译命令           
 特殊环境请注明:如离线安装等
- 复现信息:如为报错,请给出复现环境、复现步骤
- 问题描述:请详细描述您的问题,同步贴出报错信息、日志/代码关键片段
Thank you for contributing to PaddlePaddle.
Before submitting the issue, you could search issue in Github in case that there was a similar issue submitted or resolved before.
If there is no solution,please make sure that this is an installation issue including the following details:
**System information**
-PaddlePaddle version (eg.1.1)or CommitID
-CPU: including CPUMKL/OpenBlas/MKLDNN version
-GPU: including CUDA/CUDNN version
-OS Platform (eg. Mac OS 10.14)
-Python version
- Install method: pip install/install with docker/build from source(without docker)/build within docker
- Other special cases that you think may be related to this problem, eg. offline install, special internet condition  
**To Reproduce**
Steps to reproduce the behavior
**Describe your current behavior**
**Code to reproduce the issue**
**Other info / logs**
---
name: 模型(Model Issue)
about: 您可以提问模型、算法、数据集方向的使用报错等问题。You could use this template for reporting a model/
algorithm/dataset  issue.
---
为使您的问题得到快速解决,在建立Issue前,请您先通过如下方式搜索是否有相似问题:【搜索issue关键字】【使用labels筛选】【官方文档】
建立issue时,为快速解决问题,请您根据使用情况给出如下信息:
- 标题:简洁、精准描述您的问题,例如“ssd 模型前置lstm报错  ”
- 版本、环境信息:
   1)PaddlePaddle版本:请提供PaddlePaddle版本号,例如1.1或CommitID
   2)CPU:请提供CPU型号,MKL/OpenBlas/MKLDNN/等数学库的使用情况
   3)GPU:请提供GPU型号,CUDA和CUDNN版本号
   4)系统环境:请说明系统类型、版本(例如Mac OS 10.14),Python版本
- 模型信息
   1)模型名称 2)使用数据集名称 3)使用算法名称 4)模型链接
- 复现信息:如为报错,请给出复现环境、复现步骤
- 问题描述:请详细描述您的问题,同步贴出报错信息、日志/代码关键片段
Thank you for contributing to PaddlePaddle.
Before submitting the issue, you could search issue in the github.Probably there was a similar issue submitted or resolved before.
If there is no solution,please make sure that this is a issue of models including the following details:
**System information**
-PaddlePaddle version (eg.1.1)or CommitID
-CPU: including CPUMKL/OpenBlas/MKLDNN version
-GPU: including CUDA/CUDNN version
-OS Platform (eg.Mac OS 10.14)
-Python version
-Name of Models&Dataset/details of operator
**To Reproduce**
Steps to reproduce the behavior
**Describe your current behavior**
**Code to reproduce the issue**
**Other info / logs**
---
name: 其他(Others)
about: 如上述分类未包含您的问题,可在此提出。 You could use this template for reporting other issues
---
为使您的问题得到快速解决,在建立Issues前,请您先通过如下方式搜索是否有相似问题:【搜索issue关键字】【使用labels筛选】【官方文档】
如果您没有查询到相似问题,为快速解决您的提问,建立issue时请提供如下细节信息:
- 标题:简洁、精准概括您的问题
- 版本、环境信息:
   1)PaddlePaddle版本:请提供您的PaddlePaddle版本号,例如1.1或CommitID
   2)CPU/GPU:如果您使用GPU训练,请提供GPU驱动版本、CUDA和cuDNN版本号
   3)系统环境:请您描述系统类型、版本,例如Mac OS 10.14
   4)Python版本号
   5)显存信息
- 复现信息:如为报错,请给出复现环境、复现步骤
- 问题描述:请详细描述您的问题,同步贴出报错信息、日志/代码关键片段
Thank you for contributing to PaddlePaddle.
Before submitting the issue, you could search issue in the github in case that there was a similar issue submitted or resolved before.
If there is no solution,please provide us with the following details :
**System information**
-PaddlePaddle version (eg.1.1)or CommitID
-CPU: including CPUMKL/OpenBlas/MKLDNN version
-GPU: including CUDA/cuDNN version
-OS Platform and Distribution(eg.Mac OS 10.14)
-Python version
**To Reproduce**
Steps to reproduce the behavior
**Describe your current behavior**
**Code to reproduce the issue**
**Other info / logs**
---
name: 训练(Training issue)
about: 您可以提问训练中报错、应用、出core等问题。 You could use this template for reporting an training
 issue.
---
为使您的问题得到快速解决,在建立Issues前,请您先通过如下方式搜索是否有相似问题:【搜索issue关键字】【使用labels筛选】【官方文档】
如果您没有查询到相似问题,为快速解决您的提问,建立issue时请提供如下细节信息:
- 标题:简洁、精准概括您的问题,例如“Insufficient Memory xxx" ”
- 版本、环境信息:
   1)PaddlePaddle版本:请提供您的PaddlePaddle版本号,例如1.1或CommitID
   2)CPU:预测若用CPU,请提供CPU型号,MKL/OpenBlas/MKLDNN/等数学库使用情况
   3)GPU:预测若用GPU,请提供GPU型号、CUDA和CUDNN版本号
   4)系统环境:请您描述系统类型、版本,例如Mac OS 10.14,Python版本
- 训练信息
   1)单机/多机,单卡/多卡
   2)显存信息
   3)Operator信息
- 复现信息:如为报错,请给出复现环境、复现步骤
- 问题描述:请详细描述您的问题,同步贴出报错信息、日志、可复现的代码片段
Thank you for contributing to PaddlePaddle.
Before submitting the issue, you could search issue in the github in case that there was a similar issue submitted or resolved before.
If there is no solution,please make sure that this is a training issue including the following details:
**System information**
-PaddlePaddle version (eg.1.1)or CommitID
-CPU: including CPUMKL/OpenBlas/MKLDNN version
-GPU: including CUDA/CUDNN version
-OS Platform (eg.Mac OS 10.14)
-Other imformation: Distriuted training/informantion of operator/
Graphics card storage
**To Reproduce**
Steps to reproduce the behavior
**Describe your current behavior**
**Code to reproduce the issue**
**Other info / logs**
...@@ -10,6 +10,7 @@ paddle/fluid/operators/distributed/send_recv.proto ...@@ -10,6 +10,7 @@ paddle/fluid/operators/distributed/send_recv.proto
*.vs *.vs
build/ build/
build_doc/ build_doc/
build.*
*.user *.user
*.sh *.sh
*.bkp *.bkp
......
// Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#include "paddle/fluid/lite/core/mir/pattern_matcher_high_api.h"
#include <glog/logging.h>
namespace paddle {
namespace lite {
namespace mir {
void FuseBase::PerformPatternMatcher(SSAGraph *graph) {
LOG(INFO) << "\n" << matcher_.pattern().DotString();
// Get subgraphs and record the mir::Node pointers for each PMNode.
auto handler = [&](const PatternMatcher::subgraph_t &subgraph, SSAGraph *g) {
// get all the reigistered nodes.
key2nodes_.emplace_back();
for (auto &item : nodes_) {
key2nodes_.back()[item.first] = subgraph.at(item.second);
}
};
matcher_(graph, handler);
}
void FuseBase::DeleteInterNodes(SSAGraph *graph) {
std::set<std::string> keys;
for (auto &node : nodes_) {
if (node.second->IsIntermediate()) {
keys.insert(node.first);
}
}
LOG(INFO) << "keys.size " << keys.size();
std::unordered_set<const Node *> nodes2rm;
for (auto &matched : key2nodes_) {
LOG(INFO) << "get matched " << matched.size();
for (const auto &key : keys) {
nodes2rm.insert(matched.at(key));
}
}
LOG(INFO) << "clean nodes " << nodes2rm.size();
GraphSafeRemoveNodes(graph, nodes2rm);
}
PMNode *FuseBase::GetOrCreateNode(const std::string &key) {
auto it = nodes_.find(key);
if (it != nodes_.end()) {
return it->second;
}
nodes_.emplace(key,
matcher_.mutable_pattern()->NewNode(patterns::UniqueKey(key)));
it = nodes_.find(key);
return it->second;
}
PMNode *FuseBase::OpNode(const std::string &key, const std::string &op_type) {
GetOrCreateNode(key)->set_op_type(op_type);
GetOrCreateNode(key)->AsOp(op_type);
return GetOrCreateNode(key);
}
PMNode *FuseBase::VarNode(const std::string &key) {
GetOrCreateNode(key)->AsVar();
return GetOrCreateNode(key);
}
} // namespace mir
} // namespace lite
} // namespace paddle
// Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#pragma once
#include <map>
#include <set>
#include <string>
#include <unordered_map>
#include <unordered_set>
#include <utility>
#include <vector>
#include "paddle/fluid/lite/core/mir/node.h"
#include "paddle/fluid/lite/core/mir/pattern_matcher.h"
#include "paddle/fluid/lite/core/mir/ssa_graph.h"
namespace paddle {
namespace lite {
namespace mir {
class FuseBase {
public:
using key2nodes_t = std::map<std::string, Node*>;
virtual ~FuseBase() = default;
void operator()(SSAGraph* graph) {
BuildPattern();
PerformPatternMatcher(graph);
for (const auto& matched : key2nodes_) {
InsertNewNode(graph, matched);
}
DeleteInterNodes(graph);
}
// Build a PMPattern using PMNode.
virtual void BuildPattern() = 0;
// Generate an operator desc with a matched subgraph.
virtual cpp::OpDesc GenOpDesc(const key2nodes_t& matched) = 0;
PMNode* OpNode(const std::string& key, const std::string& op_type);
PMNode* VarNode(const std::string& key);
protected:
virtual void InsertNewNode(SSAGraph* graph, const key2nodes_t& matched) = 0;
private:
void PerformPatternMatcher(SSAGraph* graph);
// Delete nodes that are marked as Intermediate
void DeleteInterNodes(SSAGraph* graph);
private:
PMNode* GetOrCreateNode(const std::string& key);
protected:
PatternMatcher matcher_;
std::map<std::string, PMNode*> nodes_;
std::vector<key2nodes_t> key2nodes_;
};
} // namespace mir
} // namespace lite
} // namespace paddle
// Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#include "paddle/fluid/lite/core/mir/pattern_matcher_high_api.h"
#include <gtest/gtest.h>
#include <memory>
#include "paddle/fluid/framework/program_desc.h"
#include "paddle/fluid/lite/core/compatible_tensor.h"
#include "paddle/fluid/lite/core/mir/graph_visualize_pass.h"
#include "paddle/fluid/lite/core/program.h"
namespace paddle {
namespace lite {
namespace mir {
// An demo.
class FcFuser : public FuseBase {
public:
void BuildPattern() override {
// create nodes.
auto* x = VarNode("x");
auto* W = VarNode("W");
auto* b = VarNode("b");
auto* mul = OpNode("mul", "mul");
auto* mul_out = VarNode("mul_out");
auto* add = OpNode("add", "elementwise_add");
auto* Out = VarNode("Out");
// create topology.
// std::vector<PMNode*>({W, x}) >> *mul >> *mul_out;
// std::vector<PMNode*>({mul_out, b}) >> *add >> *Out;
*W >> *mul;
*x >> *mul >> *mul_out;
*b >> *add;
*mul_out >> *add >> *Out;
// Some op specialities.
mul_out->AsIntermediate();
mul->AsIntermediate();
add->AsIntermediate();
}
void InsertNewNode(SSAGraph* graph, const key2nodes_t& matched) override {
auto op_desc = GenOpDesc(matched);
auto fc_op = LiteOpRegistry::Global().Create("fc");
auto mul = matched.at("mul")->stmt()->op;
auto* scope = mul->scope();
auto& valid_places = mul->valid_places();
fc_op->Attach(op_desc, scope);
auto* new_op_node = graph->GraphCreateInstructNode(fc_op, valid_places);
IR_NODE_LINK_TO(matched.at("W"), new_op_node);
IR_NODE_LINK_TO(matched.at("x"), new_op_node);
IR_NODE_LINK_TO(matched.at("b"), new_op_node);
IR_NODE_LINK_TO(new_op_node, matched.at("Out"));
}
private:
cpp::OpDesc GenOpDesc(const key2nodes_t& matched) override {
cpp::OpDesc op_desc;
op_desc.SetType("fc");
op_desc.SetInput("Input", {matched.at("x")->arg()->name});
op_desc.SetInput("W", {matched.at("W")->arg()->name});
op_desc.SetInput("Bias", {matched.at("b")->arg()->name});
op_desc.SetOutput("Out", {matched.at("Out")->arg()->name});
op_desc.SetAttr("in_num_col_dims", 1);
return op_desc;
}
};
std::unique_ptr<SSAGraph> BuildGraph(framework::ProgramDesc* program_desc,
const std::shared_ptr<Scope>& scope,
const std::vector<Place>& valid_places) {
auto* main_block = program_desc->MutableBlock(0);
auto* mul = main_block->AppendOp();
auto* add = main_block->AppendOp();
main_block->Var("x");
main_block->Var("b");
main_block->Var("mul_out");
main_block->Var("w");
main_block->Var("out");
main_block->Var("out1");
scope->Var("w")->GetMutable<lite::Tensor>();
scope->Var("b")->GetMutable<lite::Tensor>();
scope->Var("mul_out")->GetMutable<lite::Tensor>();
scope->Var("w")->GetMutable<lite::Tensor>();
scope->Var("out")->GetMutable<lite::Tensor>();
scope->Var("out1")->GetMutable<lite::Tensor>();
mul->SetInput("X", {"x"});
mul->SetInput("Y", {"w"});
mul->SetOutput("Out", {"mul_out"});
mul->SetType("mul");
mul->SetAttr("x_num_col_dims", 1);
mul->SetAttr("y_num_col_dims", 1);
add->SetInput("X", {"mul_out"});
add->SetInput("Y", {"b"});
add->SetOutput("Out", {"out"});
add->SetType("elementwise_add");
add->SetAttr("axis", 1);
program_desc->Flush();
lite::Program program(*program_desc->Proto(), scope, valid_places);
auto graph = std::unique_ptr<SSAGraph>(new SSAGraph());
graph->Build(program, valid_places);
return graph;
}
TEST(pattern_matcher2, graph_test) {
framework::ProgramDesc program_desc;
std::vector<Place> places{{TARGET(kHost), PRECISION(kFloat)}};
auto scope = std::make_shared<Scope>();
auto graph = BuildGraph(&program_desc, scope, places);
ASSERT_EQ(graph->nodes().size(),
8UL /*real nodes*/ + 2UL /*feed op + fetch op*/);
Visualize(graph.get());
}
TEST(pattern_matcher2, test) {
framework::ProgramDesc program_desc;
std::vector<Place> places{{TARGET(kHost), PRECISION(kFloat)}};
auto scope = std::make_shared<Scope>();
auto graph = BuildGraph(&program_desc, scope, places);
const int num_nodes = graph->nodes().size();
FcFuser fuser;
fuser(graph.get());
ASSERT_EQ(graph->nodes().size(),
num_nodes - 3UL /*nodes removed */ + 1UL /* fused fc node*/);
}
} // namespace mir
} // namespace lite
} // namespace paddle
USE_LITE_OP(fc);
USE_LITE_OP(mul);
USE_LITE_OP(elementwise_add);
// Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#include "paddle/fluid/lite/core/mir/pattern_matcher.h"
#include <gtest/gtest.h>
namespace paddle {
namespace lite {
namespace mir {
void BuildGraph(SSAGraph* g) {
g->mutable_nodes().emplace_back();
Node& o1 = g->mutable_nodes().back();
o1.AsStmt().op_type = "op1";
g->mutable_nodes().emplace_back();
Node& o2 = g->mutable_nodes().back();
o2.AsStmt().op_type = "op2";
g->mutable_nodes().emplace_back();
Node& o3 = g->mutable_nodes().back();
o3.AsStmt().op_type = "op3";
g->mutable_nodes().emplace_back();
Node& o4 = g->mutable_nodes().back();
o4.AsStmt().op_type = "op4";
g->mutable_nodes().emplace_back();
Node& o5 = g->mutable_nodes().back();
o5.AsStmt().op_type = "op5";
g->mutable_nodes().emplace_back();
Node& v1 = g->mutable_nodes().back();
v1.AsArg("var1");
g->mutable_nodes().emplace_back();
Node& v2 = g->mutable_nodes().back();
v2.AsArg("var2");
g->mutable_nodes().emplace_back();
Node& v3 = g->mutable_nodes().back();
v3.AsArg("var3");
g->mutable_nodes().emplace_back();
Node& v4 = g->mutable_nodes().back();
v4.AsArg("var4");
// o1->v1->o2
o1.outlinks.push_back(&v1);
o2.inlinks.push_back(&v1);
v1.inlinks.push_back(&o1);
v1.outlinks.push_back(&o2);
// o2->v2->o3
// o2->v2->o4
o2.outlinks.push_back(&v2);
o3.inlinks.push_back(&v2);
o4.inlinks.push_back(&v2);
v2.inlinks.push_back(&o2);
v2.outlinks.push_back(&o3);
v2.outlinks.push_back(&o4);
// o2->v3->o5
o2.outlinks.push_back(&v3);
o5.inlinks.push_back(&v3);
v3.inlinks.push_back(&o2);
v3.outlinks.push_back(&o5);
// o3-v4->o5
o3.outlinks.push_back(&v4);
o5.inlinks.push_back(&v4);
v4.inlinks.push_back(&o3);
v4.outlinks.push_back(&o5);
}
TEST(PMPattern, NewNode) {
PMPattern x;
auto* n = x.NewNode([](const Node* x) { return true; });
ASSERT_TRUE(n);
ASSERT_EQ(x.nodes_.size(), 1UL);
}
TEST(PMPattern, AddEdge) {
PMPattern x;
auto* a = x.NewNode([](const Node* x) { return true; });
auto* b = x.NewNode([](const Node* x) { return true; });
ASSERT_TRUE(a);
ASSERT_TRUE(b);
x.AddEdge(a, b);
ASSERT_EQ(x.nodes_.size(), 2UL);
ASSERT_EQ(x.edges_.size(), 1UL);
ASSERT_EQ(x.edges_.front().first, a);
ASSERT_EQ(x.edges_.front().second, b);
ASSERT_EQ(x.nodes().size(), 2UL);
ASSERT_EQ(x.edges().size(), 1UL);
ASSERT_EQ(x.edges().front().first, a);
ASSERT_EQ(x.edges().front().second, b);
}
TEST(PatternMatcher, MarkPMNodesInGraph) {
PatternMatcher x;
// mark o2, o3, v2
// The pattern is a graph:
// o2(a node named o2) -> v2(a node named v2)
// v2 -> o3(a node named o3)
auto* o2 = x.pattern_.NewNode([](const Node* node) {
// The teller can be any condition, such as op type, or variable's shape.
return node && node->IsStmt() && node->stmt()->op_type == "op2";
});
auto* o3 = x.pattern_.NewNode([](const Node* node) {
// The teller can be any condition, such as op type, or variable's shape.
return node && node->IsStmt() && node->stmt()->op_type == "op3";
});
auto* v2 = x.pattern_.NewNode([](const Node* node) {
// The teller can be any condition, such as op type, or variable's shape.
return node && node->IsArg() && node->arg()->name == "var2";
});
ASSERT_FALSE(o2->Tell(nullptr));
ASSERT_FALSE(o3->Tell(nullptr));
ASSERT_FALSE(v2->Tell(nullptr));
x.pattern_.AddEdge(o2, v2);
x.pattern_.AddEdge(v2, o3);
ASSERT_EQ(x.pattern_.edges().size(), 2UL);
ASSERT_EQ(x.pattern_.edges()[0].first, o2);
ASSERT_EQ(x.pattern_.edges()[0].second, v2);
ASSERT_EQ(x.pattern_.edges()[1].first, v2);
ASSERT_EQ(x.pattern_.edges()[1].second, o3);
SSAGraph graph;
BuildGraph(&graph);
x.MarkPMNodesInGraph(&graph);
ASSERT_EQ(x.pmnodes2nodes_.size(), 3UL);
auto subgraphs = x.DetectPatterns();
ASSERT_EQ(subgraphs.size(), 1UL);
}
TEST(PatternMatcher, MultiSubgraph) {
SSAGraph graph;
BuildGraph(&graph);
PatternMatcher x;
// The pattern is a graph:
// op -> var
auto* any_op = x.mutable_pattern()->NewNode(
[](const Node* node) {
return node->IsStmt() && (node->stmt()->op_type == "op2" ||
node->stmt()->op_type == "op3");
},
"OP0");
auto* any_var =
x.mutable_pattern()
->NewNode([](const Node* node) { return node->IsArg(); }, "VAR")
->AsIntermediate();
auto* any_op1 = x.mutable_pattern()->NewNode(
[](const Node* node) { return node->IsStmt(); }, "OP1");
x.mutable_pattern()->AddEdge(any_op, any_var);
x.mutable_pattern()->AddEdge(any_var, any_op1);
int count = 0;
PatternMatcher::handle_t handle = [&](const PatternMatcher::subgraph_t& s,
SSAGraph* g) {
LOG(INFO) << "Detect " << s.at(any_op)->stmt()->op_type << " -> "
<< s.at(any_var)->arg()->name << " -> "
<< s.at(any_op1)->stmt()->op_type;
count++;
};
x(&graph, handle);
// 1. Detect op3 -> var4 -> op5
// 2. Detect op2 -> var2 -> op3
// 3. Detect op2 -> var2 -> op4
// 4. Detect op2 -> var3 -> op5
// But 2 and 3 and 4 overlapped, so keep 2, so the final choices are 1 and 2
ASSERT_GE(count, 1);
ASSERT_LE(count, 2);
}
TEST(PatternMatcher, IntermediateCheck) {
SSAGraph graph;
BuildGraph(&graph);
// o2->v2->o3
// o2->v2->o4
// check o2+o3 fuse, should fail because v2 also link to o4.
PatternMatcher matcher;
auto* op2 = matcher.mutable_pattern()->NewNode(
[](const Node* x) {
return x && x->IsStmt() && x->stmt()->op_type == "op2";
},
"op2");
auto* op3 = matcher.mutable_pattern()->NewNode(
[](const Node* x) {
return x && x->IsStmt() && x->stmt()->op_type == "op3";
},
"op3");
auto* v2 = matcher.mutable_pattern()
->NewNode(
[](const Node* x) {
return x && x->IsArg() && x->arg()->name == "var2";
},
"var2")
->AsIntermediate();
v2->LinksFrom({op2}).LinksTo({op3});
int count = 0;
matcher(&graph, [&](const PatternMatcher::subgraph_t& g, SSAGraph* graph) {
++count;
});
EXPECT_EQ(count, 0);
count = 0;
v2->AsInput();
matcher(&graph, [&](const PatternMatcher::subgraph_t& g, SSAGraph* graph) {
++count;
});
ASSERT_EQ(count, 1);
}
} // namespace mir
} // namespace lite
} // namespace paddle
lite_cc_library(gen_code_lite SRCS gen_code.cc
DEPS program_lite op_lite scope_lite
cpp_op_desc_lite
HVY_DEPS operator)
lite_cc_library(paddle_infer_gencode SRCS paddle_infer.cc DEPS program_lite utils_lite)
if (NOT LITE_WITH_LIGHT_WEIGHT_FRAMEWORK)
lite_cc_test(test_gen_code_lite SRCS gen_code_test.cc DEPS gen_code_lite ${tensor_lite}
mul_op_lite
compatible_pb_lite
model_parser_lite
X86_DEPS mul_compute_x86
ARM_DEPS mul_compute_arm
ARGS --optimized_model=${LITE_MODEL_DIR}/lite_naive_model_opt SERIAL)
lite_cc_library(__generated_code__
SRCS ${CMAKE_BINARY_DIR}/paddle/fluid/lite/gen_code/__generated_code__.cc
DEPS scope_lite op_lite kernel_lite paddle_infer_gencode
)
lite_cc_test(test_generated_code SRCS generated_code_test.cc DEPS __generated_code__
${ops_lite} ${host_kernels}
X86_DEPS ${x86_kernels}
)
add_dependencies(__generated_code__ test_gen_code_lite)
endif()
// Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#include "paddle/fluid/lite/gen_code/gen_code.h"
#include <algorithm>
#include <string>
#include <vector>
namespace paddle {
namespace lite {
namespace gencode {
void Module::AddWeight(const std::string &name, const TensorRepr &tensor) {
auto w_name = WeightUniqueName();
Line(string_format("// Create weight: %s", name.c_str()));
// auto* w0 = scope.Var("w0")->GetMutable<lite::Tensor>();
Line(string_format("auto* %s = scope->Var(%s)->GetMutable<lite::Tensor>();",
w_name.c_str(), Repr(name).c_str()));
// lite::DDim w_ddim({1, 2})
Line(string_format("lite::DDim %s_ddim(std::vector<int64_t>(%s));",
w_name.c_str(), tensor.ddim.repr().c_str()));
// std::vector<float> w_data({});
auto w_data_repr = DataRepr(
std::string(static_cast<const char *>(tensor.raw_data), tensor.num_bytes),
tensor.dtype);
Line(string_format("std::vector<%s> %s_data({%s});",
PrecisionToStr(tensor.dtype).c_str(), w_name.c_str(),
w_data_repr.c_str()));
// w0->Assign<float, lite::DDim, TARGET(kX86)>(w0_data.data(), w0_ddim);
Line(string_format(
"%s->Assign<%s, lite::DDim, TARGET(kX86)>(%s_data.data(), %s_ddim);",
w_name.c_str(), PrecisionToStr(tensor.dtype).c_str(), w_name.c_str(),
w_name.c_str()));
Line("");
}
void Module::AddHeaderIncludeGenCode() {
Line("");
Line("#include <string>");
Line("#include <vector>");
Line("#include \"paddle/fluid/lite/core/compatible_tensor.h\"");
Line("#include \"paddle/fluid/lite/core/context.h\"");
Line("#include \"paddle/fluid/lite/gen_code/paddle_infer.h\"");
Line("#include \"paddle/fluid/lite/core/op_registry.h\"");
Line("#include \"paddle/fluid/lite/core/scope.h\"");
Line("#include \"paddle/fluid/lite/model_parser/cpp/op_desc.h\"");
Line("");
Line("");
}
std::string Module::DataRepr(const std::string &raw_data, PrecisionType dtype) {
std::stringstream ss;
switch (dtype) {
case PRECISION(kFloat): {
const float *raw = reinterpret_cast<const float *>(raw_data.c_str());
int num_elems = raw_data.size() / sizeof(float);
if (num_elems) {
for (int i = 0; i < num_elems - 1; i++) {
ss << raw[i] << ",";
}
ss << raw[num_elems - 1];
}
} break;
default:
LOG(FATAL) << "Unsupported type " << PrecisionToStr(dtype);
}
return ss.str();
}
void Module::AddOpDescHelper(const std::string &op_id,
const cpp::OpDesc &desc) {
std::string desc_var = op_id + "_desc";
Line(string_format("lite::cpp::OpDesc %s;", desc_var.c_str()));
auto vec_str_repr = [](const std::vector<std::string> &vec) {
return Repr(vec);
};
for (auto &item : desc.inputs()) {
Line(string_format("%s.SetInput(%s, %s);", desc_var.c_str(),
Repr(item.first).c_str(),
vec_str_repr(item.second).c_str()));
}
for (auto &item : desc.outputs()) {
Line(string_format("%s.SetOutput(%s, %s);", desc_var.c_str(),
Repr(item.first).c_str(),
vec_str_repr(item.second).c_str()));
}
auto attr_repr = [&](const std::string &name) -> std::string {
using AttrType = OpDescAPI::AttrType;
auto type = desc.GetAttrType(name);
switch (type) {
case AttrType::INT:
return std::to_string(desc.GetAttr<int>(name));
case AttrType::FLOAT:
return std::to_string(desc.GetAttr<float>(name));
case AttrType::BOOLEAN:
return std::to_string(desc.GetAttr<bool>(name));
case AttrType::STRING:
return "\"" + desc.GetAttr<std::string>(name) + "\"";
case AttrType::STRINGS: {
std::vector<std::string> tmp;
auto vals = desc.GetAttr<std::vector<std::string>>(name);
std::transform(vals.begin(), vals.end(), std::back_inserter(tmp),
[](const std::string &x) { return Repr(x); });
return "{" + Join(tmp, ",") + "}";
}
default:
LOG(FATAL) << "Unsupported attribute type: " << static_cast<int>(type);
}
return "";
};
auto attr_type_repr = [&](const std::string &name) -> std::string {
using AttrType = OpDescAPI::AttrType;
auto type = desc.GetAttrType(name);
switch (type) {
case AttrType::INT:
return "int";
case AttrType::FLOAT:
return "float";
case AttrType::BOOLEAN:
return "bool";
case AttrType::STRING:
return "std::string";
case AttrType::STRINGS:
return "std::vector<std::string>";
default:
LOG(FATAL) << "Unsupported attribute type: " << static_cast<int>(type);
}
return "unk_t";
};
for (auto &item : desc.AttrNames()) {
// Drop the python information.
if (item == "op_callstack") continue;
auto attr_type = attr_type_repr(item);
auto attr_val = attr_repr(item);
Line(string_format("%s.SetAttr<%s>(%s, %s);", //
desc_var.c_str(), attr_type.c_str(), Repr(item).c_str(),
attr_val.c_str()));
}
}
void Module::AddOp(const cpp::OpDesc &op) {
auto op_name = OpUniqueName();
AddOpDescHelper(op_name, op);
Line(string_format("// Create Op: %s", op.Type().c_str()));
Line(string_format("auto %s = lite::LiteOpRegistry::Global().Create(\"%s\");",
op_name.c_str(), op.Type().c_str()));
CHECK(op.HasAttr(kKernelTypeAttr))
<< "the kernel type should be specified before generate code.";
auto kernel_type = op.GetAttr<std::string>(kKernelTypeAttr);
Line(string_format("%s->Attach(%s, exec_scope);", op_name.c_str(),
(op_name + "_desc").c_str()));
// Create kernel
auto kernel_name = KernelUniqueName();
Line(string_format(
"auto %s = std::move(%s->CreateKernels(valid_places, \"%s\").front());",
kernel_name.c_str(), op_name.c_str(), kernel_type.c_str()));
// Set Context for kernel
// clang-format off
Line(string_format("%s->SetContext(lite::ContextScheduler::Global().NewContext(%s->target()));", kernel_name.c_str(), kernel_name.c_str())); // NOLINT
// clang-format on
Line(string_format("ops.push_back(%s);", op_name.c_str()));
Line(string_format("kernels.push_back(std::move(%s));", kernel_name.c_str()));
op_kinds_.insert(op.Type());
kernel_kinds_.insert(kernel_type);
}
} // namespace gencode
} // namespace lite
} // namespace paddle
// Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#pragma once
#include <set>
#include <string>
#include <vector>
#include "paddle/fluid/lite/core/compatible_tensor.h"
#include "paddle/fluid/lite/core/framework.pb.h"
#include "paddle/fluid/lite/core/program.h"
#include "paddle/fluid/lite/core/target_wrapper.h"
#include "paddle/fluid/lite/model_parser/cpp/op_desc.h"
#include "paddle/fluid/lite/model_parser/desc_apis.h"
#include "paddle/fluid/lite/utils/string.h"
namespace paddle {
namespace lite {
namespace gencode {
struct TensorRepr {
TensorRepr() = default;
TensorRepr(PrecisionType dtype, const std::vector<int64_t> &ddim,
void *raw_data, size_t num_bytes)
: dtype(dtype), ddim(ddim), raw_data(raw_data), num_bytes(num_bytes) {}
PrecisionType dtype;
lite::DDim ddim;
const void *raw_data;
size_t num_bytes{};
};
class Module {
std::vector<cpp::OpDesc> ops;
std::vector<TensorRepr> weights;
std::vector<std::string> tmp_vars_;
std::stringstream stream_;
std::set<std::string> kernel_kinds_;
std::set<std::string> op_kinds_;
int line_indent_{};
const int indent_unit_{2};
public:
void NewOp(const cpp::OpDesc &desc) { ops.push_back(desc); }
void NewWeight(const TensorRepr &x) { weights.push_back(x); }
void NewTmpVar(const std::string &x) { tmp_vars_.push_back(x); }
std::stringstream &stream() { return stream_; }
void AddHeaderIncludeGenCode();
void AddNamespaceBegin() {
Line("namespace paddle {");
Line("namespace gencode{");
Line("");
}
void AddNamespaceEnd() {
Line("");
Line("} // namespace gencode");
Line("} // namespace paddle");
}
void AddInitFuncBegin() {
Line("void PaddlePredictor::Init() {");
Line("");
IncIndent();
}
void AddInitFuncEnd() {
DecIndent();
Line("");
Line("}");
}
void AddScopeDecl() {
Line("lite::Scope* scope = static_cast<lite::Scope*>(raw_scope_);");
// clang-format off
Line("lite::Scope* exec_scope = static_cast<lite::Scope*>(raw_exe_scope_);"); // NOLINT
// clang-format on
// Create feed and fetch in exec_scope.
Line(string_format("exec_scope->Var(%s);", Repr("feed").c_str()));
Line(string_format("exec_scope->Var(%s);", Repr("fetch").c_str()));
}
void AddValidPlaceDecl() {
// clang-format off
Line("std::vector<lite::Place> valid_places({lite::Place({TARGET(kX86), PRECISION(kFloat), DATALAYOUT(kNCHW)}), lite::Place({TARGET(kHost), PRECISION(kAny), DATALAYOUT(kAny)})});"); // NOLINT
// clang-format on
}
void AddMemberCast() {
Line("// Cast the raw members");
// clang-format off
Line(string_format("auto& ops = *static_cast<std::vector<std::shared_ptr<lite::OpLite>>*>(raw_ops_);")); // NOLINT
Line(string_format("auto& kernels = *static_cast<std::vector<std::unique_ptr<lite::KernelBase>>*>(raw_kernels_);")); // NOLINT
// clang-format on
Line("");
}
void AddWeight(const std::string &name, const TensorRepr &tensor);
void AddTmpVar(const std::string &x) {
Line(string_format("// Create temporary variable: %s", x.c_str()));
Line(string_format("exec_scope->Var(%s);", Repr(x).c_str()));
Line("");
}
void AddOp(const cpp::OpDesc &op);
void AddOpDescHelper(const std::string &op_id, const cpp::OpDesc &desc);
void AddOpCompileDeps() {
Line("");
Line("// Add Operator compile deps");
for (auto &op_type : op_kinds_) {
Line(string_format("USE_LITE_OP(%s)", op_type.c_str()));
}
Line("");
}
void AddKernelCompileDeps() {
Line("// Add Kernel compile deps");
std::string op_type, alias;
Place place;
for (auto &kernel_type : kernel_kinds_) {
KernelBase::ParseKernelType(kernel_type, &op_type, &alias, &place);
Line(string_format("USE_LITE_KERNEL(%s, %s, %s, %s, %s)", //
op_type.c_str(), //
TargetRepr(place.target).c_str(),
PrecisionRepr(place.precision).c_str(),
DataLayoutRepr(place.layout).c_str(), alias.c_str()));
}
}
private:
std::string WeightUniqueName() const {
return "w_" + std::to_string(weight_counter_++);
}
std::string TmpVarUniqueName() const {
return "tmp_" + std::to_string(tmp_var_counter_++);
}
std::string OpUniqueName() const {
return "op_" + std::to_string(op_counter_++);
}
std::string KernelUniqueName() const {
return "kernel_" + std::to_string(kernel_counter_++);
}
std::string DataRepr(const std::string &raw_data, PrecisionType dtype);
void IncIndent() { line_indent_++; }
void DecIndent() { line_indent_--; }
void Line(const std::string &x) {
std::string indent_str(line_indent_ * indent_unit_, ' ');
stream() << indent_str << x << "\n";
}
private:
mutable int weight_counter_{};
mutable int tmp_var_counter_{};
mutable int op_counter_{};
mutable int kernel_counter_{};
};
class ProgramCodeGenerator {
public:
ProgramCodeGenerator(const framework::proto::ProgramDesc &program,
const lite::Scope &exec_scope)
: program_(program), exec_scope_(exec_scope) {
LOG(INFO) << program.DebugString();
}
std::string GenCode() {
Module m;
m.AddHeaderIncludeGenCode();
m.AddNamespaceBegin();
m.AddInitFuncBegin();
m.AddMemberCast();
m.AddScopeDecl();
m.AddValidPlaceDecl();
AddWeights(&m);
AddTmpVars(&m);
AddOps(&m);
m.AddInitFuncEnd();
m.AddNamespaceEnd();
m.AddOpCompileDeps();
m.AddKernelCompileDeps();
return m.stream().str();
}
void AddWeights(Module *m) {
for (auto &var : program_.blocks(0).vars()) {
if (var.persistable()) {
auto name = var.name();
if (name == "feed" || name == "fetch") continue;
const auto &tensor = exec_scope_.FindVar(name)->Get<lite::Tensor>();
TensorRepr repr;
TensorToRepr(tensor, &repr);
m->AddWeight(name, repr);
}
}
}
void AddTmpVars(Module *m) {
for (auto &var : program_.blocks(0).vars()) {
if (!var.persistable()) {
m->AddTmpVar(var.name());
}
}
}
void AddOps(Module *m) {
for (auto &op : program_.blocks(0).ops()) {
pb::OpDesc pb_desc(op);
cpp::OpDesc cpp_desc;
TransformOpDescPbToCpp(pb_desc, &cpp_desc);
m->AddOp(cpp_desc);
}
}
private:
void TensorToRepr(const lite::Tensor &tensor, TensorRepr *repr) {
repr->ddim = tensor.dims();
// TODO(Superjomn) support other types.
repr->dtype = PRECISION(kFloat);
repr->raw_data = tensor.data<float>();
repr->num_bytes = repr->ddim.production() * sizeof(float);
}
private:
const framework::proto::ProgramDesc &program_;
const lite::Scope &exec_scope_;
};
} // namespace gencode
} // namespace lite
} // namespace paddle
// Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#include "paddle/fluid/lite/gen_code/gen_code.h"
#include <gflags/gflags.h>
#include <gtest/gtest.h>
#include <fstream>
#include <string>
#include <utility>
#include <vector>
#include "paddle/fluid/lite/core/compatible_tensor.h"
#include "paddle/fluid/lite/core/context.h"
#include "paddle/fluid/lite/core/op_registry.h"
#include "paddle/fluid/lite/core/scope.h"
#include "paddle/fluid/lite/model_parser/cpp/op_desc.h"
#include "paddle/fluid/lite/model_parser/model_parser.h"
DEFINE_string(optimized_model, "", "");
DEFINE_string(generated_code_file, "__generated_code__.cc", "");
namespace paddle {
namespace lite {
namespace gencode {
// Manually construct a program.
TEST(gen_code, manual) {
// For holding the weights.
lite::Scope scope;
// For holding the temporary variables.
auto &tmp_scope = scope.NewScope();
// Create weight variables.
auto *w0 = scope.Var("w0")->GetMutable<lite::Tensor>();
// Create temporary variables.
auto *a = tmp_scope.Var("x")->GetMutable<lite::Tensor>();
tmp_scope.Var("out")->GetMutable<lite::Tensor>();
// Set weights.
std::vector<float> w0_data({0, 1, 2, 3});
w0->Assign<float, lite::DDim, TARGET(kX86)>(
w0_data.data(), lite::DDim{std::vector<int64_t>({2, 2})});
std::vector<float> a_data({0, 1, 2, 3});
a->Assign<float, lite::DDim, TARGET(kX86)>(
a_data.data(), lite::DDim{std::vector<int64_t>({2, 2})});
std::vector<Place> valid_places({
Place{TARGET(kX86), PRECISION(kFloat)},
Place{TARGET(kHost), PRECISION(kFloat)},
Place{TARGET(kHost), PRECISION(kAny)},
});
auto mul_op = LiteOpRegistry::Global().Create("mul");
cpp::OpDesc mul_op_desc;
mul_op_desc.SetType("mul");
mul_op_desc.SetInput("X", {"x"});
mul_op_desc.SetInput("Y", {"w0"});
mul_op_desc.SetAttr("x_num_col_dims", 1);
mul_op_desc.SetAttr("y_num_col_dims", 1);
mul_op_desc.SetOutput("Out", {"out"});
mul_op->Attach(mul_op_desc, &tmp_scope);
auto mul_kernel = std::move(mul_op->CreateKernels(valid_places).front());
auto fc_ctx = ContextScheduler::Global().NewContext(TARGET(kX86));
mul_op->CheckShape();
mul_op->InferShape();
mul_kernel->SetContext(std::move(fc_ctx));
mul_kernel->Launch();
}
TEST(gen_code, auto_gen) {
std::vector<float> w0_data({0, 1, 2, 3});
TensorRepr w0(PRECISION(kFloat), std::vector<int64_t>({2, 2}), w0_data.data(),
w0_data.size() * sizeof(float));
std::vector<float> w1_data({0.01, 1.2, 2.3, 3.4, 1.1, 2.2});
TensorRepr w1(PRECISION(kFloat), std::vector<int64_t>({3, 2}), w1_data.data(),
w1_data.size() * sizeof(float));
cpp::OpDesc op0;
op0.SetType("mul");
op0.SetInput("X", {"a", "b"});
op0.SetOutput("Out", {"out0"});
op0.SetAttr<std::string>("desc", "this is a desc");
op0.SetAttr<int>("x_col", 1);
op0.SetAttr<int>("y_col", 2);
op0.SetAttr<std::string>(kKernelTypeAttr, "x86");
gencode::Module module;
module.AddHeaderIncludeGenCode();
module.AddNamespaceBegin();
module.AddInitFuncBegin();
module.AddMemberCast();
module.AddWeight("w0", w0);
module.AddWeight("w1", w1);
module.AddTmpVar("a");
module.AddTmpVar("b");
module.AddOp(op0);
module.AddInitFuncEnd();
module.AddNamespaceEnd();
LOG(INFO) << module.stream().str();
}
TEST(gen_code, optimized_program) {
lite::Scope scope;
framework::proto::ProgramDesc desc;
LoadModel(FLAGS_optimized_model, &scope, &desc);
ProgramCodeGenerator codegen(desc, scope);
std::ofstream file(FLAGS_generated_code_file);
file << codegen.GenCode();
file.close();
}
} // namespace gencode
} // namespace lite
} // namespace paddle
USE_LITE_OP(mul);
#ifdef LITE_WITH_X86
USE_LITE_KERNEL(mul, kX86, kFloat, kNCHW, def);
#endif
#ifdef LITE_WITH_ARM
USE_LITE_KERNEL(mul, kARM, kFloat, kNCHW, def);
#endif
// Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#include <glog/logging.h>
#include <gtest/gtest.h>
#include "paddle/fluid/lite/gen_code/paddle_infer.h"
namespace paddle {
namespace lite {
TEST(PaddlePredictor, Init) {
gencode::PaddlePredictor predictor;
predictor.Init();
}
TEST(PaddlePredictor, Run) {
gencode::PaddlePredictor predictor;
predictor.Init();
LOG(INFO) << "run the generated code";
auto input_tensor = predictor.GetInput(0);
input_tensor->Resize(std::vector<int64_t>({100, 100}));
auto* data = input_tensor->mutable_data<float>();
for (int i = 0; i < 100 * 100; i++) {
data[i] = i;
}
predictor.Run();
auto output_tensor = predictor.GetOutput(0);
LOG(INFO) << "output: " << output_tensor->data<float>()[0];
}
} // namespace lite
} // namespace paddle
// Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#include "paddle/fluid/lite/gen_code/paddle_infer.h"
#include "paddle/fluid/lite/core/compatible_tensor.h"
#include "paddle/fluid/lite/core/op_lite.h"
namespace paddle {
namespace gencode {
void Tensor::Resize(const Tensor::ddim_t &shape) {
CHECK(raw_mutable_tensor_);
auto *tensor = static_cast<lite::Tensor *>(raw_mutable_tensor_);
tensor->Resize(shape);
}
#define FOR_EACH_TYPE(HANDLE) \
HANDLE(int); \
HANDLE(float); \
HANDLE(int8_t); \
HANDLE(int64_t);
#define IMPL_DATA(T) \
template <> \
const T *Tensor::data<T>() const { \
CHECK(raw_tensor_); \
const auto *tensor = static_cast<const lite::Tensor *>(raw_tensor_); \
return tensor->data<T>(); \
}
FOR_EACH_TYPE(IMPL_DATA);
#undef IMPL_DATA
#define IMPL_MUTABLE_DATA(T) \
template <> \
T *Tensor::mutable_data<T>() { \
CHECK(raw_mutable_tensor_); \
auto *tensor = static_cast<lite::Tensor *>(raw_mutable_tensor_); \
return tensor->mutable_data<T>(); \
}
FOR_EACH_TYPE(IMPL_MUTABLE_DATA);
#undef IMPL_MUTABLE_DATA
PaddlePredictor::PaddlePredictor() {
raw_ops_ = new std::vector<std::shared_ptr<lite::OpLite>>;
raw_kernels_ = new std::vector<std::unique_ptr<lite::KernelBase>>;
raw_scope_ = new lite::Scope;
raw_exe_scope_ = &(static_cast<lite::Scope *>(raw_scope_)->NewScope());
}
std::unique_ptr<Tensor> PaddlePredictor::GetTensor(
const std::string &id) const {
auto *exe_scope = static_cast<lite::Scope *>(raw_exe_scope_);
const auto *var = exe_scope->FindVar(id);
const auto &tensor = var->Get<lite::Tensor>();
return std::unique_ptr<Tensor>(new Tensor(&tensor, nullptr));
}
std::unique_ptr<Tensor> PaddlePredictor::GetMutableTensor(
const std::string &id) {
auto *exe_scope = static_cast<lite::Scope *>(raw_exe_scope_);
auto *var = exe_scope->FindVar(id);
auto *tensor = var->GetMutable<lite::Tensor>();
return std::unique_ptr<Tensor>(new Tensor(nullptr, tensor));
}
#define CAST_OPS \
auto *ops = \
static_cast<std::vector<std::shared_ptr<lite::OpLite>> *>(raw_ops_);
#define CAST_KERNELS \
auto *kernels = \
static_cast<std::vector<std::unique_ptr<lite::KernelBase>> *>( \
raw_kernels_);
#define CAST_SCOPE auto *scope = static_cast<lite::Scope *>(raw_scope_);
PaddlePredictor::~PaddlePredictor() {
CAST_OPS
CAST_KERNELS
CAST_SCOPE
if (ops) {
delete ops;
}
if (kernels) {
delete kernels;
}
if (scope) {
delete scope;
}
}
void PaddlePredictor::Run() {
CAST_OPS
CAST_KERNELS
CHECK(ops);
CHECK(kernels);
CHECK_EQ(ops->size(), kernels->size());
for (size_t i = 0; i < ops->size(); i++) {
LOG(INFO) << "Running the " << i << "-th operator";
ops->at(i)->InferShape();
kernels->at(i)->Launch();
}
}
std::unique_ptr<Tensor> PaddlePredictor::GetInput(size_t offset) {
auto *exec_scope = static_cast<lite::Scope *>(raw_exe_scope_);
auto *_feed_list = exec_scope->FindVar("feed");
CHECK(_feed_list) << "no feed variable in exec_scope";
auto *feed_list = _feed_list->GetMutable<std::vector<lite::Tensor>>();
if (offset >= feed_list->size()) {
feed_list->resize(offset + 1);
}
return std::unique_ptr<Tensor>(new Tensor(nullptr, &feed_list->at(offset)));
}
std::unique_ptr<Tensor> PaddlePredictor::GetOutput(size_t offset) {
auto *exec_scope = static_cast<lite::Scope *>(raw_exe_scope_);
auto *_fetch_list = exec_scope->FindVar("fetch");
CHECK(_fetch_list) << "no fatch variable in exec_scope";
auto &fetch_list = *_fetch_list->GetMutable<std::vector<lite::Tensor>>();
CHECK_LT(offset, fetch_list.size()) << "offset " << offset << " overflow";
return std::unique_ptr<Tensor>(new Tensor(&fetch_list.at(offset), nullptr));
}
} // namespace gencode
} // namespace paddle
// Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#pragma once
#include <memory>
#include <string>
#include <vector>
namespace paddle {
namespace gencode {
/// Zero Copy Tensor.
class Tensor {
public:
using ddim_t = std::vector<int64_t>;
Tensor(const void *raw_tensor, void *raw_mutable_tensor)
: raw_tensor_(raw_tensor), raw_mutable_tensor_(raw_mutable_tensor) {}
void Resize(const ddim_t &shape);
template <typename T>
const T *data() const;
template <typename T>
T *mutable_data();
private:
const void *raw_tensor_;
void *raw_mutable_tensor_{};
};
/*
* Predictor for the generated code.
*/
class PaddlePredictor {
public:
void Init();
std::unique_ptr<Tensor> GetTensor(const std::string &id) const;
std::unique_ptr<Tensor> GetMutableTensor(const std::string &id);
// Get offset-th col of feed.
std::unique_ptr<Tensor> GetInput(size_t offset);
std::unique_ptr<Tensor> GetOutput(size_t offset);
void Run();
PaddlePredictor();
~PaddlePredictor();
private:
void *raw_ops_;
void *raw_kernels_;
void *raw_scope_{};
void *raw_exe_scope_{}; // raw_exe_scope is not owned.
};
} // namespace gencode
} // namespace paddle
// Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#include <Eigen/Core>
#include <string>
#include <vector>
#include "paddle/fluid/framework/eigen.h"
#include "paddle/fluid/lite/core/kernel.h"
#include "paddle/fluid/lite/core/op_registry.h"
#include "paddle/fluid/lite/core/types.h"
#include "paddle/fluid/lite/operators/conv_op.h"
#include "paddle/fluid/operators/math/blas.h"
#include "paddle/fluid/operators/math/depthwise_conv.h"
#include "paddle/fluid/operators/math/im2col.h"
#include "paddle/fluid/operators/math/vol2col.h"
namespace paddle {
namespace lite {
namespace kernels {
namespace x86 {
inline bool IsExpand(const std::vector<int64_t>& filter_dim,
const std::vector<int>& strides,
const std::vector<int>& paddings,
const std::vector<int>& dilations) {
bool filter_1 = true, strides_1 = true, padding_0 = true, dilation_1 = true;
for (size_t j = 0; j < strides.size(); ++j) {
filter_1 = filter_1 && (static_cast<int>(filter_dim[j + 2]) == 1);
strides_1 = strides_1 && (strides[j] == 1);
padding_0 = padding_0 && (paddings[j] == 0);
dilation_1 = dilation_1 && (dilations[j] == 1);
}
return !(filter_1 && strides_1 && padding_0 && dilation_1);
}
template <typename T>
class Conv2dCompute : public KernelLite<TARGET(kX86), PRECISION(kFloat)> {
public:
using param_t = operators::ConvParam;
void Run() override {
auto& param = *param_.get_mutable<operators::ConvParam>();
lite::Tensor filter = *param.filter;
param.output->template mutable_data<T>();
const int batch_size = static_cast<int>(param.x->dims()[0]);
std::vector<int64_t> filter_shape_vec(filter.dims().Vectorize());
std::vector<int64_t> output_shape_vec(param.output->dims().Vectorize());
size_t data_dim = filter_shape_vec.size() - 2;
std::vector<int64_t> col_shape_vec(1 + 2 * data_dim);
col_shape_vec[0] = param.x->dims()[1] / param.groups;
for (size_t j = 0; j < data_dim; ++j) {
col_shape_vec[j + 1] = filter_shape_vec[j + 2];
col_shape_vec[j + 1 + data_dim] = output_shape_vec[j + 2];
}
lite::DDim col_shape(col_shape_vec);
lite::DDim col_matrix_shape = col_shape.Flattern2D(data_dim + 1);
bool is_expand = IsExpand(filter_shape_vec, param.strides, param.paddings,
param.dilations);
lite::Tensor col;
lite::Tensor col_matrix;
if (is_expand) {
col.Resize(col_shape);
col_matrix.ShareDataWith(col);
col_matrix.Resize(col_matrix_shape);
}
lite::DDim input_shape = param.x->dims().Slice(1, param.x->dims().size());
lite::DDim filter_matrix_shape(std::vector<int64_t>{
filter.dims()[0], filter.dims().production() / filter.dims()[0]});
filter.Resize(filter_matrix_shape);
lite::DDim output_matrix_shape(std::vector<int64_t>{
param.output->dims()[1],
param.output->dims().production() /
(param.output->dims()[0] * param.output->dims()[1])});
int in_step = static_cast<int>(param.x->dims()[1]) / param.groups;
int out_step = static_cast<int>(param.output->dims()[1]) / param.groups;
paddle::operators::math::Vol2ColFunctor<platform::CPUDeviceContext, T>
vol2col;
paddle::operators::math::Im2ColFunctor<
paddle::operators::math::ColFormat::kCFO, platform::CPUDeviceContext, T>
im2col;
auto blas = paddle::operators::math::GetBlas<platform::CPUDeviceContext, T>(
platform::CPUDeviceContext());
for (int i = 0; i < batch_size; i++) {
lite::Tensor in_batch;
in_batch.ShareDataWith(
param.x->raw_tensor().Slice(i, i + 1).Resize(input_shape.data()));
lite::Tensor out_batch;
out_batch.ShareDataWith(param.output->raw_tensor().Slice(i, i + 1).Resize(
input_shape.data()));
for (int g = 0; g < param.groups; g++) {
lite::Tensor in_slice;
in_slice.ShareDataWith(
in_batch.raw_tensor().Slice(g * in_step, (g + 1) * in_step));
if (!is_expand) {
col.ShareDataWith(in_slice);
col_matrix.ShareDataWith(col);
col_matrix.Resize(col_matrix_shape);
} else if (data_dim == 2U) {
// im2col
im2col(platform::CPUDeviceContext(), in_slice.raw_tensor(),
param.dilations, param.strides,
std::vector<int>{param.paddings[0], param.paddings[1],
param.paddings[0], param.paddings[1]},
&(col.raw_tensor()));
} else if (data_dim == 3U) {
// vol2col
vol2col(platform::CPUDeviceContext(), in_slice.raw_tensor(),
param.dilations, param.strides, param.paddings,
&(col.raw_tensor()));
}
// gemm
lite::Tensor out_slice;
out_slice.ShareDataWith(
out_batch.raw_tensor().Slice(g * out_step, (g + 1) * out_step));
lite::Tensor filter_slice;
filter_slice.ShareDataWith(
filter.raw_tensor().Slice(g * out_step, (g + 1) * out_step));
blas.MatMul(filter_slice.raw_tensor(), false, col_matrix.raw_tensor(),
false, T(1.0), &(out_slice.raw_tensor()), T(0.0));
}
}
}
virtual ~Conv2dCompute() = default;
};
} // namespace x86
} // namespace kernels
} // namespace lite
} // namespace paddle
REGISTER_LITE_KERNEL(conv2d, kX86, kFloat, kNCHW,
paddle::lite::kernels::x86::Conv2dCompute<float>, def)
.BindInput("Input", {LiteType::GetTensorTy(TARGET(kX86))})
.BindInput("Filter", {LiteType::GetTensorTy(TARGET(kX86))})
.BindInput("Bias", {LiteType::GetTensorTy(TARGET(kX86))})
.BindInput("Input", {LiteType::GetTensorTy(TARGET(kX86))})
.BindOutput("Output", {LiteType::GetTensorTy(TARGET(kX86))})
.Finalize();
REGISTER_LITE_KERNEL(depthwise_conv2d, kX86, kFloat, kNCHW,
paddle::lite::kernels::x86::Conv2dCompute<float>, def)
.BindInput("Input", {LiteType::GetTensorTy(TARGET(kX86))})
.BindInput("Filter", {LiteType::GetTensorTy(TARGET(kX86))})
.BindInput("Bias", {LiteType::GetTensorTy(TARGET(kX86))})
.BindInput("Input", {LiteType::GetTensorTy(TARGET(kX86))})
.BindOutput("Output", {LiteType::GetTensorTy(TARGET(kX86))})
.Finalize();
// Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#include <Eigen/Core>
#include "paddle/fluid/framework/eigen.h"
#include "paddle/fluid/lite/core/kernel.h"
#include "paddle/fluid/lite/core/op_registry.h"
#include "paddle/fluid/lite/core/types.h"
#include "paddle/fluid/operators/math/math_function.h"
#include "paddle/fluid/operators/math/pooling.h"
namespace paddle {
namespace lite {
namespace kernels {
namespace x86 {
template <typename T>
class PoolCompute : public KernelLite<TARGET(kX86), PRECISION(kFloat)> {
public:
using param_t = operators::PoolParam;
void Run() override {
auto& param = *param_.get_mutable<param_t>();
if (param.global_pooling) {
for (size_t i = 0; i < param.ksize.size(); ++i) {
param.paddings[i] = 0;
param.ksize[i] = static_cast<int>(param.x->dims()[i + 2]);
}
}
switch (param.ksize.size()) {
case 2: {
if (param.pooling_type == "max") {
paddle::operators::math::Pool2dFunctor<
platform::CPUDeviceContext, paddle::operators::math::MaxPool<T>,
T>
pool2d_forward;
paddle::operators::math::MaxPool<T> pool_process;
pool2d_forward(platform::CPUDeviceContext(), param.x->raw_tensor(),
param.ksize, param.strides, param.paddings,
pool_process, true, false,
&(param.output->raw_tensor()));
} else if (param.pooling_type == "avg") {
paddle::operators::math::Pool2dFunctor<
platform::CPUDeviceContext, paddle::operators::math::AvgPool<T>,
T>
pool2d_forward;
paddle::operators::math::AvgPool<T> pool_process;
pool2d_forward(platform::CPUDeviceContext(), param.x->raw_tensor(),
param.ksize, param.strides, param.paddings,
pool_process, param.exclusive, param.adaptive,
&(param.output->raw_tensor()));
}
} break;
case 3: {
} break;
}
}
virtual ~PoolCompute() = default;
};
} // namespace x86
} // namespace kernels
} // namespace lite
} // namespace paddle
REGISTER_LITE_KERNEL(pool2d, kX86, kFloat, kNCHW,
paddle::lite::kernels::x86::PoolCompute<float>, def)
.BindInput("X", {LiteType::GetTensorTy(TARGET(kX86))})
.BindOutput("Out", {LiteType::GetTensorTy(TARGET(kX86))})
.Finalize();
// Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#include "paddle/fluid/lite/utils/string.h"
namespace paddle {
namespace lite {} // namespace lite
} // namespace paddle
// Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#pragma once
#include <stdarg.h> // For va_start, etc.
#include <algorithm>
#include <cstring>
#include <memory> // For std::unique_ptr
#include <sstream>
#include <string>
#include <vector>
namespace paddle {
namespace lite {
static std::string string_format(const std::string fmt_str, ...) {
/* Reserve two times as much as the length of the fmt_str */
int final_n, n = (static_cast<int>(fmt_str.size())) * 2;
std::unique_ptr<char[]> formatted;
va_list ap;
while (1) {
formatted.reset(
new char[n]); /* Wrap the plain char array into the unique_ptr */
std::strcpy(&formatted[0], fmt_str.c_str()); // NOLINT
va_start(ap, fmt_str);
final_n = vsnprintf(&formatted[0], n, fmt_str.c_str(), ap);
va_end(ap);
if (final_n < 0 || final_n >= n)
n += abs(final_n - n + 1);
else
break;
}
return std::string(formatted.get());
}
template <typename T>
static std::string to_string_with_precision(const T& v, const int n = 6) {
std::stringstream ss;
ss.precision(n);
ss << std::fixed << v;
return ss.str();
}
static std::string Join(const std::vector<std::string>& vec,
const std::string& delim) {
if (vec.empty()) return "";
std::stringstream ss;
for (size_t i = 0; i < vec.size() - 1; i++) ss << vec[i] << delim;
if (!vec.empty()) {
ss << vec.back();
}
return ss.str();
}
static std::string Repr(const std::string& x) { return "\"" + x + "\""; }
static std::string Repr(const std::vector<std::string>& v) {
std::vector<std::string> tmp;
std::transform(v.begin(), v.end(), std::back_inserter(tmp),
[](const std::string& x) { return Repr(x); });
return "{" + Join(tmp, ",") + "}";
}
} // namespace lite
} // namespace paddle
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册