Python v2 API to new operator framework
Created by: wangkuiyi
Design Doc: RNNOp
A Plain Network
predict = paddle.layer.fc(
paddle.layer.data(name="x"),
output_size = 100)
cost = paddle.layer.mse(
predict,
paddle.layer.data(name="y"))
parameters = paddle.train(cost)
paddle.save_model(cost, paddle.datasets.mnist.train(), parameters, "filename");
p = paddle.load_model(predict, "filename")
paddle.infer(predict,
Layers, Variables, and Default Scope
# in package paddle.layer
def data(name):
return paddle.cpp.variable(paddle.cpp.default_scope(), name)
def fc(input, output_size):
output = paddle.cpp.variable(paddle.cpp.default_scope())
W = paddle.cpp.variable(paddle.cpp.default_scope(), label="parameter")
b = paddle.cpp.variable(paddle.cpp.default_scope(), label="parameter")
paddle.cpp.operator("FC", read={input, W, b}, output_size, write={output})
return output
def mse(input1, input2):
output = paddle.cpp.variable(paddle.cpp.default_scope())
paddle.cpp.operator("MSE", read={input1, input2}, write={output})
return output
where
-
paddle.cpp.variable
is a Python binding of C++ methodScope::NewVar()
. -
paddle.cpp.operator
creates an operator and mark it as a reader of some variable and a writer of some others. We will cover this later in more details.
paddle::operator::Net
paddle.train
receives a variable created by paddle.layer.mse
and need to trace all related operators and sort them by the topological order.
Please be aware that all operators are derived from class OperatorBase
, which refers to Variables by their names:
class OperatorBase:
vector<string> inputs_;
vector<string> outputs_;
};
and Variables doesn't have names if they are not in a Scope.
Also, each Varaible maintains:
class Variable {
list<Operator*> readers_;
list<Operator*> writers_;
};
Please be aware the trace from an operator to its input variables depends on the default scope. The tracing is done in C++ space, so paddle.cpp.default_scope
is a binding to C++ code.
class Net {
public:
static Net* TraceAndBuild(Variable* output, Scope* scope) {
std::list<std::pair<Operaor*, int/*distance to output*/> > dists;
std::list<std::pair<Variable*, int /*distance to output*/> > frontier;
frontier.push_back(make_pair<output, 0>);
while (frontier.size() > 0) {
Variable* v = frontier.front().first;
int dist = frontier.front().second;
frontier.pop_front();
for (Operator* o : v->writers_) {
dists.push_back(make_pair(v, dist));
for (const string& s : o->writers_) {
frontier.push_back(make_pair(scope->FindVar(s), dist+1));
}
}
}
std::sort(dists, /*by the descending order of dist*/);
return new Net(dists);
}
};
We can call
Net::TraceAndBuild(output_variable, DefaultScope()).Run(DefaultScope());
to extract the network using the default scope and run it.
Scope Hierarchy
An RNN operator may have kinds of variables:
- global variable -- in outer scope
- memory variable -- in RNNOp-local scope
- local variable -- in step-local scope
outer scope
/|\
|
RNNOp scope
(the memory over steps)
/|\ /|\ /|\
| | |
step-0 step-1 step-2
scope scope scope
Just like what a programing language compiler/interpreter would do, for each step, there is a step-local scope, but there is only one copy of compiled code (binary code) or step-net in our case.
Above three tiers can be simplified to two-tier by moving memory variables to the outer scope, but this is not necessary.
outer scope (including all memory variables of an RNNOp)
/|\ /|\ /|\
| | |
step-0 step-1 step-2
scope scope scope
A Recurrent Network
x = paddle.layer.data(name="features")
y = paddle.layer.data(name="labels")
accum = paddle.framework.tensor()
cost = paddle.layer.mse(
paddle.layer.fc(
paddle.layer.rnn(
input = paddle.layer.fc(x),
step_net = paddle.layer.fc(
paddle.layer.add_to(accum, NULL),
output_size=100),
concat_output=true)),
y)
paddle.train(cost, ...)
Here we use NULL
as the placeholder of the input of the step net.
Please notice that we don't have to consume the output of an RNNOp. For example, we can use the memory as the RNNOp's output:
x = paddle.layer.data(name="features")
y = paddle.layer.data(name="labels")
memory = paddle.framework.tensor()
paddle.layer.rnn(
input = paddle.layer.fc(x),
step_net = paddle.layer.fc(
paddle.layer.add_to(memory, NULL),
output_size=100),
concat_output=true)
cost = paddle.layer.mse(paddle.layer.fc(memory), y)
paddle.train(cost, ...)
Step-Net
Above example shows that the step_net
parameter of paddle.layer.rnn
accepts a variable returned by paddle.layer.fc
. We need to trace the step-net from this variable. This can be done by calling the aforementined paddle::operator::Net::TraceAndBuild
namespace paddle {
namespace operator {
class RNN {
public:
void Run(Scope* scope) {
RNNInput* whole_input = inputs_[0]->Get<RNNInput>();
int sequence_len = whole_input->Len(0);
for (int i = 0; i < sequence_len; ++i) {
Scope* step_scope = scope->NewScope();
step_scope->NewVar("step_input")->GetMutable<Tensor>()->Slice(whole_input, i);
Net* net = Net::TraceAndBuild(GetAttr<Variable*>("step_net"), step_scope);
net->Run(step_scope)
}
}
};