Python v2 API to new operator framework

Created by: wangkuiyi

Design Doc: RNNOp

A Plain Network

predict = paddle.layer.fc(
      paddle.layer.data(name="x"),
      output_size = 100)
      
cost = paddle.layer.mse(
    predict,
    paddle.layer.data(name="y"))
    
parameters = paddle.train(cost)
paddle.save_model(cost, paddle.datasets.mnist.train(), parameters, "filename");

p = paddle.load_model(predict, "filename")
paddle.infer(predict,

Layers, Variables, and Default Scope

# in package paddle.layer
def data(name):
  return paddle.cpp.variable(paddle.cpp.default_scope(), name)

def fc(input, output_size):
  output = paddle.cpp.variable(paddle.cpp.default_scope())
  W = paddle.cpp.variable(paddle.cpp.default_scope(), label="parameter")
  b = paddle.cpp.variable(paddle.cpp.default_scope(), label="parameter")
  paddle.cpp.operator("FC", read={input, W, b}, output_size, write={output})
  return output
  
def mse(input1, input2):
  output = paddle.cpp.variable(paddle.cpp.default_scope())
  paddle.cpp.operator("MSE", read={input1, input2}, write={output})
  return output

where

paddle.cpp.variable is a Python binding of C++ method Scope::NewVar().
paddle.cpp.operator creates an operator and mark it as a reader of some variable and a writer of some others. We will cover this later in more details.

`paddle::operator::Net`

paddle.train receives a variable created by paddle.layer.mse and need to trace all related operators and sort them by the topological order.

Please be aware that all operators are derived from class OperatorBase, which refers to Variables by their names:

class OperatorBase:
  vector<string> inputs_;
  vector<string> outputs_;
};

and Variables doesn't have names if they are not in a Scope.

Also, each Varaible maintains:

class Variable {
  list<Operator*> readers_;
  list<Operator*> writers_;
};

Please be aware the trace from an operator to its input variables depends on the default scope. The tracing is done in C++ space, so paddle.cpp.default_scope is a binding to C++ code.

class Net {
 public:
  static Net* TraceAndBuild(Variable* output, Scope* scope) {
    std::list<std::pair<Operaor*, int/*distance to output*/> > dists;
    std::list<std::pair<Variable*, int /*distance to output*/> > frontier;
    frontier.push_back(make_pair<output, 0>);

    while (frontier.size() > 0) {
      Variable* v = frontier.front().first;
      int dist = frontier.front().second;
      frontier.pop_front();
      
      for (Operator* o : v->writers_) {
        dists.push_back(make_pair(v, dist));
        for (const string& s : o->writers_) {
          frontier.push_back(make_pair(scope->FindVar(s), dist+1));
        }
      }
    }
    
    std::sort(dists, /*by the descending order of dist*/);
    
    return new Net(dists); 
 }
};

We can call

Net::TraceAndBuild(output_variable, DefaultScope()).Run(DefaultScope());

to extract the network using the default scope and run it.

Scope Hierarchy

An RNN operator may have kinds of variables:

global variable -- in outer scope
memory variable -- in RNNOp-local scope
local variable -- in step-local scope

   outer scope
      /|\
       |
   RNNOp scope 
(the memory over steps)
  /|\  /|\   /|\
   |    |     |
step-0 step-1 step-2
scope  scope  scope

Just like what a programing language compiler/interpreter would do, for each step, there is a step-local scope, but there is only one copy of compiled code (binary code) or step-net in our case.

Above three tiers can be simplified to two-tier by moving memory variables to the outer scope, but this is not necessary.

outer scope (including all memory variables of an RNNOp)
  /|\  /|\   /|\
   |    |     |
step-0 step-1 step-2
scope  scope  scope

A Recurrent Network

x = paddle.layer.data(name="features")
y = paddle.layer.data(name="labels")

accum = paddle.framework.tensor()

cost = paddle.layer.mse(
  paddle.layer.fc(
    paddle.layer.rnn(
      input = paddle.layer.fc(x),
      step_net = paddle.layer.fc(
                   paddle.layer.add_to(accum, NULL), 
                   output_size=100),
      concat_output=true)),
  y)

paddle.train(cost, ...)

Here we use NULL as the placeholder of the input of the step net.

Please notice that we don't have to consume the output of an RNNOp. For example, we can use the memory as the RNNOp's output:

x = paddle.layer.data(name="features")
y = paddle.layer.data(name="labels")

memory = paddle.framework.tensor()

paddle.layer.rnn(
  input = paddle.layer.fc(x),
  step_net = paddle.layer.fc(
               paddle.layer.add_to(memory, NULL), 
               output_size=100),
  concat_output=true)

cost = paddle.layer.mse(paddle.layer.fc(memory), y)

paddle.train(cost, ...)

Step-Net

Above example shows that the step_net parameter of paddle.layer.rnn accepts a variable returned by paddle.layer.fc. We need to trace the step-net from this variable. This can be done by calling the aforementined paddle::operator::Net::TraceAndBuild

namespace paddle { 
namespace operator {
class RNN {
 public:
  void Run(Scope* scope) {
    RNNInput* whole_input = inputs_[0]->Get<RNNInput>();
    int sequence_len = whole_input->Len(0);
    for (int i = 0; i < sequence_len; ++i) {
      Scope* step_scope = scope->NewScope();
      step_scope->NewVar("step_input")->GetMutable<Tensor>()->Slice(whole_input, i);
      Net* net = Net::TraceAndBuild(GetAttr<Variable*>("step_net"), step_scope);
      net->Run(step_scope)
    }
  }
};

PaddlePaddle / Paddle 大约 2 年 前同步成功