提交 e90ec778 编写于 作者: Q qijun

translate to english

上级 e6eac856
## Survey on Graph ## Survey on Graph
神经网络框架通常提供Symbolic的接口给用户,来方便的书写网络配置。这里主要调研一下不同神经网络中框架中,用户书写的配置(等号左边)与最终解析得到的Graph之间的关系。 Neural network framework often provides Symbolic api for users to write network topology conveniently. This doc manily focus on Symbolic api in most popular neural network frameworks, and try to find out how to parse Symbolic configuration to a portable file, such as protobuf or json.
### Mxnet ### Mxnet
用户配置网络的核心概念是`Symbol`,Mxnet在C++端实现了`Symbol`,并通过CAPI暴露到Python端。在这里可以参考Mxnet中对`Symbol`的注释: The core concept of Symbolic api is `Symbol`. Mxnet implements `Symbol` class in C++, and export to Python using CAPI. Please refer to the comments in Mxnet:
`Symbol` is help class used to represent the operator node in Graph. `Symbol` is help class used to represent the operator node in Graph.
`Symbol` acts as an interface for building graphs from different components like Variable, Functor and Group. `Symbol` is also exported to python front-end (while Graph is not) to enable quick test and deployment. Conceptually, symbol is the final operation of a graph and thus including all the information required (the graph) to evaluate its output value. `Symbol` acts as an interface for building graphs from different components like Variable, Functor and Group. `Symbol` is also exported to python front-end (while Graph is not) to enable quick test and deployment. Conceptually, symbol is the final operation of a graph and thus including all the information required (the graph) to evaluate its output value.
一个简单的网络定义如下: A simple network topology wrote by Symbol is as follows:
```python ```python
def get_symbol(num_classes=10, **kwargs): def get_symbol(num_classes=10, **kwargs):
...@@ -26,23 +27,62 @@ def get_symbol(num_classes=10, **kwargs): ...@@ -26,23 +27,62 @@ def get_symbol(num_classes=10, **kwargs):
``` ```
需要注意的是,这里的Variable实际上也是一个Symbol。每个基本Symbol最终会对应到一个Node,每个Node都有对应的属性attr,attr中有一个字段为op。当这个Symbol表示Varaible时(通常是输入数据),attr中的op字段为空。
Symbol包含的成员变量为std::vector<NodeEntry> outputs,NodeEntry中包含一个指向Node的指针。 Varible here is actually a Symbol. Every basic Symbol will correspond to one Node, and every Node has its own NodeAttr. There is a op field in NodeAttr class, when a Symbol represents Variable(often input data), the op field is null.
Symbol contains a data member, std::vector<NodeEntry> outputs, and NodeEntry cantains a poniter to Node. We can follow the Node pointer to get all the Graph.
And Symbol can be saved to a Json file.
Here is a detailed example:
```
>>> import mxnet as mx
>>> data = mx.symbol.Variable('data')
>>> print data.debug_str()
Variable:data
>>> data = mx.symbol.Flatten(data=data)
>>> print data.debug_str()
Symbol Outputs:
output[0]=flatten0(0)
Variable:data
--------------------
Op:Flatten, Name=flatten0
Inputs:
arg[0]=data(0) version=0
Mxnet的Symbol可以绑定到一个Executor上,在解析为Graph之后,得以执行。 >>> fc1 = mx.symbol.FullyConnected(data = data, name='fc1', num_hidden=128)
>>> print fc1.debug_str()
Symbol Outputs:
output[0]=fc1(0)
Variable:data
--------------------
Op:Flatten, Name=flatten0
Inputs:
arg[0]=data(0) version=0
Variable:fc1_weight
Variable:fc1_bias
--------------------
Op:FullyConnected, Name=fc1
Inputs:
arg[0]=flatten0(0)
arg[1]=fc1_weight(0) version=0
arg[2]=fc1_bias(0) version=0
Attrs:
num_hidden=128
```
### TensorFlow ### TensorFlow
用户配置网络的核心概念是`Tensor`,在Python端定义了`Tensor`,在这里可以直接参考TensorFlow对Tensor的注释:
The core concept of Symbolic api is `Tensor`. Tensorflow defines `Tensor` in Python. Please refer to the comments in TensorFlow:
A `Tensor` is a symbolic handle to one of the outputs of an `Operation`. It does not hold the values of that operation's output, but instead provides a means of computing those values in a TensorFlow @{tf.Session}. A `Tensor` is a symbolic handle to one of the outputs of an `Operation`. It does not hold the values of that operation's output, but instead provides a means of computing those values in a TensorFlow @{tf.Session}.
一个简单的使用样例如下: A simple example is as follows:
```python ```python
# Build a dataflow graph. # Build a dataflow graph.
...@@ -58,8 +98,9 @@ A `Tensor` is a symbolic handle to one of the outputs of an `Operation`. It does ...@@ -58,8 +98,9 @@ A `Tensor` is a symbolic handle to one of the outputs of an `Operation`. It does
``` ```
Tensor的一些主要成员变量和接口可以参考如下: The main method of `Tensor` is as follows:
```python ```python
@property @property
def op(self): def op(self):
...@@ -89,82 +130,13 @@ def device(self): ...@@ -89,82 +130,13 @@ def device(self):
return self._op.device return self._op.device
``` ```
TensorFlow的Tensor可以作为target被session来run,实际上是Tensor已经包含了所有的Graph信息,可以track data dependency。
### Dynet
用户配置网络的核心概念是`Expression`,在C++端定义了`Expression`。用户通过书写Expression来完成Graph的构建。
一个简单的使用样例如下:
```cpp
ComputationGraph cg;
Expression W = parameter(cg, pW);
Expression in = input(cg, xs[i]);
Expression label = input(cg, ys[i]);
Expression pred = W * in;
Expression loss = square(pred - label);
```
需要注意的是,输入数据以及参数也同样使用Expression来书写。每个Expression对应一个Node,输入数据也对应一个Node。
Expression的主要成员为ComputationGraph,可以在用户配置网络的过程中修改Graph。Expression同样可以被作为目标来执行,因为Expression中已经包含了所有的依赖关系。
### 总结
实际上Mxnet/TensorFlow/Dynet中的Symbol/Tensor/Expression是同一个层级的概念,我们暂时统一这个概念的名称为Expression,这层概念有如下几个特点:
- 用户使用Symbolic的语法来书写网络配置,所有的返回值都是Expression,包括最初的输入数据,及参数等 Tensor can be taken as target to run by session. Tensor contains all the information of Graph, and tracks data dependency.
- 每个Expression都对应着同一个Graph,已经包含了所有的依赖关系,可以被当做执行的target
下面我们来看几个实例:
- Mxnet Here is a detailed example:
```
>>> import mxnet as mx
>>> data = mx.symbol.Variable('data')
>>> print data.debug_str()
Variable:data
>>> data = mx.symbol.Flatten(data=data)
>>> print data.debug_str()
Symbol Outputs:
output[0]=flatten0(0)
Variable:data
--------------------
Op:Flatten, Name=flatten0
Inputs:
arg[0]=data(0) version=0
>>> fc1 = mx.symbol.FullyConnected(data = data, name='fc1', num_hidden=128)
>>> print fc1.debug_str()
Symbol Outputs:
output[0]=fc1(0)
Variable:data
--------------------
Op:Flatten, Name=flatten0
Inputs:
arg[0]=data(0) version=0
Variable:fc1_weight
Variable:fc1_bias
--------------------
Op:FullyConnected, Name=fc1
Inputs:
arg[0]=flatten0(0)
arg[1]=fc1_weight(0) version=0
arg[2]=fc1_bias(0) version=0
Attrs:
num_hidden=128
```
- TensorFlow
``` ```
>>> import tensorflow as tf >>> import tensorflow as tf
>>> c = tf.constant([[1.0, 2.0], [3.0, 4.0]]) >>> c = tf.constant([[1.0, 2.0], [3.0, 4.0]])
...@@ -178,12 +150,32 @@ Attrs: ...@@ -178,12 +150,32 @@ Attrs:
<tensorflow.python.framework.ops.Graph object at 0x10f256d50> <tensorflow.python.framework.ops.Graph object at 0x10f256d50>
``` ```
没有找到Graph的debug string接口,但是可以明确知道配置过程中只存在一个Graph。 ### Dynet
The core concept of Symbolic api is `Expression`, and Dynet defines `Expression` class in C++.
A simple example is as follows:
```cpp
ComputationGraph cg;
Expression W = parameter(cg, pW);
Expression in = input(cg, xs[i]);
Expression label = input(cg, ys[i]);
Expression pred = W * in;
Expression loss = square(pred - label);
```
The input data and parameter are also represented by Expression. Every basci Expression corresponds to a Node. And input data is also a Node.
Expression has a data member ComputationGraph, and ComputationGraph will be modified in users' configuring process. Expression can be a running target, beacuse Expression contains all dependency.
- dynet
dynet可以在C++中书写配置 Here is a detailed example:
write topology in C++
``` ```
ComputationGraph cg; ComputationGraph cg;
...@@ -197,7 +189,7 @@ Expression loss = square(pred - ys[i]); ...@@ -197,7 +189,7 @@ Expression loss = square(pred - ys[i]);
cg.print_graphviz(); cg.print_graphviz();
``` ```
编译运行后,得到打印结果: compile and print
``` ```
# first print # first print
...@@ -229,3 +221,12 @@ digraph G { ...@@ -229,3 +221,12 @@ digraph G {
N3 -> N4; N3 -> N4;
} }
``` ```
### Conclusion
Actually, Symbol/Tensor/Expression in Mxnet/TensorFlow/Dynet are the same level concepts. We use a unified name Expression here, this level concept has following features:
- Users wirte topoloy with Symbolic api, and all return value is Expression, including input data and parameter.
- Expression corresponds with a global Graph, and Expression can also be composed.
- Expression tracks all dependency and can be taken as a run target
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册