README.md 4.9 KB
Newer Older
L
liaogang 已提交
1 2 3 4 5 6 7
# Tensor: An Unified Data Type in PaddlePaddle

## Pain Point

In this week, we discussed several potential weaknesses of PaddlePaddle caused by rapid iteration and development to promote new business products on the line in recent four years. For instance, current Matrix/Vector implementation in PaddlePaddle are long and tedious to read, which interfered seriously with the contribution of both fresh and professional engineers. More seriously for this issue, it will also become too challenging to maintain over time.


L
liaogang 已提交
8
## Learn from Majel
L
liaogang 已提交
9 10 11

Consequently, we decide to refactor PaddlePaddle step-by-step. First, refactor and replace Matrix/Vector to Tensor, a modern terminology in the deep learning system. Fortunately, we can learn from Majel how to define a Tensor.

L
liaogang 已提交
12
To simplify heterogeneous resource allocation in any dimensions (1-9) and types (double, float, float16), Majel consists of several primitives such as `Dim`, `Place` and `Array`, all of them are standard C++ class templates.
L
liaogang 已提交
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50

1. `Place`: memory location [i.e. CPU/GPU].
2. `Allocation`: heterogeneous resource allocator [i.e. 20MB in GPU].
3. `Dim`: size of each dimension. [i.e. Dim<4>({10, 2, 5, 1})]
4. `Array`: dynamic array consists of `Place`, `Dim`, and a pointer to memory.

If you dig deeper into Majel source code, you will find Majel heavily use `boost.variant`. The variant class template is a safe, generic, stack-based discriminated union container, **offering a simple solution for manipulating an object from a heterogeneous set of types in a uniform manner**. Whereas standard containers such as std::vector may be thought of as "multi-value, single type," variant is "multi-type, single value."

As a simple example, consider the following:

```c++
#include "boost/variant.hpp"
#include <iostream>

class my_visitor : public boost::static_visitor<int>
{
public:
    int operator()(int i) const
    {
        return i;
    }
    
    int operator()(const std::string & str) const
    {
        return str.length();
    }
};

int main()
{
    boost::variant< int, std::string > u("hello world");
    std::cout << u; // output: hello world

    int result = boost::apply_visitor( my_visitor(), u );
    std::cout << result; // output: 11 (i.e., length of "hello world")
}
```

F
fengjiayi 已提交
51 52 53
In Majel, `DDimVar` is derived from `Dim`, `DArrayVar` is from 
`Array`.

L
liaogang 已提交
54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72

```c++
template<int i>
struct Dim {
...    
int head;
Dim<i-1> tail;
}
```

```c++
template<typename T, int D>
class Array : public Buffer {
    ...
private:
    Dim<D> size_;
    Dim<D> stride_;
    T* ptr_;
};
F
fengjiayi 已提交
73 74
``` 
 
L
liaogang 已提交
75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97
```c++
typedef boost::variant<GpuPlace, CpuPlace> Place;
typedef boost::variant<Dim<1>, Dim<2>, Dim<3>, Dim<4>, Dim<5>,
                       Dim<6>, Dim<7>, Dim<8>, Dim<9>> DDimVar;
typedef boost::variant<
    Array<float, 1>,
    Array<float, 2>,
    Array<float, 3>,
    Array<float, 4>,

    Array<double, 1>,
    Array<double, 2>,
    Array<double, 3>,
    Array<double, 4>,

    Array<float16, 1>,
    Array<float16, 2>,
    Array<float16, 3>,
    Array<float16, 4> > DArrayVar;
```

Because `variant` may be thought of as "multi-type, single value", we can utilize it to implement unified interfaces for PaddlePaddle.

F
fengjiayi 已提交
98 99 100 101 102 103 104 105 106 107 108 109 110
`DDim` plays two kinds of roles in Majel. First, it is used to indicate the size of a tensor. For example, we can construct a new `DArray` by following way:
 
 ```c++
 DArray arr = make_darray(make_ddim({2,3}), 0.0f);
 ```
 It means that `arr` will be a two-dimension tensor, or a matrix. The size of its first dimension is 2 and the second is 3. All the element value of `arr` will be initialized as 0.0 .
 
 The second meaning of `DDim` is tensor index. For example, if we want to access the value in the 1st row and 2nd column of `arr` and set it to 1.0, we can do like this:

 ```c++
 arr[make_ddim({0, 1})] = 1.0
 ```

L
liaogang 已提交
111 112 113 114 115 116 117 118
## implement Tensor in Paddle

Before writing code, please make sure you already look through Majel Source Code and grabbed the design philosophy of `DArray` in Majel.

To assign subtasks to our colleagues, we have to discuss how to divide it to independent subtasks.

- [ ] 1. First, we need to consider the third-party dependencies in Majel.

L
liaogang 已提交
119
    Majel heavily use `boost.variant`, but we don't want to integrate `boost` into PaddlePaddle. It's better to replace boost using the lightweight implementation. https://github.com/mapbox/variant Mapbox variant has the same speedy performance of `boost::variant `but is faster to compile, results in smaller binaries, and has no dependencies.
L
liaogang 已提交
120 121 122 123 124

> @gangliao

- [ ] 2. Re-implement `Place` and `Allocation/Memory`

L
liaogang 已提交
125
    I found @wangkuiyi submitted a pull request includes `Place`. @gangliao and @qijun could re-implement `Allocation`, because we have the GPU development experience before joining Paddle team.
L
liaogang 已提交
126 127 128 129

> @wangkuiyi @gangliao @qijun

- [ ] 3. Re-implement `Dim`.
L
liaogang 已提交
130

F
fengjiayi 已提交
131 132
    `Dim` is an excellent implementation in Majel.
    
L
liaogang 已提交
133 134 135 136 137 138 139 140 141
> ???

- [ ] 4. Re-implement `Array/Tensor`.

> Prerequisites: 1 - 3

- [ ] 5. Re-implement fundamental operators for `Array/Tensor`.

> Prerequisites: 1 - 4