Discussion on some key questions with a DL framework (#2454) · Issue · PaddlePaddle / Paddle

Discussion on some key questions with a DL framework

Created by: wangkuiyi

This issue comes from https://github.com/PaddlePaddle/Paddle/pull/2445#issuecomment-308007213

I think it would be easier to subdivide this document into the following topics, and try to figure out an optimal design for each topic:

memory management
1. class Place, inspired by Majel
2. Allocation and Allocators
3. unified malloc and allocation API for GPU and CPU --
```
p=malloc(Place pl, ...);
used(pl);
free(p); 
```
Tensor
1. consider re-use existing ones, including mshadow, Eigen, porting Majel, or write a wrapper of these libraries.
Expression Template
1. Expression Template is important for performance optimization.
2. Is there any (what is the) difference between above libraries -- mshadow and Eigen. (Majel doesn't complete implementation of Expression Template.)
3. Which is better?
Ops and Variables
1. TensorFlow's Ops take Tensor, a wrapper of Eigen, as its inputs and outputs.
2. Caffe2 and PyTorch take Variable as Ops' inputs and outputs.
3. A Variable includes a tensor for the forward algorithm and another gradient tensor for the backward algorithm. The difference seems that TensorFlow is a general-purpose framework, whereas Caffe2 and PyTorch are DL-specific.
4. Which approach should PaddlePaddle take?
Ops and Gradient Ops
1. In TensorFlow, an Op is a general concept -- all computations are represented by ops.
2. In Caffe2, each Op has one or more corresponding GradientOps.
3. Which approach should PaddlePaddle take?
Ops and Kernels
1. TensorFlow separate Op's signature (as OpDef) and its implementations (as OpKernel)
2. others might not have such clear separation.
3. Which approach should PaddlePaddle take.
Execution engine
1. How TensorFlow and other solutions parse the network definition and create a network?
2. How TensorFlow and other solutions execute the training algorithm over the network while creating/managing the memory of Variables?

PaddlePaddle / Paddle 1 年多 前同步成功

Discussion on some key questions with a DL framework

PaddlePaddle / Paddle
1 年多前同步成功