- 07 6月, 2018 1 次提交
-
-
由 mozga-intel 提交于
* Add MKLDNN layout support in Paddle Add MKLDNN layout in Paddle so that MKLDNN friendly memory layout can be used in MKLDNN enabled OP kernel. Before this commit, NCHW is hardcode to be used in all MKLDNN op kernels. As a result, non-optimized execution path is selected in MKLDNN primitive which bring worse performance. Besides framework change, three MKLDNN OP kernels were updated for using new MKLDNN layout. They are conv/pool2d/batch_norm. Other MKLDNN OP kernels need be also updated in similar way to achieve best performance. * Add MKLDNN layout support in activation OP * Don't populate layout from input to output when kMKLDNN in * Refine pool mkldnn op kernel * MKLDNN layout * Remove the inferitance from tensor file * MKLDNN layout: refactoring * Remove additional #define to register new operator * Prepare mkldnn tests to work with layout
-
- 12 2月, 2018 1 次提交
-
-
由 qingqing01 提交于
-
- 10 2月, 2018 2 次提交
- 21 1月, 2018 1 次提交
-
-
由 dzhwinter 提交于
* "fix decode bug" * "follow commnet" * "fix error" * "fix hook bug" * fix based comment * fix copyright * fix based on comment
-
- 15 1月, 2018 1 次提交
-
-
由 dzhwinter 提交于
* add copyright hook * add copyright hook * refine copyright hook * "test copyright hook" * fix check style * fix ci
-
- 28 12月, 2017 1 次提交
-
-
由 Yancey 提交于
* implement selectedrows serialize and deserialize * make serialize/deserialize as global function * recover send_imp.cc * delete unused brackets * fix compile error * serialize version in LodTensor and SelecetedRows * fix ci * fix ci
-
- 25 12月, 2017 2 次提交
- 26 11月, 2017 1 次提交
-
-
由 dzhwinter 提交于
* "make global tensor function independently" * "replace functor" * "fix inline template error" * "fix tensor array with CopyFrom" * "fix other case use CopyFrom" * "move the op interface hardly" * "fix operators" * "fix typo" * "delete dynamic recurrent rnn and fix gru_unit in debugmode" * "fix unique_ptr copy" * "fix cuda copy" * "fix namespace error" * "removed nccl python test" * "fix include error" * "fix typo" * fix copy util test
-
- 20 10月, 2017 1 次提交
-
-
由 Yu Yang 提交于
* Remove template parameter for Tensor methods * Also check the type is correct when data() * Simplize holder_ * Fix accuracy_op * Register Code
-
- 12 10月, 2017 1 次提交
-
-
由 QI JUN 提交于
* init * unify CopyFrom interface * fix gpu build error * fix bug in tensor_py.h * refine code comments and add TODO list * fix conflicts in FeedOp and FetchOp
-
- 10 10月, 2017 1 次提交
-
-
由 Abhinav Arora 提交于
* Adding implementation for copying a vector to tensor * Changing Tensor test to access gpu memory indirectly
-
- 05 10月, 2017 2 次提交
-
-
由 Yi Wang 提交于
-
由 Yu Yang 提交于
By shell command ```bash sed -i 's#ifdef PADDLE_ONLY_CPU#ifndef PADDLE_WITH_GPU#g' `find ./paddle/ -name '*.h' -o -name '*.cc' -o -name '*.cpp' -o -name '*.c' -o -name '*.cu'` sed -i 's#ifndef PADDLE_ONLY_CPU#ifdef PADDLE_WITH_GPU#g' `find ./paddle/ -name '*.h' -o -name '*.cc' -o -name '*.cpp' -o -name '*.c' -o -name '*.cu'` ```
-
- 15 9月, 2017 1 次提交
-
-
由 zchen0211 提交于
-
- 08 9月, 2017 1 次提交
-
-
由 Yu Yang 提交于
-
- 07 9月, 2017 2 次提交
- 06 9月, 2017 1 次提交
-
-
由 Zhuoyuan 提交于
-
- 05 9月, 2017 1 次提交
-
-
由 fengjiayi 提交于
-
- 09 8月, 2017 1 次提交
-
-
由 Yan Chunwei 提交于
* add lodtensor * add reshape of lod * add details * rename Elements/Levels * size_t and vector reserve * add details * add const& std::shared_ptr * add lod_tensor_impl.h * remove a shared_ptr
-
- 08 8月, 2017 1 次提交
-
-
由 Yan Chunwei 提交于
* fix some enforce * remove compatible_type to avoid compile error * remove shared_ptr * fix tensor error msg
-
- 28 7月, 2017 1 次提交
-
-
由 qijun 提交于
-
- 26 7月, 2017 1 次提交
-
-
由 liaogang 提交于
-
- 25 7月, 2017 2 次提交
- 19 7月, 2017 1 次提交
-
-
由 fengjiayi 提交于
ATTENTION: some interfaces changed: 1. void Tensor::set_dims(const DDim& dims) ==> void Tensor::Resize(const DDim& dims). 2. void Tensor::ShareDataFrom(const Tensor& src) ==> void Tensor::ShareDataWith(const Tensor& src) 3. DDim Tensor::dims() const ==> const DDim& Tensor::dims() const
-
- 15 7月, 2017 4 次提交
- 14 7月, 2017 2 次提交
-
-
由 fengjiayi 提交于
-
由 fengjiayi 提交于
1. Add template T which indicates data type to `CopyFrom()`, `Slice()` and `ShareData()` functions. This makes `CopyData()` code much clearer. 2. Add `set_dim()`. 3. `product(DDim)` transforms `DDim` to `vector<int>` first and then calculate its product. That might be quite slow. For `product(dims_)` is frequently used in Tensor, we add a mumber variable `numel_` as a cache of the product result. TODO: refactor `product()` to make it more efficient. 4. Unable Tensor::operator= 5. Remove the limit of POD type, because `float16` and `int8` are not POD type.
-
- 12 7月, 2017 1 次提交
-
-
由 fengjiayi 提交于
1. Add `Tensor::CopyFrom`. Current version can only support CPU memory copy. The support of GPU will be provided later by `paddle::memory`. The current implementation of `Tensor::CopyFrom` is a little inefficient: Every time `CopyFrom` is called, tensor will re-allocate its memory. However, if we try to check and reuse `placeholder_`, we have to provide a template parameter for `CopyFrom` to indicate the data type. It seems strange for a simple copy function. 2. Add `Tensor::mutable_data(Place place)`, which directly use member variable `dims_` as its dim parameter. This interface is required by `Op::InferShape`.
-
- 11 7月, 2017 1 次提交
-
-
由 fengjiayi 提交于
-
- 03 7月, 2017 3 次提交