- 22 2月, 2021 2 次提交
-
-
由 Huihuang Zheng 提交于
**Problem** In our old shape transformer logic, if user write: ``` s = tensor.shape ... y = paddle.some_api(s) ``` Dy2stat will change it to ``` ... y = paddle.some_api(convert_var_shape(tensor)) ``` However it will cause fatal bug if user changes the shape of `x` after assign. For example: ``` s = tensor.shape ... tensor = paddle.some_change_shape_api(tensor) ... y = paddle.some_api(s) ``` Then the Dy2stat will get wrong result because the code is translated into: ``` tensor = paddle.some_change_shape_api(tensor) ... y = paddle.some_api(convert_var_shape(tensor)) # tensor shape has been changed, not origin `s` value ``` **Solution Logic** It can not be solved in the old logic, so I refactoring tensor_shape_transformer logic. Now we will use `s` to store shape attribute and generate a var `s__STATIC_CONVERT_VAR_SHAPE_SUFFIX` to store static shape API `shape(tensor)` ``` s = tensor.shape ... y = paddle.some_api(s) ``` Dy2stat will change it to ``` s = tensor.shape s__STATIC_CONVERT_VAR_SHAPE_SUFFIX = shape(tensor) ... y = paddle.some_api(choose_shape_attr_or_api(s, s__STATIC_CONVERT_VAR_SHAPE_SUFFIX )) ``` In this case, the code is consistent with origin dygraph meaning and it fixed the change after assign bug. **Code Key Note** To help reviewers, the key change of this PR is changing `self.name_to_var_shape` from "mapping name to shape node" to "mapping name to its STATIC_CONVERT_VAR_SHAPE_SUFFIX name", then if a variable name has the SUFFIX, we can choose to use attribute shape or shape api. Other changes go with the key change. **Consideration** The issue of this PR is that we store extra static `shape` API result, will it harms the speed of Dy2stat? In some cases it will, but we argue that the benefit would be greater than the cost. 1. The extra calling to static `shape` API will happen when coder assign among shape variables. Take the following dygraph code as an instance: ``` s1 = tensor.shape s2 = s1 s3 = s2 ... ``` Then we called extra static `shape` APIs again and again, however users seldom write code like this. 2. If the shape variable is used a lot, for example: ``` s = tensor.shape y1 = paddle.some_api1(s) y2 = paddle.some_api2(s) y3 = paddle.some_api3(s) ``` Our old logic will create 3 shape APIs but now just 1. This is more common user code pattern. In fact, if reviewers take a look at the current unit test in this PR, you could see the op numbers decrease after this PR. So we argue that this PR can also improve speed in this code pattern.
-
由 tangwei12 提交于
* fix dist fleet ctr ut Change-Id: I59bf5123c7bd47bd0e8f1ca2a26295257597c0f5 * fix dist fleet ctr ut Change-Id: Iafcdd172364be47fe67b753774ce09af050bcbce * Update CMakeLists.txt
-
- 20 2月, 2021 5 次提交
-
-
由 TTerror 提交于
add squeeze_op/unsqueeze_op on kunlun;fix conv op and parallel executor;optimize lookup_table op (#31056) * add squeeze_op/unsqueeze_op on kunlun; fix conv op and parallel executor on kunlun; optimize lookup_table op on kunlun * update squeeze/unsqueeze op
-
由 123malin 提交于
* test=develop, save/load, shrink Co-authored-by: NseiriosPlus <tangwei12@baidu.com>
-
由 Shibo Tao 提交于
* export paddle.static.normalize_program method. test=develop * fix ut coverage.test=develop
-
由 liym27 提交于
* [static setitem] support the index step > 1. tensor_a[::3] = value * [static setitem] support the index step < 0. Eg: tensor_a[::-3] = value * [static setitem] support the index is Tensor. eg: tensor_a[tensor_3:0:-1] = value * Add op version.
-
由 Huihuang Zheng 提交于
As the title, when slice_node like 1:3 being passed to idx of convert_var_shape, it will cause syntax error because a function cannot take this as argument. This PR fixed it.
-
- 19 2月, 2021 4 次提交
-
-
由 Jacek Czaja 提交于
* - added Reshape grad bf16 * - Added reshape grad bf16 * - cosmetics in py
-
由 ShenLiang 提交于
-
由 Kaipeng Deng 提交于
* fix dataloader collate return list mix tensor and numpy array. test=develop
-
由 Guanghua Yu 提交于
* add parameter in roi_align op
-
- 18 2月, 2021 5 次提交
-
-
由 Pei Yang 提交于
-
由 joanna.wozna.intel 提交于
* Add conv transpose BF16 * Share function GetWeightsTz * Adjust to review and fix op compatibility * Add bias to unique handler name * Remove errors related to paddle enforce * Add conv2d_transpose to bf16 list and kernel refator
-
由 Huihuang Zheng 提交于
Refine fake_interface Error Message
-
由 Huihuang Zheng 提交于
Dy2stat didn't support tuple as iteration variable in the past. This PR added there main cases: 1). Non-enumerate case: for var1, var2 in var|var.numpy() will be re-written as: for FOR_ITER_TUPLE_PREFIX_x in var | var.numpy(): var1 = FOR_ITER_TUPLE_PREFIX_x[0] var2 = FOR_ITER_TUPLE_PREFIX_x[1] 2). Enumerate out tuple case: for t in enumerate(var|var.numpy) will be rewritten as: for FOR_ITER_TUPLE_INDEX_PREFIX_x, FOR_ITER_TUPLE_PREFIX_x in enumerate(var|var.numpy): t = (FOR_ITER_TUPLE_INDEX_PREFIX_x, FOR_ITER_TUPLE_PREFIX_x) 3). Enumerate inner tuple case: for i, (var1, (var2, va3)) in enumerate(var|var.numpy()) will be re-written as: for i, FOR_ITER_TUPLE_PREFIX_x in var | var.numpy(): var1 = FOR_ITER_TUPLE_PREFIX_x[0] var2 = FOR_ITER_TUPLE_PREFIX_x[1][0] var3 = FOR_ITER_TUPLE_PREFIX_x[1][1]
-
由 Wojciech Uss 提交于
-
- 10 2月, 2021 1 次提交
-
-
由 WeiXin 提交于
-
- 09 2月, 2021 1 次提交
-
-
由 Chen Weihang 提交于
-
- 08 2月, 2021 2 次提交
- 06 2月, 2021 1 次提交
-
-
由 Jacek Czaja 提交于
-
- 05 2月, 2021 1 次提交
-
-
由 liuyuhui 提交于
-
- 04 2月, 2021 1 次提交
-
-
由 Jacek Czaja 提交于
-
- 03 2月, 2021 7 次提交
-
-
由 cucuzg 提交于
-
由 wawltor 提交于
fix the broadcast for the large second input
-
由 JamesLim 提交于
-
由 AshburnLee 提交于
-
由 joejiong 提交于
As the title
-
由 Adam Osewski 提交于
-
由 WangXi 提交于
-
- 02 2月, 2021 1 次提交
-
-
由 Shang Zhizhou 提交于
* fix trt plugin clone and initialize bugs * fix unit test error * enable trt in ci py3 * update unittest timeout
-
- 01 2月, 2021 3 次提交
-
-
由 Shang Zhizhou 提交于
-
由 xiemoyuan 提交于
* Add cache for Transformer encoder. * Bug fixed. * add unittests for transformer encoder.
-
由 WangXi 提交于
-
- 28 1月, 2021 2 次提交
-
-
由 Wojciech Uss 提交于
-
由 WeiXin 提交于
-
- 27 1月, 2021 3 次提交
-
-
由 liu zhengxi 提交于
* upgrade gather_tree to core.ops * update gather_tree unittests
-
由 jakpiase 提交于
* added external reorder to profiler * resolved conflict * added enable_static * initial version of lstm, not working yet * added lstm to operators.cmake * added vanilla lstm mkldnn op * added peephole weights integration * minor changes * added formatting * added fusion_lstm_mkldnn to static_whitelist * added formatting * removed comment * moved use_peepholes attribute inside is_cached block * reverted wrong changes * minor formatting change * minor changes * changed stream handling * minor change * added datatype to GetExpectedKernelType() * added reading stream from TLS
-
由 liym27 提交于
-
- 26 1月, 2021 1 次提交
-