- 20 2月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
* rename pten dir to phi * rename namespace to phi * rename infrt pten dir to phi * resolve conflict * rename pten to phi in cmake * revert all infrt change * change needed files * fix infrt failed * fix inference failed
-
- 19 2月, 2022 1 次提交
-
-
由 Aurelius84 提交于
* Unify paddle/pten::framework::ddim into pten::ddim * fix paddle namespace * compile sucessfully * fix npu src file * fix conflict * fix conflict * fix tensorrt compiler error * fix conflict * fix conflict * fix tesst file conflict * fix conflict * fix mlu file conflict * fix mlu file conflict * fix cinn header file conflict * fix conflict * fix conflict * fix conflict * fix conflict
-
- 06 2月, 2022 1 次提交
-
-
由 Wilber 提交于
-
- 03 12月, 2021 1 次提交
-
-
由 ronnywang 提交于
* refine structure for cuda and rocm * update * update * update * update
-
- 03 3月, 2021 1 次提交
-
-
由 Qi Li 提交于
-
- 21 9月, 2020 1 次提交
-
-
由 LutaoChu 提交于
* argsort op acceleration on GPU when the input size is equal to the length of the ‘axis’ dimension
-
- 20 4月, 2020 1 次提交
-
-
由 Zhou Wei 提交于
* Optimize the error messages of paddle CUDA API, test=develop * fix the error messages of paddle CUDA API, test=develop * Refactoring PADDLE_ENFORCE_CUDA_SUCCESS, and apply to curand/cudnn/cublas/NCCL,test=develop * remove build_ex_string,test=develop * merge conflict,test=develop
-
- 10 1月, 2020 1 次提交
-
-
由 FlyingQianMM 提交于
* add backward gradient computation for op argsort test=developo * use pre-commit test=develop
-
- 25 12月, 2019 1 次提交
-
-
由 Aurelius84 提交于
* add register op_data_type test=develop * fix register bug in isfinite op test=develop * rm int int64_t in pad2d gradKernel test=develop
-
- 29 11月, 2019 1 次提交
-
-
由 zhaoyuchen2018 提交于
* Add ascending for argsort * Refine api doc description. * Refine descending description * Add int32 logic to speedup when data is small size. * Remove int32 opt as not support in python
-
- 25 11月, 2019 1 次提交
-
-
由 zhaoyuchen2018 提交于
* Improve argsort performance. - Give 200000 data to compute argsort on v100, can speed up ~190x before opt cost: 0.53s after opt cost:0.0027s - Add fp16 support * Refine error message * Refine code test=develop Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
-
- 30 8月, 2019 1 次提交
-
-
由 Tao Luo 提交于
test=develop
-
- 17 6月, 2018 1 次提交
-
-
由 Yibing Liu 提交于
-
- 12 6月, 2018 3 次提交
-
-
由 Yibing Liu 提交于
-
由 Yibing Liu 提交于
-
由 Yibing Liu 提交于
-