1. 29 8月, 2022 4 次提交
  2. 26 8月, 2022 22 次提交
  3. 25 8月, 2022 14 次提交
    • F
      add support for double attributes (#45390) · efab2eb4
      Feiyu Chan 提交于
      efab2eb4
    • P
      Enable OMP multithreading in lookup_table_v2 (#45249) · 0c363de8
      piotrekobi 提交于
      * Add omp parallel for directives
      
      * Revert "Add omp parallel for directives"
      
      This reverts commit f4e4f8ddb12454018d9c1e49c074af2543659de6.
      
      * Add #pragma omp parallel for to correct file
      
      * Add check for _OPENMP definition
      
      * Disable omp on gpu
      
      * Trigger CI
      
      * Readd check for _OPENMP definition
      
      * Change macro disabling changes on GPU
      
      * Improve macro readability
      0c363de8
    • A
      [OpAttr]axis of Reverse Support Tensor type (#45391) · 91110661
      Aurelius84 提交于
      * [OpAttr]axis of Reverse Support Tensor type
      
      * fix coverage
      
      * fix unittest
      91110661
    • D
      update brpc version to 1.2.0 (#45351) · 9b5b005e
      danleifeng 提交于
      * update brpc version;test=develop
      9b5b005e
    • H
      fix auto tune unitest assert (#45421) · cb0b53cb
      hong 提交于
      cb0b53cb
    • A
      [OpAttr]min/max of uniform_random support Tensor type (#45417) · c8955d0d
      Aurelius84 提交于
      * [OpAttr]min/max of Uniform_rand support Tensor type
      
      * fix typo
      c8955d0d
    • C
      Fix record operator input shapes segment fault in new dygraph (#45360) · 4d78390e
      chenjian 提交于
      * fix segment fault
      
      * fix
      4d78390e
    • K
      Transfer memcpy d2h from fluid to phi (#45150) · 0d14e74a
      kangguangli 提交于
      * transfer memcpy_d2h from fluid to phi
      
      * refine arg check and add comment
      
      * fix cannot fallback to phi kernel
      
      * fix gpu_context host alloc when tensor size = 0
      
      * add kernel for std::vector<DenseTensor> args
      
      * fix bugs in MemcpyD2HMultiIOKernel
      
      * remove useless header file
      
      * polish format
      
      * fix typo
      
      * add testcase for cudapinned place
      
      * refine check condition in test
      
      * polish error message
      
      * polish error message
      
      * remove header in fluid  directory
      
      * merge memcpy_h2d and memcpy_d2h into one file, change register method to simplify implementation
      
      * fix code style check
      0d14e74a
    • R
      [NPU] add run_program_op_npu (#45349) · 64afa638
      ronnywang 提交于
      * [NPU] add run_program_op_npu
      
      * add run_program_op_npu ut
      64afa638
    • S
      make full_like support double_max in dygraph (#45385) · edd66f2e
      Sing_chan 提交于
      * make full_like support double_max in dygraph
      
      * fix bug
      edd66f2e
    • W
      [Eager] sync_batch_norm_grad delete mean and variance (#45411) · 5df464fe
      wanghuancoder 提交于
      * sync_batch_norm_grad delete mean and variance
      5df464fe
    • H
      optimize conv algo cache (#41891) · 1cd7e68b
      hong 提交于
      * optimizer conv alog speed
      
      * code polish
      
      * remove useless code
      
      * fix compile error
      
      * fix cpu compile error
      
      * not use cudnn alog t
      
      * add search cache max number
      
      * polish code
      
      * fix cache test bug
      
      * add groups data format to conv args
      
      * fix cache test bug
      
      * fix cudnn_deterministic bug
      
      * fix test switch auto tune bug
      
      * fix test swith autotune bug;
      
      * fix conv cache bug
      
      * fix cache test error
      
      * fix cache test bug
      
      * fix windows mac compile error
      
      * fix workspace search error
      
      * update cudnn cache
      
      * fix cache test bug; test=develop
      
      * fix autotune swith test error
      
      * polish code
      
      * oplish code
      1cd7e68b
    • Z
      Fl-PS bug fix (#45413) · f2f3f6e7
      ziyoujiyi 提交于
      * back fl
      
      * delete ssl cert
      
      * .
      
      * make warning
      
      * .
      
      * unittest paral degree
      
      * solve unittest
      
      * heter & multi cloud commm ready
      
      * .
      
      * .
      
      * fl-ps v1.0
      
      * .
      
      * support N + N mode
      
      * .
      
      * .
      
      * .
      
      * .
      
      * delete print
      
      * .
      
      * .
      
      * .
      
      * .
      
      * fix bug
      
      * .
      
      * .
      
      * fl-ps with coordinator ready
      
      * merge dev
      
      * update message parse only
      
      * update fl client scheduler
      
      * fix bug
      
      * update multithreads sync
      
      * fix ci errors
      
      * update role_maker.py
      
      * update role_maker.py
      
      * fix ci error: windows py import error
      
      * fix ci error: windows py import error
      
      * fix windows ci pylib import error
      
      * add dump fields & params
      
      * try to fix windows import fleet error
      
      * fix ps FLAGS error
      
      * fix logging risk
      
      * fix logging possible risk
      
      * write trainer_desc file
      
      * support split sparse params in local & remote
      
      * fix import paddle.fluid.core.PSGPU
      
      * fix import paddle.fluid.core.PSGPU
      
      * add remote_sparse & local_sparse config
      
      * fix unittest
      
      * fix test_dist_fleet_geo table error
      
      * fix PADDLE_ENFORCE error
      
      * fix other's pr conflict
      
      * forbidden ssd table
      
      * .
      
      * recover ssd table code
      
      * recover file mode
      f2f3f6e7
    • R
      [triu_indices] add triu_indices_op (#45168) · a410c397
      Rayman 提交于
      a410c397