1. 18 5月, 2022 1 次提交
    • W
      Add support for forward and reverse high-order automatic differentiation mechanism (#41919) · f6ee202f
      WangZhen 提交于
      * Updated triple_grad_check func
      
      * add todo for gradient checker and refine some comments
      
      * remove additional code
      
      * add test for warnging in backward.py
      
      * format python code
      
      * support multi input in triple gradient checker
      
      * Add matmul triple grad kernel
      
      * Updated comments of TODO
      
      * Supported some special tests
      
      * Change code-format to follow CI std
      
      * Updated gradient_checker.py
      
      * Fix conflicts
      
      * Removed unnecessary printing log
      
      * Change code style to follow CI std
      
      * merge upstream
      
      * add priops.py
      
      * add_p
      
      * rm useless files
      
      * add sub_p mul_p div_p
      
      * add sqrt_p and tanh_p
      
      * add reshape_p
      
      * add broadcast_p
      
      * Add python primitive wrappers.
      
      * Jvp rules updated.
      
      * JVP rules done for all the 17 primops.
      
      * quick check and fixes.
      
      * add jvp(op, *args)
      
      * add broadcast_p fill_constant_p matmul_p reduce_p reshape_p transpose_p
      
      * add split_p and concat_p
      
      * add gather_p and scatter_add_p
      
      * add slice_select_p and slice_assign_p
      
      * Add transpose rules.
      
      * add multi input check for add_p, sub_p, mul_p, div_p
      
      * update concat_p
      
      * Linearize and transpose in progress..
      
      * refine gather_p and scatter_add_p
      
      * updated.
      
      * update transpose.
      
      * refine slice_assign_p and slice_select_p
      
      * init commit for lower
      
      * Merged with primitive ops.
      
      * small update
      
      * add rules for orig2prim and prim2orig
      
      * add 9 test for prim ops
      
      * add more test and fix some bug
      
      * add more test
      
      * register proto
      
      * Adding primops test.
      
      * add shape valid check for broadcast_p op, and add keepdim attr into reduce_p op proto
      
      * support multi input and multi output for split_p and concat_p
      
      * Test updated.
      
      * update
      
      * fix slice bug for slice_select_p and slice_assign_p
      
      * updated.
      
      * Ops updated.
      
      * Refactor and bug fixes.
      
      * updated.
      
      * finish orig2prim and prim2orig rules
      
      * dtype for axis attr should be long int
      
      * update dtype for axis attr int64_t
      
      * update for iscan CI
      
      * Update primx.
      
      * Refactor vars in primx.
      
      * update for lower transform
      
      * add more shape and dtype check
      
      * update primx.py
      
      * change IndexTensor into int32 dtype
      
      * update
      
      * Fix linearize and transpose.
      
      * Update is_dot
      
      * Update is_dot
      
      * Update is_dot
      
      * add gradient aggregation, fix add_transpose.
      
      * pass first linearize+transpose test.
      
      * update test
      
      * refactor op registration and primx.
      
      * update rule for slice_assign
      
      * try test lower
      
      * update orig2prim and prim2orig
      
      * pass simple lower pass
      
      * update
      
      * Update input types in the unit test.
      
      * orig2prim segfault.
      
      * 50% for adam.minimize
      
      * test updated.
      
      * temp fix erros in removing vars.
      
      * primx updated.
      
      * update for matmul_v2 and reshape2 orig2prim
      
      * update for minimize
      
      * Refine primrules
      
      * Remove some code
      
      * supporting unused and unreachable vars.
      
      * update for use prim2orig in minimize
      
      * fix gather and scatter_add transpose.
      
      * Add rules UT
      
      * update scatter_add
      
      * Refine UT code
      
      * fix nonetype check in topo
      
      * Update gather_p pywrapper.
      
      * remove useless print
      
      * Merge tongxin PR and refine code
      
      * readd some test
      
      * rm useless print
      
      * polish code.
      
      * fix bug in minimize
      
      * add get_input_var_list and get_output_var_list and use it in lower
      
      * Fix scatter_add_p prim2orig
      
      * Update code and fix orig2prim/prim2orig UT
      
      * delete vars after block.desc._remove
      
      * Improve ops and vars clean up logics.
      
      * fix some bug in linearize and lower
      
      * update tanh transpose.
      
      * use set instead of list for var2remove
      
      * test updated.
      
      * polish code.
      
      * fix dot2bar delete.
      
      * merge tx/ad
      
      * add indextensor_dot for gather and scatter_add
      
      * add sorted for set
      
      * Fix scale_orig2prim params
      
      * fix some syntax bug
      
      * add golbal_lower_update list
      
      * Better handling of unused vars.
      
      * update tests.
      
      * Fix elementwise_sub orig2prim
      
      * support none for transpose rule
      
      * Merge and add transform UT
      
      * fix a bug in transpose
      
      * Fix transpose and UT
      
      * a hacky fix for cancat op
      
      * Fix exector place
      
      * Refine variable name
      
      * Add elementwise_mul orig2prim and support p_norm when p=1
      
      * Add sqrt orig2prim rule and UT
      
      * merge wz test
      
      * rename files, add enable_prim, disable_prim, prim_enabled, delete global_lower_update
      
      * fix a bug in test_ad_transform_trans
      
      * revert modify in framework.py
      
      * add paddle.fluid.incubate.ad_transform to  python/setup.py.in
      
      * Fix remove vars error
      
      * Fix p_norm_orig2prim
      
      * merge wz
      
      * Modify the code directory
      
      * Add utils.py and remove get_input/output_vars functions
      
      * Update maolin code
      
      * Rename UT and refine test_ad_transform_primops
      
      * Fix div_p jvp rule
      
      * Add higher derivatives UT
      
      * Remove UT to autograd dir
      
      * Fix comments
      
      * import paddle in primops.py
      
      * Add some error message for assert
      
      * Refine UT class name and refine some comments in primreg.py
      
      * update minimize of paddle/optimizer for supporting new autograd
      
      * resolve cicular importing between backward.py and optimizer.py
      
      * fill gradients and minimize unittest
      
      * Replace `assert isinstance` with `raise TypeError`
      
      * Add some assert message for primx.py
      
      * Polish variable name
      
      * Add some assert message
      
      * add some docstring
      
      * refine some name
      
      * update the format of english documents
      
      * Split test_transform.py to two files to avoid ci error
      
      * fix the document format of enable_prim/disable_prim/prim2orig/prim_enabled
      
      * polish test_gradients_and_minimize
      
      * add default value for prim_enabled api doc
      
      * Remove some UT to avoid windows ci error
      
      * Enlarge test_gradients_and_minimize limit time
      
      * Fix ut limit time
      Co-authored-by: Nveyron95 <veyron_wu@163.com>
      Co-authored-by: NJiabin Yang <360788950@qq.com>
      Co-authored-by: Nlevi131 <limaolin01@baidu.com>
      Co-authored-by: NTongxin Bai <waffle.bai@gmail.com>
      Co-authored-by: NXiaoxu Chen <chenxx_id@163.com>
      Co-authored-by: Nlevi131 <83750468+levi131@users.noreply.github.com>
      f6ee202f
  2. 12 5月, 2022 1 次提交
  3. 10 5月, 2022 1 次提交
  4. 28 4月, 2022 1 次提交
  5. 27 4月, 2022 3 次提交
  6. 26 4月, 2022 1 次提交
  7. 25 4月, 2022 1 次提交
  8. 16 4月, 2022 1 次提交
    • R
      Moe ref (#41864) · e9a63237
      Roc 提交于
      * moe ref
      
      * ref commit; test=document_fix
      
      * update; test=document_fix
      
      * update test=document_fix
      
      * update; test=document_fix
      e9a63237
  9. 15 4月, 2022 1 次提交
    • R
      Moe ref (#41836) · c37af19c
      Roc 提交于
      * moe ref
      
      * ref commit; test=document_fix
      
      * update; test=document_fix
      
      * update test=document_fix
      c37af19c
  10. 14 4月, 2022 1 次提交
    • S
      fix bfgs_doc (#41505) · 7f73ef2c
      Sing_chan 提交于
      * fix bfgs_doc; test=document_fix
      
      * add parameter name; test=document_fix
      
      * modify according to chenlong's comments;test=document_fix
      7f73ef2c
  11. 13 4月, 2022 1 次提交
  12. 08 4月, 2022 2 次提交
  13. 07 4月, 2022 2 次提交
  14. 06 4月, 2022 1 次提交
  15. 04 4月, 2022 1 次提交
  16. 02 4月, 2022 2 次提交
    • S
      Add graph apis (#40809) · b0398c8e
      Siming Dai 提交于
      * Add graph_reindex API
      
      * add graph_sample_neighbors api
      
      * Add buffer
      
      * delete VLOG
      
      * delete thrust::copy for output
      
      * add ShareDataWith
      
      * delete graph_reindex hashtable output
      
      * add graph_reindex dispensable
      
      * add reindex unittest, move memset to cuda kernel, change api
      
      * fix conflict
      
      * add reindex buffer for gpu version note
      
      * fix conflicts for op_func_generator
      
      * Add fisher_yates sampling, add dispensable, change infermeta
      
      * add dtype for edge_id
      
      * fix rocm ci and static check ci
      
      * add unittest
      
      * fix unittest
      
      * fix unittest
      
      * fix bug
      b0398c8e
    • X
      Enhance vjp/jvp/Jacobian/Hessian API for supporting dynamic, static graph and... · 9e764d82
      Xiaoxu Chen 提交于
      Enhance vjp/jvp/Jacobian/Hessian API for supporting dynamic, static graph and batched, unbatched mode (#40692)
      
      * modify vjp/jvp for both dynamic and static graph
      
      * enforce jacobian class for supporting first/last batch
      
      * add unittest for jvp, jacobian withlast batch, jacobian with first batch
      
      * fix the incorrect shape when multi-index Jacobian
      
      * enforce Hessian class for supporting dynamic graph
      
      * add Hessian class unittest
      
      * bugfix, jvp double_backward_trick zeros_like return stop_gradient=True in static graph
      
      * add API beta warnnings
      
      * add white_list for cuda11.x ci windows.
      
      * optimize some code snippets and documments
      
      * set unittest timeout to 100 seconds
      
      * move vjp,jvp,Jacobian,Hessian to incubate
      
      * fix vjp,vjp import path of sample code
      
      * fix code style error of augtograd/__init__ file
      9e764d82
  17. 01 4月, 2022 3 次提交
  18. 31 3月, 2022 1 次提交
    • S
      [New API]: miminize_bfgs and miminize_lbfgs (#40710) · e7928a06
      Sing_chan 提交于
      * [New API]: miminize_bfgs and miminize_lbfgs
      
      * modify for python module call correctly
      
      * add functional package, add error raise in static_graph, change assign to set_value
      
      * unify static_graph and dygraph, fix bug when x or H0 is float64
      
      * now only accept input is tensor, put check args in utils.py, put exception test together
      
      * temp
      
      * add more detailed algorithm illustration and comment, reduce test case to limit test time in 15s
      
      * change in_dygraph_mode to in_dynamic_mode
      
      * fix bug of sample code; reduce test case to reduce test time
      
      * change dir to incubate
      e7928a06
  19. 30 3月, 2022 1 次提交
    • R
      [MoE] Moe apis (#41092) · aac7879a
      Roc 提交于
      * add random routing op
      
      add _random_routing api in utils
      
      add random routing ut
      
      * # This is a combination of 10 commits.
      # The first commit's message is:
      add expert count op
      
      add ut for expert_count
      
      # This is the 2nd commit message:
      
      update UT only for cuda
      
      # This is the 3rd commit message:
      
      fix for rocm
      
      # This is the 4th commit message:
      
      update ut
      
      # This is the 5th commit message:
      
      add moe module
      
      # This is the 6th commit message:
      
      add expert count op
      
      add ut for expert_count
      
      # This is the 7th commit message:
      
      update UT only for cuda
      
      # This is the 8th commit message:
      
      update ut
      
      # This is the 9th commit message:
      
      add moe module
      
      # This is the 10th commit message:
      
      make expert count private
      
      * add assign pos op
      
      * fix upper num name
      
      * add api _assign pos
      
      * add ut for assign pos op
      
      * update date
      
      * add op about moe gate
      
      update utils
      
      add limit by capacity op
      
      add ut for limit_by_capacity
      
      add ut for prune_gate_by_capacity
      
      add ut for limit_by_capacity
      
      add ut for prune_gate_by_capacity
      
      * fix for win
      
      * fix bugs in test_limit_by_capacity_op
      
      * update ut
      
      * update for test (timeout)
      
      * fix ut
      
      * update
      
      * update(fix) ut for win
      
      * moe apis in incubate
      
      * # This is a combination of 10 commits.
      # The first commit's message is:
      add expert count op
      
      add ut for expert_count
      
      # This is the 2nd commit message:
      
      update UT only for cuda
      
      # This is the 3rd commit message:
      
      fix for rocm
      
      # This is the 4th commit message:
      
      update ut
      
      # This is the 5th commit message:
      
      add moe module
      
      # This is the 6th commit message:
      
      add expert count op
      
      add ut for expert_count
      
      # This is the 7th commit message:
      
      update UT only for cuda
      
      # This is the 8th commit message:
      
      update ut
      
      # This is the 9th commit message:
      
      add moe module
      
      # This is the 10th commit message:
      
      make expert count private
      
      * add assign pos op
      
      * fix upper num name
      
      * add api _assign pos
      
      * add ut for assign pos op
      
      * update date
      
      * fix for win
      
      * update for test (timeout)
      
      * fix ut
      
      * update
      
      * fix ut for number count
      
      * add apis and utils
      
      * add gate apis
      
      * add moe and grad clip apis
      
      * update moe apis
      
      * add ops for moe gate
      
      * fix
      
      * update for base moe layer api
      
      * add random routing op
      
      add _random_routing api in utils
      
      add random routing ut
      
      * fix for dygraph
      
      * update with ranodm routing
      
      * update
      
      * fix ut for limit by capacity
      
      * update
      
      * update limit by capacity for easily to switch to single thread mode
      
      * update api docs
      Co-authored-by: Nhlygit66666 <2570058140@qq.com>
      aac7879a
  20. 29 3月, 2022 1 次提交
    • R
      [MoE] Moe apis (#40895) · aeade538
      Roc 提交于
      * add random routing op
      
      add _random_routing api in utils
      
      add random routing ut
      
      * # This is a combination of 10 commits.
      # The first commit's message is:
      add expert count op
      
      add ut for expert_count
      
      # This is the 2nd commit message:
      
      update UT only for cuda
      
      # This is the 3rd commit message:
      
      fix for rocm
      
      # This is the 4th commit message:
      
      update ut
      
      # This is the 5th commit message:
      
      add moe module
      
      # This is the 6th commit message:
      
      add expert count op
      
      add ut for expert_count
      
      # This is the 7th commit message:
      
      update UT only for cuda
      
      # This is the 8th commit message:
      
      update ut
      
      # This is the 9th commit message:
      
      add moe module
      
      # This is the 10th commit message:
      
      make expert count private
      
      * add assign pos op
      
      * fix upper num name
      
      * add api _assign pos
      
      * add ut for assign pos op
      
      * update date
      
      * add op about moe gate
      
      update utils
      
      add limit by capacity op
      
      add ut for limit_by_capacity
      
      add ut for prune_gate_by_capacity
      
      add ut for limit_by_capacity
      
      add ut for prune_gate_by_capacity
      
      * fix for win
      
      * fix bugs in test_limit_by_capacity_op
      
      * update ut
      
      * update for test (timeout)
      
      * fix ut
      
      * update
      
      * update(fix) ut for win
      
      * moe apis in incubate
      
      * # This is a combination of 10 commits.
      # The first commit's message is:
      add expert count op
      
      add ut for expert_count
      
      # This is the 2nd commit message:
      
      update UT only for cuda
      
      # This is the 3rd commit message:
      
      fix for rocm
      
      # This is the 4th commit message:
      
      update ut
      
      # This is the 5th commit message:
      
      add moe module
      
      # This is the 6th commit message:
      
      add expert count op
      
      add ut for expert_count
      
      # This is the 7th commit message:
      
      update UT only for cuda
      
      # This is the 8th commit message:
      
      update ut
      
      # This is the 9th commit message:
      
      add moe module
      
      # This is the 10th commit message:
      
      make expert count private
      
      * add assign pos op
      
      * fix upper num name
      
      * add api _assign pos
      
      * add ut for assign pos op
      
      * update date
      
      * fix for win
      
      * update for test (timeout)
      
      * fix ut
      
      * update
      
      * fix ut for number count
      
      * add apis and utils
      
      * add gate apis
      
      * add moe and grad clip apis
      
      * update moe apis
      
      * add ops for moe gate
      
      * fix
      
      * update for base moe layer api
      
      * add random routing op
      
      add _random_routing api in utils
      
      add random routing ut
      
      * fix for dygraph
      
      * update with ranodm routing
      
      * update
      
      * fix ut for limit by capacity
      
      * update
      Co-authored-by: Nhlygit66666 <2570058140@qq.com>
      aeade538
  21. 25 3月, 2022 1 次提交
    • J
      Refactor Dygraph Flags (#40786) · 3085d5e4
      Jiabin Yang 提交于
      * refactor eager flags
      
      * fix flags error when we switch from eager to dygraph
      
      * fix ci problem
      
      * fix ci
      
      * fix ci
      
      * merge develop and fix code style
      
      * merge develop and fix code style
      
      * fix op test error
      
      * fix op test error
      
      * fix op test error
      
      * fix op test error
      
      * fix op test error
      
      * merge develop
      3085d5e4
  22. 22 3月, 2022 1 次提交
    • S
      [phi] Update graph_send_recv OP (#40509) · 67b46e45
      Siming Dai 提交于
      * add out_size shape for graph_send_recv
      
      * fix bug in register kernel: no const int& support
      
      * add out_size in infermeta
      
      * change unittest
      
      * fix unittest
      
      * fix out_size default value
      
      * fix doc
      
      * delete arg mapping
      
      * add sig
      
      * move -1 to 0
      
      * move -1 to 0
      67b46e45
  23. 16 3月, 2022 1 次提交
  24. 14 3月, 2022 1 次提交
  25. 11 3月, 2022 1 次提交
  26. 01 3月, 2022 1 次提交
  27. 25 2月, 2022 1 次提交
  28. 24 2月, 2022 1 次提交
  29. 19 2月, 2022 1 次提交
    • S
      Add the DistributedFusedLamb optimizer (#39148) · 5df3cd61
      sneaxiy 提交于
      * add DistributedFusedLamb op
      
      * polish code
      
      * fix compile error
      
      * compatible with pten changement
      
      * fix rocm compile error
      
      * improve converage
      
      * update upstream/develop
      
      * fix cast_with_ptr.h
      
      * add FLAGS_distributed_lamb_divide_nranks_when_allreduce=1
      
      * fix clip before allreduce
      
      * add use_master_param_norm
      
      * code polish
      
      * fix bug
      
      * fix ROCM ci
      5df3cd61
  30. 28 1月, 2022 1 次提交
  31. 27 1月, 2022 2 次提交
    • S
      Add Khop Graph Sampler API (#39146) · 35f949b5
      Siming Dai 提交于
      * add the test case for the UVA
      
      * add the context load for the uva
      
      * Add graph_sample kernel
      
      * Add graph_sample commit
      
      * add new commit for graph_sample
      
      * add unsigned long long int
      
      * delete some remarks
      
      * add cpu version
      
      * add cuda eids
      
      * add cpu eids
      
      * delete _uva
      
      * optimize speed: emplace_back, last_layer
      
      * add to_uva_tensor
      
      * add cpu return_eids choice
      
      * add gpu return_eids choice
      
      * add cpu reindex_nodes
      
      * add gpu reindex_nodes
      
      * rename op and add OMP for cpu
      
      * add incubate api
      
      * fix the compile problem for the PADDLE_ENFORE and different device
      
      * fix the rcom and windows compile problem
      
      * add unittest for graph_sample_neighbors
      
      * fix cpu unittest and unique problem
      
      * fix uva unittest, fix cuda unique problem
      
      * fix the windows compile problem
      
      * fix the windows rand_r compile problem
      
      * add correct unittest, add src_eids dispensable
      
      * delete black
      
      * combine uva unittest
      
      * mv Sample_index to Sample_Index; check input shape; fix random sample func
      
      * delete memset & cudaMemset
      
      * fix according to PR comments
      
      * fix rocm ci
      
      * modify function names according to the specification
      
      * fix windows_openblas ci
      
      * refine annotations, fix windows unittest, add default value for uva device_id, fix bug for input nodes with empty neighbors
      
      * fix rocm ci
      
      * rename graph_sample_neighbors as graph_khop_sampler, add incubate api doc
      
      * add data type
      
      * fix conflict
      Co-authored-by: Nwawltor <fangzeyang0904@hotmail.com>
      35f949b5
    • Z
      Add SparseCooTensor and SparseCsrTensor (#38906) · a7edb3f3
      zhangkaihuo 提交于
      * fix bug:
      1. atten: set the default value of attn_dropout_rate to None
      2. ffn: add activation parameter
      
      * for pure fp16
      
      * Add a SparseCsrTensor
      
      * remove unused functional
      
      * remove const
      
      * remove SetMemoberTensor
      
      * remove non_zero_nums_, the number of non zero elements of each batch can be obtained from the crows
      
      * SparseCooTensor
      
      * add SetMember
      
      * merge upstream; add SetMember
      
      * merge upstream
      
      * merge upstream; add newline at end of file
      
      * add newline at end of file
      
      * remove newline at end of file
      
      * remove newline at end of file
      
      * stash
      
      * user pten::framework::make_ddim
      
      * user pten::framework::make_ddim
      
      * merge upstream; use the latest mutable_data
      
      * merge upstream; use the latest mutable_data
      
      * return mutable dense tensor
      a7edb3f3
  32. 22 12月, 2021 1 次提交