1. 22 9月, 2021 24 次提交
  2. 21 9月, 2021 2 次提交
    • G
      support fp16 (#35888) · 087c23a9
      Guoxia Wang 提交于
      087c23a9
    • A
      Reuse OneDNN handler for SGD and SUM for SelectedRows input tensors. (#35510) · 799f3861
      Adam Osewski 提交于
      * Create stateful OneDNNAXPYHandler object.
      
      This makes it possible to call it multiple times without recreating the
      oneDNN primitives every time.
      
      * Prepare SGDOpKernel to reuse its implementation from OneDNN kernel.
      
      * OneDNN SGD kernel.
      
      * Update call to use new OneDNNAXPYHandler object api.
      
      * Setup seed in proper place.
      
      * Enable OneDNN kernel only for single case.
      
      * For dense param and sparse grad.
      
      * Small refactor.
      
      * Enable oneDNN by op attr or by cmd line flag.
      
      * Use int64_t type for number of elements.
      
      * Support dense param and grad from OneDNN kernel.
      
      * Enable SGD OneDNN kernel when use MP BF16 optimizer.
      
      * Force non-copyable/movable OneDNNAXPYHandler.
      
      * Reuse OneDNNAXPYHandler for spare tensors in SUM op.
      
      * Fix SFINAE rules.
      
      * Remove recording event inside AXPY.
      
      * Get rid of internal primitive caching.
      
      * Stop use PP cache mechanims to store mem and primitive obj.
      * Handler obj store and reuse needed desc & prim
      
      * Do not derive from MKLDNNHandlerT
      799f3861
  3. 19 9月, 2021 2 次提交
    • L
      Optimization of pool2d grad (#35389) · 86685190
      limingshu 提交于
      * Optimization of pool2d grad, first commit.
      
      * remove useless print codes
      
      * refine codes
      
      * refine codes
      
      * seal more operation into template specialization
      
      * fix template struct error in MaxPool2dGrad.
      
      * Fix header including error
      
      * refine code with comment
      
      * Seal the param-preparation codes into function for common use.
      
      * Seal the param-preparation codes into function for common use.
      
      * Seal the param-preparation into funciton and make it common for other kernels
      
      * polish code and erase useless template speicalization
      
      * Rerun triger
      
      * rerun trigger
      86685190
    • B
      add hard_sigmoid trt converter test cases (#35876) · 9f88d327
      baoachun 提交于
      9f88d327
  4. 18 9月, 2021 12 次提交