• J
    Added reshape, reshape2, squeeze and squeeze2 BF16/FP32 FWD/BWD kernels (#34219) · 22c4c189
    jakpiase 提交于
    * test version of matmul_v2
    
    * added matmul_v2 grad kernel
    
    * minor changes
    
    * minor changes
    
    * minor change for CI approval
    
    * CI fix
    
    * CI fix
    
    * added squeeze and squeeze2 kernels
    
    * CI fix
    
    * CI fix
    
    * CI fix
    
    * disabled tests when compiled with cuda
    
    * added setting format_tag by strides
    
    * added sigmoid BF16 FWD/BWD and gelu BF16 BWD
    
    * changes after review
    
    * Revert "added sigmoid BF16 FWD/BWD and gelu BF16 BWD"
    
    This reverts commit 6e3f76720b545abfcff9f6052b46b73a1e745cae.
    
    * Revert "Merge branch 'matmul_v2_grad' into squeeze2_op"
    
    This reverts commit 06fcf67843a4a7884eccdf67a02a03575e1d4cb8, reversing
    changes made to 6e3f76720b545abfcff9f6052b46b73a1e745cae.
    
    * minor change
    
    * added reshape1/2 kernels
    
    * moved some functions into private block
    
    * CI fix
    
    * CI fix
    
    * CI fix
    22c4c189
graph_pattern_detector.cc 110.5 KB