• Z
    fix the bug of layer_norm when batch_size=1 (#35480) · ad5f7494
    zhangkaihuo 提交于
    The bug is that access to mean and var is incorrect, and the array will be out of bounds: the shape of mean and var is [batch_size], and the range of thread idx is 0~feature_size, so mean[idx] and var[idx] is incorrect.
    
    When batch_size=1, the correct access is mean[0] and var[0], and a unit test with batch_size=1 is added.
    ad5f7494
layer_norm_kernel.cu.h 33.3 KB