GPU implementation of smooth l1 loss
Created by: pengwangucla
Matrix::resizeOrCreate(
targetCpu, target.getHeight(), target.getWidth(), false, false);
Matrix::resizeOrCreate(
outputCpu, output.getHeight(), output.getWidth(), false, false);
Matrix::resizeOrCreate(labelCpu,
label.value->getHeight(),
label.value->getWidth(),
false,
false);
targetCpu->copyFrom(target);
outputCpu->copyFrom(output);
labelCpu->copyFrom(*label.value);
targetCpu->smoothL1(*outputCpu, *(labelCpu));
target.copyFrom(*targetCpu);
It is not right to do Gpu computing by copy gpu memory to cpu and compute at cpu, moreover, the gpu implementation is not difficult.
Someone should change this.