Created by: zhaoyuchen2018
elementwise function used before definition then failed in cuda 8, move it ahead.