* add huber_loss for kunlun * update xpu.cmake * update unitests * update unitests * update elementwise_add * update elementwise_add * update elementwise_add