* add trace op * bug fix * bug fix; test=develop * thrust bug fix; test=develop * remove useless register; test=develop * fix bug; test=develop * update trace kernel; test=develop * move kernel args to trace_sig; test=develop