* move embeding to phi; * update sig; test=develop * move reset impl to phi; test=develop * remove old register; test=develop * fix cpu bf16 bug; test=develop * fix lookup speed error * polish code * fix paddle throw type