* [Auto Parallel] Redesign the tunner for Auto Parallel
* add tcp_socket and tcp_store
* fix c_split bug * fix utest * add c_embedding for tensorparallel