Created by: sneaxiy
bert_encoder_functor.h
includes cub/cub.cuh
, but does not depend on target cub
, which causes random failure in CI compilation. This PR fixes the compilation dependency and makes all math targets depend on cub
to avoid future bad dependency cases.