Created by: sneaxiy
This PR changes the output Mask
of dropout_op to be type of uint8_t. (Furthermore, we can change Mask
to be something like std::vector<bool>
).
This PR makes the maximum batch size of Transformer model in benchmark repo reach 12000 stably.