Question on SetTensor BF16 (#27305) · Issue · PaddlePaddle / Paddle

Question on SetTensor BF16

Created by: jczaja

We are having problems with implementing setting tensor data from numpy array.

Numpy does not have BF16 data type support, so on python level we use uint16 data type. So from python to C++ comes the python object of data type uint16 so each element is of two bytes size. Then mentioned numpy object is to be transformed into bfloat16 data_type object and used for setting target tensor : https://github.com/PaddlePaddle/Paddle/blob/6947a58a1f52bed347341d373ce86452b17c3366/paddle/fluid/pybind/tensor_py.h#L269-L273

array that comes from Python is of proper data type . We can check this with: array.itemsize() that returns 2 This array is of type: pybind11::array and is next transformed into : const py::array_t<T, py::array::c_style | py::array::forcecast> &array, to meet requested parameter, here: https://github.com/PaddlePaddle/Paddle/blob/6947a58a1f52bed347341d373ce86452b17c3366/paddle/fluid/pybind/tensor_py.h#L197

Conversion is done inside numpy , but there is a problem that returned object is having data type of size 4 while we expect to be size of 2.

To conclude, when there is conversion from pybind11::array to pybind11::array_t<paddle::platform::bfloat16, 17> itemsize change from 2 to 4.

Could please advice where the mistake could be?

@zhiqiu , @luotao1 Could you please advice?