MKLDNN: Fully Connected layer.
Created by: mozga-intel
I will use the Fully Connected layer as an example to describe this problem. So, during the implementation of the Fully Connected layer with the use of MKLDNN algorithm, I have encountered a few difficulties. The current version of the fully connected layer of Paddle is splitting into two operations: multiplication and addition. Basically, these operations are used in the current version of Paddle. Subsequently, MKLDNN version of algorithm gives us the opportunity to combine these operations into one. So, If I wanted to kill two birds with one stone I should have made a new kernel to this layer. Thus, I should make a stand-alone version of FC's algorithm. However, when I implemented new kernel, I picked up a few problems:
-
First of all, Am I forced to make three versions of the same algorithm on a CPU, GPU and MKLDNN, in order to register the new MKLDNN's op kernel?
-
Can I use the new Fc's kernel when I don't have a full implementation of FC's kernels on a CPU and GPU place, but I have only two fake kernels on CPU and GPU place? By fake kernel I mean that this kernel is registered in the system but when it is called then the system gets the message that the kernel is not available at this time. I worked out that there are fake objects because the PaddlePaddle platform needs to have all kernels on all platforms.
-
Referring to the point above, Can I integrate single FC's kernel and all fake CPU's and GPU's kernels with current platform, when I have the old version of algorithm (multiplication and sum of matrix)?
-
Also, what can I do to link some of algorithms to one. Should we remove the old version of the algorithm (multiplication and sum) or should we replace this solution with a new algorithm (fully connected on MKLDNN) or is it not possible to touch it, and we need to add a new op kernel to the current solution?
-
Can we have a special kernel only to one specific platform, i.e MKLDNN, without a need to register new kernel for other platform i.e CPU (naive) and GPU?
Thank you.