target-i386: add SSE4a instruction support
This adds support for the AMD Phenom/Barcelona's SSE4a instructions. Those include insertq and extrq, which are doing shift and mask on XMM registers, in two versions (immediate shift/length values and stored in another XMM register). Additionally it implements movntss, movntsd, which are scalar non-temporal stores (avoiding cache trashing). These are implemented as normal stores, though. SSE4a is guarded by the SSE4A CPUID bit (Fn8000_0001:ECX[6]). Signed-off-by: NAndre Przywara <andre.przywara@amd.com> Signed-off-by: NAurelien Jarno <aurelien@aurel32.net>
Showing
想要评论请 注册 或 登录