Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
OpenHarmony
Third Party Openssl
提交
558ff0f0
T
Third Party Openssl
项目概览
OpenHarmony
/
Third Party Openssl
1 年多 前同步成功
通知
10
Star
18
Fork
1
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
0
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
T
Third Party Openssl
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
0
Issue
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
提交
558ff0f0
编写于
4月 24, 2014
作者:
A
Andy Polyakov
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
aes/asm/bsaes-x86_64.pl: Atom-specific optimization.
上级
94d1f4b0
变更
1
隐藏空白更改
内联
并排
Showing
1 changed file
with
32 addition
and
40 deletion
+32
-40
crypto/aes/asm/bsaes-x86_64.pl
crypto/aes/asm/bsaes-x86_64.pl
+32
-40
未找到文件。
crypto/aes/asm/bsaes-x86_64.pl
浏览文件 @
558ff0f0
...
...
@@ -38,8 +38,8 @@
# Emilia's this(*) difference
#
# Core 2 9.30 8.69 +7%
# Nehalem(**) 7.63 6.
98 +9
%
# Atom 17.1 1
7.4 -2%(***)
# Nehalem(**) 7.63 6.
88 +11
%
# Atom 17.1 1
6.4 +4%
#
# (*) Comparison is not completely fair, because "this" is ECB,
# i.e. no extra processing such as counter values calculation
...
...
@@ -50,14 +50,6 @@
# (**) Results were collected on Westmere, which is considered to
# be equivalent to Nehalem for this code.
#
# (***) Slowdown on Atom is rather strange per se, because original
# implementation has a number of 9+-bytes instructions, which
# are bad for Atom front-end, and which I eliminated completely.
# In attempt to address deterioration sbox() was tested in FP
# SIMD "domain" (movaps instead of movdqa, xorps instead of
# pxor, etc.). While it resulted in nominal 4% improvement on
# Atom, it hurted Westmere by more than 2x factor.
#
# As for key schedule conversion subroutine. Interface to OpenSSL
# relies on per-invocation on-the-fly conversion. This naturally
# has impact on performance, especially for short inputs. Conversion
...
...
@@ -67,7 +59,7 @@
# conversion conversion/8x block
# Core 2 240 0.22
# Nehalem 180 0.20
# Atom 430 0.
19
# Atom 430 0.
20
#
# The ratio values mean that 128-byte blocks will be processed
# 16-18% slower, 256-byte blocks - 9-10%, 384-byte blocks - 6-7%,
...
...
@@ -83,9 +75,9 @@
# Add decryption procedure. Performance in CPU cycles spent to decrypt
# one byte out of 4096-byte buffer with 128-bit key is:
#
# Core 2 9.
83
# Nehalem 7.
74
# Atom 1
9.0
# Core 2 9.
98
# Nehalem 7.
80
# Atom 1
7.9
#
# November 2011.
#
...
...
@@ -434,21 +426,21 @@ my $mask=pop;
$code
.=
<<___;
pxor 0x00($key),@x[0]
pxor 0x10($key),@x[1]
pshufb $mask,@x[0]
pxor 0x20($key),@x[2]
pshufb $mask,@x[1]
pxor 0x30($key),@x[3]
pshufb $mask,@x[2]
pshufb $mask,@x[0]
pshufb $mask,@x[1]
pxor 0x40($key),@x[4]
pshufb $mask,@x[3]
pxor 0x50($key),@x[5]
pshufb $mask,@x[4]
pshufb $mask,@x[2]
pshufb $mask,@x[3]
pxor 0x60($key),@x[6]
pshufb $mask,@x[5]
pxor 0x70($key),@x[7]
pshufb $mask,@x[4]
pshufb $mask,@x[5]
pshufb $mask,@x[6]
lea 0x80($key),$key
pshufb $mask,@x[7]
lea 0x80($key),$key
___
}
...
...
@@ -820,18 +812,18 @@ _bsaes_encrypt8:
movdqa 0x50($const), @XMM[8] # .LM0SR
pxor @XMM[9], @XMM[0] # xor with round0 key
pxor @XMM[9], @XMM[1]
pshufb @XMM[8], @XMM[0]
pxor @XMM[9], @XMM[2]
pshufb @XMM[8], @XMM[1]
pxor @XMM[9], @XMM[3]
pshufb @XMM[8], @XMM[2]
pshufb @XMM[8], @XMM[0]
pshufb @XMM[8], @XMM[1]
pxor @XMM[9], @XMM[4]
pshufb @XMM[8], @XMM[3]
pxor @XMM[9], @XMM[5]
pshufb @XMM[8], @XMM[4]
pshufb @XMM[8], @XMM[2]
pshufb @XMM[8], @XMM[3]
pxor @XMM[9], @XMM[6]
pshufb @XMM[8], @XMM[5]
pxor @XMM[9], @XMM[7]
pshufb @XMM[8], @XMM[4]
pshufb @XMM[8], @XMM[5]
pshufb @XMM[8], @XMM[6]
pshufb @XMM[8], @XMM[7]
_bsaes_encrypt8_bitslice:
...
...
@@ -884,18 +876,18 @@ _bsaes_decrypt8:
movdqa -0x30($const), @XMM[8] # .LM0ISR
pxor @XMM[9], @XMM[0] # xor with round0 key
pxor @XMM[9], @XMM[1]
pshufb @XMM[8], @XMM[0]
pxor @XMM[9], @XMM[2]
pshufb @XMM[8], @XMM[1]
pxor @XMM[9], @XMM[3]
pshufb @XMM[8], @XMM[2]
pshufb @XMM[8], @XMM[0]
pshufb @XMM[8], @XMM[1]
pxor @XMM[9], @XMM[4]
pshufb @XMM[8], @XMM[3]
pxor @XMM[9], @XMM[5]
pshufb @XMM[8], @XMM[4]
pshufb @XMM[8], @XMM[2]
pshufb @XMM[8], @XMM[3]
pxor @XMM[9], @XMM[6]
pshufb @XMM[8], @XMM[5]
pxor @XMM[9], @XMM[7]
pshufb @XMM[8], @XMM[4]
pshufb @XMM[8], @XMM[5]
pshufb @XMM[8], @XMM[6]
pshufb @XMM[8], @XMM[7]
___
...
...
@@ -1937,21 +1929,21 @@ $code.=<<___;
movdqa -0x10(%r11), @XMM[8] # .LSWPUPM0SR
pxor @XMM[9], @XMM[0] # xor with round0 key
pxor @XMM[9], @XMM[1]
pshufb @XMM[8], @XMM[0]
pxor @XMM[9], @XMM[2]
pshufb @XMM[8], @XMM[1]
pxor @XMM[9], @XMM[3]
pshufb @XMM[8], @XMM[2]
pshufb @XMM[8], @XMM[0]
pshufb @XMM[8], @XMM[1]
pxor @XMM[9], @XMM[4]
pshufb @XMM[8], @XMM[3]
pxor @XMM[9], @XMM[5]
pshufb @XMM[8], @XMM[4]
pshufb @XMM[8], @XMM[2]
pshufb @XMM[8], @XMM[3]
pxor @XMM[9], @XMM[6]
pshufb @XMM[8], @XMM[5]
pxor @XMM[9], @XMM[7]
pshufb @XMM[8], @XMM[4]
pshufb @XMM[8], @XMM[5]
pshufb @XMM[8], @XMM[6]
lea .LBS0(%rip), %r11 # constants table
pshufb @XMM[8], @XMM[7]
lea .LBS0(%rip), %r11 # constants table
mov %ebx,%r10d # pass rounds
call _bsaes_encrypt8_bitslice
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录