Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
OpenHarmony
Third Party Openssl
提交
bac252a5
T
Third Party Openssl
项目概览
OpenHarmony
/
Third Party Openssl
大约 1 年 前同步成功
通知
9
Star
18
Fork
1
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
0
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
T
Third Party Openssl
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
0
Issue
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
体验新版 GitCode,发现更多精彩内容 >>
提交
bac252a5
编写于
1月 20, 2005
作者:
A
Andy Polyakov
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
Bug-fix in CBC encrypt tail processing and commentary section update.
上级
a963395a
变更
1
隐藏空白更改
内联
并排
Showing
1 changed file
with
29 addition
and
16 deletion
+29
-16
crypto/aes/asm/aes-586.pl
crypto/aes/asm/aes-586.pl
+29
-16
未找到文件。
crypto/aes/asm/aes-586.pl
浏览文件 @
bac252a5
...
@@ -6,7 +6,7 @@
...
@@ -6,7 +6,7 @@
# forms are granted according to the OpenSSL license.
# forms are granted according to the OpenSSL license.
# ====================================================================
# ====================================================================
#
#
# Version 3.
0
.
# Version 3.
1
.
#
#
# You might fail to appreciate this module performance from the first
# You might fail to appreciate this module performance from the first
# try. If compared to "vanilla" linux-ia32-icc target, i.e. considered
# try. If compared to "vanilla" linux-ia32-icc target, i.e. considered
...
@@ -46,23 +46,27 @@
...
@@ -46,23 +46,27 @@
# Instruction Level Parallelism, and it indeed resulted in up to 15%
# Instruction Level Parallelism, and it indeed resulted in up to 15%
# better performance on most recent µ-archs...
# better performance on most recent µ-archs...
#
#
# Current ECB performance numbers for 128-bit key in
cycles per byte
# Current ECB performance numbers for 128-bit key in
CPU cycles per
# [measure commonly used by AES benchmarkers] are:
#
processed byte
[measure commonly used by AES benchmarkers] are:
#
#
# small footprint fully unrolled
# small footprint fully unrolled
# P4[-3] 23[24] 22[23]
# P4[-3] 23[24] 22[23]
# AMD K8 19 18
# AMD K8 19 18
# PIII 26
(*)
23
# PIII 26 23
# Pentium 63(*) 52
# Pentium 63(*) 52
#
#
# (*) Performance difference between small footprint code and fully
# (*) Performance difference between small footprint code and fully
# unrolled in more commonly used CBC mode is not as big, 7% for
# unrolled in more commonly used CBC mode is not as big, 4% for
# PIII and 15% for Pentium, which I consider tolerable.
# for Pentium. PIII's ~13% difference [in both cases in 3rd
# version] is considered tolerable...
#
#
# Third version adds AES_cbc_encrypt implementation, which resulted in
# Third version adds AES_cbc_encrypt implementation, which resulted in
# up to 40% performance imrovement of CBC benchmark results [on most
# up to 40% performance imrovement of CBC benchmark results. 40% was
# recent -archs]. CBC performance is virtually as good as ECB now and
# observed on P4 core, where "overall" imrovement coefficient, i.e. if
# sometimes even better, because function prologues and epilogues are
# compared to PIC generated by GCC and in CBC mode, was observed to be
# as large as 4x:-) CBC performance is virtually identical to ECB now
# and on some platforms even better, e.g. 56 "small" cycles/byte on
# senior Pentium, because certain function prologues and epilogues are
# effectively taken out of the loop...
# effectively taken out of the loop...
push
(
@INC
,"
perlasm
","
../../perlasm
");
push
(
@INC
,"
perlasm
","
../../perlasm
");
...
@@ -79,8 +83,9 @@ $acc="esi";
...
@@ -79,8 +83,9 @@ $acc="esi";
$small_footprint
=
1
;
# $small_footprint=1 code is ~5% slower [on
$small_footprint
=
1
;
# $small_footprint=1 code is ~5% slower [on
# recent µ-archs], but ~5 times smaller!
# recent µ-archs], but ~5 times smaller!
# I favor compact code, because it minimizes
# I favor compact code to minimize cache
# cache contention...
# contention and in hope to "collect" 5% back
# in real-life applications...
$vertical_spin
=
0
;
# shift "verticaly" defaults to 0, because of
$vertical_spin
=
0
;
# shift "verticaly" defaults to 0, because of
# its proof-of-concept status...
# its proof-of-concept status...
...
@@ -1296,12 +1301,18 @@ sub declast()
...
@@ -1296,12 +1301,18 @@ sub declast()
&push
(
$key
eq
"
edi
"
?
$key
:
"");
# push ivp
&push
(
$key
eq
"
edi
"
?
$key
:
"");
# push ivp
&pushf
();
&pushf
();
&mov
(
$key
,
&wparam
(
1
));
# load out
&mov
(
$key
,
&wparam
(
1
));
# load out
&xor
(
$s0
,
$s0
);
&mov
(
$s1
,
16
);
&mov
(
&DWP
(
0
,
$key
),
$s0
);
# zero output
&sub
(
$s1
,
$s2
);
&mov
(
&DWP
(
4
,
$key
),
$s0
);
&cmp
(
$key
,
$acc
);
# compare with inp
&mov
(
&DWP
(
8
,
$key
),
$s0
);
&je
(
&label
("
enc_in_place
"));
&mov
(
&DWP
(
12
,
$key
),
$s0
);
&data_word
(
0x90A4F3FC
);
# cld; rep movsb; nop # copy input
&data_word
(
0x90A4F3FC
);
# cld; rep movsb; nop # copy input
&jmp
(
&label
("
enc_skip_in_place
"));
&set_label
("
enc_in_place
");
&lea
(
$key
,
&DWP
(
0
,
$key
,
$s2
));
&set_label
("
enc_skip_in_place
");
&mov
(
$s2
,
$s1
);
&xor
(
$s0
,
$s0
);
&data_word
(
0x90AAF3FC
);
# cld; rep stosb; nop # zero tail
&popf
();
&popf
();
&pop
(
$key
);
# pop ivp
&pop
(
$key
);
# pop ivp
...
@@ -1456,6 +1467,8 @@ sub declast()
...
@@ -1456,6 +1467,8 @@ sub declast()
&pushf
();
&pushf
();
&data_word
(
0x90A4F3FC
);
# cld; rep movsb; nop # restore tail
&data_word
(
0x90A4F3FC
);
# cld; rep movsb; nop # restore tail
&popf
();
&popf
();
&align
(
4
);
&set_label
("
dec_out
");
&set_label
("
dec_out
");
&stack_pop
(
5
);
&stack_pop
(
5
);
&function_end
("
AES_cbc_encrypt
");
&function_end
("
AES_cbc_encrypt
");
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录