Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
陶辉
geektime_distrib_perf
提交
5bde8dc4
G
geektime_distrib_perf
项目概览
陶辉
/
geektime_distrib_perf
通知
1
Star
0
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
0
列表
看板
标记
里程碑
合并请求
1
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
G
geektime_distrib_perf
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
0
Issue
0
列表
看板
标记
里程碑
合并请求
1
合并请求
1
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
前往新版Gitcode,体验更适合开发者的 AI 搜索 >>
提交
5bde8dc4
编写于
11月 07, 2019
作者:
R
russelltao
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
add 148,684,923 cache-references (83.32%)
上级
b67b32a6
变更
5
隐藏空白更改
内联
并排
Showing
5 changed file
with
161 addition
and
12 deletion
+161
-12
1-cpu_cache/traverse_1d_array/README.md
1-cpu_cache/traverse_1d_array/README.md
+108
-4
1-cpu_cache/traverse_1d_array/traverse_1d_array.cpp
1-cpu_cache/traverse_1d_array/traverse_1d_array.cpp
+7
-3
1-cpu_cache/traverse_1d_array/traverse_1d_array.java
1-cpu_cache/traverse_1d_array/traverse_1d_array.java
+40
-0
1-cpu_cache/traverse_2d_array/README.md
1-cpu_cache/traverse_2d_array/README.md
+1
-2
1-cpu_cache/traverse_2d_array/traverse_2d_array.py
1-cpu_cache/traverse_2d_array/traverse_2d_array.py
+5
-3
未找到文件。
1-cpu_cache/traverse_1d_array/README.md
浏览文件 @
5bde8dc4
## 1. C++程序traverse_1d_array.cpp
## 1. 验证环境
*
操作系统: CentOS7.0
*
CPU: Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz
*
GCC-C++: 4.8.5
*
JAVA: 1.8.0
*
Python: 2.7.5
## 2. C++程序traverse_1d_array.cpp
### a. 编译程序
#### 安装编译依赖的软件
如Linux中需要安装gcc-c++,CentOS中可用
`yum install gcc-c++`
安装,Ubuntu中可用
`apt-get install gcc-c++`
#### 编译程序
`g++ traverse_1d_array.cpp -o traverse_1d_array`
### b. 运行验证
#### 以步长为1遍历数组
`./traverse_1d_array -s 1`
消耗时间(毫秒):20
#### 以步长为128遍历数组
`./traverse_1d_array -s 128`
消耗时间(毫秒):280
#### 以步长为1024遍历数组
`./traverse_1d_array -s 1024`
消耗时间(毫秒):1850
### c. 使用perf验证缓存命中率
#### 以步长为1遍历数组
`perf stat -e cache-references,cache-misses,instructions,cycles,L1-dcache-load-misses,L1-dcache-loads ./traverse_1d_array -s 1`
*
输出结果:
```
Performance counter stats for './traverse_1d_array -s 1':
332,787 cache-references (82.95%)
26,230 cache-misses # 7.882 % of all cache refs (67.17%)
111,702,471 instructions # 1.61 insn per cycle (83.59%)
69,498,357 cycles (83.57%)
** 250,109 L1-dcache-load-misses # 0.43% of all L1-dcache hits (83.58%) **
58,115,659 L1-dcache-loads (82.72%)
0.030938059 seconds time elapsed
0.026916000 seconds user
0.004098000 seconds sys
```
#### 以步长为128遍历数组
`perf stat -e cache-references,cache-misses,instructions,cycles,L1-dcache-load-misses,L1-dcache-loads ./traverse_1d_array -f`
*
输出结果:
```
34,246,770 cache-references (83.16%)
912,881 cache-misses # 2.666 % of all cache refs (66.44%)
137,729,629 instructions # 0.16 insn per cycle (83.27%)
844,462,327 cycles (83.51%)
25,917,035 L1-dcache-load-misses # 38.92% of all L1-dcache hits (83.51%)
66,593,669 L1-dcache-loads (83.39%)
0.291569229 seconds time elapsed
0.066179000 seconds user
0.225442000 seconds sys
```
#### 以步长为1024遍历数组
`perf stat -e cache-references,cache-misses,instructions,cycles,L1-dcache-load-misses,L1-dcache-loads ./traverse_1d_array -f`
*
输出结果:
```
148,684,923 cache-references (83.32%)
8,213,600 cache-misses # 5.524 % of all cache refs (66.64%)
312,534,826 instructions # 0.06 insn per cycle (83.32%)
5,593,728,896 cycles (83.32%)
148,953,141 L1-dcache-load-misses # 133.42% of all L1-dcache hits (83.37%)
111,642,681 L1-dcache-loads (83.35%)
1.894789074 seconds time elapsed
0.158064000 seconds user
1.736704000 seconds sys
```
## 3. Java程序
### a. 编译程序
`javac traverse_1d_array.java`
### b.运行验证
#### 使用array[i][j]遍历数组
`./traverse_1d_array -f`
`java traverse_1d_array -f`
消耗时间(毫秒):20
#### 使用array[j][i]遍历数组
`./traverse_1d_array -s`
`java traverse_1d_array -s`
消耗时间(毫秒):100
### c. 使用perf验证缓存命中率
#### 使用array[i][j]遍历数组
`perf stat -e cache-references,cache-misses,instructions,cycles,L1-dcache-load-misses,L1-dcache-loads ./traverse_1d_array -f`
*
输出结果:
```
Performance counter stats for 'java traverse_2d_array -f':
6,379,138 cache-references (80.62%)
866,578 cache-misses # 13.585 % of all cache refs (68.93%)
459,726,039 instructions # 1.51 insn per cycle (85.22%)
303,673,757 cycles (85.69%)
5,270,707 L1-dcache-load-misses # 3.96% of all L1-dcache hits (81.64%)
133,211,743 L1-dcache-loads (83.13%)
0.126089887 seconds time elapsed
0.122353000 seconds user
0.047877000 seconds sys
```
#### 使用array[j][i]遍历数组
`perf stat -e cache-references,cache-misses,instructions,cycles,L1-dcache-load-misses,L1-dcache-loads ./traverse_1d_array -s`
## 2. python程序traverse_1d_array.py
\ No newline at end of file
*
输出结果:
```
Performance counter stats for 'java traverse_2d_array -s':
42,441,956 cache-references (80.21%)
872,336 cache-misses # 2.055 % of all cache refs (66.61%)
386,326,280 instructions # 0.71 insn per cycle (84.29%)
544,411,061 cycles (85.01%)
38,884,991 L1-dcache-load-misses # 32.48% of all L1-dcache hits (85.24%)
119,711,464 L1-dcache-loads (82.94%)
0.192838747 seconds time elapsed
0.200693000 seconds user
0.052919000 seconds sys
```
\ No newline at end of file
1-cpu_cache/traverse_1d_array/traverse_1d_array.cpp
浏览文件 @
5bde8dc4
...
...
@@ -19,14 +19,18 @@ int main(int argc, char** argv) {
while
((
ch
=
getopt
(
argc
,
argv
,
"s:"
))
!=
-
1
)
{
switch
(
ch
)
{
case
's'
:
step
=
atoi
(
optarg
);
break
;
//步长s必须小于1024
case
's'
:
step
=
atoi
(
optarg
);
if
(
step
>
1024
)
step
=
1024
;
break
;
}
}
char
*
arr
=
new
char
[
TESTN
];
//使用clock比取系统时间能够更准确的看到消耗了多少CPU资源
clock_t
start
,
end
;
//用不同的步长,但只做total次运算,这样可以横向比较
long
total
=
TESTN
/
1024
,
cnt
=
0
;
long
i
=
0
;
start
=
clock
();
...
...
1-cpu_cache/traverse_1d_array/traverse_1d_array.java
0 → 100644
浏览文件 @
5bde8dc4
import
java.util.Date
;
public
class
traverse_1d_array
{
public
static
void
main
(
String
args
[]){
int
ch
;
int
TESTN
=
4096
;
boolean
slowMode
=
false
;
for
(
String
arg
:
args
)
{
if
(
"-f"
.
equals
(
arg
))
{
slowMode
=
false
;
break
;
}
else
if
(
"-s"
.
equals
(
arg
))
{
slowMode
=
true
;
break
;
}
}
char
[][]
arr
=
new
char
[
TESTN
][
TESTN
];
Date
start
=
new
Date
();
if
(!
slowMode
)
{
for
(
int
i
=
0
;
i
<
TESTN
;
i
++)
{
for
(
int
j
=
0
;
j
<
TESTN
;
j
++)
{
//arr[i][j]是连续访问的
arr
[
i
][
j
]
=
0
;
}
}
}
else
{
for
(
int
i
=
0
;
i
<
TESTN
;
i
++)
{
for
(
int
j
=
0
;
j
<
TESTN
;
j
++)
{
//arr[j][i]是不连续访问的
arr
[
j
][
i
]
=
0
;
}
}
}
System
.
out
.
println
(
new
Date
().
getTime
()-
start
.
getTime
());
}
}
1-cpu_cache/traverse_2d_array/README.md
浏览文件 @
5bde8dc4
...
...
@@ -100,5 +100,4 @@
0.200693000 seconds user
0.052919000 seconds sys
```
## 3. python程序
\ No newline at end of file
```
\ No newline at end of file
1-cpu_cache/traverse_2d_array/traverse_2d_array.py
浏览文件 @
5bde8dc4
import
time
import
sys
,
getopt
import
numpy
as
np
try
:
opts
,
args
=
getopt
.
getopt
(
sys
.
argv
,
"fs"
)
...
...
@@ -15,14 +15,16 @@ for opt, arg in opts:
elif
opt
in
(
"-s"
):
slowMode
=
True
TESTN
=
10240
arr
=
[[
0
for
col
in
range
(
TESTN
)]
for
row
in
range
(
TESTN
)]
TESTN
=
1024
*
1
0
arr
=
np
.
empty
((
TESTN
,
TESTN
))
t1
=
time
.
time
()
if
slowMode
:
sum
=
np
.
sum
(
arr
,
axis
=
1
)
for
i
in
range
(
TESTN
):
for
j
in
range
(
TESTN
):
arr
[
j
][
i
]
=
1
else
:
sum
=
np
.
sum
(
arr
,
axis
=
0
)
for
i
in
range
(
TESTN
):
for
j
in
range
(
TESTN
):
arr
[
i
][
j
]
=
1
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录