benchmark.md 2.2 KB
Newer Older
T
Ting Wang 已提交
1 2
# Benchmarks

昇思MindSpore's avatar
昇思MindSpore 已提交
3
<a href="https://gitee.com/mindspore/docs/blob/r0.5/docs/source_en/benchmark.md" target="_blank"><img src="./_static/logo_source.png"></a>
4

T
Ting Wang 已提交
5
This document describes the MindSpore benchmarks. 
T
Ting Wang 已提交
6
For details about the MindSpore pre-trained model, see [Model Zoo](https://gitee.com/mindspore/mindspore/tree/r0.5/model_zoo).
T
Ting Wang 已提交
7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

## Training Performance

### ResNet

| Network |     Network Type | Dataset | MindSpore Version | Resource &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; | Precision | Batch Size | Throughput | Speedup |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
| ResNet-50 v1.5 | CNN | ImageNet2012 | 0.2.0-alpha | Ascend: 1 * Ascend 910 </br> CPU:24 Cores | Mixed | 32 | 1787 images/sec | - |
|  |  |  |  | Ascend: 8 * Ascend 910 </br> CPU:192 Cores | Mixed | 32 | 13689 images/sec | 0.95 |
|  |  |  |  | Ascend: 16 * Ascend 910 </br> CPU:384 Cores | Mixed | 32 | 27090 images/sec | 0.94 |

1. The preceding performance is obtained based on ModelArts, the HUAWEI CLOUD AI development platform. It is the average performance obtained by the Ascend 910 AI processor during the overall training process. 
2. For details about other open source frameworks, see [ResNet-50 v1.5 for TensorFlow](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/Classification/RN50v1.5#nvidia-dgx-2-16x-v100-32g).

### BERT

| Network |	Network Type | Dataset | MindSpore Version | Resource &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; | Precision | Batch Size | Throughput |  Speedup |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
C
chenhaozhe 已提交
25 26
| BERT-Large | Attention | zhwiki | 0.5.0-beta | Ascend: 1 * Ascend 910 </br> CPU:24 Cores | Mixed | 96 | 269 sentences/sec | - |
|  |  |  |  | Ascend: 8 * Ascend 910 </br> CPU:192 Cores | Mixed | 96 | 2069 sentences/sec | 0.96 |
T
Ting Wang 已提交
27 28 29

1. The preceding performance is obtained based on ModelArts, the HUAWEI CLOUD AI development platform. The network contains 24 hidden layers, the sequence length is 128 tokens, and the vocabulary contains 21128 tokens.   
2. For details about other open source frameworks, see [BERT For TensorFlow](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/LanguageModeling/BERT).