diff --git a/docs/source_en/benchmark.md b/docs/source_en/benchmark.md new file mode 100644 index 0000000000000000000000000000000000000000..b1de5b03aaaf5e872964ae9843718e666fdae505 --- /dev/null +++ b/docs/source_en/benchmark.md @@ -0,0 +1,27 @@ +# Benchmarks + +This document describes the MindSpore benchmarks. +For details about the MindSpore pre-trained model, see [Model Zoo](https://gitee.com/mindspore/mindspore/tree/master/mindspore/model_zoo). + +## Training Performance + +### ResNet + +| Network | Network Type | Dataset | MindSpore Version | Resource                 | Precision | Batch Size | Throughput | Speedup | +| --- | --- | --- | --- | --- | --- | --- | --- | --- | +| ResNet-50 v1.5 | CNN | ImageNet2012 | 0.2.0-alpha | Ascend: 1 * Ascend 910
CPU:24 Cores | Mixed | 32 | 1787 images/sec | - | +| | | | | Ascend: 8 * Ascend 910
CPU:192 Cores | Mixed | 32 | 13689 images/sec | 0.95 | +| | | | | Ascend: 16 * Ascend 910
CPU:384 Cores | Mixed | 32 | 27090 images/sec | 0.94 | + +1. The preceding performance is obtained based on ModelArts, the HUAWEI CLOUD AI development platform. It is the average performance obtained by the Ascend 910 AI processor during the overall training process. +2. For details about other open source frameworks, see [ResNet-50 v1.5 for TensorFlow](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/Classification/RN50v1.5#nvidia-dgx-2-16x-v100-32g). + +### BERT + +| Network | Network Type | Dataset | MindSpore Version | Resource                 | Precision | Batch Size | Throughput | Speedup | +| --- | --- | --- | --- | --- | --- | --- | --- | --- | +| BERT-Large | Attention | zhwiki | 0.2.0-alpha | Ascend: 1 * Ascend 910
CPU:24 Cores | Mixed | 96 | 210 sentences/sec | - | +| | | | | Ascend: 8 * Ascend 910
CPU:192 Cores | Mixed | 96 | 1613 sentences/sec | 0.96 | + +1. The preceding performance is obtained based on ModelArts, the HUAWEI CLOUD AI development platform. The network contains 24 hidden layers, the sequence length is 128 tokens, and the vocabulary contains 21128 tokens. +2. For details about other open source frameworks, see [BERT For TensorFlow](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/LanguageModeling/BERT). \ No newline at end of file diff --git a/docs/source_en/index.rst b/docs/source_en/index.rst index 379ddea08c2fabc0e4cbcf604fad65d38ed6964f..16131f7e380f6d64aa61e1607348360575dcfd4e 100644 --- a/docs/source_en/index.rst +++ b/docs/source_en/index.rst @@ -12,6 +12,7 @@ MindSpore Documentation architecture roadmap + benchmark constraints_on_network_construction operator_list glossary diff --git a/docs/source_zh_cn/benchmark.md b/docs/source_zh_cn/benchmark.md new file mode 100644 index 0000000000000000000000000000000000000000..0b486617ec6bf0e6ffe5bb34af2a61fdcfb71748 --- /dev/null +++ b/docs/source_zh_cn/benchmark.md @@ -0,0 +1,26 @@ +# 基准性能 + +本文介绍MindSpore的基准性能。MindSpore预训练模型可参考[Model Zoo](https://gitee.com/mindspore/mindspore/tree/master/mindspore/model_zoo)。 + +## 训练性能 + +### ResNet + +| Network | Network Type | Dataset | MindSpore Version | Resource                 | Precision | Batch Size | Throughput | Speedup | +| --- | --- | --- | --- | --- | --- | --- | --- | --- | +| ResNet-50 v1.5 | CNN | ImageNet2012 | 0.2.0-alpha | Ascend: 1 * Ascend 910
CPU:24 Cores | Mixed | 32 | 1787 images/sec | - | +| | | | | Ascend: 8 * Ascend 910
CPU:192 Cores | Mixed | 32 | 13689 images/sec | 0.95 | +| | | | | Ascend: 16 * Ascend 910
CPU:384 Cores | Mixed | 32 | 27090 images/sec | 0.94 | + +1. 以上数据基于华为云AI开发平台ModelArts测试获得,是训练过程整体下沉至Ascend 910 AI处理器执行所得的平均性能。 +2. 业界其他开源框架数据可参考:[ResNet-50 v1.5 for TensorFlow](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/Classification/RN50v1.5#nvidia-dgx-2-16x-v100-32g)。 + +### BERT + +| Network | Network Type | Dataset | MindSpore Version | Resource                 | Precision | Batch Size | Throughput | Speedup | +| --- | --- | --- | --- | --- | --- | --- | --- | --- | +| BERT-Large | Attention | zhwiki | 0.2.0-alpha | Ascend: 1 * Ascend 910
CPU:24 Cores | Mixed | 96 | 210 sentences/sec | - | +| | | | | Ascend: 8 * Ascend 910
CPU:192 Cores | Mixed | 96 | 1613 sentences/sec | 0.96 | + +1. 以上数据基于华为云AI开发平台ModelArts测试获得,其中网络包含24个隐藏层,句长为128个token,字典表包含21128个token。 +2. 业界其他开源框架数据可参考:[BERT For TensorFlow](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/LanguageModeling/BERT)。 \ No newline at end of file diff --git a/docs/source_zh_cn/index.rst b/docs/source_zh_cn/index.rst index 0c25192799567148908dd9e709fc73da67a1b1ef..98a53318d71eda9434f5e401e33632ba1b953a64 100644 --- a/docs/source_zh_cn/index.rst +++ b/docs/source_zh_cn/index.rst @@ -12,6 +12,7 @@ MindSpore文档 architecture roadmap + benchmark constraints_on_network_construction operator_list glossary diff --git a/tutorials/source_zh_cn/advanced_use/use_on_the_cloud.md b/tutorials/source_zh_cn/advanced_use/use_on_the_cloud.md index 4a54faf046155aaf3eef3a917b12281d65ca82de..8acca15e188620fe7cd70261e478d979b6b669cb 100644 --- a/tutorials/source_zh_cn/advanced_use/use_on_the_cloud.md +++ b/tutorials/source_zh_cn/advanced_use/use_on_the_cloud.md @@ -25,12 +25,7 @@ ## 概述 -ModelArts是华为云提供的面向开发者的一站式AI开发平台,集成了昇腾AI处理器资源池,用户可以在该平台下体验MindSpore。在ModelArts上使用MindSpore 0.2.0-alpha版本的训练性能如下表所示。 - -| 模型 | 数据集 | MindSpore版本 | 资源 | 处理速度(images/sec) | -| --- | --- | --- | --- | --- | -| ResNet-50 v1.5 | CIFAR-10 | 0.2.0-alpha | Ascend: 1 * Ascend 910
CPU:24 核 96GiB | 1,759.0 | -| ResNet-50 v1.5 | CIFAR-10 | 0.2.0-alpha | Ascend: 8 * Ascend 910
CPU:192 核 768GiB | 13,391.6 | +ModelArts是华为云提供的面向开发者的一站式AI开发平台,集成了昇腾AI处理器资源池,用户可以在该平台下体验MindSpore。 本教程以ResNet-50为例,简要介绍如何在ModelArts使用MindSpore完成训练任务。