Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
Greenplum
DeepSpeed
提交
11a426ac
D
DeepSpeed
项目概览
Greenplum
/
DeepSpeed
上一次同步 大约 1 年
通知
10
Star
0
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
0
列表
看板
标记
里程碑
合并请求
0
DevOps
流水线
流水线任务
计划
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
D
DeepSpeed
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
0
Issue
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
DevOps
DevOps
流水线
流水线任务
计划
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
流水线任务
提交
Issue看板
体验新版 GitCode,发现更多精彩内容 >>
未验证
提交
11a426ac
编写于
2月 07, 2020
作者:
S
Shaden Smith
提交者:
GitHub
2月 07, 2020
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
Pointing docs to hosted HTML files for core API. (#41)
上级
246a2844
变更
2
隐藏空白更改
内联
并排
Showing
2 changed file
with
21 addition
and
19 deletion
+21
-19
README.md
README.md
+8
-7
docs/features.md
docs/features.md
+13
-12
未找到文件。
README.md
浏览文件 @
11a426ac
...
...
@@ -215,7 +215,7 @@ pre-defined learning rate schedule:
*
**Gradient Averaging**
: in distributed data parallel training,
`backward`
ensures that gradients are averaged across data parallel processes after
training on an
`
effective
_batch_size`
.
training on an
`
train
_batch_size`
.
*
**Loss Scaling**
: in FP16/mixed precision training, the DeepSpeed
engine automatically handles scaling the loss to avoid precision loss in the
...
...
@@ -274,7 +274,7 @@ the `step` value is stored as part of the `client_sd`.
DeepSpeed featureds can be enabled, disabled, or configured using a config JSON
file that should be specified as
`args.deepspeed_config`
. A sample config file
is shown below. For a full set of features see
[
core API
doc
](
../../API/core_api/core_api.md
)
.
doc
](
https://microsoft.github.io/DeepSpeed/docs/htmlfiles/api/full/index.html
)
.
```
json
{
...
...
@@ -363,11 +363,12 @@ deepspeed --include="worker-2:0,1" \
## Further Reading
| Article | Description |
| ---------------------------------------------------------------- | -------------------------------------------- |
|
[
DeepSpeed Features
](
./docs/features.md
)
| DeepSpeed features |
|
[
CIFAR-10 Tutorial
](
./docs/tutorials/CIFAR-10.md
)
| Getting started with CIFAR-10 and DeepSpeed |
|
[
Megatron-LM Tutorial
](
./docs/tutorials/MegatronGPT2Tutorial.md
)
| Train GPT2 with DeepSpeed and Megatron-LM |
| Article | Description |
| ---------------------------------------------------------------------------------------------- | -------------------------------------------- |
|
[
DeepSpeed Features
](
./docs/features.md
)
| DeepSpeed features |
|
[
CIFAR-10 Tutorial
](
./docs/tutorials/CIFAR-10.md
)
| Getting started with CIFAR-10 and DeepSpeed |
|
[
Megatron-LM Tutorial
](
./docs/tutorials/MegatronGPT2Tutorial.md
)
| Train GPT2 with DeepSpeed and Megatron-LM |
|
[
API Documentation
](
https://microsoft.github.io/DeepSpeed/docs/htmlfiles/api/full/index.html
)
| Generated DeepSpeed API documentation |
...
...
docs/features.md
浏览文件 @
11a426ac
...
...
@@ -124,19 +124,19 @@ The DeepSpeed core API consists of just a handful of methods:
*
checkpointing :
`load_checkpoint`
and
`store_checkpoint`
DeepSpeed supports all the features described in this document, via the use of these API,
along with a
`deepspeed_config`
JSON file for enabling and disabling the features.
Please
see
[
core API doc
](
../../API/core_api/core_api.md
)
for more details.
along with a
`deepspeed_config`
JSON file for enabling and disabling the features.
Please see the
[
core API doc
](
https://microsoft.github.io/DeepSpeed/docs/htmlfiles/api/full/index.html
)
for more details.
### Gradient Clipping
DeepSpeed handles gradient clipping under the hood based on the max gradient norm
specified by the user.
See
[
core API doc
](
../../API/core_api/core_api.md
)
for more
details.
specified by the user.
Please see the
[
core API doc
](
https://microsoft.github.io/DeepSpeed/docs/htmlfiles/api/full/index.html
)
for more
details.
### Automatic loss scaling with mixed precision
DeepSpeed internally handles loss scaling for mixed precision training. The parameters
for loss scaling can be specified in the
`deepspeed_config`
JSON file.
See
[
core API
doc
](
../../API/core_api/core_api.md
)
for more details.
for loss scaling can be specified in the
`deepspeed_config`
JSON file.
Please see the
[
core API doc
](
https://microsoft.github.io/DeepSpeed/docs/htmlfiles/api/full/index.html
)
for more details.
## Training Optimizers
...
...
@@ -169,12 +169,12 @@ more details see [ZeRO paper](https://arxiv.org/abs/1910.02054) .
## Training Agnostic Checkpointing
**TODO: API documentation**
DeepSpeed can simplify checkpointing for you regardless of whether you are using data
parallel training, model parallel training, mixed-precision training, a mix of these
three, or using the zero optimizer to enable larger model sizes. See the
[
getting
started
](
../../Onboard/onboard/onboard.md
)
or
[
core API
doc
](
../../API/core_api/core_api.md
)
for details.
three, or using the zero optimizer to enable larger model sizes.
Please see the
[
Getting Started
](
../README.md#getting-started
)
guide
and the
[
core API doc
](
https://microsoft.github.io/DeepSpeed/docs/htmlfiles/api/full/index.html
)
for more details.
## Advanced parameter search
DeepSpeed supports multiple Learning Rate Schedules to enable faster convergence for
...
...
@@ -195,9 +195,10 @@ can automatically handle batch creation appropriately.
## Performance Analysis and Debugging
For performance debugging, DeepSpeed can give you a detailed breakdown of the time spent
in different parts of the training with by simply enabling it in the
`deepspeed_config`
file. See
[
core API doc
](
../../API/core_api/core_api.md
)
.
file.
Please see the
[
core API doc
](
https://microsoft.github.io/DeepSpeed/docs/htmlfiles/api/full/index.html
)
for more details.
```
json
{
"wallclock_breakd
wo
n"
:
true
"wallclock_breakd
ow
n"
:
true
}
```
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录