未验证 提交 be4b94be 编写于 作者: A Arash Ashari 提交者: GitHub

Sparse attention: updating code tag in documentation (#394)

Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
上级 b1d4bd73
......@@ -240,13 +240,13 @@ Please see the [core API doc](https://deepspeed.readthedocs.io/) for more detail
}
```
## Sparse Attention
DeepSpeed offers sparse attention to support long sequences. Please refer to the [Sparse Attention](/tutorials/sparse_attention/) tutorial.
DeepSpeed offers sparse attention to support long sequences. Please refer to the [Sparse Attention](/tutorials/sparse-attention/) tutorial.
```python
```bash
--deepspeed_sparse_attention
```
```python
```json
"sparse_attention": {
"mode": "fixed",
"block": 16,
......
......@@ -115,7 +115,7 @@ if self.sparse_attention_config is not None and pad_len > 0:
* **Enable sparse attention*: To use DeepSpeed Sparse Attention, you need to enable it in the launcher script through `deepspeed_sparse_attention` argument:
```python
```bash
--deepspeed_sparse_attention
```
......@@ -123,7 +123,7 @@ Please check [our bing_bert runner script](https://github.com/microsoft/DeepSpee
* **Add sparsity config**: The sparsity config can be set through the [DeepSpeed JSON config file](https://github.com/microsoft/DeepSpeedExamples/blob/master/bing_bert/deepspeed_bsz64k_lamb_config_seq128.json). In this example, we have used `fixed` sparsity mode that will be described in [How to config sparsity structures](/tutorials/sparse-attention/#how-to-config-sparsity-structures) section.
```python
```json
"sparse_attention": {
"mode": "fixed",
"block": 16,
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册