Sparse attention: updating code tag in documentation (#394)

Co-authored-by: N Jeff Rasley <jerasley@microsoft.com>

Sparse attention: updating code tag in documentation (#394)
Co-authored-by: N Jeff Rasley <jerasley@microsoft.com>
be4b94be · Arash Ashari · GitHub · b1d4bd73 · be4b94be · be4b94be
显示空白变更内容
内联并排

Showing with 5 addition and 5 deletion

docs/_pages/features.md docs/_pages/features.md +3 -3

docs/_tutorials/sparse-attention.md docs/_tutorials/sparse-attention.md +2 -2

未找到文件。
--- a/docs/_pages/features.md
+++ b/docs/_pages/features.md
@@ -240,13 +240,13 @@ Please see the [core API doc](https://deepspeed.readthedocs.io/) for more detail
 }
 ```
 ## Sparse Attention
-DeepSpeed offers sparse attention to support long sequences. Please refer to the [Sparse Attention](/tutorials/sparse_attention/) tutorial.
+DeepSpeed offers sparse attention to support long sequences. Please refer to the [Sparse Attention](/tutorials/sparse-attention/) tutorial.

-```python
+```bash
 --deepspeed_sparse_attention
 ```

-```python
+```json
 "sparse_attention": {
    "mode": "fixed",
    "block": 16,

--- a/docs/_tutorials/sparse-attention.md
+++ b/docs/_tutorials/sparse-attention.md
@@ -115,7 +115,7 @@ if self.sparse_attention_config is not None and pad_len > 0:

 * **Enable sparse attention*: To use DeepSpeed Sparse Attention, you need to enable it in the launcher script through `deepspeed_sparse_attention` argument:

-```python
+```bash
 --deepspeed_sparse_attention
 ```

@@ -123,7 +123,7 @@ Please check [our bing_bert runner script](https://github.com/microsoft/DeepSpee

 * **Add sparsity config**: The sparsity config can be set through the [DeepSpeed JSON config file](https://github.com/microsoft/DeepSpeedExamples/blob/master/bing_bert/deepspeed_bsz64k_lamb_config_seq128.json). In this example, we have used `fixed` sparsity mode that will be described in [How to config sparsity structures](/tutorials/sparse-attention/#how-to-config-sparsity-structures) section.

-```python
+```json
 "sparse_attention": {
    "mode": "fixed",
    "block": 16,