diff --git a/docs/_pages/features.md b/docs/_pages/features.md
index 5eb341b75e15fffb6225dc419e34a489db99bf0a..344c72222fb1be7178be1c7c7453a2641f2657e6 100755
--- a/docs/_pages/features.md
+++ b/docs/_pages/features.md
@@ -240,13 +240,13 @@ Please see the [core API doc](https://deepspeed.readthedocs.io/) for more detail
 }
 ```
 ## Sparse Attention
-DeepSpeed offers sparse attention to support long sequences. Please refer to the [Sparse Attention](/tutorials/sparse_attention/) tutorial.
+DeepSpeed offers sparse attention to support long sequences. Please refer to the [Sparse Attention](/tutorials/sparse-attention/) tutorial.
 
-```python
+```bash
 --deepspeed_sparse_attention
 ```
 
-```python
+```json
 "sparse_attention": {
     "mode": "fixed",
     "block": 16,
diff --git a/docs/_tutorials/sparse-attention.md b/docs/_tutorials/sparse-attention.md
index 5e4027150ef04ce7aa88493c0d5ae3ac731b24c0..6279fe7c768d2f9361752d8671f4f273f2de16bd 100644
--- a/docs/_tutorials/sparse-attention.md
+++ b/docs/_tutorials/sparse-attention.md
@@ -115,7 +115,7 @@ if self.sparse_attention_config is not None and pad_len > 0:
 
 * **Enable sparse attention*: To use DeepSpeed Sparse Attention, you need to enable it in the launcher script through `deepspeed_sparse_attention` argument:
 
-```python
+```bash
 --deepspeed_sparse_attention
 ```
 
@@ -123,7 +123,7 @@ Please check [our bing_bert runner script](https://github.com/microsoft/DeepSpee
 
 * **Add sparsity config**: The sparsity config can be set through the [DeepSpeed JSON config file](https://github.com/microsoft/DeepSpeedExamples/blob/master/bing_bert/deepspeed_bsz64k_lamb_config_seq128.json). In this example, we have used `fixed` sparsity mode that will be described in [How to config sparsity structures](/tutorials/sparse-attention/#how-to-config-sparsity-structures) section.
 
-```python
+```json
 "sparse_attention": {
     "mode": "fixed",
     "block": 16,