[docs] update moe features and news post

6cd5f87b · Jeff Rasley · 058ab819 · 6cd5f87b · 6cd5f87b · 6cd5f87b
隐藏空白更改
内联并排

Showing with 11 addition and 2 deletion

README.md README.md +1 -1

docs/_posts/2021-08-18-deepspeed-moe.md docs/_posts/2021-08-18-deepspeed-moe.md +9 -0

docs/index.md docs/index.md +1 -1

未找到文件。
--- a/README.md
+++ b/README.md
@@ -153,7 +153,7 @@ overview](https://www.deepspeed.ai/features/) for descriptions and usage.
  * Stable and 2.6x faster GPT-2 pre-training with 8x/4x larger batch size/learning rate while maintaining token-wise convergence speed
  * Complementary to many other DeepSpeed features
 * [Performance Analysis and Debugging](https://www.deepspeed.ai/features/#performance-analysis-and-debugging)
-
+* [Mixture of Experts (MoE)](https://www.deepspeed.ai/tutorials/mixture-of-experts/)


 # Further Reading

--- a/docs/_posts/2021-08-18-deepspeed-moe.md
+++ b/docs/_posts/2021-08-18-deepspeed-moe.md
+---
+layout: single
+title: "DeepSpeed powers 8x larger MoE model training with high performance"
+excerpt: ""
+categories: news
+link: https://www.microsoft.com/en-us/research/blog/deepspeed-powers-8x-larger-moe-model-training-with-high-performance/
+new_post: true
+date: 2021-08-18 00:00:00
+---
--- a/docs/index.md
+++ b/docs/index.md
@@ -209,7 +209,7 @@ Below we provide a brief feature list, see our detailed [feature overview](https
  * Efficient and robust compressed training
  * Up to 2.5x convergence speedup for pre-training
 * [Performance Analysis and Debugging](https://www.deepspeed.ai/features/#performance-analysis-and-debugging)
-
+* [Mixture of Experts (MoE)](https://www.deepspeed.ai/tutorials/mixture-of-experts/)

 # Contributing
 DeepSpeed welcomes your contributions! Please see our