Merge pull request #165 from astonzhang/zhibo

remove cbow neg sample

Merge pull request #165 from astonzhang/zhibo
remove cbow neg sample
9dd66590 · Aston Zhang · GitHub · e0a850b2 · e1a8d896 · 9dd66590
隐藏空白更改
内联并排

Showing with 0 addition and 8 deletion

chapter_natural-language-processing/word2vec.md chapter_natural-language-processing/word2vec.md +0 -8

未找到文件。
--- a/chapter_natural-language-processing/word2vec.md
+++ b/chapter_natural-language-processing/word2vec.md
@@ -130,15 +130,7 @@ $$ - \text{log} \mathbb{P} (w_o \mid w_c) = -\text{log} \frac{1}{1+\text{exp}(-\

 当我们把$K$取较小值时，每次随机梯度下降的梯度计算开销将由$\mathcal{O}(|\mathcal{V}|)$降为$\mathcal{O}(K)$。

-我们也可以对连续词袋模型进行负采样。有关背景词$w^{(t-m)}, \ldots,  w^{(t-1)},  w^{(t+1)}, \ldots,  w^{(t+m)}$生成中心词$w_c$的损失函数

-$$-\text{log} \mathbb{P}(w^{(t)} \mid  w^{(t-m)}, \ldots,  w^{(t-1)},  w^{(t+1)}, \ldots,  w^{(t+m)})$$
-
-在负采样中可以近似为
-
-$$-\text{log} \frac{1}{1+\text{exp}[-\mathbf{u}_c^\top (\mathbf{v}_{o_1} + \ldots + \mathbf{v}_{o_{2m}}) /(2m)]}  - \sum_{k=1, w_k \sim \mathbb{P}(w)}^K \text{log} \frac{1}{1+\text{exp}[(\mathbf{u}_{i_k}^\top (\mathbf{v}_{o_1} + \ldots + \mathbf{v}_{o_{2m}}) /(2m)]}$$
-
-同样地，当我们把$K$取较小值时，每次随机梯度下降的梯度计算开销将由$\mathcal{O}(|\mathcal{V}|)$降为$\mathcal{O}(K)$。


 ## 层序softmax