readme

1cf74fc3 · zhaoyijin666 · 65001741 · 1cf74fc3 · 1cf74fc3
显示空白变更内容
内联并排

Showing with 17 addition and 6 deletion

youtube_recall/README.cn.md youtube_recall/README.cn.md +9 -3

youtube_recall/README.md youtube_recall/README.md +8 -3

未找到文件。
--- a/youtube_recall/README.cn.md
+++ b/youtube_recall/README.cn.md
@@ -269,13 +269,19 @@ python infer.py --infer_set_path='./data/infer.txt' \
 ```

 ## 在线预测
-在线预测的时候，我们采用近似最近邻（approximate nearest neighbor-ANN）算法直接用用户向量查询最相关的topN个视频内容。由于我们的ANN暂时只支持cosine，而模型是根据内积排序的，两者效果差异太大。
-为此，我们的解决方案是，对用户和视频向量，作SIMPLE-LSH变换\[[4](#参考文献)\]，使内积排序与cosin排序等价。具体如下：
+在线预测的时候，我们采用近似最近邻（approximate nearest neighbor-ANN）算法直接用用户向量查询最相关的topN个视频内容。由于我们的ANN暂时只支持cosine，而模型是根据内积排序的，两者效果差异太大。
+
+为此，我们的解决方案是，对用户和视频向量，作SIMPLE-LSH变换\[[4](#参考文献)\]，使内积排序与cosin排序等价。
+
+具体如下：
 对于视频向量$$\mathbf{v}\in \mathbb{R}^N$$，有$$\left \| \mathbf{v} \right \|\leqslant m$$，变换后的$$\tilde{\mathbf{v}}\in \mathbb{R}^{N+1}$$，
 $$\tilde{\mathbf{v}} = [\frac{\mathbf{v}}{m}; \sqrt{1 -\left \| \mathbf{\frac{\mathbf{v}}{m}{}} \right \|^2}]$$
 对于用户向量$$\mathbf{u}\in \mathbb{R}^N$$，变换后的$$\tilde{\mathbf{u}}\in \mathbb{R}^{N+1}$$，
 $$\tilde{\mathbf{u}} = [\mathbf{u}_{norm}; 0]$$，其中$$\mathbf{u}_{norm}$$是模长归一化后的$$\mathbf{u}$$，
-线上对于一个$$\mathbf{u}$$用内积召回$$\mathbf{v},作上述变换$$\mathbf{u}\rightarrow \tilde{\mathbf{u}}, \mathbf{v}\rightarrow \tilde{\mathbf{v}}$$后，不改变内积排序的顺序。又因为$$\left \| \tilde{\mathbf{u}} \right \|$$和$$\left \| \tilde{\mathbf{v}} \right \|$$都为1，因此$$cos(\tilde{\mathbf{u}} ,\tilde{\mathbf{v}}) = \tilde{\mathbf{u}}\cdot \tilde{\mathbf{v}}$$，就可以兼容ANN用cosin的方式召回了，结果等价。线上使用时，为保留精度，可以不除以$$$m$$,也就变成$\tilde{\mathbf{v}} = [\mathbf{v}; \sqrt{m^2 -\left \| \mathbf{\mathbf{v}} \right \|^2}]$$，排序依然等价。
+
+线上对于一个$$\mathbf{u}$$用内积召回$$\mathbf{v},作上述变换$$\mathbf{u}\rightarrow \tilde{\mathbf{u}}, \mathbf{v}\rightarrow \tilde{\mathbf{v}}$$后，不改变内积排序的顺序。又因为$$\left \| \tilde{\mathbf{u}} \right \|$$和$$\left \| \tilde{\mathbf{v}} \right \|$$都为1，因此$$cos(\tilde{\mathbf{u}} ,\tilde{\mathbf{v}}) = \tilde{\mathbf{u}}\cdot \tilde{\mathbf{v}}$$，就可以兼容ANN用cosin的方式召回了，结果等价。
+
+线上使用时，为保留精度，可以不除以$$$m$$,也就变成$\tilde{\mathbf{v}} = [\mathbf{v}; \sqrt{m^2 -\left \| \mathbf{\mathbf{v}} \right \|^2}]$$，排序依然等价。


 ## 参考文献

--- a/youtube_recall/README.md
+++ b/youtube_recall/README.md
@@ -254,14 +254,19 @@ python infer.py --infer_set_path='./data/infer.txt' \
 ```

 ## Online prediction
-For online prediction，we adopt Approximate Nearest Neighbor(ANN) to directly recall top N mostly likely watch video. However, our ANN system currently only supports cosin sorting, not by inner product sorting, which leads to big effect difference.
+For online prediction，we adopt Approximate Nearest Neighbor(ANN) to directly recall top N most likely watch video. However, our ANN system currently only supports cosin sorting, not by inner product sorting, which leads to big effect difference.
+
 As a result, we sliently modify user and video vectors by a SIMPLE-LSH conversion\[[4](#References)\], so that inner sorting is equivalent to cosin sorting after conversion.
+
 Details as follows:
 For video vector, $$\mathbf{v}\in \mathbb{R}^N$$, we have $$\left \| \mathbf{v} \right \|\leqslant m$$. The modified video vector $$\tilde{\mathbf{v}}\in \mathbb{R}^{N+1}$$,
 $$\tilde{\mathbf{v}} = [\frac{\mathbf{v}}{m}; \sqrt{1 -\left \| \mathbf{\frac{\mathbf{v}}{m}{}} \right \|^2}]$$
 For user vector, $$\mathbf{u}\in \mathbb{R}^N$$, The modified user vector $$\tilde{\mathbf{u}}\in \mathbb{R}^{N+1}$$,
-$$\tilde{\mathbf{u}} = [\mathbf{u}_{norm}; 0]$$，in which $$\mathbf{u}_{norm}$$ is normalized $$\mathbf{u}$$，
-When online predicting, For a $$\mathbf{u}$$, we need recall $$\mathbf{v} by inner product sorting. After conversion, $$\mathbf{u}\rightarrow \tilde{\mathbf{u}}, \mathbf{v}\rightarrow \tilde{\mathbf{v}}$$, the order of inner prodct sorting is unchanged. Since $$\left \| \tilde{\mathbf{u}} \right \|$$ and $$\left \| \tilde{\mathbf{v}} \right \|$$ are both equal to 1, $$cos(\tilde{\mathbf{u}} ,\tilde{\mathbf{v}}) = \tilde{\mathbf{u}}\cdot \tilde{\mathbf{v}}$$, which makes cosin-supported-only ANN system works. And in order to retain precision, we find that $\tilde{\mathbf{v}} = [\mathbf{v}; \sqrt{m^2 -\left \| \mathbf{\mathbf{v}} \right \|^2}]$$ is also equivalent.
+$$\tilde{\mathbf{u}} = [\mathbf{u}_{norm}; 0]$$，in which $$\mathbf{u}_{norm}$$ is normalized $$\mathbf{u}$$.
+
+When online predicting, For a $$\mathbf{u}$$, we need recall $$\mathbf{v} by inner product sorting. After conversion, $$\mathbf{u}\rightarrow \tilde{\mathbf{u}}, \mathbf{v}\rightarrow \tilde{\mathbf{v}}$$, the order of inner prodct sorting is unchanged. Since $$\left \| \tilde{\mathbf{u}} \right \|$$ and $$\left \| \tilde{\mathbf{v}} \right \|$$ are both equal to 1, $$cos(\tilde{\mathbf{u}} ,\tilde{\mathbf{v}}) = \tilde{\mathbf{u}}\cdot \tilde{\mathbf{v}}$$, which makes cosin-supported-only ANN system works.
+
+And in order to retain precision, we find that $\tilde{\mathbf{v}} = [\mathbf{v}; \sqrt{m^2 -\left \| \mathbf{\mathbf{v}} \right \|^2}]$$ is also equivalent.

 ## References
 1. Covington, Paul, Jay Adams, and Emre Sargin. "Deep neural networks for youtube recommendations." Proceedings of the 10th ACM Conference on Recommender Systems. ACM, 2016.