Deploy to GitHub Pages: 7384966f

8db295fd · Travis CI · b489cf6c · 8db295fd · 8db295fd · 8db295fd
6 changed file
--- a/develop/doc/_sources/tutorials/embedding_model/index_en.md.txt
+++ b/develop/doc/_sources/tutorials/embedding_model/index_en.md.txt
@@ -6,9 +6,10 @@ We thank @lipeng for the pull request that defined the model schemas and pretrai
 ## Introduction ###
 ### Chinese Word Dictionary ###
-Our Chinese-word dictionary is created on Baidu ZhiDao and Baidu Baike by using in-house word segmentor. For example, the participle of "《红楼梦》" is "《"，"红楼梦"，"》"，and "《红楼梦》". Our dictionary (using UTF-8 format) has has two columns: word and its frequency. The total word count is 3206325, including 3 special token:
+Our Chinese-word dictionary is created on Baidu ZhiDao and Baidu Baike by using in-house word segmentor. For example, the participle of "《红楼梦》" is "《"，"红楼梦"，"》"，and "《红楼梦》". Our dictionary (using UTF-8 format) has has two columns: word and its frequency. The total word count is 3206326, including 4 special token:
  - `<s>`: the start of a sequence
  - `<e>`: the end of a sequence
+  - `PALCEHOLDER_JUST_IGNORE_THE_EMBEDDING`: a placeholder, just ignore it and its embedding
  - `<unk>`: a word not included in dictionary
 ### Pretrained Chinese Word Embedding Model ###

--- a/develop/doc/searchindex.js
+++ b/develop/doc/searchindex.js
--- a/develop/doc/tutorials/embedding_model/index_en.html
+++ b/develop/doc/tutorials/embedding_model/index_en.html
@@ -233,10 +233,11 @@
 <span id="introduction"></span><h2>Introduction<a class="headerlink" href="#introduction" title="Permalink to this headline">¶</a></h2>
 <div class="section" id="chinese-word-dictionary">
 <span id="chinese-word-dictionary"></span><h3>Chinese Word Dictionary<a class="headerlink" href="#chinese-word-dictionary" title="Permalink to this headline">¶</a></h3>
-<p>Our Chinese-word dictionary is created on Baidu ZhiDao and Baidu Baike by using in-house word segmentor. For example, the participle of &#8220;《红楼梦》&#8221; is &#8220;《&#8221;，&#8221;红楼梦&#8221;，&#8221;》&#8221;，and &#8220;《红楼梦》&#8221;. Our dictionary (using UTF-8 format) has has two columns: word and its frequency. The total word count is 3206325, including 3 special token:</p>
+<p>Our Chinese-word dictionary is created on Baidu ZhiDao and Baidu Baike by using in-house word segmentor. For example, the participle of &#8220;《红楼梦》&#8221; is &#8220;《&#8221;，&#8221;红楼梦&#8221;，&#8221;》&#8221;，and &#8220;《红楼梦》&#8221;. Our dictionary (using UTF-8 format) has has two columns: word and its frequency. The total word count is 3206326, including 4 special token:</p>
 <ul class="simple">
 <li><code class="docutils literal"><span class="pre">&lt;s&gt;</span></code>: the start of a sequence</li>
 <li><code class="docutils literal"><span class="pre">&lt;e&gt;</span></code>: the end of a sequence</li>
+<li><code class="docutils literal"><span class="pre">PALCEHOLDER_JUST_IGNORE_THE_EMBEDDING</span></code>: a placeholder, just ignore it and its embedding</li>
 <li><code class="docutils literal"><span class="pre">&lt;unk&gt;</span></code>: a word not included in dictionary</li>
 </ul>
 </div>

--- a/develop/doc_cn/_sources/tutorials/embedding_model/index_cn.md.txt
+++ b/develop/doc_cn/_sources/tutorials/embedding_model/index_cn.md.txt
@@ -6,9 +6,10 @@
 ## 介绍 ###
 ### 中文字典 ###
-我们的字典使用内部的分词工具对百度知道和百度百科的语料进行分词后产生。分词风格如下： "《红楼梦》"将被分为 "《"，"红楼梦"，"》"，和 "《红楼梦》"。字典采用UTF8编码，输出有2列：词本身和词频。字典共包含 3206325个词和3个特殊标记：
+我们的字典使用内部的分词工具对百度知道和百度百科的语料进行分词后产生。分词风格如下： "《红楼梦》"将被分为 "《"，"红楼梦"，"》"，和 "《红楼梦》"。字典采用UTF8编码，输出有2列：词本身和词频。字典共包含 3206326个词和4个特殊标记：
  - `<s>`: 分词序列的开始
  - `<e>`: 分词序列的结束
+  - `PALCEHOLDER_JUST_IGNORE_THE_EMBEDDING`: 占位符，没有实际意义
  - `<unk>`: 未知词
 ### 中文词向量的预训练模型 ###

--- a/develop/doc_cn/searchindex.js
+++ b/develop/doc_cn/searchindex.js
--- a/develop/doc_cn/tutorials/embedding_model/index_cn.html
+++ b/develop/doc_cn/tutorials/embedding_model/index_cn.html
@@ -240,10 +240,11 @@
 <span id="id2"></span><h2>介绍<a class="headerlink" href="#" title="永久链接至标题">¶</a></h2>
 <div class="section" id="">
 <span id="id3"></span><h3>中文字典<a class="headerlink" href="#" title="永久链接至标题">¶</a></h3>
-<p>我们的字典使用内部的分词工具对百度知道和百度百科的语料进行分词后产生。分词风格如下： &#8220;《红楼梦》&#8221;将被分为 &#8220;《&#8221;，&#8221;红楼梦&#8221;，&#8221;》&#8221;，和 &#8220;《红楼梦》&#8221;。字典采用UTF8编码，输出有2列：词本身和词频。字典共包含 3206325个词和3个特殊标记：</p>
+<p>我们的字典使用内部的分词工具对百度知道和百度百科的语料进行分词后产生。分词风格如下： &#8220;《红楼梦》&#8221;将被分为 &#8220;《&#8221;，&#8221;红楼梦&#8221;，&#8221;》&#8221;，和 &#8220;《红楼梦》&#8221;。字典采用UTF8编码，输出有2列：词本身和词频。字典共包含 3206326个词和4个特殊标记：</p>
 <ul class="simple">
 <li><code class="docutils literal"><span class="pre">&lt;s&gt;</span></code>: 分词序列的开始</li>
 <li><code class="docutils literal"><span class="pre">&lt;e&gt;</span></code>: 分词序列的结束</li>
+<li><code class="docutils literal"><span class="pre">PALCEHOLDER_JUST_IGNORE_THE_EMBEDDING</span></code>: 占位符，没有实际意义</li>
 <li><code class="docutils literal"><span class="pre">&lt;unk&gt;</span></code>: 未知词</li>
 </ul>
 </div>