Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
Greenplum
Pytorch Widedeep
提交
3227f159
P
Pytorch Widedeep
项目概览
Greenplum
/
Pytorch Widedeep
10 个月 前同步成功
通知
9
Star
1
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
0
列表
看板
标记
里程碑
合并请求
0
DevOps
流水线
流水线任务
计划
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
Pytorch Widedeep
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
0
Issue
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
DevOps
DevOps
流水线
流水线任务
计划
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
流水线任务
提交
Issue看板
前往新版Gitcode,体验更适合开发者的 AI 搜索 >>
提交
3227f159
编写于
8月 20, 2022
作者:
J
Javier Rodriguez Zaurin
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
Review docs. Ready to check installations
上级
c23fed86
变更
40
展开全部
隐藏空白更改
内联
并排
Showing
40 changed file
with
1904 addition
and
619 deletion
+1904
-619
mkdocs/site/objects.inv
mkdocs/site/objects.inv
+0
-0
mkdocs/site/pytorch-widedeep/losses.html
mkdocs/site/pytorch-widedeep/losses.html
+601
-0
mkdocs/site/pytorch-widedeep/losses.md
mkdocs/site/pytorch-widedeep/losses.md
+6
-0
mkdocs/site/pytorch-widedeep/model_components.html
mkdocs/site/pytorch-widedeep/model_components.html
+760
-222
mkdocs/site/pytorch-widedeep/model_components.md
mkdocs/site/pytorch-widedeep/model_components.md
+8
-7
mkdocs/site/pytorch-widedeep/preprocessing.html
mkdocs/site/pytorch-widedeep/preprocessing.html
+55
-48
mkdocs/site/pytorch-widedeep/self_supervised_pretraining.html
...cs/site/pytorch-widedeep/self_supervised_pretraining.html
+3
-1
mkdocs/site/pytorch-widedeep/self_supervised_pretraining.md
mkdocs/site/pytorch-widedeep/self_supervised_pretraining.md
+4
-1
mkdocs/site/pytorch-widedeep/trainer.html
mkdocs/site/pytorch-widedeep/trainer.html
+177
-123
mkdocs/site/pytorch-widedeep/utils/fastai_transforms.html
mkdocs/site/pytorch-widedeep/utils/fastai_transforms.html
+62
-58
mkdocs/site/pytorch-widedeep/utils/index.html
mkdocs/site/pytorch-widedeep/utils/index.html
+2
-0
mkdocs/site/pytorch-widedeep/utils/index.md
mkdocs/site/pytorch-widedeep/utils/index.md
+3
-0
mkdocs/site/search/search_index.json
mkdocs/site/search/search_index.json
+1
-1
mkdocs/site/sitemap.xml
mkdocs/site/sitemap.xml
+36
-36
mkdocs/site/sitemap.xml.gz
mkdocs/site/sitemap.xml.gz
+0
-0
mkdocs/sources/pytorch-widedeep/losses.md
mkdocs/sources/pytorch-widedeep/losses.md
+6
-0
mkdocs/sources/pytorch-widedeep/model_components.md
mkdocs/sources/pytorch-widedeep/model_components.md
+8
-7
mkdocs/sources/pytorch-widedeep/self_supervised_pretraining.md
...s/sources/pytorch-widedeep/self_supervised_pretraining.md
+4
-1
mkdocs/sources/pytorch-widedeep/utils/index.md
mkdocs/sources/pytorch-widedeep/utils/index.md
+3
-0
pytorch_widedeep/losses.py
pytorch_widedeep/losses.py
+33
-17
pytorch_widedeep/models/fds_layer.py
pytorch_widedeep/models/fds_layer.py
+5
-0
pytorch_widedeep/models/image/vision.py
pytorch_widedeep/models/image/vision.py
+5
-5
pytorch_widedeep/models/tabular/mlp/context_attention_mlp.py
pytorch_widedeep/models/tabular/mlp/context_attention_mlp.py
+4
-4
pytorch_widedeep/models/tabular/mlp/self_attention_mlp.py
pytorch_widedeep/models/tabular/mlp/self_attention_mlp.py
+8
-5
pytorch_widedeep/models/tabular/mlp/tab_mlp.py
pytorch_widedeep/models/tabular/mlp/tab_mlp.py
+5
-7
pytorch_widedeep/models/tabular/resnet/tab_resnet.py
pytorch_widedeep/models/tabular/resnet/tab_resnet.py
+6
-7
pytorch_widedeep/models/tabular/tabnet/tab_net.py
pytorch_widedeep/models/tabular/tabnet/tab_net.py
+5
-5
pytorch_widedeep/models/tabular/transformers/ft_transformer.py
...ch_widedeep/models/tabular/transformers/ft_transformer.py
+5
-5
pytorch_widedeep/models/tabular/transformers/saint.py
pytorch_widedeep/models/tabular/transformers/saint.py
+5
-5
pytorch_widedeep/models/tabular/transformers/tab_fastformer.py
...ch_widedeep/models/tabular/transformers/tab_fastformer.py
+5
-5
pytorch_widedeep/models/tabular/transformers/tab_perceiver.py
...rch_widedeep/models/tabular/transformers/tab_perceiver.py
+5
-5
pytorch_widedeep/models/tabular/transformers/tab_transformer.py
...h_widedeep/models/tabular/transformers/tab_transformer.py
+5
-5
pytorch_widedeep/models/text/attentive_rnn.py
pytorch_widedeep/models/text/attentive_rnn.py
+1
-4
pytorch_widedeep/models/text/basic_rnn.py
pytorch_widedeep/models/text/basic_rnn.py
+4
-4
pytorch_widedeep/models/text/stacked_attentive_rnn.py
pytorch_widedeep/models/text/stacked_attentive_rnn.py
+4
-4
pytorch_widedeep/models/wide_deep.py
pytorch_widedeep/models/wide_deep.py
+8
-1
pytorch_widedeep/preprocessing/tab_preprocessor.py
pytorch_widedeep/preprocessing/tab_preprocessor.py
+9
-7
pytorch_widedeep/preprocessing/wide_preprocessor.py
pytorch_widedeep/preprocessing/wide_preprocessor.py
+3
-1
pytorch_widedeep/training/trainer.py
pytorch_widedeep/training/trainer.py
+30
-13
pytorch_widedeep/utils/fastai_transforms.py
pytorch_widedeep/utils/fastai_transforms.py
+10
-5
未找到文件。
mkdocs/site/objects.inv
浏览文件 @
3227f159
无法预览此类型文件
mkdocs/site/pytorch-widedeep/losses.html
浏览文件 @
3227f159
此差异已折叠。
点击以展开。
mkdocs/site/pytorch-widedeep/losses.md
浏览文件 @
3227f159
...
...
@@ -50,3 +50,9 @@ from pytorch_widedeep.losses import FocalLoss
::: pytorch_widedeep.losses.FocalR_RMSELoss
::: pytorch_widedeep.losses.HuberLoss
::: pytorch_widedeep.losses.InfoNCELoss
::: pytorch_widedeep.losses.DenoisingLoss
::: pytorch_widedeep.losses.EncoderDecoderLoss
mkdocs/site/pytorch-widedeep/model_components.html
浏览文件 @
3227f159
此差异已折叠。
点击以展开。
mkdocs/site/pytorch-widedeep/model_components.md
浏览文件 @
3227f159
...
...
@@ -3,7 +3,9 @@
This module contains the models that can be used as the four main components
that will comprise a Wide and Deep model (
``wide``
,
``deeptabular``
,
``deeptext``
,
``deepimage``
), as well as the
``WideDeep``
"constructor"
class. Note that each of the four components can be used independently.
class. Note that each of the four components can be used independently. It
also contains all the documentation for the models that can be used for
self-supervised pre-training with tabular data.
::: pytorch_widedeep.models.tabular.linear.wide.Wide
...
...
@@ -82,16 +84,15 @@ class. Note that each of the four components can be used independently.
:information_source:
**NOTE**
: when we started developing the library we
thought that combining Deep Learning architectures for tabular data, with
CNN-based architectures (pretrained or not) for images and Transformer-based
architectures for text would be an _'overkill'_(also, pretrained
architectures for text would be an _'overkill'_
(also, pretrained
transformer-based models were not as readily available as they are today).
Therefore, at that time we made the decision of including in the library
simple RNN-based architectures for the text dataset. A lot has passed since
then and it is our intention to integrate this library with the
[Hugginface's Transformers library]
(https://huggingface.co/docs/transformers/main/en/index) in the near future.
Nonetheless, note that it is still possible to use any custom model as the
`deeptext`
component using this library. Please, see the example section in
this documentation for details
[
Hugginface's Transformers library
](
https://huggingface.co/docs/transformers/main/en/index
)
in the near future. Nonetheless, note that it is still possible to use any
custom model as the
`deeptext`
component using this library. Please, see the
example section in this documentation for details
::: pytorch_widedeep.models.text.attentive_rnn.BasicRNN
selection:
...
...
mkdocs/site/pytorch-widedeep/preprocessing.html
浏览文件 @
3227f159
此差异已折叠。
点击以展开。
mkdocs/site/pytorch-widedeep/self_supervised_pretraining.html
浏览文件 @
3227f159
...
...
@@ -1078,7 +1078,9 @@
user to self-suerpvised pre-training for all tabular models in the library
with the exception of the
<code>
TabPerceiver
</code>
(this is a particular model and
self-supervised pre-training requires some adjustments that will be
implemented in future versions).
</p>
implemented in future versions). Please see the examples folder in the repo
or the examples section in the docs for details on how to use self-supervised
pre-training with this library.
</p>
<p>
The two routines implemented are illustrated in the figures below. The first
is from
<a
href=
"https://arxiv.org/abs/1908.07442"
>
TabNet: Attentive Interpretable Tabular Learning
</a>
.
It is a
<em>
'standard'
</em>
encoder-decoder architecture and and is designed here for
...
...
mkdocs/site/pytorch-widedeep/self_supervised_pretraining.md
浏览文件 @
3227f159
...
...
@@ -4,7 +4,10 @@ In this library we have implemented two methods or routines that allow the
user to self-suerpvised pre-training for all tabular models in the library
with the exception of the
`TabPerceiver`
(this is a particular model and
self-supervised pre-training requires some adjustments that will be
implemented in future versions).
implemented in future versions). Please see the examples folder in the repo
or the examples section in the docs for details on how to use self-supervised
pre-training with this library.
The two routines implemented are illustrated in the figures below. The first
is from
[
TabNet: Attentive Interpretable Tabular Learning
](
https://arxiv.org/abs/1908.07442
)
.
...
...
mkdocs/site/pytorch-widedeep/trainer.html
浏览文件 @
3227f159
此差异已折叠。
点击以展开。
mkdocs/site/pytorch-widedeep/utils/fastai_transforms.html
浏览文件 @
3227f159
此差异已折叠。
点击以展开。
mkdocs/site/pytorch-widedeep/utils/index.html
浏览文件 @
3227f159
...
...
@@ -1035,6 +1035,8 @@ the classes and functions discussed here are available directly from the
<code>
deeptabular_utils
</code>
submodule can be imported as:
</p>
<div
class=
"highlight"
><pre><span></span><code>
from pytorch_widedeep.utils import LabelEncoder
</code></pre></div>
<p>
These are classes and functions that are internally used in the library. We
include them here in case the user finds them useful for other purposes.
</p>
</article>
...
...
mkdocs/site/pytorch-widedeep/utils/index.md
浏览文件 @
3227f159
...
...
@@ -10,3 +10,6 @@ the classes and functions discussed here are available directly from the
```
from pytorch_widedeep.utils import LabelEncoder
```
These are classes and functions that are internally used in the library. We
include them here in case the user finds them useful for other purposes.
mkdocs/site/search/search_index.json
浏览文件 @
3227f159
此差异已折叠。
点击以展开。
mkdocs/site/sitemap.xml
浏览文件 @
3227f159
...
...
@@ -2,182 +2,182 @@
<urlset
xmlns=
"http://www.sitemaps.org/schemas/sitemap/0.9"
>
<url>
<loc>
https://pytorch-widedeep.readthedocs.io/index.html
</loc>
<lastmod>
2022-08-
17
</lastmod>
<lastmod>
2022-08-
20
</lastmod>
<changefreq>
daily
</changefreq>
</url>
<url>
<loc>
https://pytorch-widedeep.readthedocs.io/contributing.html
</loc>
<lastmod>
2022-08-
17
</lastmod>
<lastmod>
2022-08-
20
</lastmod>
<changefreq>
daily
</changefreq>
</url>
<url>
<loc>
https://pytorch-widedeep.readthedocs.io/installation.html
</loc>
<lastmod>
2022-08-
17
</lastmod>
<lastmod>
2022-08-
20
</lastmod>
<changefreq>
daily
</changefreq>
</url>
<url>
<loc>
https://pytorch-widedeep.readthedocs.io/quick_start.html
</loc>
<lastmod>
2022-08-
17
</lastmod>
<lastmod>
2022-08-
20
</lastmod>
<changefreq>
daily
</changefreq>
</url>
<url>
<loc>
https://pytorch-widedeep.readthedocs.io/examples/01_Preprocessors_and_utils.html
</loc>
<lastmod>
2022-08-
17
</lastmod>
<lastmod>
2022-08-
20
</lastmod>
<changefreq>
daily
</changefreq>
</url>
<url>
<loc>
https://pytorch-widedeep.readthedocs.io/examples/02_model_components.html
</loc>
<lastmod>
2022-08-
17
</lastmod>
<lastmod>
2022-08-
20
</lastmod>
<changefreq>
daily
</changefreq>
</url>
<url>
<loc>
https://pytorch-widedeep.readthedocs.io/examples/03_Binary_Classification_with_Defaults.html
</loc>
<lastmod>
2022-08-
17
</lastmod>
<lastmod>
2022-08-
20
</lastmod>
<changefreq>
daily
</changefreq>
</url>
<url>
<loc>
https://pytorch-widedeep.readthedocs.io/examples/04_regression_with_images_and_text.html
</loc>
<lastmod>
2022-08-
17
</lastmod>
<lastmod>
2022-08-
20
</lastmod>
<changefreq>
daily
</changefreq>
</url>
<url>
<loc>
https://pytorch-widedeep.readthedocs.io/examples/05_save_and_load_model_and_artifacts.html
</loc>
<lastmod>
2022-08-
17
</lastmod>
<lastmod>
2022-08-
20
</lastmod>
<changefreq>
daily
</changefreq>
</url>
<url>
<loc>
https://pytorch-widedeep.readthedocs.io/examples/06_fineTune_and_warmup.html
</loc>
<lastmod>
2022-08-
17
</lastmod>
<lastmod>
2022-08-
20
</lastmod>
<changefreq>
daily
</changefreq>
</url>
<url>
<loc>
https://pytorch-widedeep.readthedocs.io/examples/07_Custom_Components.html
</loc>
<lastmod>
2022-08-
17
</lastmod>
<lastmod>
2022-08-
20
</lastmod>
<changefreq>
daily
</changefreq>
</url>
<url>
<loc>
https://pytorch-widedeep.readthedocs.io/examples/08_custom_dataLoader_imbalanced_dataset.html
</loc>
<lastmod>
2022-08-
17
</lastmod>
<lastmod>
2022-08-
20
</lastmod>
<changefreq>
daily
</changefreq>
</url>
<url>
<loc>
https://pytorch-widedeep.readthedocs.io/examples/09_extracting_embeddings.html
</loc>
<lastmod>
2022-08-
17
</lastmod>
<lastmod>
2022-08-
20
</lastmod>
<changefreq>
daily
</changefreq>
</url>
<url>
<loc>
https://pytorch-widedeep.readthedocs.io/examples/11_auc_multiclass.html
</loc>
<lastmod>
2022-08-
17
</lastmod>
<lastmod>
2022-08-
20
</lastmod>
<changefreq>
daily
</changefreq>
</url>
<url>
<loc>
https://pytorch-widedeep.readthedocs.io/examples/12_ZILNLoss_origkeras_vs_pytorch_multimodal.html
</loc>
<lastmod>
2022-08-
17
</lastmod>
<lastmod>
2022-08-
20
</lastmod>
<changefreq>
daily
</changefreq>
</url>
<url>
<loc>
https://pytorch-widedeep.readthedocs.io/examples/13_Model_Uncertainty_prediction.html
</loc>
<lastmod>
2022-08-
17
</lastmod>
<lastmod>
2022-08-
20
</lastmod>
<changefreq>
daily
</changefreq>
</url>
<url>
<loc>
https://pytorch-widedeep.readthedocs.io/examples/14_bayesian_models.html
</loc>
<lastmod>
2022-08-
17
</lastmod>
<lastmod>
2022-08-
20
</lastmod>
<changefreq>
daily
</changefreq>
</url>
<url>
<loc>
https://pytorch-widedeep.readthedocs.io/examples/15_DIR-LDS_and_FDS.html
</loc>
<lastmod>
2022-08-
17
</lastmod>
<lastmod>
2022-08-
20
</lastmod>
<changefreq>
daily
</changefreq>
</url>
<url>
<loc>
https://pytorch-widedeep.readthedocs.io/examples/16_Self_Supervised_Pretraning_pt1.html
</loc>
<lastmod>
2022-08-
17
</lastmod>
<lastmod>
2022-08-
20
</lastmod>
<changefreq>
daily
</changefreq>
</url>
<url>
<loc>
https://pytorch-widedeep.readthedocs.io/examples/16_Self_Supervised_Pretraning_pt2.html
</loc>
<lastmod>
2022-08-
17
</lastmod>
<lastmod>
2022-08-
20
</lastmod>
<changefreq>
daily
</changefreq>
</url>
<url>
<loc>
https://pytorch-widedeep.readthedocs.io/pytorch-widedeep/bayesian_models.html
</loc>
<lastmod>
2022-08-
17
</lastmod>
<lastmod>
2022-08-
20
</lastmod>
<changefreq>
daily
</changefreq>
</url>
<url>
<loc>
https://pytorch-widedeep.readthedocs.io/pytorch-widedeep/bayesian_trainer.html
</loc>
<lastmod>
2022-08-
17
</lastmod>
<lastmod>
2022-08-
20
</lastmod>
<changefreq>
daily
</changefreq>
</url>
<url>
<loc>
https://pytorch-widedeep.readthedocs.io/pytorch-widedeep/callbacks.html
</loc>
<lastmod>
2022-08-
17
</lastmod>
<lastmod>
2022-08-
20
</lastmod>
<changefreq>
daily
</changefreq>
</url>
<url>
<loc>
https://pytorch-widedeep.readthedocs.io/pytorch-widedeep/dataloaders.html
</loc>
<lastmod>
2022-08-
17
</lastmod>
<lastmod>
2022-08-
20
</lastmod>
<changefreq>
daily
</changefreq>
</url>
<url>
<loc>
https://pytorch-widedeep.readthedocs.io/pytorch-widedeep/losses.html
</loc>
<lastmod>
2022-08-
17
</lastmod>
<lastmod>
2022-08-
20
</lastmod>
<changefreq>
daily
</changefreq>
</url>
<url>
<loc>
https://pytorch-widedeep.readthedocs.io/pytorch-widedeep/metrics.html
</loc>
<lastmod>
2022-08-
17
</lastmod>
<lastmod>
2022-08-
20
</lastmod>
<changefreq>
daily
</changefreq>
</url>
<url>
<loc>
https://pytorch-widedeep.readthedocs.io/pytorch-widedeep/model_components.html
</loc>
<lastmod>
2022-08-
17
</lastmod>
<lastmod>
2022-08-
20
</lastmod>
<changefreq>
daily
</changefreq>
</url>
<url>
<loc>
https://pytorch-widedeep.readthedocs.io/pytorch-widedeep/preprocessing.html
</loc>
<lastmod>
2022-08-
17
</lastmod>
<lastmod>
2022-08-
20
</lastmod>
<changefreq>
daily
</changefreq>
</url>
<url>
<loc>
https://pytorch-widedeep.readthedocs.io/pytorch-widedeep/self_supervised_pretraining.html
</loc>
<lastmod>
2022-08-
17
</lastmod>
<lastmod>
2022-08-
20
</lastmod>
<changefreq>
daily
</changefreq>
</url>
<url>
<loc>
https://pytorch-widedeep.readthedocs.io/pytorch-widedeep/tab2vec.html
</loc>
<lastmod>
2022-08-
17
</lastmod>
<lastmod>
2022-08-
20
</lastmod>
<changefreq>
daily
</changefreq>
</url>
<url>
<loc>
https://pytorch-widedeep.readthedocs.io/pytorch-widedeep/trainer.html
</loc>
<lastmod>
2022-08-
17
</lastmod>
<lastmod>
2022-08-
20
</lastmod>
<changefreq>
daily
</changefreq>
</url>
<url>
<loc>
https://pytorch-widedeep.readthedocs.io/pytorch-widedeep/utils/index.html
</loc>
<lastmod>
2022-08-
17
</lastmod>
<lastmod>
2022-08-
20
</lastmod>
<changefreq>
daily
</changefreq>
</url>
<url>
<loc>
https://pytorch-widedeep.readthedocs.io/pytorch-widedeep/utils/deeptabular_utils.html
</loc>
<lastmod>
2022-08-
17
</lastmod>
<lastmod>
2022-08-
20
</lastmod>
<changefreq>
daily
</changefreq>
</url>
<url>
<loc>
https://pytorch-widedeep.readthedocs.io/pytorch-widedeep/utils/fastai_transforms.html
</loc>
<lastmod>
2022-08-
17
</lastmod>
<lastmod>
2022-08-
20
</lastmod>
<changefreq>
daily
</changefreq>
</url>
<url>
<loc>
https://pytorch-widedeep.readthedocs.io/pytorch-widedeep/utils/image_utils.html
</loc>
<lastmod>
2022-08-
17
</lastmod>
<lastmod>
2022-08-
20
</lastmod>
<changefreq>
daily
</changefreq>
</url>
<url>
<loc>
https://pytorch-widedeep.readthedocs.io/pytorch-widedeep/utils/text_utils.html
</loc>
<lastmod>
2022-08-
17
</lastmod>
<lastmod>
2022-08-
20
</lastmod>
<changefreq>
daily
</changefreq>
</url>
</urlset>
\ No newline at end of file
mkdocs/site/sitemap.xml.gz
浏览文件 @
3227f159
无法预览此类型文件
mkdocs/sources/pytorch-widedeep/losses.md
浏览文件 @
3227f159
...
...
@@ -50,3 +50,9 @@ from pytorch_widedeep.losses import FocalLoss
::: pytorch_widedeep.losses.FocalR_RMSELoss
::: pytorch_widedeep.losses.HuberLoss
::: pytorch_widedeep.losses.InfoNCELoss
::: pytorch_widedeep.losses.DenoisingLoss
::: pytorch_widedeep.losses.EncoderDecoderLoss
mkdocs/sources/pytorch-widedeep/model_components.md
浏览文件 @
3227f159
...
...
@@ -3,7 +3,9 @@
This module contains the models that can be used as the four main components
that will comprise a Wide and Deep model (
``wide``
,
``deeptabular``
,
``deeptext``
,
``deepimage``
), as well as the
``WideDeep``
"constructor"
class. Note that each of the four components can be used independently.
class. Note that each of the four components can be used independently. It
also contains all the documentation for the models that can be used for
self-supervised pre-training with tabular data.
::: pytorch_widedeep.models.tabular.linear.wide.Wide
...
...
@@ -82,16 +84,15 @@ class. Note that each of the four components can be used independently.
:information_source:
**NOTE**
: when we started developing the library we
thought that combining Deep Learning architectures for tabular data, with
CNN-based architectures (pretrained or not) for images and Transformer-based
architectures for text would be an _'overkill'_(also, pretrained
architectures for text would be an _'overkill'_
(also, pretrained
transformer-based models were not as readily available as they are today).
Therefore, at that time we made the decision of including in the library
simple RNN-based architectures for the text dataset. A lot has passed since
then and it is our intention to integrate this library with the
[Hugginface's Transformers library]
(https://huggingface.co/docs/transformers/main/en/index) in the near future.
Nonetheless, note that it is still possible to use any custom model as the
`deeptext`
component using this library. Please, see the example section in
this documentation for details
[
Hugginface's Transformers library
](
https://huggingface.co/docs/transformers/main/en/index
)
in the near future. Nonetheless, note that it is still possible to use any
custom model as the
`deeptext`
component using this library. Please, see the
example section in this documentation for details
::: pytorch_widedeep.models.text.attentive_rnn.BasicRNN
selection:
...
...
mkdocs/sources/pytorch-widedeep/self_supervised_pretraining.md
浏览文件 @
3227f159
...
...
@@ -4,7 +4,10 @@ In this library we have implemented two methods or routines that allow the
user to self-suerpvised pre-training for all tabular models in the library
with the exception of the
`TabPerceiver`
(this is a particular model and
self-supervised pre-training requires some adjustments that will be
implemented in future versions).
implemented in future versions). Please see the examples folder in the repo
or the examples section in the docs for details on how to use self-supervised
pre-training with this library.
The two routines implemented are illustrated in the figures below. The first
is from
[
TabNet: Attentive Interpretable Tabular Learning
](
https://arxiv.org/abs/1908.07442
)
.
...
...
mkdocs/sources/pytorch-widedeep/utils/index.md
浏览文件 @
3227f159
...
...
@@ -10,3 +10,6 @@ the classes and functions discussed here are available directly from the
```
from pytorch_widedeep.utils import LabelEncoder
```
These are classes and functions that are internally used in the library. We
include them here in case the user finds them useful for other purposes.
pytorch_widedeep/losses.py
浏览文件 @
3227f159
...
...
@@ -798,14 +798,18 @@ class HuberLoss(nn.Module):
class
InfoNCELoss
(
nn
.
Module
):
r
"""InfoNCE Loss
r
"""InfoNCE Loss. Loss applied during the Contrastive Denoising Self
Supervised Pre-training routine available in this library
See `SAINT: Improved Neural Networks for Tabular Data via Row Attention
and Contrastive Pre-Training <https://arxiv.org/abs/2106.01342>`_ and
:information_source: **NOTE**: This loss is in principle not exposed to
the user, as it is used internally in the library, but it is included
here for completion.
See [SAINT: Improved Neural Networks for Tabular Data via Row Attention
and Contrastive Pre-Training](https://arxiv.org/abs/2106.01342) and
references therein
Partially inspired by the code in this `repo
<https://github.com/RElbers/info-nce-pytorch>`_
Partially inspired by the code in this [repo](https://github.com/RElbers/info-nce-pytorch)
Parameters:
-----------
...
...
@@ -857,10 +861,15 @@ class InfoNCELoss(nn.Module):
class
DenoisingLoss
(
nn
.
Module
):
r
"""Denoising Loss
r
"""Denoising Loss. Loss applied during the Contrastive Denoising Self
Supervised Pre-training routine available in this library
:information_source: **NOTE**: This loss is in principle not exposed to
the user, as it is used internally in the library, but it is included
here for completion.
See
`
SAINT: Improved Neural Networks for Tabular Data via Row Attention
and Contrastive Pre-Training
<https://arxiv.org/abs/2106.01342>`_
and
See
[
SAINT: Improved Neural Networks for Tabular Data via Row Attention
and Contrastive Pre-Training
](https://arxiv.org/abs/2106.01342)
and
references therein
Parameters:
...
...
@@ -898,12 +907,12 @@ class DenoisingLoss(nn.Module):
----------
x_cat_and_cat_: tuple of Tensors or lists of tuples
Tuple of tensors containing the raw input features and their
encodings, referred in the SAINT paper as
:math:x` and :math:x''`
encodings, referred in the SAINT paper as
$x$ and $x''$
respectively. If one denoising MLP is used per categorical
feature
'x_cat_and_cat_'
will be a list of tuples, one per
feature
`x_cat_and_cat_`
will be a list of tuples, one per
categorical feature
x_cont_and_cont_: tuple of Tensors or lists of tuples
same as
'x_cat_and_cat_'
but for continuous columns
same as
`x_cat_and_cat_`
but for continuous columns
Examples
--------
...
...
@@ -928,7 +937,9 @@ class DenoisingLoss(nn.Module):
return
self
.
lambda_cat
*
loss_cat
+
self
.
lambda_cont
*
loss_cont
def
_compute_cat_loss
(
self
,
x_cat_and_cat_
):
def
_compute_cat_loss
(
self
,
x_cat_and_cat_
:
Union
[
List
[
Tuple
[
Tensor
,
Tensor
]],
Tuple
[
Tensor
,
Tensor
]]
)
->
Tensor
:
loss_cat
=
torch
.
tensor
(
0.0
)
if
isinstance
(
x_cat_and_cat_
,
list
):
...
...
@@ -940,7 +951,7 @@ class DenoisingLoss(nn.Module):
return
loss_cat
def
_compute_cont_loss
(
self
,
x_cont_and_cont_
):
def
_compute_cont_loss
(
self
,
x_cont_and_cont_
)
->
Tensor
:
loss_cont
=
torch
.
tensor
(
0.0
)
if
isinstance
(
x_cont_and_cont_
,
list
):
...
...
@@ -954,12 +965,17 @@ class DenoisingLoss(nn.Module):
class
EncoderDecoderLoss
(
nn
.
Module
):
r
"""Loss applied for the Endoder-Decoder Self Supervised pretraining process
r
"""'_Standard_' Encoder Decoder Loss. Loss applied during the Endoder-Decoder
Self-Supervised Pre-Training routine available in this library
:information_source: **NOTE**: This loss is in principle not exposed to
the user, as it is used internally in the library, but it is included
here for completion.
The implementation of this lost is based on that at the
https://github.com/dreamquark-ai/tabnet repo
, which is in itself an
adaptation of that in the original
TabNet paper: `
TabNet: Attentive
Interpretable Tabular Learning
<https://arxiv.org/abs/1908.07442>`_
.
[tabnet repo](https://github.com/dreamquark-ai/tabnet)
, which is in itself an
adaptation of that in the original
paper [
TabNet: Attentive
Interpretable Tabular Learning
](https://arxiv.org/abs/1908.07442)
.
Parameters:
-----------
...
...
pytorch_widedeep/models/fds_layer.py
浏览文件 @
3227f159
...
...
@@ -38,6 +38,11 @@ class FDSLayer(nn.Module):
:information_source: **NOTE**: Feature Distribution Smoothing is
available when using ONLY a `deeptabular` component
:information_source: **NOTE**: We consider this feature absolutely
experimental and we recommend the user to not use it unless the
corresponding [publication](https://arxiv.org/abs/2102.09554) is
well understood
The code here is based on the code at the
[official repo](https://github.com/YyzHarry/imbalanced-regression)
...
...
pytorch_widedeep/models/image/vision.py
浏览文件 @
3227f159
...
...
@@ -62,8 +62,8 @@ class Vision(nn.Module):
used. Alternatively, since Torchvision 0.13 one can use pretrained
models with different weigths. Therefore, `pretrained_model_setup` can
also be dictionary with the name of the model and the weights (e.g.
{'resnet50': ResNet50_Weights.DEFAULT}
or
{'resnet50': "IMAGENET1K_V2"}).
Aliased as `pretrained_model_name`.
`{'resnet50': ResNet50_Weights.DEFAULT}`
or
`{'resnet50': "IMAGENET1K_V2"}`). <br/>
Aliased as `pretrained_model_name`.
n_trainable: Optional, int, default = None
Number of trainable layers starting from the layer closer to the
output neuron(s). Note that this number DOES NOT take into account
...
...
@@ -108,9 +108,6 @@ class Vision(nn.Module):
----------
features: nn.Module
The pretrained model or Standard CNN plus the optional head
output_dim: int
The output dimension of the model. This is a required attribute
neccesary to build the `WideDeep` class
Examples
--------
...
...
@@ -188,6 +185,9 @@ class Vision(nn.Module):
@
property
def
output_dim
(
self
)
->
int
:
r
"""The output dimension of the model. This is a required property
neccesary to build the `WideDeep` class
"""
return
(
self
.
head_hidden_dims
[
-
1
]
if
self
.
head_hidden_dims
is
not
None
...
...
pytorch_widedeep/models/tabular/mlp/context_attention_mlp.py
浏览文件 @
3227f159
...
...
@@ -88,11 +88,8 @@ class ContextAttentionMLP(BaseTabularModelWithAttention):
----------
cat_and_cont_embed: nn.Module
This is the module that processes the categorical and continuous columns
attention_blks: nn.Sequential
encoder: nn.Module
Sequence of attention encoders.
output_dim: int
The output dimension of the model. This is a required attribute
neccesary to build the `WideDeep` class
Examples
--------
...
...
@@ -181,6 +178,9 @@ class ContextAttentionMLP(BaseTabularModelWithAttention):
@
property
def
output_dim
(
self
)
->
int
:
r
"""The output dimension of the model. This is a required property
neccesary to build the `WideDeep` class
"""
return
(
self
.
input_dim
if
self
.
with_cls_token
...
...
pytorch_widedeep/models/tabular/mlp/self_attention_mlp.py
浏览文件 @
3227f159
...
...
@@ -17,7 +17,10 @@ class SelfAttentionMLP(BaseTabularModelWithAttention):
are then passed through a series of attention blocks. Each attention
block is comprised by what we would refer as a simplified
`SelfAttentionEncoder`. See
`pytorch_widedeep.models.tabular.mlp._attention_layers` for details.
`pytorch_widedeep.models.tabular.mlp._attention_layers` for details. The
reason to use a simplified version of self attention is because we
observed that the '_standard_' attention mechanism used in the
TabTransformer has a notable tendency to overfit.
Parameters
----------
...
...
@@ -89,11 +92,8 @@ class SelfAttentionMLP(BaseTabularModelWithAttention):
----------
cat_and_cont_embed: nn.Module
This is the module that processes the categorical and continuous columns
attention_blks: nn.Sequential
encoder: nn.Module
Sequence of attention encoders.
output_dim: int
The output dimension of the model. This is a required attribute
neccesary to build the WideDeep class
Examples
--------
...
...
@@ -188,6 +188,9 @@ class SelfAttentionMLP(BaseTabularModelWithAttention):
@
property
def
output_dim
(
self
)
->
int
:
r
"""The output dimension of the model. This is a required property
neccesary to build the WideDeep class
"""
return
(
self
.
input_dim
if
self
.
with_cls_token
...
...
pytorch_widedeep/models/tabular/mlp/tab_mlp.py
浏览文件 @
3227f159
...
...
@@ -71,12 +71,9 @@ class TabMlp(BaseTabularModelWithoutAttention):
----------
cat_and_cont_embed: nn.Module
This is the module that processes the categorical and continuous columns
tab_mlp: nn.Sequential
encoder: nn.Module
mlp model that will receive the concatenation of the embeddings and
the continuous columns
output_dim: int
The output dimension of the model. This is a required attribute
neccesary to build the `WideDeep` class
Examples
--------
...
...
@@ -152,7 +149,9 @@ class TabMlp(BaseTabularModelWithoutAttention):
return
self
.
encoder
(
x
)
@
property
def
output_dim
(
self
):
def
output_dim
(
self
)
->
int
:
r
"""The output dimension of the model. This is a required property
neccesary to build the `WideDeep` class"""
return
self
.
mlp_hidden_dims
[
-
1
]
...
...
@@ -170,8 +169,7 @@ class TabMlpDecoder(nn.Module):
This class is designed to be used with the `EncoderDecoderTrainer` when
using self-supervised pre-training (see the corresponding section in the
docs). The `TabMlpDecoder` will receive the output from the MLP
and '_reconstruct_' the embeddings in the embeddings layer in the
`TabMlp` model.
and '_reconstruct_' the embeddings.
Parameters
----------
...
...
pytorch_widedeep/models/tabular/resnet/tab_resnet.py
浏览文件 @
3227f159
...
...
@@ -88,16 +88,13 @@ class TabResnet(BaseTabularModelWithoutAttention):
----------
cat_and_cont_embed: nn.Module
This is the module that processes the categorical and continuous columns
tab_resnet_blks: nn.Sequential
encoder: nn.Module
deep dense Resnet model that will receive the concatenation of the
embeddings and the continuous columns
tab_resnet_mlp: nn.Sequential
mlp: nn.Module
if `mlp_hidden_dims` is `True`, this attribute will be an mlp
model that will receive the results of the concatenation of the
embeddings and the continuous columns -- if present --.
output_dim: int
The output dimension of the model. This is a required attribute
neccesary to build the `WideDeep` class
Examples
--------
...
...
@@ -200,6 +197,9 @@ class TabResnet(BaseTabularModelWithoutAttention):
@
property
def
output_dim
(
self
)
->
int
:
r
"""The output dimension of the model. This is a required property
neccesary to build the `WideDeep` class
"""
return
(
self
.
mlp_hidden_dims
[
-
1
]
if
self
.
mlp_hidden_dims
is
not
None
...
...
@@ -214,8 +214,7 @@ class TabResnetDecoder(nn.Module):
This class is designed to be used with the `EncoderDecoderTrainer` when
using self-supervised pre-training (see the corresponding section in the
docs). This class will receive the output from the ResNet blocks or the
MLP(if present) and '_reconstruct_' the embeddings in the embeddings
layer in the `TabResnet` model.
MLP(if present) and '_reconstruct_' the embeddings.
Parameters
----------
...
...
pytorch_widedeep/models/tabular/tabnet/tab_net.py
浏览文件 @
3227f159
...
...
@@ -95,11 +95,8 @@ class TabNet(BaseTabularModelWithoutAttention):
----------
cat_and_cont_embed: nn.Module
This is the module that processes the categorical and continuous columns
tabnet_
encoder: nn.Module
encoder: nn.Module
the TabNet encoder. For details see the [original publication](https://arxiv.org/abs/1908.07442).
output_dim: int
The output dimension of the model. This is a required attribute
neccesary to build the `WideDeep` class
Examples
--------
...
...
@@ -202,6 +199,9 @@ class TabNet(BaseTabularModelWithoutAttention):
@
property
def
output_dim
(
self
)
->
int
:
r
"""The output dimension of the model. This is a required property
neccesary to build the `WideDeep` class
"""
return
self
.
step_dim
...
...
@@ -234,7 +234,7 @@ class TabNetDecoder(nn.Module):
using self-supervised pre-training (see the corresponding section in the
docs). This class will receive the output from the `TabNet` encoder
(i.e. the output from the so called 'steps') and '_reconstruct_' the
embeddings
from the embeddings layer in the `TabNet` encoder
.
embeddings.
Parameters
----------
...
...
pytorch_widedeep/models/tabular/transformers/ft_transformer.py
浏览文件 @
3227f159
...
...
@@ -120,13 +120,10 @@ class FTTransformer(BaseTabularModelWithAttention):
----------
cat_and_cont_embed: nn.Module
This is the module that processes the categorical and continuous columns
fttransformer_blks: nn.Sequential
encoder: nn.Module
Sequence of FTTransformer blocks
fttransformer_
mlp: nn.Module
mlp: nn.Module
MLP component in the model
output_dim: int
The output dimension of the model. This is a required attribute
neccesary to build the `WideDeep` class
Examples
--------
...
...
@@ -268,6 +265,9 @@ class FTTransformer(BaseTabularModelWithAttention):
@
property
def
output_dim
(
self
)
->
int
:
r
"""The output dimension of the model. This is a required property
neccesary to build the `WideDeep` class
"""
return
(
self
.
mlp_hidden_dims
[
-
1
]
if
self
.
mlp_hidden_dims
is
not
None
...
...
pytorch_widedeep/models/tabular/transformers/saint.py
浏览文件 @
3227f159
...
...
@@ -106,13 +106,10 @@ class SAINT(BaseTabularModelWithAttention):
----------
cat_and_cont_embed: nn.Module
This is the module that processes the categorical and continuous columns
saint_blks: nn.Sequential
encoder: nn.Module
Sequence of SAINT-Transformer blocks
saint_
mlp: nn.Module
mlp: nn.Module
MLP component in the model
output_dim: int
The output dimension of the model. This is a required attribute
neccesary to build the `WideDeep` class
Examples
--------
...
...
@@ -241,6 +238,9 @@ class SAINT(BaseTabularModelWithAttention):
@
property
def
output_dim
(
self
)
->
int
:
r
"""The output dimension of the model. This is a required property
neccesary to build the `WideDeep` class
"""
return
(
self
.
mlp_hidden_dims
[
-
1
]
if
self
.
mlp_hidden_dims
is
not
None
...
...
pytorch_widedeep/models/tabular/transformers/tab_fastformer.py
浏览文件 @
3227f159
...
...
@@ -119,13 +119,10 @@ class TabFastFormer(BaseTabularModelWithAttention):
----------
cat_and_cont_embed: nn.Module
This is the module that processes the categorical and continuous columns
fastformer_blks: nn.Sequential
encoder: nn.Module
Sequence of FasFormer blocks.
fastformer_
mlp: nn.Module
mlp: nn.Module
MLP component in the model
output_dim: int
The output dimension of the model. This is a required attribute
neccesary to build the `WideDeep` class
Examples
--------
...
...
@@ -274,6 +271,9 @@ class TabFastFormer(BaseTabularModelWithAttention):
@
property
def
output_dim
(
self
)
->
int
:
r
"""The output dimension of the model. This is a required property
neccesary to build the `WideDeep` class
"""
return
(
self
.
mlp_hidden_dims
[
-
1
]
if
self
.
mlp_hidden_dims
is
not
None
...
...
pytorch_widedeep/models/tabular/transformers/tab_perceiver.py
浏览文件 @
3227f159
...
...
@@ -134,15 +134,12 @@ class TabPerceiver(BaseTabularModelWithAttention):
----------
cat_and_cont_embed: nn.Module
This is the module that processes the categorical and continuous columns
perceiver_blks
: nn.ModuleDict
encoder
: nn.ModuleDict
ModuleDict with the Perceiver blocks
latents: nn.Parameter
Latents that will be used for prediction
perceiver_
mlp: nn.Module
mlp: nn.Module
MLP component in the model
output_dim: int
The output dimension of the model. This is a required attribute
neccesary to build the `WideDeep` class
Examples
--------
...
...
@@ -289,6 +286,9 @@ class TabPerceiver(BaseTabularModelWithAttention):
@
property
def
output_dim
(
self
)
->
int
:
r
"""The output dimension of the model. This is a required property
neccesary to build the `WideDeep` class
"""
return
(
self
.
mlp_hidden_dims
[
-
1
]
if
self
.
mlp_hidden_dims
is
not
None
...
...
pytorch_widedeep/models/tabular/transformers/tab_transformer.py
浏览文件 @
3227f159
...
...
@@ -112,13 +112,10 @@ class TabTransformer(BaseTabularModelWithAttention):
----------
cat_and_cont_embed: nn.Module
This is the module that processes the categorical and continuous columns
transformer_blks: nn.Sequential
encoder: nn.Module
Sequence of Transformer blocks
transformer_
mlp: nn.Module
mlp: nn.Module
MLP component in the model
output_dim: int
The output dimension of the model. This is a required attribute
neccesary to build the `WideDeep` class
Examples
--------
...
...
@@ -279,6 +276,9 @@ class TabTransformer(BaseTabularModelWithAttention):
@
property
def
output_dim
(
self
)
->
int
:
r
"""The output dimension of the model. This is a required property
neccesary to build the `WideDeep` class
"""
return
(
self
.
mlp_hidden_dims
[
-
1
]
if
self
.
mlp_hidden_dims
is
not
None
...
...
pytorch_widedeep/models/text/attentive_rnn.py
浏览文件 @
3227f159
...
...
@@ -77,12 +77,9 @@ class AttentiveRNN(BasicRNN):
word embedding matrix
rnn: nn.Module
Stack of RNNs
rnn_mlp: nn.
Sequential
rnn_mlp: nn.
Module
Stack of dense layers on top of the RNN. This will only exists if
`head_layers_dim` is not `None`
output_dim: int
The output dimension of the model. This is a required attribute
neccesary to build the `WideDeep` class
Examples
--------
...
...
pytorch_widedeep/models/text/basic_rnn.py
浏览文件 @
3227f159
...
...
@@ -69,12 +69,9 @@ class BasicRNN(nn.Module):
word embedding matrix
rnn: nn.Module
Stack of RNNs
rnn_mlp: nn.
Sequential
rnn_mlp: nn.
Module
Stack of dense layers on top of the RNN. This will only exists if
`head_layers_dim` is not None
output_dim: int
The output dimension of the model. This is a required attribute
neccesary to build the `WideDeep` class
Examples
--------
...
...
@@ -193,6 +190,9 @@ class BasicRNN(nn.Module):
@
property
def
output_dim
(
self
)
->
int
:
r
"""The output dimension of the model. This is a required property
neccesary to build the `WideDeep` class
"""
return
(
self
.
head_hidden_dims
[
-
1
]
if
self
.
head_hidden_dims
is
not
None
...
...
pytorch_widedeep/models/text/stacked_attentive_rnn.py
浏览文件 @
3227f159
...
...
@@ -75,12 +75,9 @@ class StackedAttentiveRNN(nn.Module):
word embedding matrix
rnn: nn.Module
Stack of RNNs
rnn_mlp: nn.
Sequential
rnn_mlp: nn.
Module
Stack of dense layers on top of the RNN. This will only exists if
`head_layers_dim` is not `None`
output_dim: int
The output dimension of the model. This is a required attribute
neccesary to build the `WideDeep` class
Examples
--------
...
...
@@ -235,6 +232,9 @@ class StackedAttentiveRNN(nn.Module):
return
self
.
rnn_mlp
(
x
)
def
output_dim
(
self
)
->
int
:
r
"""The output dimension of the model. This is a required property
neccesary to build the `WideDeep` class
"""
return
(
self
.
head_hidden_dims
[
-
1
]
if
self
.
head_hidden_dims
is
not
None
...
...
pytorch_widedeep/models/wide_deep.py
浏览文件 @
3227f159
...
...
@@ -28,6 +28,12 @@ class WideDeep(nn.Module):
r
"""Main collector class that combines all `wide`, `deeptabular`
`deeptext` and `deepimage` models.
Note that all models described so far in this library must be passed to
the `WideDeep` class once constructed. This is because the models output
the last layer before the prediction layer. Such prediction layer is
added by the `WideDeep` class as it collects the components for every
data mode.
There are two options to combine these models that correspond to the
two main architectures that `pytorch-widedeep` can build.
...
...
@@ -100,7 +106,8 @@ class WideDeep(nn.Module):
Distribution Smoothing. Please, see the docs for the `FDSLayer`.
<br/>
:information_source: **NOTE**: Feature Distribution Smoothing
is available when using ONLY a `deeptabular` component
is available when using **ONLY** a `deeptabular` component
<br/>
:information_source: **NOTE**: We consider this feature absolutely
experimental and we recommend the user to not use it unless the
corresponding [publication](https://arxiv.org/abs/2102.09554) is
...
...
pytorch_widedeep/preprocessing/tab_preprocessor.py
浏览文件 @
3227f159
...
...
@@ -76,12 +76,14 @@ class TabPreprocessor(BasePreprocessor):
`False`.
with_attention: bool, default = False
Boolean indicating whether the preprocessed data will be passed to an
attention-based model. If `True`, the param `cat_embed_cols` must
just be a list containing just the categorical column names: e.g.
attention-based model (more precisely a model where all embeddings
must have the same dimensions). If `True`, the param `cat_embed_cols`
must just be a list containing just the categorical column names:
e.g.
_['education', 'relationship', ...]_. This is because they will all be
encoded using embeddings of the same dim, which will be specified
later when the model is defined. <br/>
Param alias:
`for_transformer`
encoded using embeddings of the same dim, which will be specified
later when the model is defined. <br/> Param alias:
`for_transformer`
with_cls_token: bool, default = False
Boolean indicating if a `'[CLS]'` token will be added to the dataset
when using attention-based models. The final hidden state
...
...
@@ -144,10 +146,10 @@ class TabPreprocessor(BasePreprocessor):
cat_embed_cols
:
Union
[
List
[
str
],
List
[
Tuple
[
str
,
int
]]]
=
None
,
continuous_cols
:
List
[
str
]
=
None
,
scale
:
bool
=
True
,
already_standard
:
List
[
str
]
=
None
,
auto_embed_dim
:
bool
=
True
,
embedding_rule
:
Literal
[
"google"
,
"fastai_old"
,
"fastai_new"
]
=
"fastai_new"
,
default_embed_dim
:
int
=
16
,
already_standard
:
List
[
str
]
=
None
,
with_attention
:
bool
=
False
,
with_cls_token
:
bool
=
False
,
shared_embed
:
bool
=
False
,
...
...
@@ -157,10 +159,10 @@ class TabPreprocessor(BasePreprocessor):
self
.
continuous_cols
=
continuous_cols
self
.
scale
=
scale
self
.
already_standard
=
already_standard
self
.
auto_embed_dim
=
auto_embed_dim
self
.
embedding_rule
=
embedding_rule
self
.
default_embed_dim
=
default_embed_dim
self
.
already_standard
=
already_standard
self
.
with_attention
=
with_attention
self
.
with_cls_token
=
with_cls_token
self
.
shared_embed
=
shared_embed
...
...
pytorch_widedeep/preprocessing/wide_preprocessor.py
浏览文件 @
3227f159
...
...
@@ -27,7 +27,7 @@ class WidePreprocessor(BasePreprocessor):
List of Tuples with the name of the columns that will be `'crossed'`
and then label encoded. e.g. _[('education', 'occupation'), ...]_. For
binary features, a cross-product transformation is 1 if and only if
the constituent features are all 1, and 0 otherwise
"
.
the constituent features are all 1, and 0 otherwise.
Attributes
----------
...
...
@@ -36,6 +36,8 @@ class WidePreprocessor(BasePreprocessor):
encoding_dict: Dict
Dictionary where the keys are the result of pasting `colname + '_' +
column value` and the values are the corresponding mapped integer.
inverse_encoding_dict: Dict
the inverse encoding dictionary
wide_dim: int
Dimension of the wide model (i.e. dim of the linear layer)
...
...
pytorch_widedeep/training/trainer.py
浏览文件 @
3227f159
...
...
@@ -50,10 +50,10 @@ class Trainer(BaseTrainer):
model: `WideDeep`
An object of class `WideDeep`
objective: str
Defines the objective, loss or cost function.
Defines the objective, loss or cost function.
<br/>
Param aliases: `loss_function`, `loss_fn`, `loss`,
`cost_function`, `cost_fn`, `cost`
`cost_function`, `cost_fn`, `cost`
. <br/>
Possible values are:
...
...
@@ -312,12 +312,10 @@ class Trainer(BaseTrainer):
predefined dataloaders are in `pytorch-widedeep.dataloaders`.If
`None`, a standard torch `DataLoader` is used.
finetune: bool, default=False
param alias: `warmup`
fine-tune individual model components. This functionality can also
be used to 'warm-up'
individual components before the joined
training starts, and hence its alias. See the Examples folder in
the repo for more details
be used to 'warm-up'
(and hence the alias `warmup`) individual
components before the joined training starts, and hence its
alias. See the Examples folder in
the repo for more details
`pytorch_widedeep` implements 3 fine-tune routines.
...
...
@@ -341,7 +339,14 @@ class Trainer(BaseTrainer):
[ULMfit paper](https://arxiv.org/abs/1801.06146>).
For details on how these routines work, please see the Examples
section in this documentation and the Examples folder in the repo.
section in this documentation and the Examples folder in the repo. <br/>
Param Alias: `warmup`
with_lds: bool, default=False
Boolean indicating if Label Distribution Smoothing will be used. <br/>
information_source: **NOTE**: We consider this feature absolutely
experimental and we recommend the user to not use it unless the
corresponding [publication](https://arxiv.org/abs/2102.09554) is
well understood
Other Parameters
----------------
...
...
@@ -355,12 +360,24 @@ class Trainer(BaseTrainer):
for details.
- **Label Distribution Smoothing related parameters**:<br/>
see the source code at `pytorch_widedeep._wd_dataset` for some details
>:information_source: **NOTE**: We consider this feature absolutely
experimental and we recommend the user to not use it unless the
corresponding [publication](https://arxiv.org/abs/2102.09554) is
well understood
- lds_kernel (`Literal['gaussian', 'triang', 'laplace']`):
choice of kernel for Label Distribution Smoothing
- lds_ks (`int`):
LDS kernel window size
- lds_sigma (`float`):
standard deviation of ['gaussian','laplace'] kernel for LDS
- lds_granularity (`int`):
number of bins in histogram used in LDS to count occurence of sample values
- lds_reweight (`bool`):
option to reweight bin frequency counts in LDS
- lds_y_max (`Optional[float]`):
option to restrict LDS bins by upper label limit
- lds_y_min (`Optional[float]`):
option to restrict LDS bins by lower label limit
See `pytorch_widedeep.trainer._wd_dataset` for more details on
the implications of these parameters
- **Finetune related parameters**:<br/>
see the source code at `pytorch_widedeep._finetune`. Namely, these are:
...
...
pytorch_widedeep/utils/fastai_transforms.py
浏览文件 @
3227f159
...
...
@@ -227,6 +227,9 @@ class Tokenizer:
r
"""Class to combine a series of rules and a tokenizer function to tokenize
text with multiprocessing.
Setting some of the parameters of this class require perhaps some
familiarity with the source code.
Parameters
----------
tok_func: Callable, default = ``SpacyTokenizer``
...
...
@@ -234,11 +237,13 @@ class Tokenizer:
lang: str, default = "en"
Text's Language
pre_rules: ListRules, Optional, default = None
Custom type: ``Collection[Callable[[str], str]]``.
see `pytorch_widedeep.wdtypes`. Preprocessing Rules
Custom type: ``Collection[Callable[[str], str]]``. These are
`Callable` objects that will be applied to the text (str) directly as
`rule(tok)` before being tokenized.
post_rules: ListRules, Optional, default = None
Custom type: ``Collection[Callable[[str], str]]``.
see `pytorch_widedeep.wdtypes`. Postprocessing Rules
Custom type: ``Collection[Callable[[str], str]]``. These are
`Callable` objects that will be applied to the tokens as
`rule(tokens)` after the text has been tokenized.
special_cases: Collection, Optional, default= None
special cases to be added to the tokenizer via ``Spacy``'s
``add_special_case`` method
...
...
@@ -272,7 +277,7 @@ class Tokenizer:
return
res
def
process_text
(
self
,
t
:
str
,
tok
:
BaseTokenizer
)
->
List
[
str
]:
"""Process and tokenize one text ``t`` with tokenizer ``tok``.
r
"""Process and tokenize one text ``t`` with tokenizer ``tok``.
Parameters
----------
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录