未验证 提交 e1e6a2ac 编写于 作者: J Javier 提交者: GitHub

Merge pull request #56 from jrzaurin/pmulinka/uncertainty

Embedding, MC and draft requets
...@@ -11,6 +11,7 @@ on: ...@@ -11,6 +11,7 @@ on:
jobs: jobs:
codestyle: codestyle:
runs-on: ubuntu-latest runs-on: ubuntu-latest
if: ${{ github.event_name == 'push' || !github.event.pull_request.draft }}
steps: steps:
- uses: actions/checkout@v2 - uses: actions/checkout@v2
- name: Set up Python 3.9 - name: Set up Python 3.9
...@@ -32,6 +33,7 @@ jobs: ...@@ -32,6 +33,7 @@ jobs:
test: test:
runs-on: ubuntu-latest runs-on: ubuntu-latest
if: ${{ github.event_name == 'push' || !github.event.pull_request.draft }}
strategy: strategy:
fail-fast: true fail-fast: true
matrix: matrix:
...@@ -59,6 +61,7 @@ jobs: ...@@ -59,6 +61,7 @@ jobs:
finish: finish:
needs: test needs: test
runs-on: ubuntu-latest runs-on: ubuntu-latest
if: ${{ github.event_name == 'push' || !github.event.pull_request.draft }}
steps: steps:
- uses: actions/checkout@v2 - uses: actions/checkout@v2
- name: Set up Python 3.9 - name: Set up Python 3.9
......
...@@ -11,7 +11,7 @@ __pycache__* ...@@ -11,7 +11,7 @@ __pycache__*
Untitled*.ipynb Untitled*.ipynb
# data related dirs # data related dirs
data/ tmp_data/
model_weights/ model_weights/
tmp_dir/ tmp_dir/
weights/ weights/
......
Pytorch-widedeep is being developed and used by many active community members. Your help is very valuable to make it better for everyone.
- **[TBA]** Check for the [Roadmap](https://github.com/jrzaurin/pytorch-widedeep/projects/1) or [Open an issue](https://github.com/microsoft/jrzaurin/pytorch-widedeep/issues) to report problems or recommend new features and submit a draft pull requests, which will be changed to pull request after intial review
- Contribute to the [tests](https://github.com/jrzaurin/pytorch-widedeep/tree/master/tests) to make it more reliable.
- Contribute to the [documentation](https://github.com/jrzaurin/pytorch-widedeep/tree/master/docs) to make it clearer for everyone.
- Contribute to the [examples](https://github.com/jrzaurin/pytorch-widedeep/tree/master/examples) to share your experience with other users.
- Join the dicussion on [slack](https://join.slack.com/t/pytorch-widedeep/shared_invite/zt-soss7stf-iXpVuLeKZz8lGTnxxtHtTw)
\ No newline at end of file
...@@ -24,8 +24,6 @@ using wide and deep models. ...@@ -24,8 +24,6 @@ using wide and deep models.
**Experiments and comparisson with `LightGBM`**: [TabularDL vs LightGBM](https://github.com/jrzaurin/tabulardl-benchmark) **Experiments and comparisson with `LightGBM`**: [TabularDL vs LightGBM](https://github.com/jrzaurin/tabulardl-benchmark)
**Slack**: if you want to contribute or just want to chat with us, join [slack](https://join.slack.com/t/pytorch-widedeep/shared_invite/zt-soss7stf-iXpVuLeKZz8lGTnxxtHtTw)
The content of this document is organized as follows: The content of this document is organized as follows:
1. [introduction](#introduction) 1. [introduction](#introduction)
...@@ -307,6 +305,10 @@ of the package and its functionalities. ...@@ -307,6 +305,10 @@ of the package and its functionalities.
pytest tests pytest tests
``` ```
### How to Contribute
Check [CONTRIBUTING](https://github.com/jrzaurin/pytorch-widedeep/CONTRIBUTING.MD) page.
### Acknowledgments ### Acknowledgments
This library takes from a series of other libraries, so I think it is just This library takes from a series of other libraries, so I think it is just
......
1.0.11 1.0.12
\ No newline at end of file \ No newline at end of file
...@@ -2,27 +2,15 @@ ...@@ -2,27 +2,15 @@
"cells": [ "cells": [
{ {
"cell_type": "markdown", "cell_type": "markdown",
"id": "731975e2",
"metadata": {}, "metadata": {},
"source": [ "source": [
"# Hyperparameter tuning and using Raytune and visulization using Tensorboard" "# Hyperparameter tuning with Raytune and visulization using Tensorboard and Weights & Biases"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"* In this notebook we will use the higly imbalanced Protein Homology Dataset from [KDD cup 2004](https://www.kdd.org/kdd-cup/view/kdd-cup-2004/Data)\n",
"\n",
"```\n",
"* The first element of each line is a BLOCK ID that denotes to which native sequence this example belongs. There is a unique BLOCK ID for each native sequence. BLOCK IDs are integers running from 1 to 303 (one for each native sequence, i.e. for each query). BLOCK IDs were assigned before the blocks were split into the train and test sets, so they do not run consecutively in either file.\n",
"* The second element of each line is an EXAMPLE ID that uniquely describes the example. You will need this EXAMPLE ID and the BLOCK ID when you submit results.\n",
"* The third element is the class of the example. Proteins that are homologous to the native sequence are denoted by 1, non-homologous proteins (i.e. decoys) by 0. Test examples have a \"?\" in this position.\n",
"* All following elements are feature values. There are 74 feature values in each line. The features describe the match (e.g. the score of a sequence alignment) between the native protein sequence and the sequence that is tested for homology.\n",
"```"
] ]
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
"id": "ee745c58",
"metadata": {}, "metadata": {},
"source": [ "source": [
"## Initial imports" "## Initial imports"
...@@ -30,19 +18,12 @@ ...@@ -30,19 +18,12 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 1, "execution_count": 61,
"id": "fdab94eb",
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [],
{
"name": "stderr",
"output_type": "stream",
"text": [
"/Users/javier/.pyenv/versions/3.7.7/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject\n",
" return f(*args, **kwds)\n"
]
}
],
"source": [ "source": [
"import os\n",
"import numpy as np\n", "import numpy as np\n",
"import pandas as pd\n", "import pandas as pd\n",
"import torch\n", "import torch\n",
...@@ -51,7 +32,6 @@ ...@@ -51,7 +32,6 @@
"from pytorch_widedeep import Trainer\n", "from pytorch_widedeep import Trainer\n",
"from pytorch_widedeep.preprocessing import TabPreprocessor\n", "from pytorch_widedeep.preprocessing import TabPreprocessor\n",
"from pytorch_widedeep.models import TabMlp, WideDeep\n", "from pytorch_widedeep.models import TabMlp, WideDeep\n",
"from pytorch_widedeep.dataloaders import DataLoaderImbalanced, DataLoaderDefault\n",
"from torchmetrics import F1 as F1_torchmetrics\n", "from torchmetrics import F1 as F1_torchmetrics\n",
"from torchmetrics import Accuracy as Accuracy_torchmetrics\n", "from torchmetrics import Accuracy as Accuracy_torchmetrics\n",
"from torchmetrics import Precision as Precision_torchmetrics\n", "from torchmetrics import Precision as Precision_torchmetrics\n",
...@@ -61,22 +41,21 @@ ...@@ -61,22 +41,21 @@
"from pytorch_widedeep.callbacks import (\n", "from pytorch_widedeep.callbacks import (\n",
" EarlyStopping,\n", " EarlyStopping,\n",
" ModelCheckpoint,\n", " ModelCheckpoint,\n",
" LRHistory,\n",
" RayTuneReporter,\n", " RayTuneReporter,\n",
")\n", ")\n",
"from pytorch_widedeep.datasets import load_bio_kdd04\n",
"\n", "\n",
"from sklearn.model_selection import train_test_split\n", "from sklearn.model_selection import train_test_split\n",
"from sklearn.metrics import classification_report\n",
"\n",
"import time\n",
"import datetime\n",
"\n",
"import warnings\n", "import warnings\n",
"\n", "\n",
"warnings.filterwarnings(\"ignore\", category=DeprecationWarning)\n", "warnings.filterwarnings(\"ignore\", category=DeprecationWarning)\n",
"\n", "\n",
"from ray import tune\n", "from ray import tune\n",
"from ray.tune.schedulers import AsyncHyperBandScheduler\n",
"from ray.tune import JupyterNotebookReporter\n", "from ray.tune import JupyterNotebookReporter\n",
"from ray.tune.integration.wandb import WandbLoggerCallback, wandb_mixin\n",
"import wandb\n",
"\n",
"import tracemalloc\n", "import tracemalloc\n",
"\n", "\n",
"tracemalloc.start()\n", "tracemalloc.start()\n",
...@@ -88,7 +67,8 @@ ...@@ -88,7 +67,8 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 2, "execution_count": 64,
"id": "07c75f0c",
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
...@@ -647,20 +627,20 @@ ...@@ -647,20 +627,20 @@
"4 0.68 -0.59 2.0 -36.0 -6.9 2.02 0.14 -0.23 " "4 0.68 -0.59 2.0 -36.0 -6.9 2.02 0.14 -0.23 "
] ]
}, },
"execution_count": 2, "execution_count": 64,
"metadata": {}, "metadata": {},
"output_type": "execute_result" "output_type": "execute_result"
} }
], ],
"source": [ "source": [
"header_list = ['EXAMPLE_ID', 'BLOCK_ID', 'target'] + [str(i) for i in range(4,78)]\n", "df = load_bio_kdd04(as_frame=True)\n",
"df = pd.read_csv('data/kddcup04/bio_train.dat', sep='\\t', names=header_list)\n",
"df.head()" "df.head()"
] ]
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 3, "execution_count": 65,
"id": "1e3f8efc",
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
...@@ -671,7 +651,7 @@ ...@@ -671,7 +651,7 @@
"Name: target, dtype: int64" "Name: target, dtype: int64"
] ]
}, },
"execution_count": 3, "execution_count": 65,
"metadata": {}, "metadata": {},
"output_type": "execute_result" "output_type": "execute_result"
} }
...@@ -683,7 +663,8 @@ ...@@ -683,7 +663,8 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 4, "execution_count": 35,
"id": "214b3071",
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
...@@ -693,7 +674,8 @@ ...@@ -693,7 +674,8 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 5, "execution_count": 36,
"id": "168c81f1",
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
...@@ -703,6 +685,7 @@ ...@@ -703,6 +685,7 @@
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
"id": "87e7b8f0",
"metadata": {}, "metadata": {},
"source": [ "source": [
"## Preparing the data" "## Preparing the data"
...@@ -710,7 +693,8 @@ ...@@ -710,7 +693,8 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 6, "execution_count": 37,
"id": "3a7b246b",
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
...@@ -719,7 +703,8 @@ ...@@ -719,7 +703,8 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 7, "execution_count": 38,
"id": "7a2dac24",
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
...@@ -737,6 +722,7 @@ ...@@ -737,6 +722,7 @@
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
"id": "7b9f63e2",
"metadata": {}, "metadata": {},
"source": [ "source": [
"## Define the model" "## Define the model"
...@@ -744,7 +730,8 @@ ...@@ -744,7 +730,8 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 8, "execution_count": 39,
"id": "81bfda03",
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
...@@ -757,7 +744,8 @@ ...@@ -757,7 +744,8 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 9, "execution_count": 40,
"id": "511198d4",
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
...@@ -804,7 +792,7 @@ ...@@ -804,7 +792,7 @@
")" ")"
] ]
}, },
"execution_count": 9, "execution_count": 40,
"metadata": {}, "metadata": {},
"output_type": "execute_result" "output_type": "execute_result"
} }
...@@ -821,7 +809,8 @@ ...@@ -821,7 +809,8 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 10, "execution_count": 41,
"id": "2d76f463",
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
...@@ -834,7 +823,8 @@ ...@@ -834,7 +823,8 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 11, "execution_count": 42,
"id": "a5359b0f",
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
...@@ -847,42 +837,19 @@ ...@@ -847,42 +837,19 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 33, "execution_count": 60,
"id": "34a18ac0",
"metadata": { "metadata": {
"scrolled": false "scrolled": false
}, },
"outputs": [ "outputs": [],
{
"data": {
"text/html": [
"== Status ==<br>Memory usage on this node: 10.5/16.0 GiB<br>Using FIFO scheduling algorithm.<br>Resources requested: 0/8 CPUs, 0/0 GPUs, 0.0/4.32 GiB heap, 0.0/2.16 GiB objects<br>Result logdir: /Users/javier/ray_results/training_function_2021-10-18_15-59-06<br>Number of trials: 2/2 (2 TERMINATED)<br><table>\n",
"<thead>\n",
"<tr><th>Trial name </th><th>status </th><th>loc </th><th style=\"text-align: right;\"> batch_size</th><th style=\"text-align: right;\"> iter</th><th style=\"text-align: right;\"> total time (s)</th></tr>\n",
"</thead>\n",
"<tbody>\n",
"<tr><td>training_function_8f035_00000</td><td>TERMINATED</td><td> </td><td style=\"text-align: right;\"> 1000</td><td style=\"text-align: right;\"> 5</td><td style=\"text-align: right;\"> 18.2589</td></tr>\n",
"<tr><td>training_function_8f035_00001</td><td>TERMINATED</td><td> </td><td style=\"text-align: right;\"> 5000</td><td style=\"text-align: right;\"> 5</td><td style=\"text-align: right;\"> 18.0369</td></tr>\n",
"</tbody>\n",
"</table><br><br>"
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"2021-10-18 15:59:28,373\tINFO tune.py:617 -- Total run time: 22.07 seconds (21.91 seconds for the tuning loop).\n"
]
}
],
"source": [ "source": [
"config = {\n", "config = {\n",
" \"batch_size\": tune.grid_search([1000, 5000]),\n", " \"batch_size\": tune.grid_search([1000, 5000]),\n",
" \"wandb\": {\n",
" \"project\": \"test\",\n",
" \"api_key_file\": os.getcwd() + \"/wandb_api.key\",\n",
" },\n",
"}\n", "}\n",
"\n", "\n",
"# Optimizers\n", "# Optimizers\n",
...@@ -890,16 +857,17 @@ ...@@ -890,16 +857,17 @@
"# LR Scheduler\n", "# LR Scheduler\n",
"deep_sch = lr_scheduler.StepLR(deep_opt, step_size=3)\n", "deep_sch = lr_scheduler.StepLR(deep_opt, step_size=3)\n",
"\n", "\n",
"early_stopping = EarlyStopping()\n", "@wandb_mixin\n",
"\n",
"\n",
"def training_function(config, X_train, X_val):\n", "def training_function(config, X_train, X_val):\n",
" early_stopping = EarlyStopping()\n",
" model_checkpoint = ModelCheckpoint(save_best_only=True, \n",
" wb=wandb)\n",
" # Hyperparameters\n", " # Hyperparameters\n",
" batch_size = config[\"batch_size\"]\n", " batch_size = config[\"batch_size\"]\n",
" trainer = Trainer(\n", " trainer = Trainer(\n",
" model,\n", " model,\n",
" objective=\"binary_focal_loss\",\n", " objective=\"binary_focal_loss\",\n",
" callbacks=[RayTuneReporter, LRHistory(n_epochs=10), early_stopping],\n", " callbacks=[RayTuneReporter, early_stopping, model_checkpoint],\n",
" lr_schedulers={\"deeptabular\": deep_sch},\n", " lr_schedulers={\"deeptabular\": deep_sch},\n",
" initializers={\"deeptabular\": XavierNormal},\n", " initializers={\"deeptabular\": XavierNormal},\n",
" optimizers={\"deeptabular\": deep_opt},\n", " optimizers={\"deeptabular\": deep_opt},\n",
...@@ -912,121 +880,67 @@ ...@@ -912,121 +880,67 @@
"\n", "\n",
"X_train = {\"X_tab\": X_tab_train, \"target\": y_train}\n", "X_train = {\"X_tab\": X_tab_train, \"target\": y_train}\n",
"X_val = {\"X_tab\": X_tab_valid, \"target\": y_valid}\n", "X_val = {\"X_tab\": X_tab_valid, \"target\": y_valid}\n",
"\n",
"asha_scheduler = AsyncHyperBandScheduler(\n",
" time_attr=\"training_iteration\",\n",
" metric=\"_metric/val_loss\",\n",
" mode=\"min\",\n",
" max_t=100,\n",
" grace_period=10,\n",
" reduction_factor=3,\n",
" brackets=1,\n",
")\n",
"\n",
"analysis = tune.run(\n", "analysis = tune.run(\n",
" tune.with_parameters(training_function, X_train=X_train, X_val=X_val),\n", " tune.with_parameters(training_function, X_train=X_train, X_val=X_val),\n",
" resources_per_trial={\"cpu\": 1, \"gpu\": 0},\n", " resources_per_trial={\"cpu\": 1, \"gpu\": 0},\n",
" progress_reporter=JupyterNotebookReporter(overwrite=True),\n", " progress_reporter=JupyterNotebookReporter(overwrite=True),\n",
" scheduler=asha_scheduler,\n",
" config=config,\n", " config=config,\n",
" callbacks=[WandbLoggerCallback(\n",
" project=config[\"wandb\"][\"project\"],\n",
" api_key_file=config[\"wandb\"][\"api_key_file\"],\n",
" log_config=True)],\n",
")" ")"
] ]
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 34, "execution_count": 56,
"id": "fac74d5f",
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [],
{
"data": {
"text/plain": [
"{'8f035_00000': {'_metric': {'train_loss': 0.007156053689332345,\n",
" 'train_Accuracy_0': 1.0,\n",
" 'train_Accuracy_1': 0.008678881451487541,\n",
" 'train_Precision': 0.9911835193634033,\n",
" 'train_Recall_0': 1.0,\n",
" 'train_Recall_1': 0.008678881451487541,\n",
" 'train_F1_0': 0.9955719113349915,\n",
" 'train_F1_1': 0.017208412289619446,\n",
" 'val_loss': 0.006261252580831448,\n",
" 'val_Accuracy_0': 1.0,\n",
" 'val_Accuracy_1': 0.023255813866853714,\n",
" 'val_Precision': 0.9913550615310669,\n",
" 'val_Recall_0': 1.0,\n",
" 'val_Recall_1': 0.023255813866853714,\n",
" 'val_F1_0': 0.9956578612327576,\n",
" 'val_F1_1': 0.045454543083906174,\n",
" 'lr_deeptabular_0': 0.010000000000000002},\n",
" 'time_this_iter_s': 3.5364139080047607,\n",
" 'done': True,\n",
" 'timesteps_total': None,\n",
" 'episodes_total': None,\n",
" 'training_iteration': 5,\n",
" 'experiment_id': 'f62bb9c9c32a45b9af85a6a3b0e30d94',\n",
" 'date': '2021-10-18_15-59-28',\n",
" 'timestamp': 1634565568,\n",
" 'time_total_s': 18.25888180732727,\n",
" 'pid': 15457,\n",
" 'hostname': 'infinito.bbrouter',\n",
" 'node_ip': '192.168.18.34',\n",
" 'config': {'batch_size': 1000},\n",
" 'time_since_restore': 18.25888180732727,\n",
" 'timesteps_since_restore': 0,\n",
" 'iterations_since_restore': 5,\n",
" 'trial_id': '8f035_00000',\n",
" 'experiment_tag': '0_batch_size=1000'},\n",
" '8f035_00001': {'_metric': {'train_loss': 0.019367387828727562,\n",
" 'train_Accuracy_0': 0.9999827146530151,\n",
" 'train_Accuracy_1': 0.01157184224575758,\n",
" 'train_Precision': 0.991192102432251,\n",
" 'train_Recall_0': 0.9999827146530151,\n",
" 'train_Recall_1': 0.01157184224575758,\n",
" 'train_F1_0': 0.9955761432647705,\n",
" 'train_F1_1': 0.022835396230220795,\n",
" 'val_loss': 0.01834123209118843,\n",
" 'val_Accuracy_0': 1.0,\n",
" 'val_Accuracy_1': 0.0,\n",
" 'val_Precision': 0.9911492466926575,\n",
" 'val_Recall_0': 1.0,\n",
" 'val_Recall_1': 0.0,\n",
" 'val_F1_0': 0.9955549836158752,\n",
" 'val_F1_1': 0.0,\n",
" 'lr_deeptabular_0': 0.010000000000000002},\n",
" 'time_this_iter_s': 3.42478084564209,\n",
" 'done': True,\n",
" 'timesteps_total': None,\n",
" 'episodes_total': None,\n",
" 'training_iteration': 5,\n",
" 'experiment_id': 'd7379f1debc14e41bc971bf4c27b6793',\n",
" 'date': '2021-10-18_15-59-27',\n",
" 'timestamp': 1634565567,\n",
" 'time_total_s': 18.036858797073364,\n",
" 'pid': 15456,\n",
" 'hostname': 'infinito.bbrouter',\n",
" 'node_ip': '192.168.18.34',\n",
" 'config': {'batch_size': 5000},\n",
" 'time_since_restore': 18.036858797073364,\n",
" 'timesteps_since_restore': 0,\n",
" 'iterations_since_restore': 5,\n",
" 'trial_id': '8f035_00001',\n",
" 'experiment_tag': '1_batch_size=5000'}}"
]
},
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [ "source": [
"analysis.results" "analysis.results"
] ]
}, },
{ {
"cell_type": "code", "cell_type": "markdown",
"execution_count": 16, "id": "81450d98",
"metadata": {},
"source": [
"Using Weights and Biases logging you can create [parallel coordinates graphs](https://docs.wandb.ai/ref/app/features/panels/parallel-coordinates) that map parametr combinations to the best(lowest) loss achieved during the training of the networks\n",
"\n",
"![WNB](wnb.png \"parallel coordinates\")"
]
},
{
"cell_type": "markdown",
"id": "56fc4823",
"metadata": {}, "metadata": {},
"outputs": [],
"source": [ "source": [
"# %load_ext tensorboard" "local visualization of raytune reults using tensorboard"
] ]
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 17, "execution_count": 59,
"metadata": { "id": "e1719cc0",
"scrolled": true "metadata": {},
},
"outputs": [], "outputs": [],
"source": [ "source": [
"# %tensorboard --logdir ~/ray_results" "%load_ext tensorboard\n",
"%tensorboard --logdir ~/ray_results"
] ]
} }
], ],
...@@ -1035,8 +949,7 @@ ...@@ -1035,8 +949,7 @@
"hash": "3b99005fd577fa40f3cce433b2b92303885900e634b2b5344c07c59d06c8792d" "hash": "3b99005fd577fa40f3cce433b2b92303885900e634b2b5344c07c59d06c8792d"
}, },
"kernelspec": { "kernelspec": {
"display_name": "Python 3", "display_name": "Python 3.8.5 64-bit ('base': conda)",
"language": "python",
"name": "python3" "name": "python3"
}, },
"language_info": { "language_info": {
...@@ -1049,7 +962,7 @@ ...@@ -1049,7 +962,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.7.7" "version": "3.8.5"
}, },
"toc": { "toc": {
"base_numbering": 1, "base_numbering": 1,
......
此差异已折叠。
...@@ -338,12 +338,13 @@ class ModelCheckpoint(Callback): ...@@ -338,12 +338,13 @@ class ModelCheckpoint(Callback):
Parameters Parameters
---------- ----------
filepath: str filepath: str, default=None
Full path to save the output weights. It must contain only the root of Full path to save the output weights. It must contain only the root of
the filenames. Epoch number and ``.pt`` extension (for pytorch) will the filenames. Epoch number and ``.pt`` extension (for pytorch) will
be added. e.g. ``filepath="path/to/output_weights/weights_out"`` And be added. e.g. ``filepath="path/to/output_weights/weights_out"`` And
the saved files in that directory will be named: ``weights_out_1.pt, the saved files in that directory will be named: ``weights_out_1.pt,
weights_out_2.pt, ...`` weights_out_2.pt, ...``
If set to None the class just report best metric and best_epoch.
monitor: str, default="loss" monitor: str, default="loss"
quantity to monitor. Typically 'val_loss' or metric name (e.g. 'val_acc') quantity to monitor. Typically 'val_loss' or metric name (e.g. 'val_acc')
verbose:int, default=0 verbose:int, default=0
...@@ -362,6 +363,15 @@ class ModelCheckpoint(Callback): ...@@ -362,6 +363,15 @@ class ModelCheckpoint(Callback):
Interval (number of epochs) between checkpoints. Interval (number of epochs) between checkpoints.
max_save: int, default=-1 max_save: int, default=-1
Maximum number of outputs to save. If -1 will save all outputs Maximum number of outputs to save. If -1 will save all outputs
wb: obj, default=None
Weights&Biases API interface to report single best result usable for comparisson of multiple
paramater combinations by e.g. parallel coordinates:
https://docs.wandb.ai/ref/app/features/panels/parallel-coordinates.
E.g W&B summary report `wandb.run.summary["best"]`:
If external EarlyStopping scheduler is used from e.g. RayTune in combination with W&B,
the RayTune EarlyStopping stops training function and the summary log is not sent if defined
after training by e.g.:
`wandb.run.summary["best"]=model_checkpoint.best`.
Attributes Attributes
---------- ----------
...@@ -369,6 +379,8 @@ class ModelCheckpoint(Callback): ...@@ -369,6 +379,8 @@ class ModelCheckpoint(Callback):
best metric best metric
best_epoch: int best_epoch: int
best epoch best epoch
best_state_dict: dict
best model state dictionary to restore model to its best state using trainer.model.load_state_dict(ModelCheckpoint.best_state_dict)
Examples Examples
-------- --------
...@@ -386,13 +398,14 @@ class ModelCheckpoint(Callback): ...@@ -386,13 +398,14 @@ class ModelCheckpoint(Callback):
def __init__( def __init__(
self, self,
filepath: str, filepath: Optional[str] = None,
monitor: str = "val_loss", monitor: str = "val_loss",
verbose: int = 0, verbose: int = 0,
save_best_only: bool = False, save_best_only: bool = False,
mode: str = "auto", mode: str = "auto",
period: int = 1, period: int = 1,
max_save: int = -1, max_save: int = -1,
wb: Optional[object] = None,
): ):
super(ModelCheckpoint, self).__init__() super(ModelCheckpoint, self).__init__()
...@@ -403,18 +416,20 @@ class ModelCheckpoint(Callback): ...@@ -403,18 +416,20 @@ class ModelCheckpoint(Callback):
self.mode = mode self.mode = mode
self.period = period self.period = period
self.max_save = max_save self.max_save = max_save
self.wb = wb
self.epochs_since_last_save = 0 self.epochs_since_last_save = 0
if len(self.filepath.split("/")[:-1]) == 0: if self.filepath:
raise ValueError( if len(self.filepath.split("/")[:-1]) == 0:
"'filepath' must be the full path to save the output weights," raise ValueError(
" including the root of the filenames. e.g. 'checkpoints/weights_out'" "'filepath' must be the full path to save the output weights,"
) " including the root of the filenames. e.g. 'checkpoints/weights_out'"
)
root_dir = ("/").join(self.filepath.split("/")[:-1]) root_dir = ("/").join(self.filepath.split("/")[:-1])
if not os.path.exists(root_dir): if not os.path.exists(root_dir):
os.makedirs(root_dir) os.makedirs(root_dir)
if self.max_save > 0: if self.max_save > 0:
self.old_files: List[str] = [] self.old_files: List[str] = []
...@@ -447,7 +462,8 @@ class ModelCheckpoint(Callback): ...@@ -447,7 +462,8 @@ class ModelCheckpoint(Callback):
self.epochs_since_last_save += 1 self.epochs_since_last_save += 1
if self.epochs_since_last_save >= self.period: if self.epochs_since_last_save >= self.period:
self.epochs_since_last_save = 0 self.epochs_since_last_save = 0
filepath = "{}_{}.p".format(self.filepath, epoch + 1) if self.filepath:
filepath = "{}_{}.p".format(self.filepath, epoch + 1)
if self.save_best_only: if self.save_best_only:
current = logs.get(self.monitor) current = logs.get(self.monitor)
if current is None: if current is None:
...@@ -459,35 +475,50 @@ class ModelCheckpoint(Callback): ...@@ -459,35 +475,50 @@ class ModelCheckpoint(Callback):
else: else:
if self.monitor_op(current, self.best): if self.monitor_op(current, self.best):
if self.verbose > 0: if self.verbose > 0:
print( if self.filepath:
"\nEpoch %05d: %s improved from %0.5f to %0.5f," print(
" saving model to %s" "\nEpoch %05d: %s improved from %0.5f to %0.5f,"
% ( " saving model to %s"
epoch + 1, % (
self.monitor, epoch + 1,
self.best, self.monitor,
current, self.best,
filepath, current,
filepath,
)
) )
) else:
print(
"\nEpoch %05d: %s improved from %0.5f to %0.5f"
% (
epoch + 1,
self.monitor,
self.best,
current,
)
)
if self.wb is not None:
self.wb.run.summary["best"] = current # type: ignore[attr-defined]
self.best = current self.best = current
self.best_epoch = epoch self.best_epoch = epoch
torch.save(self.model.state_dict(), filepath) self.best_state_dict = self.model.state_dict()
if self.max_save > 0: if self.filepath:
if len(self.old_files) == self.max_save: torch.save(self.best_state_dict, filepath)
try: if self.max_save > 0:
os.remove(self.old_files[0]) if len(self.old_files) == self.max_save:
except FileNotFoundError: try:
pass os.remove(self.old_files[0])
self.old_files = self.old_files[1:] except FileNotFoundError:
self.old_files.append(filepath) pass
self.old_files = self.old_files[1:]
self.old_files.append(filepath)
else: else:
if self.verbose > 0: if self.verbose > 0:
print( print(
"\nEpoch %05d: %s did not improve from %0.5f" "\nEpoch %05d: %s did not improve from %0.5f"
% (epoch + 1, self.monitor, self.best) % (epoch + 1, self.monitor, self.best)
) )
else: if not self.save_best_only and self.filepath:
if self.verbose > 0: if self.verbose > 0:
print("\nEpoch %05d: saving model to %s" % (epoch + 1, filepath)) print("\nEpoch %05d: saving model to %s" % (epoch + 1, filepath))
torch.save(self.model.state_dict(), filepath) torch.save(self.model.state_dict(), filepath)
......
from ._base import load_adult, load_bio_kdd04
__all__ = ["load_bio_kdd04", "load_adult"]
from importlib import resources
import pandas as pd
def load_bio_kdd04(as_frame: bool = False):
"""Load and return the higly imbalanced Protein Homology
Dataset from [KDD cup 2004](https://www.kdd.org/kdd-cup/view/kdd-cup-2004/Data.
This datasets include only bio_train.dat part of the dataset
* The first element of each line is a BLOCK ID that denotes to which native sequence
this example belongs. There is a unique BLOCK ID for each native sequence.
BLOCK IDs are integers running from 1 to 303 (one for each native sequence,
i.e. for each query). BLOCK IDs were assigned before the blocks were split
into the train and test sets, so they do not run consecutively in either file.
* The second element of each line is an EXAMPLE ID that uniquely describes
the example. You will need this EXAMPLE ID and the BLOCK ID when you submit results.
* The third element is the class of the example. Proteins that are homologous to
the native sequence are denoted by 1, non-homologous proteins (i.e. decoys) by 0.
Test examples have a "?" in this position.
* All following elements are feature values. There are 74 feature values in each line.
The features describe the match (e.g. the score of a sequence alignment) between
the native protein sequence and the sequence that is tested for homology.
"""
header_list = ["EXAMPLE_ID", "BLOCK_ID", "target"] + [str(i) for i in range(4, 78)]
with resources.path("pytorch_widedeep.datasets.data", "bio_train.dat") as fpath:
df = pd.read_csv(fpath, sep="\t", names=header_list)
if as_frame:
return df
else:
return df.to_numpy()
def load_adult(as_frame: bool = False):
"""Load and return the [adult income datatest](http://www.cs.toronto.edu/~delve/data/adult/desc.html).
you may find detailed description [here](http://www.cs.toronto.edu/~delve/data/adult/adultDetail.html)
"""
with resources.path("pytorch_widedeep.datasets.data", "adult.csv.zip") as fpath:
df = pd.read_csv(fpath)
if as_frame:
return df
else:
return df.to_numpy()
此差异已折叠。
...@@ -4,6 +4,7 @@ https://github.com/awslabs/autogluon/tree/master/tabular/src/autogluon/tabular/m ...@@ -4,6 +4,7 @@ https://github.com/awslabs/autogluon/tree/master/tabular/src/autogluon/tabular/m
""" """
import math import math
import warnings
import torch import torch
from torch import nn from torch import nn
...@@ -18,10 +19,13 @@ class FullEmbeddingDropout(nn.Module): ...@@ -18,10 +19,13 @@ class FullEmbeddingDropout(nn.Module):
self.dropout = dropout self.dropout = dropout
def forward(self, X: Tensor) -> Tensor: def forward(self, X: Tensor) -> Tensor:
mask = X.new().resize_((X.size(1), 1)).bernoulli_(1 - self.dropout).expand_as( if self.training:
X mask = X.new().resize_((X.size(1), 1)).bernoulli_(
) / (1 - self.dropout) 1 - self.dropout
return mask * X ).expand_as(X) / (1 - self.dropout)
return mask * X
else:
return X
DropoutLayers = Union[nn.Dropout, FullEmbeddingDropout] DropoutLayers = Union[nn.Dropout, FullEmbeddingDropout]
...@@ -125,13 +129,16 @@ class CategoricalEmbeddings(nn.Module): ...@@ -125,13 +129,16 @@ class CategoricalEmbeddings(nn.Module):
self.categorical_cols = [ei[0] for ei in embed_input] self.categorical_cols = [ei[0] for ei in embed_input]
self.cat_idx = [self.column_idx[col] for col in self.categorical_cols] self.cat_idx = [self.column_idx[col] for col in self.categorical_cols]
self.bias = ( if use_bias is not None:
nn.Parameter(torch.Tensor(len(self.categorical_cols), embed_dim)) self.bias = nn.Parameter(
if use_bias torch.Tensor(len(self.categorical_cols), embed_dim)
else None )
)
if self.bias is not None:
nn.init.kaiming_uniform_(self.bias, a=math.sqrt(5)) nn.init.kaiming_uniform_(self.bias, a=math.sqrt(5))
if shared_embed:
warnings.warn(
"The current implementation of 'SharedEmbeddings' does not use bias",
UserWarning,
)
# Categorical: val + 1 because 0 is reserved for padding/unseen cateogories. # Categorical: val + 1 because 0 is reserved for padding/unseen cateogories.
if self.shared_embed: if self.shared_embed:
...@@ -167,11 +174,10 @@ class CategoricalEmbeddings(nn.Module): ...@@ -167,11 +174,10 @@ class CategoricalEmbeddings(nn.Module):
x = torch.cat(cat_embed, 1) x = torch.cat(cat_embed, 1)
else: else:
x = self.embed(X[:, self.cat_idx].long()) x = self.embed(X[:, self.cat_idx].long())
if self.bias is not None:
x = x + self.bias.unsqueeze(0)
x = self.dropout(x) x = self.dropout(x)
if self.bias is not None:
x = x + self.bias.unsqueeze(0)
return x return x
......
...@@ -12,10 +12,24 @@ from pytorch_widedeep.preprocessing.base_preprocessor import ( ...@@ -12,10 +12,24 @@ from pytorch_widedeep.preprocessing.base_preprocessor import (
) )
def embed_sz_rule(n_cat): def embed_sz_rule(n_cat: int, embedding_rule: str = "fastai_new") -> int:
r"""Rule of thumb to pick embedding size corresponding to ``n_cat``. Taken r"""Rule of thumb to pick embedding size corresponding to ``n_cat``. Default rule is taken
from fastai's Tabular API""" from recent fastai's Tabular API. The function also includes previously used rule by fastai
return min(600, round(1.6 * n_cat ** 0.56)) and rule included in the Google's Tensorflow documentation
Parameters
----------
n_cat: int
number of unique categorical values in a feature
embedding_rule: str, default = fastai_old
rule of thumb to be used for embedding vector size
"""
if embedding_rule == "google":
return int(round(n_cat ** 0.25))
elif embedding_rule == "fastai_old":
return int(min(50, (n_cat // 2) + 1))
else:
return int(min(600, round(1.6 * n_cat ** 0.56)))
class TabPreprocessor(BasePreprocessor): class TabPreprocessor(BasePreprocessor):
...@@ -38,8 +52,16 @@ class TabPreprocessor(BasePreprocessor): ...@@ -38,8 +52,16 @@ class TabPreprocessor(BasePreprocessor):
:obj:`pytorch_widedeep.models.transformers._embedding_layers` :obj:`pytorch_widedeep.models.transformers._embedding_layers`
auto_embed_dim: bool, default = True auto_embed_dim: bool, default = True
Boolean indicating whether the embedding dimensions will be Boolean indicating whether the embedding dimensions will be
automatically defined via fastai's rule of thumb': automatically defined via rule of thumb
:math:`min(600, int(1.6 \times n_{cat}^{0.56}))` embedding_rule: str, default = 'fastai_new'
choice of embedding rule of thumb are:
- 'fastai_new' -- :math:`min(600, round(1.6 \times n_{cat}^{0.56}))`
- 'fastai_old' -- :math:`min(50, (n_{cat}//{2})+1)`
- 'google' -- :math:`min(600, round(n_{cat}^{0.24}))`
default_embed_dim: int, default=16 default_embed_dim: int, default=16
Dimension for the embeddings used for the ``deeptabular`` Dimension for the embeddings used for the ``deeptabular``
component if the embed_dim is not provided in the ``embed_cols`` component if the embed_dim is not provided in the ``embed_cols``
...@@ -118,6 +140,7 @@ class TabPreprocessor(BasePreprocessor): ...@@ -118,6 +140,7 @@ class TabPreprocessor(BasePreprocessor):
continuous_cols: List[str] = None, continuous_cols: List[str] = None,
scale: bool = True, scale: bool = True,
auto_embed_dim: bool = True, auto_embed_dim: bool = True,
embedding_rule: str = "fastai_new",
default_embed_dim: int = 16, default_embed_dim: int = 16,
already_standard: List[str] = None, already_standard: List[str] = None,
for_transformer: bool = False, for_transformer: bool = False,
...@@ -131,6 +154,7 @@ class TabPreprocessor(BasePreprocessor): ...@@ -131,6 +154,7 @@ class TabPreprocessor(BasePreprocessor):
self.continuous_cols = continuous_cols self.continuous_cols = continuous_cols
self.scale = scale self.scale = scale
self.auto_embed_dim = auto_embed_dim self.auto_embed_dim = auto_embed_dim
self.embedding_rule = embedding_rule
self.default_embed_dim = default_embed_dim self.default_embed_dim = default_embed_dim
self.already_standard = already_standard self.already_standard = already_standard
self.for_transformer = for_transformer self.for_transformer = for_transformer
...@@ -250,7 +274,10 @@ class TabPreprocessor(BasePreprocessor): ...@@ -250,7 +274,10 @@ class TabPreprocessor(BasePreprocessor):
embed_colname = [emb[0] for emb in self.embed_cols] embed_colname = [emb[0] for emb in self.embed_cols]
elif self.auto_embed_dim: elif self.auto_embed_dim:
n_cats = {col: df[col].nunique() for col in self.embed_cols} n_cats = {col: df[col].nunique() for col in self.embed_cols}
self.embed_dim = {col: embed_sz_rule(n_cat) for col, n_cat in n_cats.items()} # type: ignore[misc] self.embed_dim = {
col: embed_sz_rule(n_cat, self.embedding_rule) # type: ignore[misc]
for col, n_cat in n_cats.items()
}
embed_colname = self.embed_cols # type: ignore embed_colname = self.embed_cols # type: ignore
else: else:
self.embed_dim = {e: self.default_embed_dim for e in self.embed_cols} # type: ignore self.embed_dim = {e: self.default_embed_dim for e in self.embed_cols} # type: ignore
......
import os import os
import json import json
import warnings import warnings
from copy import deepcopy
from pathlib import Path from pathlib import Path
import numpy as np import numpy as np
...@@ -20,7 +19,6 @@ from pytorch_widedeep.callbacks import ( ...@@ -20,7 +19,6 @@ from pytorch_widedeep.callbacks import (
History, History,
Callback, Callback,
MetricCallback, MetricCallback,
RayTuneReporter,
CallbackContainer, CallbackContainer,
LRShedulerCallback, LRShedulerCallback,
) )
...@@ -685,8 +683,14 @@ class Trainer: ...@@ -685,8 +683,14 @@ class Trainer:
If a trainer is used to predict after having trained a model, the If a trainer is used to predict after having trained a model, the
``batch_size`` needs to be defined as it will not be defined as ``batch_size`` needs to be defined as it will not be defined as
the :obj:`Trainer` is instantiated the :obj:`Trainer` is instantiated
uncertainty: bool, default = False
If set to True the model activates the dropout layers and predicts
the each sample N times (uncertainty_granularity times) and returns
{max, min, mean, stdev} value for each sample
uncertainty_granularity: int default = 1000
number of times the model does prediction for each sample if uncertainty
is set to True
""" """
preds_l = self._predict(X_wide, X_tab, X_text, X_img, X_test, batch_size) preds_l = self._predict(X_wide, X_tab, X_text, X_img, X_test, batch_size)
if self.method == "regression": if self.method == "regression":
return np.vstack(preds_l).squeeze(1) return np.vstack(preds_l).squeeze(1)
...@@ -697,6 +701,96 @@ class Trainer: ...@@ -697,6 +701,96 @@ class Trainer:
preds = np.vstack(preds_l) preds = np.vstack(preds_l)
return np.argmax(preds, 1) # type: ignore[return-value] return np.argmax(preds, 1) # type: ignore[return-value]
def predict_uncertainty( # type: ignore[return]
self,
X_wide: Optional[np.ndarray] = None,
X_tab: Optional[np.ndarray] = None,
X_text: Optional[np.ndarray] = None,
X_img: Optional[np.ndarray] = None,
X_test: Optional[Dict[str, np.ndarray]] = None,
batch_size: int = 256,
uncertainty_granularity=1000,
) -> np.ndarray:
r"""Returns the predicted ucnertainty of the model for the test dataset using a
Monte Carlo method during which dropout layers are activated in the evaluation/prediction
phase and each sample is predicted N times (uncertainty_granularity times). Based on [1].
[1] Gal Y. & Ghahramani Z., 2016, Dropout as a Bayesian Approximation: Representing Model
Uncertainty in Deep Learning, Proceedings of the 33rd International Conference on Machine Learning
Parameters
----------
X_wide: np.ndarray, Optional. default=None
Input for the ``wide`` model component.
See :class:`pytorch_widedeep.preprocessing.WidePreprocessor`
X_tab: np.ndarray, Optional. default=None
Input for the ``deeptabular`` model component.
See :class:`pytorch_widedeep.preprocessing.TabPreprocessor`
X_text: np.ndarray, Optional. default=None
Input for the ``deeptext`` model component.
See :class:`pytorch_widedeep.preprocessing.TextPreprocessor`
X_img : np.ndarray, Optional. default=None
Input for the ``deepimage`` model component.
See :class:`pytorch_widedeep.preprocessing.ImagePreprocessor`
X_test: Dict, Optional. default=None
The test dataset can also be passed in a dictionary. Keys are
`X_wide`, `'X_tab'`, `'X_text'`, `'X_img'` and `'target'`. Values
are the corresponding matrices.
batch_size: int, default = 256
If a trainer is used to predict after having trained a model, the
``batch_size`` needs to be defined as it will not be defined as
the :obj:`Trainer` is instantiated
uncertainty_granularity: int default = 1000
number of times the model does prediction for each sample if uncertainty
is set to True
Returns
-------
method == regression : np.ndarray
{max, min, mean, stdev} values for each sample
method == binary : np.ndarray
{mean_cls_0_prob, mean_cls_1_prob, predicted_cls} values for each sample
method == multiclass : np.ndarray
{mean_cls_0_prob, mean_cls_1_prob, mean_cls_2_prob, ... , predicted_cls} values for each sample
"""
preds_l = self._predict(
X_wide,
X_tab,
X_text,
X_img,
X_test,
batch_size,
uncertainty_granularity,
uncertainty=True,
)
preds = np.vstack(preds_l)
samples_num = int(preds.shape[0] / uncertainty_granularity)
if self.method == "regression":
preds = preds.squeeze(1)
preds = preds.reshape((uncertainty_granularity, samples_num))
return np.array(
(
preds.max(axis=0),
preds.min(axis=0),
preds.mean(axis=0),
preds.std(axis=0),
)
).T
if self.method == "binary":
preds = preds.squeeze(1)
preds = preds.reshape((uncertainty_granularity, samples_num))
preds = preds.mean(axis=0)
probs = np.zeros([preds.shape[0], 3])
probs[:, 0] = 1 - preds
probs[:, 1] = preds
return probs
if self.method == "multiclass":
preds = preds.reshape(uncertainty_granularity, samples_num, preds.shape[1])
preds = preds.mean(axis=0)
preds = np.hstack((preds, np.vstack(np.argmax(preds, 1))))
return preds
def predict_proba( # type: ignore[return] def predict_proba( # type: ignore[return]
self, self,
X_wide: Optional[np.ndarray] = None, X_wide: Optional[np.ndarray] = None,
...@@ -944,14 +1038,11 @@ class Trainer: ...@@ -944,14 +1038,11 @@ class Trainer:
for callback in self.callback_container.callbacks: for callback in self.callback_container.callbacks:
if callback.__class__.__name__ == "ModelCheckpoint": if callback.__class__.__name__ == "ModelCheckpoint":
if callback.save_best_only: if callback.save_best_only:
filepath = "{}_{}.p".format(
callback.filepath, callback.best_epoch + 1
)
if self.verbose: if self.verbose:
print( print(
f"Model weights restored to best epoch: {callback.best_epoch + 1}" f"Model weights restored to best epoch: {callback.best_epoch + 1}"
) )
self.model.load_state_dict(torch.load(filepath)) self.model.load_state_dict(callback.best_state_dict)
else: else:
if self.verbose: if self.verbose:
print( print(
...@@ -1104,7 +1195,7 @@ class Trainer: ...@@ -1104,7 +1195,7 @@ class Trainer:
k: v for k, v in zip(tabnet_backbone.column_idx.keys(), feat_imp) # type: ignore[operator, union-attr] k: v for k, v in zip(tabnet_backbone.column_idx.keys(), feat_imp) # type: ignore[operator, union-attr]
} }
def _predict( def _predict( # noqa: C901
self, self,
X_wide: Optional[np.ndarray] = None, X_wide: Optional[np.ndarray] = None,
X_tab: Optional[np.ndarray] = None, X_tab: Optional[np.ndarray] = None,
...@@ -1112,6 +1203,8 @@ class Trainer: ...@@ -1112,6 +1203,8 @@ class Trainer:
X_img: Optional[np.ndarray] = None, X_img: Optional[np.ndarray] = None,
X_test: Optional[Dict[str, np.ndarray]] = None, X_test: Optional[Dict[str, np.ndarray]] = None,
batch_size: int = 256, batch_size: int = 256,
uncertainty_granularity=1000,
uncertainty: bool = False,
) -> List: ) -> List:
r"""Private method to avoid code repetition in predict and r"""Private method to avoid code repetition in predict and
predict_proba. For parameter information, please, see the .predict() predict_proba. For parameter information, please, see the .predict()
...@@ -1144,20 +1237,41 @@ class Trainer: ...@@ -1144,20 +1237,41 @@ class Trainer:
self.model.eval() self.model.eval()
preds_l = [] preds_l = []
if uncertainty:
for m in self.model.modules():
if m.__class__.__name__.startswith("Dropout"):
m.train()
prediction_iters = uncertainty_granularity
else:
prediction_iters = 1
with torch.no_grad(): with torch.no_grad():
with trange(test_steps, disable=self.verbose != 1) as t: with trange(uncertainty_granularity, disable=uncertainty is False) as t:
for i, data in zip(t, test_loader): for i, k in zip(t, range(prediction_iters)):
t.set_description("predict") t.set_description("predict_UncertaintyIter")
X = {k: v.cuda() for k, v in data.items()} if use_cuda else data
preds = ( with trange(
self.model(X) if not self.model.is_tabnet else self.model(X)[0] test_steps, disable=self.verbose != 1 or uncertainty is True
) ) as tt:
if self.method == "binary": for j, data in zip(tt, test_loader):
preds = torch.sigmoid(preds) tt.set_description("predict")
if self.method == "multiclass": X = (
preds = F.softmax(preds, dim=1) {k: v.cuda() for k, v in data.items()}
preds = preds.cpu().data.numpy() if use_cuda
preds_l.append(preds) else data
)
preds = (
self.model(X)
if not self.model.is_tabnet
else self.model(X)[0]
)
if self.method == "binary":
preds = torch.sigmoid(preds)
if self.method == "multiclass":
preds = F.softmax(preds, dim=1)
preds = preds.cpu().data.numpy()
preds_l.append(preds)
self.model.train() self.model.train()
return preds_l return preds_l
......
__version__ = "1.0.11" __version__ = "1.0.12"
...@@ -81,6 +81,7 @@ setup_kwargs = { ...@@ -81,6 +81,7 @@ setup_kwargs = {
"Topic :: Scientific/Engineering :: Artificial Intelligence", "Topic :: Scientific/Engineering :: Artificial Intelligence",
], ],
"zip_safe": True, "zip_safe": True,
"package_data": {"pytorch_widedeep": ["datasets/data/*"]},
"packages": setuptools.find_packages(exclude=["test_*.py"]), "packages": setuptools.find_packages(exclude=["test_*.py"]),
} }
......
...@@ -269,7 +269,15 @@ def test_notfittederror(): ...@@ -269,7 +269,15 @@ def test_notfittederror():
############################################################################### ###############################################################################
def test_embed_sz_rule_of_thumb(): @pytest.mark.parametrize(
"rule",
[
("google"),
("fastai_old"),
("fastai_new"),
],
)
def test_embed_sz_rule_of_thumb(rule):
embed_cols = ["col1", "col2"] embed_cols = ["col1", "col2"]
df = pd.DataFrame( df = pd.DataFrame(
...@@ -279,8 +287,8 @@ def test_embed_sz_rule_of_thumb(): ...@@ -279,8 +287,8 @@ def test_embed_sz_rule_of_thumb():
} }
) )
n_cats = {c: df[c].nunique() for c in ["col1", "col2"]} n_cats = {c: df[c].nunique() for c in ["col1", "col2"]}
embed_szs = {c: embed_sz_rule(nc) for c, nc in n_cats.items()} embed_szs = {c: embed_sz_rule(nc, embedding_rule=rule) for c, nc in n_cats.items()}
tab_preprocessor = TabPreprocessor(embed_cols=embed_cols) tab_preprocessor = TabPreprocessor(embed_cols=embed_cols, embedding_rule=rule)
tdf = tab_preprocessor.fit_transform(df) # noqa: F841 tdf = tab_preprocessor.fit_transform(df) # noqa: F841
out = [ out = [
tab_preprocessor.embed_dim[col] == embed_szs[col] for col in embed_szs.keys() tab_preprocessor.embed_dim[col] == embed_szs[col] for col in embed_szs.keys()
......
import numpy as np
import pandas as pd
import pytest
from pytorch_widedeep.datasets import load_adult, load_bio_kdd04
@pytest.mark.parametrize(
"as_frame",
[
(True),
(False),
],
)
def test_load_bio_kdd04(as_frame):
df = load_bio_kdd04(as_frame=as_frame)
if as_frame:
assert (df.shape, type(df)) == ((145751, 77), pd.DataFrame)
else:
assert (df.shape, type(df)) == ((145751, 77), np.ndarray)
@pytest.mark.parametrize(
"as_frame",
[
(True),
(False),
],
)
def test_load_adult(as_frame):
df = load_adult(as_frame=as_frame)
if as_frame:
assert (df.shape, type(df)) == ((48842, 15), pd.DataFrame)
else:
assert (df.shape, type(df)) == ((48842, 15), np.ndarray)
...@@ -168,9 +168,15 @@ def test_early_stop(): ...@@ -168,9 +168,15 @@ def test_early_stop():
# Test that ModelCheckpoint behaves as expected # Test that ModelCheckpoint behaves as expected
############################################################################### ###############################################################################
@pytest.mark.parametrize( @pytest.mark.parametrize(
"save_best_only, max_save, n_files", [(True, 2, 2), (False, 2, 2), (False, 0, 5)] "fpath, save_best_only, max_save, n_files",
[
("tests/test_model_functioning/weights/test_weights", True, 2, 2),
("tests/test_model_functioning/weights/test_weights", False, 2, 2),
("tests/test_model_functioning/weights/test_weights", False, 0, 5),
(None, False, 0, 0),
],
) )
def test_model_checkpoint(save_best_only, max_save, n_files): def test_model_checkpoint(fpath, save_best_only, max_save, n_files):
wide = Wide(np.unique(X_wide).shape[0], 1) wide = Wide(np.unique(X_wide).shape[0], 1)
deeptabular = TabMlp( deeptabular = TabMlp(
mlp_hidden_dims=[32, 16], mlp_hidden_dims=[32, 16],
...@@ -185,7 +191,7 @@ def test_model_checkpoint(save_best_only, max_save, n_files): ...@@ -185,7 +191,7 @@ def test_model_checkpoint(save_best_only, max_save, n_files):
objective="binary", objective="binary",
callbacks=[ callbacks=[
ModelCheckpoint( ModelCheckpoint(
"tests/test_model_functioning/weights/test_weights", filepath=fpath,
save_best_only=save_best_only, save_best_only=save_best_only,
max_save=max_save, max_save=max_save,
) )
...@@ -193,10 +199,11 @@ def test_model_checkpoint(save_best_only, max_save, n_files): ...@@ -193,10 +199,11 @@ def test_model_checkpoint(save_best_only, max_save, n_files):
verbose=0, verbose=0,
) )
trainer.fit(X_wide=X_wide, X_tab=X_tab, target=target, n_epochs=5, val_split=0.2) trainer.fit(X_wide=X_wide, X_tab=X_tab, target=target, n_epochs=5, val_split=0.2)
n_saved = len(os.listdir("tests/test_model_functioning/weights/")) if fpath:
n_saved = len(os.listdir("tests/test_model_functioning/weights/"))
shutil.rmtree("tests/test_model_functioning/weights/") shutil.rmtree("tests/test_model_functioning/weights/")
else:
n_saved = 0
assert n_saved <= n_files assert n_saved <= n_files
...@@ -340,6 +347,7 @@ def test_modelcheckpoint_mode_options(): ...@@ -340,6 +347,7 @@ def test_modelcheckpoint_mode_options():
model_checkpoint_2 = ModelCheckpoint(filepath=fpath, monitor="val_loss") model_checkpoint_2 = ModelCheckpoint(filepath=fpath, monitor="val_loss")
model_checkpoint_3 = ModelCheckpoint(filepath=fpath, monitor="acc", mode="max") model_checkpoint_3 = ModelCheckpoint(filepath=fpath, monitor="acc", mode="max")
model_checkpoint_4 = ModelCheckpoint(filepath=fpath, monitor="acc") model_checkpoint_4 = ModelCheckpoint(filepath=fpath, monitor="acc")
model_checkpoint_5 = ModelCheckpoint(filepath=None, monitor="acc")
is_min = model_checkpoint_1.monitor_op is np.less is_min = model_checkpoint_1.monitor_op is np.less
best_inf = model_checkpoint_1.best is np.Inf best_inf = model_checkpoint_1.best is np.Inf
...@@ -349,6 +357,8 @@ def test_modelcheckpoint_mode_options(): ...@@ -349,6 +357,8 @@ def test_modelcheckpoint_mode_options():
best_minus_inf = -model_checkpoint_3.best == np.Inf best_minus_inf = -model_checkpoint_3.best == np.Inf
auto_is_max = model_checkpoint_4.monitor_op is np.greater auto_is_max = model_checkpoint_4.monitor_op is np.greater
auto_best_minus_inf = -model_checkpoint_4.best == np.Inf auto_best_minus_inf = -model_checkpoint_4.best == np.Inf
auto_is_max = model_checkpoint_5.monitor_op is np.greater
auto_best_minus_inf = -model_checkpoint_5.best == np.Inf
shutil.rmtree("tests/test_model_functioning/modelcheckpoint/") shutil.rmtree("tests/test_model_functioning/modelcheckpoint/")
...@@ -478,6 +488,16 @@ def test_early_stopping_get_state(): ...@@ -478,6 +488,16 @@ def test_early_stopping_get_state():
def test_ray_tune_reporter(): def test_ray_tune_reporter():
rt_wide = Wide(np.unique(X_wide).shape[0], 1)
rt_deeptabular = TabMlp(
mlp_hidden_dims=[32, 16],
mlp_dropout=[0.5, 0.5],
column_idx=column_idx,
embed_input=embed_input,
continuous_cols=colnames[-5:],
)
rt_model = WideDeep(wide=rt_wide, deeptabular=rt_deeptabular)
config = { config = {
"batch_size": tune.grid_search([8, 16]), "batch_size": tune.grid_search([8, 16]),
} }
...@@ -486,7 +506,7 @@ def test_ray_tune_reporter(): ...@@ -486,7 +506,7 @@ def test_ray_tune_reporter():
batch_size = config["batch_size"] batch_size = config["batch_size"]
trainer = Trainer( trainer = Trainer(
model, rt_model,
objective="binary", objective="binary",
callbacks=[RayTuneReporter], callbacks=[RayTuneReporter],
verbose=0, verbose=0,
...@@ -503,7 +523,9 @@ def test_ray_tune_reporter(): ...@@ -503,7 +523,9 @@ def test_ray_tune_reporter():
analysis = tune.run( analysis = tune.run(
tune.with_parameters(training_function), tune.with_parameters(training_function),
config=config, config=config,
resources_per_trial={"cpu": 1, "gpu": 0}, resources_per_trial={"cpu": 1, "gpu": 0}
if not torch.cuda.is_available()
else {"cpu": 0, "gpu": 1},
verbose=0, verbose=0,
) )
......
...@@ -43,14 +43,14 @@ X_test = {"X_wide": X_wide, "X_tab": X_tab} ...@@ -43,14 +43,14 @@ X_test = {"X_wide": X_wide, "X_tab": X_tab}
# work well # work well
############################################################################## ##############################################################################
@pytest.mark.parametrize( @pytest.mark.parametrize(
"X_wide, X_tab, target, objective, X_wide_test, X_tab_test, X_test, pred_dim, probs_dim", "X_wide, X_tab, target, objective, X_test, pred_dim, probs_dim, uncertainties_pred_dim",
[ [
(X_wide, X_tab, target_regres, "regression", X_wide, X_tab, None, 1, None), (X_wide, X_tab, target_regres, "regression", None, 1, None, 4),
(X_wide, X_tab, target_binary, "binary", X_wide, X_tab, None, 1, 2), (X_wide, X_tab, target_binary, "binary", None, 1, 2, 3),
(X_wide, X_tab, target_multic, "multiclass", X_wide, X_tab, None, 3, 3), (X_wide, X_tab, target_multic, "multiclass", None, 3, 3, 4),
(X_wide, X_tab, target_regres, "regression", None, None, X_test, 1, None), (X_wide, X_tab, target_regres, "regression", X_test, 1, None, 4),
(X_wide, X_tab, target_binary, "binary", None, None, X_test, 1, 2), (X_wide, X_tab, target_binary, "binary", X_test, 1, 2, 3),
(X_wide, X_tab, target_multic, "multiclass", None, None, X_test, 3, 3), (X_wide, X_tab, target_multic, "multiclass", X_test, 3, 3, 4),
], ],
) )
def test_fit_objectives( def test_fit_objectives(
...@@ -58,11 +58,10 @@ def test_fit_objectives( ...@@ -58,11 +58,10 @@ def test_fit_objectives(
X_tab, X_tab,
target, target,
objective, objective,
X_wide_test,
X_tab_test,
X_test, X_test,
pred_dim, pred_dim,
probs_dim, probs_dim,
uncertainties_pred_dim,
): ):
wide = Wide(np.unique(X_wide).shape[0], pred_dim) wide = Wide(np.unique(X_wide).shape[0], pred_dim)
deeptabular = TabMlp( deeptabular = TabMlp(
...@@ -76,11 +75,22 @@ def test_fit_objectives( ...@@ -76,11 +75,22 @@ def test_fit_objectives(
trainer = Trainer(model, objective=objective, verbose=0) trainer = Trainer(model, objective=objective, verbose=0)
trainer.fit(X_wide=X_wide, X_tab=X_tab, target=target, batch_size=16) trainer.fit(X_wide=X_wide, X_tab=X_tab, target=target, batch_size=16)
preds = trainer.predict(X_wide=X_wide, X_tab=X_tab, X_test=X_test) preds = trainer.predict(X_wide=X_wide, X_tab=X_tab, X_test=X_test)
if objective == "binary": probs = trainer.predict_proba(X_wide=X_wide, X_tab=X_tab, X_test=X_test)
pass unc_preds = trainer.predict_uncertainty(
X_wide=X_wide, X_tab=X_tab, X_test=X_test, uncertainty_granularity=5
)
if objective == "regression":
assert (preds.shape[0], probs, unc_preds.shape[1]) == (
32,
probs_dim,
uncertainties_pred_dim,
)
else: else:
probs = trainer.predict_proba(X_wide=X_wide, X_tab=X_tab, X_test=X_test) assert (preds.shape[0], probs.shape[1], unc_preds.shape[1]) == (
assert preds.shape[0] == 32, probs.shape[1] == probs_dim 32,
probs_dim,
uncertainties_pred_dim,
)
############################################################################## ##############################################################################
...@@ -100,7 +110,10 @@ def test_fit_with_deephead(): ...@@ -100,7 +110,10 @@ def test_fit_with_deephead():
trainer.fit(X_wide=X_wide, X_tab=X_tab, target=target_binary, batch_size=16) trainer.fit(X_wide=X_wide, X_tab=X_tab, target=target_binary, batch_size=16)
preds = trainer.predict(X_wide=X_wide, X_tab=X_tab, X_test=X_test) preds = trainer.predict(X_wide=X_wide, X_tab=X_tab, X_test=X_test)
probs = trainer.predict_proba(X_wide=X_wide, X_tab=X_tab, X_test=X_test) probs = trainer.predict_proba(X_wide=X_wide, X_tab=X_tab, X_test=X_test)
assert preds.shape[0] == 32, probs.shape[1] == 2 unc_preds = trainer.predict_uncertainty(
X_wide=X_wide, X_tab=X_tab, X_test=X_test, uncertainty_granularity=5
)
assert (preds.shape[0], probs.shape[1], unc_preds.shape[1]) == (32, 2, 3)
############################################################################## ##############################################################################
...@@ -109,14 +122,14 @@ def test_fit_with_deephead(): ...@@ -109,14 +122,14 @@ def test_fit_with_deephead():
@pytest.mark.parametrize( @pytest.mark.parametrize(
"X_wide, X_tab, target, objective, X_wide_test, X_tab_test, X_test, pred_dim, probs_dim", "X_wide, X_tab, target, objective, X_wide_test, X_tab_test, X_test, pred_dim, probs_dim, uncertainties_pred_dim",
[ [
(X_wide, X_tab, target_regres, "regression", X_wide, X_tab, None, 1, None), (X_wide, X_tab, target_regres, "regression", X_wide, X_tab, None, 1, None, 4),
(X_wide, X_tab, target_binary, "binary", X_wide, X_tab, None, 1, 2), (X_wide, X_tab, target_binary, "binary", X_wide, X_tab, None, 1, 2, 3),
(X_wide, X_tab, target_multic, "multiclass", X_wide, X_tab, None, 3, 3), (X_wide, X_tab, target_multic, "multiclass", X_wide, X_tab, None, 3, 3, 4),
(X_wide, X_tab, target_regres, "regression", None, None, X_test, 1, None), (X_wide, X_tab, target_regres, "regression", None, None, X_test, 1, None, 4),
(X_wide, X_tab, target_binary, "binary", None, None, X_test, 1, 2), (X_wide, X_tab, target_binary, "binary", None, None, X_test, 1, 2, 3),
(X_wide, X_tab, target_multic, "multiclass", None, None, X_test, 3, 3), (X_wide, X_tab, target_multic, "multiclass", None, None, X_test, 3, 3, 4),
], ],
) )
def test_fit_objectives_tab_transformer( def test_fit_objectives_tab_transformer(
...@@ -129,6 +142,7 @@ def test_fit_objectives_tab_transformer( ...@@ -129,6 +142,7 @@ def test_fit_objectives_tab_transformer(
X_test, X_test,
pred_dim, pred_dim,
probs_dim, probs_dim,
uncertainties_pred_dim,
): ):
wide = Wide(np.unique(X_wide).shape[0], pred_dim) wide = Wide(np.unique(X_wide).shape[0], pred_dim)
tab_transformer = TabTransformer( tab_transformer = TabTransformer(
...@@ -140,11 +154,22 @@ def test_fit_objectives_tab_transformer( ...@@ -140,11 +154,22 @@ def test_fit_objectives_tab_transformer(
trainer = Trainer(model, objective=objective, verbose=0) trainer = Trainer(model, objective=objective, verbose=0)
trainer.fit(X_wide=X_wide, X_tab=X_tab, target=target, batch_size=16) trainer.fit(X_wide=X_wide, X_tab=X_tab, target=target, batch_size=16)
preds = trainer.predict(X_wide=X_wide, X_tab=X_tab, X_test=X_test) preds = trainer.predict(X_wide=X_wide, X_tab=X_tab, X_test=X_test)
if objective == "binary": probs = trainer.predict_proba(X_wide=X_wide, X_tab=X_tab, X_test=X_test)
pass unc_preds = trainer.predict_uncertainty(
X_wide=X_wide, X_tab=X_tab, X_test=X_test, uncertainty_granularity=5
)
if objective == "regression":
assert (preds.shape[0], probs, unc_preds.shape[1]) == (
32,
probs_dim,
uncertainties_pred_dim,
)
else: else:
probs = trainer.predict_proba(X_wide=X_wide, X_tab=X_tab, X_test=X_test) assert (preds.shape[0], probs.shape[1], unc_preds.shape[1]) == (
assert preds.shape[0] == 32, probs.shape[1] == probs_dim 32,
probs_dim,
uncertainties_pred_dim,
)
############################################################################## ##############################################################################
...@@ -153,14 +178,14 @@ def test_fit_objectives_tab_transformer( ...@@ -153,14 +178,14 @@ def test_fit_objectives_tab_transformer(
@pytest.mark.parametrize( @pytest.mark.parametrize(
"X_wide, X_tab, target, objective, X_wide_test, X_tab_test, X_test, pred_dim, probs_dim", "X_wide, X_tab, target, objective, X_wide_test, X_tab_test, X_test, pred_dim, probs_dim, uncertainties_pred_dim",
[ [
(X_wide, X_tab, target_regres, "regression", X_wide, X_tab, None, 1, None), (X_wide, X_tab, target_regres, "regression", X_wide, X_tab, None, 1, None, 4),
(X_wide, X_tab, target_binary, "binary", X_wide, X_tab, None, 1, 2), (X_wide, X_tab, target_binary, "binary", X_wide, X_tab, None, 1, 2, 3),
(X_wide, X_tab, target_multic, "multiclass", X_wide, X_tab, None, 3, 3), (X_wide, X_tab, target_multic, "multiclass", X_wide, X_tab, None, 3, 3, 4),
(X_wide, X_tab, target_regres, "regression", None, None, X_test, 1, None), (X_wide, X_tab, target_regres, "regression", None, None, X_test, 1, None, 4),
(X_wide, X_tab, target_binary, "binary", None, None, X_test, 1, 2), (X_wide, X_tab, target_binary, "binary", None, None, X_test, 1, 2, 3),
(X_wide, X_tab, target_multic, "multiclass", None, None, X_test, 3, 3), (X_wide, X_tab, target_multic, "multiclass", None, None, X_test, 3, 3, 4),
], ],
) )
def test_fit_objectives_tabnet( def test_fit_objectives_tabnet(
...@@ -173,6 +198,7 @@ def test_fit_objectives_tabnet( ...@@ -173,6 +198,7 @@ def test_fit_objectives_tabnet(
X_test, X_test,
pred_dim, pred_dim,
probs_dim, probs_dim,
uncertainties_pred_dim,
): ):
warnings.filterwarnings("ignore") warnings.filterwarnings("ignore")
wide = Wide(np.unique(X_wide).shape[0], pred_dim) wide = Wide(np.unique(X_wide).shape[0], pred_dim)
...@@ -185,11 +211,22 @@ def test_fit_objectives_tabnet( ...@@ -185,11 +211,22 @@ def test_fit_objectives_tabnet(
trainer = Trainer(model, objective=objective, verbose=0) trainer = Trainer(model, objective=objective, verbose=0)
trainer.fit(X_wide=X_wide, X_tab=X_tab, target=target, batch_size=16) trainer.fit(X_wide=X_wide, X_tab=X_tab, target=target, batch_size=16)
preds = trainer.predict(X_wide=X_wide, X_tab=X_tab, X_test=X_test) preds = trainer.predict(X_wide=X_wide, X_tab=X_tab, X_test=X_test)
if objective == "binary": probs = trainer.predict_proba(X_wide=X_wide, X_tab=X_tab, X_test=X_test)
pass unc_preds = trainer.predict_uncertainty(
X_wide=X_wide, X_tab=X_tab, X_test=X_test, uncertainty_granularity=5
)
if objective == "regression":
assert (preds.shape[0], probs, unc_preds.shape[1]) == (
32,
probs_dim,
uncertainties_pred_dim,
)
else: else:
probs = trainer.predict_proba(X_wide=X_wide, X_tab=X_tab, X_test=X_test) assert (preds.shape[0], probs.shape[1], unc_preds.shape[1]) == (
assert preds.shape[0] == 32, probs.shape[1] == probs_dim 32,
probs_dim,
uncertainties_pred_dim,
)
############################################################################## ##############################################################################
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册