未验证 提交 e1e6a2ac 编写于 作者: J Javier 提交者: GitHub

Merge pull request #56 from jrzaurin/pmulinka/uncertainty

Embedding, MC and draft requets
......@@ -11,6 +11,7 @@ on:
jobs:
codestyle:
runs-on: ubuntu-latest
if: ${{ github.event_name == 'push' || !github.event.pull_request.draft }}
steps:
- uses: actions/checkout@v2
- name: Set up Python 3.9
......@@ -32,6 +33,7 @@ jobs:
test:
runs-on: ubuntu-latest
if: ${{ github.event_name == 'push' || !github.event.pull_request.draft }}
strategy:
fail-fast: true
matrix:
......@@ -59,6 +61,7 @@ jobs:
finish:
needs: test
runs-on: ubuntu-latest
if: ${{ github.event_name == 'push' || !github.event.pull_request.draft }}
steps:
- uses: actions/checkout@v2
- name: Set up Python 3.9
......
......@@ -11,7 +11,7 @@ __pycache__*
Untitled*.ipynb
# data related dirs
data/
tmp_data/
model_weights/
tmp_dir/
weights/
......
Pytorch-widedeep is being developed and used by many active community members. Your help is very valuable to make it better for everyone.
- **[TBA]** Check for the [Roadmap](https://github.com/jrzaurin/pytorch-widedeep/projects/1) or [Open an issue](https://github.com/microsoft/jrzaurin/pytorch-widedeep/issues) to report problems or recommend new features and submit a draft pull requests, which will be changed to pull request after intial review
- Contribute to the [tests](https://github.com/jrzaurin/pytorch-widedeep/tree/master/tests) to make it more reliable.
- Contribute to the [documentation](https://github.com/jrzaurin/pytorch-widedeep/tree/master/docs) to make it clearer for everyone.
- Contribute to the [examples](https://github.com/jrzaurin/pytorch-widedeep/tree/master/examples) to share your experience with other users.
- Join the dicussion on [slack](https://join.slack.com/t/pytorch-widedeep/shared_invite/zt-soss7stf-iXpVuLeKZz8lGTnxxtHtTw)
\ No newline at end of file
......@@ -24,8 +24,6 @@ using wide and deep models.
**Experiments and comparisson with `LightGBM`**: [TabularDL vs LightGBM](https://github.com/jrzaurin/tabulardl-benchmark)
**Slack**: if you want to contribute or just want to chat with us, join [slack](https://join.slack.com/t/pytorch-widedeep/shared_invite/zt-soss7stf-iXpVuLeKZz8lGTnxxtHtTw)
The content of this document is organized as follows:
1. [introduction](#introduction)
......@@ -307,6 +305,10 @@ of the package and its functionalities.
pytest tests
```
### How to Contribute
Check [CONTRIBUTING](https://github.com/jrzaurin/pytorch-widedeep/CONTRIBUTING.MD) page.
### Acknowledgments
This library takes from a series of other libraries, so I think it is just
......
1.0.11
\ No newline at end of file
1.0.12
\ No newline at end of file
......@@ -2,27 +2,15 @@
"cells": [
{
"cell_type": "markdown",
"id": "731975e2",
"metadata": {},
"source": [
"# Hyperparameter tuning and using Raytune and visulization using Tensorboard"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"* In this notebook we will use the higly imbalanced Protein Homology Dataset from [KDD cup 2004](https://www.kdd.org/kdd-cup/view/kdd-cup-2004/Data)\n",
"\n",
"```\n",
"* The first element of each line is a BLOCK ID that denotes to which native sequence this example belongs. There is a unique BLOCK ID for each native sequence. BLOCK IDs are integers running from 1 to 303 (one for each native sequence, i.e. for each query). BLOCK IDs were assigned before the blocks were split into the train and test sets, so they do not run consecutively in either file.\n",
"* The second element of each line is an EXAMPLE ID that uniquely describes the example. You will need this EXAMPLE ID and the BLOCK ID when you submit results.\n",
"* The third element is the class of the example. Proteins that are homologous to the native sequence are denoted by 1, non-homologous proteins (i.e. decoys) by 0. Test examples have a \"?\" in this position.\n",
"* All following elements are feature values. There are 74 feature values in each line. The features describe the match (e.g. the score of a sequence alignment) between the native protein sequence and the sequence that is tested for homology.\n",
"```"
"# Hyperparameter tuning with Raytune and visulization using Tensorboard and Weights & Biases"
]
},
{
"cell_type": "markdown",
"id": "ee745c58",
"metadata": {},
"source": [
"## Initial imports"
......@@ -30,19 +18,12 @@
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": 61,
"id": "fdab94eb",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/Users/javier/.pyenv/versions/3.7.7/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject\n",
" return f(*args, **kwds)\n"
]
}
],
"outputs": [],
"source": [
"import os\n",
"import numpy as np\n",
"import pandas as pd\n",
"import torch\n",
......@@ -51,7 +32,6 @@
"from pytorch_widedeep import Trainer\n",
"from pytorch_widedeep.preprocessing import TabPreprocessor\n",
"from pytorch_widedeep.models import TabMlp, WideDeep\n",
"from pytorch_widedeep.dataloaders import DataLoaderImbalanced, DataLoaderDefault\n",
"from torchmetrics import F1 as F1_torchmetrics\n",
"from torchmetrics import Accuracy as Accuracy_torchmetrics\n",
"from torchmetrics import Precision as Precision_torchmetrics\n",
......@@ -61,22 +41,21 @@
"from pytorch_widedeep.callbacks import (\n",
" EarlyStopping,\n",
" ModelCheckpoint,\n",
" LRHistory,\n",
" RayTuneReporter,\n",
")\n",
"from pytorch_widedeep.datasets import load_bio_kdd04\n",
"\n",
"from sklearn.model_selection import train_test_split\n",
"from sklearn.metrics import classification_report\n",
"\n",
"import time\n",
"import datetime\n",
"\n",
"import warnings\n",
"\n",
"warnings.filterwarnings(\"ignore\", category=DeprecationWarning)\n",
"\n",
"from ray import tune\n",
"from ray.tune.schedulers import AsyncHyperBandScheduler\n",
"from ray.tune import JupyterNotebookReporter\n",
"from ray.tune.integration.wandb import WandbLoggerCallback, wandb_mixin\n",
"import wandb\n",
"\n",
"import tracemalloc\n",
"\n",
"tracemalloc.start()\n",
......@@ -88,7 +67,8 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 64,
"id": "07c75f0c",
"metadata": {},
"outputs": [
{
......@@ -647,20 +627,20 @@
"4 0.68 -0.59 2.0 -36.0 -6.9 2.02 0.14 -0.23 "
]
},
"execution_count": 2,
"execution_count": 64,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"header_list = ['EXAMPLE_ID', 'BLOCK_ID', 'target'] + [str(i) for i in range(4,78)]\n",
"df = pd.read_csv('data/kddcup04/bio_train.dat', sep='\\t', names=header_list)\n",
"df = load_bio_kdd04(as_frame=True)\n",
"df.head()"
]
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 65,
"id": "1e3f8efc",
"metadata": {},
"outputs": [
{
......@@ -671,7 +651,7 @@
"Name: target, dtype: int64"
]
},
"execution_count": 3,
"execution_count": 65,
"metadata": {},
"output_type": "execute_result"
}
......@@ -683,7 +663,8 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 35,
"id": "214b3071",
"metadata": {},
"outputs": [],
"source": [
......@@ -693,7 +674,8 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 36,
"id": "168c81f1",
"metadata": {},
"outputs": [],
"source": [
......@@ -703,6 +685,7 @@
},
{
"cell_type": "markdown",
"id": "87e7b8f0",
"metadata": {},
"source": [
"## Preparing the data"
......@@ -710,7 +693,8 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 37,
"id": "3a7b246b",
"metadata": {},
"outputs": [],
"source": [
......@@ -719,7 +703,8 @@
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 38,
"id": "7a2dac24",
"metadata": {},
"outputs": [],
"source": [
......@@ -737,6 +722,7 @@
},
{
"cell_type": "markdown",
"id": "7b9f63e2",
"metadata": {},
"source": [
"## Define the model"
......@@ -744,7 +730,8 @@
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": 39,
"id": "81bfda03",
"metadata": {},
"outputs": [],
"source": [
......@@ -757,7 +744,8 @@
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 40,
"id": "511198d4",
"metadata": {},
"outputs": [
{
......@@ -804,7 +792,7 @@
")"
]
},
"execution_count": 9,
"execution_count": 40,
"metadata": {},
"output_type": "execute_result"
}
......@@ -821,7 +809,8 @@
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": 41,
"id": "2d76f463",
"metadata": {},
"outputs": [],
"source": [
......@@ -834,7 +823,8 @@
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": 42,
"id": "a5359b0f",
"metadata": {},
"outputs": [],
"source": [
......@@ -847,42 +837,19 @@
},
{
"cell_type": "code",
"execution_count": 33,
"execution_count": 60,
"id": "34a18ac0",
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"text/html": [
"== Status ==<br>Memory usage on this node: 10.5/16.0 GiB<br>Using FIFO scheduling algorithm.<br>Resources requested: 0/8 CPUs, 0/0 GPUs, 0.0/4.32 GiB heap, 0.0/2.16 GiB objects<br>Result logdir: /Users/javier/ray_results/training_function_2021-10-18_15-59-06<br>Number of trials: 2/2 (2 TERMINATED)<br><table>\n",
"<thead>\n",
"<tr><th>Trial name </th><th>status </th><th>loc </th><th style=\"text-align: right;\"> batch_size</th><th style=\"text-align: right;\"> iter</th><th style=\"text-align: right;\"> total time (s)</th></tr>\n",
"</thead>\n",
"<tbody>\n",
"<tr><td>training_function_8f035_00000</td><td>TERMINATED</td><td> </td><td style=\"text-align: right;\"> 1000</td><td style=\"text-align: right;\"> 5</td><td style=\"text-align: right;\"> 18.2589</td></tr>\n",
"<tr><td>training_function_8f035_00001</td><td>TERMINATED</td><td> </td><td style=\"text-align: right;\"> 5000</td><td style=\"text-align: right;\"> 5</td><td style=\"text-align: right;\"> 18.0369</td></tr>\n",
"</tbody>\n",
"</table><br><br>"
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"2021-10-18 15:59:28,373\tINFO tune.py:617 -- Total run time: 22.07 seconds (21.91 seconds for the tuning loop).\n"
]
}
],
"outputs": [],
"source": [
"config = {\n",
" \"batch_size\": tune.grid_search([1000, 5000]),\n",
" \"wandb\": {\n",
" \"project\": \"test\",\n",
" \"api_key_file\": os.getcwd() + \"/wandb_api.key\",\n",
" },\n",
"}\n",
"\n",
"# Optimizers\n",
......@@ -890,16 +857,17 @@
"# LR Scheduler\n",
"deep_sch = lr_scheduler.StepLR(deep_opt, step_size=3)\n",
"\n",
"early_stopping = EarlyStopping()\n",
"\n",
"\n",
"@wandb_mixin\n",
"def training_function(config, X_train, X_val):\n",
" early_stopping = EarlyStopping()\n",
" model_checkpoint = ModelCheckpoint(save_best_only=True, \n",
" wb=wandb)\n",
" # Hyperparameters\n",
" batch_size = config[\"batch_size\"]\n",
" trainer = Trainer(\n",
" model,\n",
" objective=\"binary_focal_loss\",\n",
" callbacks=[RayTuneReporter, LRHistory(n_epochs=10), early_stopping],\n",
" callbacks=[RayTuneReporter, early_stopping, model_checkpoint],\n",
" lr_schedulers={\"deeptabular\": deep_sch},\n",
" initializers={\"deeptabular\": XavierNormal},\n",
" optimizers={\"deeptabular\": deep_opt},\n",
......@@ -912,121 +880,67 @@
"\n",
"X_train = {\"X_tab\": X_tab_train, \"target\": y_train}\n",
"X_val = {\"X_tab\": X_tab_valid, \"target\": y_valid}\n",
"\n",
"asha_scheduler = AsyncHyperBandScheduler(\n",
" time_attr=\"training_iteration\",\n",
" metric=\"_metric/val_loss\",\n",
" mode=\"min\",\n",
" max_t=100,\n",
" grace_period=10,\n",
" reduction_factor=3,\n",
" brackets=1,\n",
")\n",
"\n",
"analysis = tune.run(\n",
" tune.with_parameters(training_function, X_train=X_train, X_val=X_val),\n",
" resources_per_trial={\"cpu\": 1, \"gpu\": 0},\n",
" progress_reporter=JupyterNotebookReporter(overwrite=True),\n",
" scheduler=asha_scheduler,\n",
" config=config,\n",
" callbacks=[WandbLoggerCallback(\n",
" project=config[\"wandb\"][\"project\"],\n",
" api_key_file=config[\"wandb\"][\"api_key_file\"],\n",
" log_config=True)],\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 34,
"execution_count": 56,
"id": "fac74d5f",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'8f035_00000': {'_metric': {'train_loss': 0.007156053689332345,\n",
" 'train_Accuracy_0': 1.0,\n",
" 'train_Accuracy_1': 0.008678881451487541,\n",
" 'train_Precision': 0.9911835193634033,\n",
" 'train_Recall_0': 1.0,\n",
" 'train_Recall_1': 0.008678881451487541,\n",
" 'train_F1_0': 0.9955719113349915,\n",
" 'train_F1_1': 0.017208412289619446,\n",
" 'val_loss': 0.006261252580831448,\n",
" 'val_Accuracy_0': 1.0,\n",
" 'val_Accuracy_1': 0.023255813866853714,\n",
" 'val_Precision': 0.9913550615310669,\n",
" 'val_Recall_0': 1.0,\n",
" 'val_Recall_1': 0.023255813866853714,\n",
" 'val_F1_0': 0.9956578612327576,\n",
" 'val_F1_1': 0.045454543083906174,\n",
" 'lr_deeptabular_0': 0.010000000000000002},\n",
" 'time_this_iter_s': 3.5364139080047607,\n",
" 'done': True,\n",
" 'timesteps_total': None,\n",
" 'episodes_total': None,\n",
" 'training_iteration': 5,\n",
" 'experiment_id': 'f62bb9c9c32a45b9af85a6a3b0e30d94',\n",
" 'date': '2021-10-18_15-59-28',\n",
" 'timestamp': 1634565568,\n",
" 'time_total_s': 18.25888180732727,\n",
" 'pid': 15457,\n",
" 'hostname': 'infinito.bbrouter',\n",
" 'node_ip': '192.168.18.34',\n",
" 'config': {'batch_size': 1000},\n",
" 'time_since_restore': 18.25888180732727,\n",
" 'timesteps_since_restore': 0,\n",
" 'iterations_since_restore': 5,\n",
" 'trial_id': '8f035_00000',\n",
" 'experiment_tag': '0_batch_size=1000'},\n",
" '8f035_00001': {'_metric': {'train_loss': 0.019367387828727562,\n",
" 'train_Accuracy_0': 0.9999827146530151,\n",
" 'train_Accuracy_1': 0.01157184224575758,\n",
" 'train_Precision': 0.991192102432251,\n",
" 'train_Recall_0': 0.9999827146530151,\n",
" 'train_Recall_1': 0.01157184224575758,\n",
" 'train_F1_0': 0.9955761432647705,\n",
" 'train_F1_1': 0.022835396230220795,\n",
" 'val_loss': 0.01834123209118843,\n",
" 'val_Accuracy_0': 1.0,\n",
" 'val_Accuracy_1': 0.0,\n",
" 'val_Precision': 0.9911492466926575,\n",
" 'val_Recall_0': 1.0,\n",
" 'val_Recall_1': 0.0,\n",
" 'val_F1_0': 0.9955549836158752,\n",
" 'val_F1_1': 0.0,\n",
" 'lr_deeptabular_0': 0.010000000000000002},\n",
" 'time_this_iter_s': 3.42478084564209,\n",
" 'done': True,\n",
" 'timesteps_total': None,\n",
" 'episodes_total': None,\n",
" 'training_iteration': 5,\n",
" 'experiment_id': 'd7379f1debc14e41bc971bf4c27b6793',\n",
" 'date': '2021-10-18_15-59-27',\n",
" 'timestamp': 1634565567,\n",
" 'time_total_s': 18.036858797073364,\n",
" 'pid': 15456,\n",
" 'hostname': 'infinito.bbrouter',\n",
" 'node_ip': '192.168.18.34',\n",
" 'config': {'batch_size': 5000},\n",
" 'time_since_restore': 18.036858797073364,\n",
" 'timesteps_since_restore': 0,\n",
" 'iterations_since_restore': 5,\n",
" 'trial_id': '8f035_00001',\n",
" 'experiment_tag': '1_batch_size=5000'}}"
]
},
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
}
],
"outputs": [],
"source": [
"analysis.results"
]
},
{
"cell_type": "code",
"execution_count": 16,
"cell_type": "markdown",
"id": "81450d98",
"metadata": {},
"source": [
"Using Weights and Biases logging you can create [parallel coordinates graphs](https://docs.wandb.ai/ref/app/features/panels/parallel-coordinates) that map parametr combinations to the best(lowest) loss achieved during the training of the networks\n",
"\n",
"![WNB](wnb.png \"parallel coordinates\")"
]
},
{
"cell_type": "markdown",
"id": "56fc4823",
"metadata": {},
"outputs": [],
"source": [
"# %load_ext tensorboard"
"local visualization of raytune reults using tensorboard"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"scrolled": true
},
"execution_count": 59,
"id": "e1719cc0",
"metadata": {},
"outputs": [],
"source": [
"# %tensorboard --logdir ~/ray_results"
"%load_ext tensorboard\n",
"%tensorboard --logdir ~/ray_results"
]
}
],
......@@ -1035,8 +949,7 @@
"hash": "3b99005fd577fa40f3cce433b2b92303885900e634b2b5344c07c59d06c8792d"
},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"display_name": "Python 3.8.5 64-bit ('base': conda)",
"name": "python3"
},
"language_info": {
......@@ -1049,7 +962,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.7"
"version": "3.8.5"
},
"toc": {
"base_numbering": 1,
......
此差异已折叠。
......@@ -338,12 +338,13 @@ class ModelCheckpoint(Callback):
Parameters
----------
filepath: str
filepath: str, default=None
Full path to save the output weights. It must contain only the root of
the filenames. Epoch number and ``.pt`` extension (for pytorch) will
be added. e.g. ``filepath="path/to/output_weights/weights_out"`` And
the saved files in that directory will be named: ``weights_out_1.pt,
weights_out_2.pt, ...``
If set to None the class just report best metric and best_epoch.
monitor: str, default="loss"
quantity to monitor. Typically 'val_loss' or metric name (e.g. 'val_acc')
verbose:int, default=0
......@@ -362,6 +363,15 @@ class ModelCheckpoint(Callback):
Interval (number of epochs) between checkpoints.
max_save: int, default=-1
Maximum number of outputs to save. If -1 will save all outputs
wb: obj, default=None
Weights&Biases API interface to report single best result usable for comparisson of multiple
paramater combinations by e.g. parallel coordinates:
https://docs.wandb.ai/ref/app/features/panels/parallel-coordinates.
E.g W&B summary report `wandb.run.summary["best"]`:
If external EarlyStopping scheduler is used from e.g. RayTune in combination with W&B,
the RayTune EarlyStopping stops training function and the summary log is not sent if defined
after training by e.g.:
`wandb.run.summary["best"]=model_checkpoint.best`.
Attributes
----------
......@@ -369,6 +379,8 @@ class ModelCheckpoint(Callback):
best metric
best_epoch: int
best epoch
best_state_dict: dict
best model state dictionary to restore model to its best state using trainer.model.load_state_dict(ModelCheckpoint.best_state_dict)
Examples
--------
......@@ -386,13 +398,14 @@ class ModelCheckpoint(Callback):
def __init__(
self,
filepath: str,
filepath: Optional[str] = None,
monitor: str = "val_loss",
verbose: int = 0,
save_best_only: bool = False,
mode: str = "auto",
period: int = 1,
max_save: int = -1,
wb: Optional[object] = None,
):
super(ModelCheckpoint, self).__init__()
......@@ -403,18 +416,20 @@ class ModelCheckpoint(Callback):
self.mode = mode
self.period = period
self.max_save = max_save
self.wb = wb
self.epochs_since_last_save = 0
if len(self.filepath.split("/")[:-1]) == 0:
raise ValueError(
"'filepath' must be the full path to save the output weights,"
" including the root of the filenames. e.g. 'checkpoints/weights_out'"
)
if self.filepath:
if len(self.filepath.split("/")[:-1]) == 0:
raise ValueError(
"'filepath' must be the full path to save the output weights,"
" including the root of the filenames. e.g. 'checkpoints/weights_out'"
)
root_dir = ("/").join(self.filepath.split("/")[:-1])
if not os.path.exists(root_dir):
os.makedirs(root_dir)
root_dir = ("/").join(self.filepath.split("/")[:-1])
if not os.path.exists(root_dir):
os.makedirs(root_dir)
if self.max_save > 0:
self.old_files: List[str] = []
......@@ -447,7 +462,8 @@ class ModelCheckpoint(Callback):
self.epochs_since_last_save += 1
if self.epochs_since_last_save >= self.period:
self.epochs_since_last_save = 0
filepath = "{}_{}.p".format(self.filepath, epoch + 1)
if self.filepath:
filepath = "{}_{}.p".format(self.filepath, epoch + 1)
if self.save_best_only:
current = logs.get(self.monitor)
if current is None:
......@@ -459,35 +475,50 @@ class ModelCheckpoint(Callback):
else:
if self.monitor_op(current, self.best):
if self.verbose > 0:
print(
"\nEpoch %05d: %s improved from %0.5f to %0.5f,"
" saving model to %s"
% (
epoch + 1,
self.monitor,
self.best,
current,
filepath,
if self.filepath:
print(
"\nEpoch %05d: %s improved from %0.5f to %0.5f,"
" saving model to %s"
% (
epoch + 1,
self.monitor,
self.best,
current,
filepath,
)
)
)
else:
print(
"\nEpoch %05d: %s improved from %0.5f to %0.5f"
% (
epoch + 1,
self.monitor,
self.best,
current,
)
)
if self.wb is not None:
self.wb.run.summary["best"] = current # type: ignore[attr-defined]
self.best = current
self.best_epoch = epoch
torch.save(self.model.state_dict(), filepath)
if self.max_save > 0:
if len(self.old_files) == self.max_save:
try:
os.remove(self.old_files[0])
except FileNotFoundError:
pass
self.old_files = self.old_files[1:]
self.old_files.append(filepath)
self.best_state_dict = self.model.state_dict()
if self.filepath:
torch.save(self.best_state_dict, filepath)
if self.max_save > 0:
if len(self.old_files) == self.max_save:
try:
os.remove(self.old_files[0])
except FileNotFoundError:
pass
self.old_files = self.old_files[1:]
self.old_files.append(filepath)
else:
if self.verbose > 0:
print(
"\nEpoch %05d: %s did not improve from %0.5f"
% (epoch + 1, self.monitor, self.best)
)
else:
if not self.save_best_only and self.filepath:
if self.verbose > 0:
print("\nEpoch %05d: saving model to %s" % (epoch + 1, filepath))
torch.save(self.model.state_dict(), filepath)
......
from ._base import load_adult, load_bio_kdd04
__all__ = ["load_bio_kdd04", "load_adult"]
from importlib import resources
import pandas as pd
def load_bio_kdd04(as_frame: bool = False):
"""Load and return the higly imbalanced Protein Homology
Dataset from [KDD cup 2004](https://www.kdd.org/kdd-cup/view/kdd-cup-2004/Data.
This datasets include only bio_train.dat part of the dataset
* The first element of each line is a BLOCK ID that denotes to which native sequence
this example belongs. There is a unique BLOCK ID for each native sequence.
BLOCK IDs are integers running from 1 to 303 (one for each native sequence,
i.e. for each query). BLOCK IDs were assigned before the blocks were split
into the train and test sets, so they do not run consecutively in either file.
* The second element of each line is an EXAMPLE ID that uniquely describes
the example. You will need this EXAMPLE ID and the BLOCK ID when you submit results.
* The third element is the class of the example. Proteins that are homologous to
the native sequence are denoted by 1, non-homologous proteins (i.e. decoys) by 0.
Test examples have a "?" in this position.
* All following elements are feature values. There are 74 feature values in each line.
The features describe the match (e.g. the score of a sequence alignment) between
the native protein sequence and the sequence that is tested for homology.
"""
header_list = ["EXAMPLE_ID", "BLOCK_ID", "target"] + [str(i) for i in range(4, 78)]
with resources.path("pytorch_widedeep.datasets.data", "bio_train.dat") as fpath:
df = pd.read_csv(fpath, sep="\t", names=header_list)
if as_frame:
return df
else:
return df.to_numpy()
def load_adult(as_frame: bool = False):
"""Load and return the [adult income datatest](http://www.cs.toronto.edu/~delve/data/adult/desc.html).
you may find detailed description [here](http://www.cs.toronto.edu/~delve/data/adult/adultDetail.html)
"""
with resources.path("pytorch_widedeep.datasets.data", "adult.csv.zip") as fpath:
df = pd.read_csv(fpath)
if as_frame:
return df
else:
return df.to_numpy()
此差异已折叠。
......@@ -4,6 +4,7 @@ https://github.com/awslabs/autogluon/tree/master/tabular/src/autogluon/tabular/m
"""
import math
import warnings
import torch
from torch import nn
......@@ -18,10 +19,13 @@ class FullEmbeddingDropout(nn.Module):
self.dropout = dropout
def forward(self, X: Tensor) -> Tensor:
mask = X.new().resize_((X.size(1), 1)).bernoulli_(1 - self.dropout).expand_as(
X
) / (1 - self.dropout)
return mask * X
if self.training:
mask = X.new().resize_((X.size(1), 1)).bernoulli_(
1 - self.dropout
).expand_as(X) / (1 - self.dropout)
return mask * X
else:
return X
DropoutLayers = Union[nn.Dropout, FullEmbeddingDropout]
......@@ -125,13 +129,16 @@ class CategoricalEmbeddings(nn.Module):
self.categorical_cols = [ei[0] for ei in embed_input]
self.cat_idx = [self.column_idx[col] for col in self.categorical_cols]
self.bias = (
nn.Parameter(torch.Tensor(len(self.categorical_cols), embed_dim))
if use_bias
else None
)
if self.bias is not None:
if use_bias is not None:
self.bias = nn.Parameter(
torch.Tensor(len(self.categorical_cols), embed_dim)
)
nn.init.kaiming_uniform_(self.bias, a=math.sqrt(5))
if shared_embed:
warnings.warn(
"The current implementation of 'SharedEmbeddings' does not use bias",
UserWarning,
)
# Categorical: val + 1 because 0 is reserved for padding/unseen cateogories.
if self.shared_embed:
......@@ -167,11 +174,10 @@ class CategoricalEmbeddings(nn.Module):
x = torch.cat(cat_embed, 1)
else:
x = self.embed(X[:, self.cat_idx].long())
if self.bias is not None:
x = x + self.bias.unsqueeze(0)
x = self.dropout(x)
if self.bias is not None:
x = x + self.bias.unsqueeze(0)
return x
......
......@@ -12,10 +12,24 @@ from pytorch_widedeep.preprocessing.base_preprocessor import (
)
def embed_sz_rule(n_cat):
r"""Rule of thumb to pick embedding size corresponding to ``n_cat``. Taken
from fastai's Tabular API"""
return min(600, round(1.6 * n_cat ** 0.56))
def embed_sz_rule(n_cat: int, embedding_rule: str = "fastai_new") -> int:
r"""Rule of thumb to pick embedding size corresponding to ``n_cat``. Default rule is taken
from recent fastai's Tabular API. The function also includes previously used rule by fastai
and rule included in the Google's Tensorflow documentation
Parameters
----------
n_cat: int
number of unique categorical values in a feature
embedding_rule: str, default = fastai_old
rule of thumb to be used for embedding vector size
"""
if embedding_rule == "google":
return int(round(n_cat ** 0.25))
elif embedding_rule == "fastai_old":
return int(min(50, (n_cat // 2) + 1))
else:
return int(min(600, round(1.6 * n_cat ** 0.56)))
class TabPreprocessor(BasePreprocessor):
......@@ -38,8 +52,16 @@ class TabPreprocessor(BasePreprocessor):
:obj:`pytorch_widedeep.models.transformers._embedding_layers`
auto_embed_dim: bool, default = True
Boolean indicating whether the embedding dimensions will be
automatically defined via fastai's rule of thumb':
:math:`min(600, int(1.6 \times n_{cat}^{0.56}))`
automatically defined via rule of thumb
embedding_rule: str, default = 'fastai_new'
choice of embedding rule of thumb are:
- 'fastai_new' -- :math:`min(600, round(1.6 \times n_{cat}^{0.56}))`
- 'fastai_old' -- :math:`min(50, (n_{cat}//{2})+1)`
- 'google' -- :math:`min(600, round(n_{cat}^{0.24}))`
default_embed_dim: int, default=16
Dimension for the embeddings used for the ``deeptabular``
component if the embed_dim is not provided in the ``embed_cols``
......@@ -118,6 +140,7 @@ class TabPreprocessor(BasePreprocessor):
continuous_cols: List[str] = None,
scale: bool = True,
auto_embed_dim: bool = True,
embedding_rule: str = "fastai_new",
default_embed_dim: int = 16,
already_standard: List[str] = None,
for_transformer: bool = False,
......@@ -131,6 +154,7 @@ class TabPreprocessor(BasePreprocessor):
self.continuous_cols = continuous_cols
self.scale = scale
self.auto_embed_dim = auto_embed_dim
self.embedding_rule = embedding_rule
self.default_embed_dim = default_embed_dim
self.already_standard = already_standard
self.for_transformer = for_transformer
......@@ -250,7 +274,10 @@ class TabPreprocessor(BasePreprocessor):
embed_colname = [emb[0] for emb in self.embed_cols]
elif self.auto_embed_dim:
n_cats = {col: df[col].nunique() for col in self.embed_cols}
self.embed_dim = {col: embed_sz_rule(n_cat) for col, n_cat in n_cats.items()} # type: ignore[misc]
self.embed_dim = {
col: embed_sz_rule(n_cat, self.embedding_rule) # type: ignore[misc]
for col, n_cat in n_cats.items()
}
embed_colname = self.embed_cols # type: ignore
else:
self.embed_dim = {e: self.default_embed_dim for e in self.embed_cols} # type: ignore
......
import os
import json
import warnings
from copy import deepcopy
from pathlib import Path
import numpy as np
......@@ -20,7 +19,6 @@ from pytorch_widedeep.callbacks import (
History,
Callback,
MetricCallback,
RayTuneReporter,
CallbackContainer,
LRShedulerCallback,
)
......@@ -685,8 +683,14 @@ class Trainer:
If a trainer is used to predict after having trained a model, the
``batch_size`` needs to be defined as it will not be defined as
the :obj:`Trainer` is instantiated
uncertainty: bool, default = False
If set to True the model activates the dropout layers and predicts
the each sample N times (uncertainty_granularity times) and returns
{max, min, mean, stdev} value for each sample
uncertainty_granularity: int default = 1000
number of times the model does prediction for each sample if uncertainty
is set to True
"""
preds_l = self._predict(X_wide, X_tab, X_text, X_img, X_test, batch_size)
if self.method == "regression":
return np.vstack(preds_l).squeeze(1)
......@@ -697,6 +701,96 @@ class Trainer:
preds = np.vstack(preds_l)
return np.argmax(preds, 1) # type: ignore[return-value]
def predict_uncertainty( # type: ignore[return]
self,
X_wide: Optional[np.ndarray] = None,
X_tab: Optional[np.ndarray] = None,
X_text: Optional[np.ndarray] = None,
X_img: Optional[np.ndarray] = None,
X_test: Optional[Dict[str, np.ndarray]] = None,
batch_size: int = 256,
uncertainty_granularity=1000,
) -> np.ndarray:
r"""Returns the predicted ucnertainty of the model for the test dataset using a
Monte Carlo method during which dropout layers are activated in the evaluation/prediction
phase and each sample is predicted N times (uncertainty_granularity times). Based on [1].
[1] Gal Y. & Ghahramani Z., 2016, Dropout as a Bayesian Approximation: Representing Model
Uncertainty in Deep Learning, Proceedings of the 33rd International Conference on Machine Learning
Parameters
----------
X_wide: np.ndarray, Optional. default=None
Input for the ``wide`` model component.
See :class:`pytorch_widedeep.preprocessing.WidePreprocessor`
X_tab: np.ndarray, Optional. default=None
Input for the ``deeptabular`` model component.
See :class:`pytorch_widedeep.preprocessing.TabPreprocessor`
X_text: np.ndarray, Optional. default=None
Input for the ``deeptext`` model component.
See :class:`pytorch_widedeep.preprocessing.TextPreprocessor`
X_img : np.ndarray, Optional. default=None
Input for the ``deepimage`` model component.
See :class:`pytorch_widedeep.preprocessing.ImagePreprocessor`
X_test: Dict, Optional. default=None
The test dataset can also be passed in a dictionary. Keys are
`X_wide`, `'X_tab'`, `'X_text'`, `'X_img'` and `'target'`. Values
are the corresponding matrices.
batch_size: int, default = 256
If a trainer is used to predict after having trained a model, the
``batch_size`` needs to be defined as it will not be defined as
the :obj:`Trainer` is instantiated
uncertainty_granularity: int default = 1000
number of times the model does prediction for each sample if uncertainty
is set to True
Returns
-------
method == regression : np.ndarray
{max, min, mean, stdev} values for each sample
method == binary : np.ndarray
{mean_cls_0_prob, mean_cls_1_prob, predicted_cls} values for each sample
method == multiclass : np.ndarray
{mean_cls_0_prob, mean_cls_1_prob, mean_cls_2_prob, ... , predicted_cls} values for each sample
"""
preds_l = self._predict(
X_wide,
X_tab,
X_text,
X_img,
X_test,
batch_size,
uncertainty_granularity,
uncertainty=True,
)
preds = np.vstack(preds_l)
samples_num = int(preds.shape[0] / uncertainty_granularity)
if self.method == "regression":
preds = preds.squeeze(1)
preds = preds.reshape((uncertainty_granularity, samples_num))
return np.array(
(
preds.max(axis=0),
preds.min(axis=0),
preds.mean(axis=0),
preds.std(axis=0),
)
).T
if self.method == "binary":
preds = preds.squeeze(1)
preds = preds.reshape((uncertainty_granularity, samples_num))
preds = preds.mean(axis=0)
probs = np.zeros([preds.shape[0], 3])
probs[:, 0] = 1 - preds
probs[:, 1] = preds
return probs
if self.method == "multiclass":
preds = preds.reshape(uncertainty_granularity, samples_num, preds.shape[1])
preds = preds.mean(axis=0)
preds = np.hstack((preds, np.vstack(np.argmax(preds, 1))))
return preds
def predict_proba( # type: ignore[return]
self,
X_wide: Optional[np.ndarray] = None,
......@@ -944,14 +1038,11 @@ class Trainer:
for callback in self.callback_container.callbacks:
if callback.__class__.__name__ == "ModelCheckpoint":
if callback.save_best_only:
filepath = "{}_{}.p".format(
callback.filepath, callback.best_epoch + 1
)
if self.verbose:
print(
f"Model weights restored to best epoch: {callback.best_epoch + 1}"
)
self.model.load_state_dict(torch.load(filepath))
self.model.load_state_dict(callback.best_state_dict)
else:
if self.verbose:
print(
......@@ -1104,7 +1195,7 @@ class Trainer:
k: v for k, v in zip(tabnet_backbone.column_idx.keys(), feat_imp) # type: ignore[operator, union-attr]
}
def _predict(
def _predict( # noqa: C901
self,
X_wide: Optional[np.ndarray] = None,
X_tab: Optional[np.ndarray] = None,
......@@ -1112,6 +1203,8 @@ class Trainer:
X_img: Optional[np.ndarray] = None,
X_test: Optional[Dict[str, np.ndarray]] = None,
batch_size: int = 256,
uncertainty_granularity=1000,
uncertainty: bool = False,
) -> List:
r"""Private method to avoid code repetition in predict and
predict_proba. For parameter information, please, see the .predict()
......@@ -1144,20 +1237,41 @@ class Trainer:
self.model.eval()
preds_l = []
if uncertainty:
for m in self.model.modules():
if m.__class__.__name__.startswith("Dropout"):
m.train()
prediction_iters = uncertainty_granularity
else:
prediction_iters = 1
with torch.no_grad():
with trange(test_steps, disable=self.verbose != 1) as t:
for i, data in zip(t, test_loader):
t.set_description("predict")
X = {k: v.cuda() for k, v in data.items()} if use_cuda else data
preds = (
self.model(X) if not self.model.is_tabnet else self.model(X)[0]
)
if self.method == "binary":
preds = torch.sigmoid(preds)
if self.method == "multiclass":
preds = F.softmax(preds, dim=1)
preds = preds.cpu().data.numpy()
preds_l.append(preds)
with trange(uncertainty_granularity, disable=uncertainty is False) as t:
for i, k in zip(t, range(prediction_iters)):
t.set_description("predict_UncertaintyIter")
with trange(
test_steps, disable=self.verbose != 1 or uncertainty is True
) as tt:
for j, data in zip(tt, test_loader):
tt.set_description("predict")
X = (
{k: v.cuda() for k, v in data.items()}
if use_cuda
else data
)
preds = (
self.model(X)
if not self.model.is_tabnet
else self.model(X)[0]
)
if self.method == "binary":
preds = torch.sigmoid(preds)
if self.method == "multiclass":
preds = F.softmax(preds, dim=1)
preds = preds.cpu().data.numpy()
preds_l.append(preds)
self.model.train()
return preds_l
......
__version__ = "1.0.11"
__version__ = "1.0.12"
......@@ -81,6 +81,7 @@ setup_kwargs = {
"Topic :: Scientific/Engineering :: Artificial Intelligence",
],
"zip_safe": True,
"package_data": {"pytorch_widedeep": ["datasets/data/*"]},
"packages": setuptools.find_packages(exclude=["test_*.py"]),
}
......
......@@ -269,7 +269,15 @@ def test_notfittederror():
###############################################################################
def test_embed_sz_rule_of_thumb():
@pytest.mark.parametrize(
"rule",
[
("google"),
("fastai_old"),
("fastai_new"),
],
)
def test_embed_sz_rule_of_thumb(rule):
embed_cols = ["col1", "col2"]
df = pd.DataFrame(
......@@ -279,8 +287,8 @@ def test_embed_sz_rule_of_thumb():
}
)
n_cats = {c: df[c].nunique() for c in ["col1", "col2"]}
embed_szs = {c: embed_sz_rule(nc) for c, nc in n_cats.items()}
tab_preprocessor = TabPreprocessor(embed_cols=embed_cols)
embed_szs = {c: embed_sz_rule(nc, embedding_rule=rule) for c, nc in n_cats.items()}
tab_preprocessor = TabPreprocessor(embed_cols=embed_cols, embedding_rule=rule)
tdf = tab_preprocessor.fit_transform(df) # noqa: F841
out = [
tab_preprocessor.embed_dim[col] == embed_szs[col] for col in embed_szs.keys()
......
import numpy as np
import pandas as pd
import pytest
from pytorch_widedeep.datasets import load_adult, load_bio_kdd04
@pytest.mark.parametrize(
"as_frame",
[
(True),
(False),
],
)
def test_load_bio_kdd04(as_frame):
df = load_bio_kdd04(as_frame=as_frame)
if as_frame:
assert (df.shape, type(df)) == ((145751, 77), pd.DataFrame)
else:
assert (df.shape, type(df)) == ((145751, 77), np.ndarray)
@pytest.mark.parametrize(
"as_frame",
[
(True),
(False),
],
)
def test_load_adult(as_frame):
df = load_adult(as_frame=as_frame)
if as_frame:
assert (df.shape, type(df)) == ((48842, 15), pd.DataFrame)
else:
assert (df.shape, type(df)) == ((48842, 15), np.ndarray)
......@@ -168,9 +168,15 @@ def test_early_stop():
# Test that ModelCheckpoint behaves as expected
###############################################################################
@pytest.mark.parametrize(
"save_best_only, max_save, n_files", [(True, 2, 2), (False, 2, 2), (False, 0, 5)]
"fpath, save_best_only, max_save, n_files",
[
("tests/test_model_functioning/weights/test_weights", True, 2, 2),
("tests/test_model_functioning/weights/test_weights", False, 2, 2),
("tests/test_model_functioning/weights/test_weights", False, 0, 5),
(None, False, 0, 0),
],
)
def test_model_checkpoint(save_best_only, max_save, n_files):
def test_model_checkpoint(fpath, save_best_only, max_save, n_files):
wide = Wide(np.unique(X_wide).shape[0], 1)
deeptabular = TabMlp(
mlp_hidden_dims=[32, 16],
......@@ -185,7 +191,7 @@ def test_model_checkpoint(save_best_only, max_save, n_files):
objective="binary",
callbacks=[
ModelCheckpoint(
"tests/test_model_functioning/weights/test_weights",
filepath=fpath,
save_best_only=save_best_only,
max_save=max_save,
)
......@@ -193,10 +199,11 @@ def test_model_checkpoint(save_best_only, max_save, n_files):
verbose=0,
)
trainer.fit(X_wide=X_wide, X_tab=X_tab, target=target, n_epochs=5, val_split=0.2)
n_saved = len(os.listdir("tests/test_model_functioning/weights/"))
shutil.rmtree("tests/test_model_functioning/weights/")
if fpath:
n_saved = len(os.listdir("tests/test_model_functioning/weights/"))
shutil.rmtree("tests/test_model_functioning/weights/")
else:
n_saved = 0
assert n_saved <= n_files
......@@ -340,6 +347,7 @@ def test_modelcheckpoint_mode_options():
model_checkpoint_2 = ModelCheckpoint(filepath=fpath, monitor="val_loss")
model_checkpoint_3 = ModelCheckpoint(filepath=fpath, monitor="acc", mode="max")
model_checkpoint_4 = ModelCheckpoint(filepath=fpath, monitor="acc")
model_checkpoint_5 = ModelCheckpoint(filepath=None, monitor="acc")
is_min = model_checkpoint_1.monitor_op is np.less
best_inf = model_checkpoint_1.best is np.Inf
......@@ -349,6 +357,8 @@ def test_modelcheckpoint_mode_options():
best_minus_inf = -model_checkpoint_3.best == np.Inf
auto_is_max = model_checkpoint_4.monitor_op is np.greater
auto_best_minus_inf = -model_checkpoint_4.best == np.Inf
auto_is_max = model_checkpoint_5.monitor_op is np.greater
auto_best_minus_inf = -model_checkpoint_5.best == np.Inf
shutil.rmtree("tests/test_model_functioning/modelcheckpoint/")
......@@ -478,6 +488,16 @@ def test_early_stopping_get_state():
def test_ray_tune_reporter():
rt_wide = Wide(np.unique(X_wide).shape[0], 1)
rt_deeptabular = TabMlp(
mlp_hidden_dims=[32, 16],
mlp_dropout=[0.5, 0.5],
column_idx=column_idx,
embed_input=embed_input,
continuous_cols=colnames[-5:],
)
rt_model = WideDeep(wide=rt_wide, deeptabular=rt_deeptabular)
config = {
"batch_size": tune.grid_search([8, 16]),
}
......@@ -486,7 +506,7 @@ def test_ray_tune_reporter():
batch_size = config["batch_size"]
trainer = Trainer(
model,
rt_model,
objective="binary",
callbacks=[RayTuneReporter],
verbose=0,
......@@ -503,7 +523,9 @@ def test_ray_tune_reporter():
analysis = tune.run(
tune.with_parameters(training_function),
config=config,
resources_per_trial={"cpu": 1, "gpu": 0},
resources_per_trial={"cpu": 1, "gpu": 0}
if not torch.cuda.is_available()
else {"cpu": 0, "gpu": 1},
verbose=0,
)
......
......@@ -43,14 +43,14 @@ X_test = {"X_wide": X_wide, "X_tab": X_tab}
# work well
##############################################################################
@pytest.mark.parametrize(
"X_wide, X_tab, target, objective, X_wide_test, X_tab_test, X_test, pred_dim, probs_dim",
"X_wide, X_tab, target, objective, X_test, pred_dim, probs_dim, uncertainties_pred_dim",
[
(X_wide, X_tab, target_regres, "regression", X_wide, X_tab, None, 1, None),
(X_wide, X_tab, target_binary, "binary", X_wide, X_tab, None, 1, 2),
(X_wide, X_tab, target_multic, "multiclass", X_wide, X_tab, None, 3, 3),
(X_wide, X_tab, target_regres, "regression", None, None, X_test, 1, None),
(X_wide, X_tab, target_binary, "binary", None, None, X_test, 1, 2),
(X_wide, X_tab, target_multic, "multiclass", None, None, X_test, 3, 3),
(X_wide, X_tab, target_regres, "regression", None, 1, None, 4),
(X_wide, X_tab, target_binary, "binary", None, 1, 2, 3),
(X_wide, X_tab, target_multic, "multiclass", None, 3, 3, 4),
(X_wide, X_tab, target_regres, "regression", X_test, 1, None, 4),
(X_wide, X_tab, target_binary, "binary", X_test, 1, 2, 3),
(X_wide, X_tab, target_multic, "multiclass", X_test, 3, 3, 4),
],
)
def test_fit_objectives(
......@@ -58,11 +58,10 @@ def test_fit_objectives(
X_tab,
target,
objective,
X_wide_test,
X_tab_test,
X_test,
pred_dim,
probs_dim,
uncertainties_pred_dim,
):
wide = Wide(np.unique(X_wide).shape[0], pred_dim)
deeptabular = TabMlp(
......@@ -76,11 +75,22 @@ def test_fit_objectives(
trainer = Trainer(model, objective=objective, verbose=0)
trainer.fit(X_wide=X_wide, X_tab=X_tab, target=target, batch_size=16)
preds = trainer.predict(X_wide=X_wide, X_tab=X_tab, X_test=X_test)
if objective == "binary":
pass
probs = trainer.predict_proba(X_wide=X_wide, X_tab=X_tab, X_test=X_test)
unc_preds = trainer.predict_uncertainty(
X_wide=X_wide, X_tab=X_tab, X_test=X_test, uncertainty_granularity=5
)
if objective == "regression":
assert (preds.shape[0], probs, unc_preds.shape[1]) == (
32,
probs_dim,
uncertainties_pred_dim,
)
else:
probs = trainer.predict_proba(X_wide=X_wide, X_tab=X_tab, X_test=X_test)
assert preds.shape[0] == 32, probs.shape[1] == probs_dim
assert (preds.shape[0], probs.shape[1], unc_preds.shape[1]) == (
32,
probs_dim,
uncertainties_pred_dim,
)
##############################################################################
......@@ -100,7 +110,10 @@ def test_fit_with_deephead():
trainer.fit(X_wide=X_wide, X_tab=X_tab, target=target_binary, batch_size=16)
preds = trainer.predict(X_wide=X_wide, X_tab=X_tab, X_test=X_test)
probs = trainer.predict_proba(X_wide=X_wide, X_tab=X_tab, X_test=X_test)
assert preds.shape[0] == 32, probs.shape[1] == 2
unc_preds = trainer.predict_uncertainty(
X_wide=X_wide, X_tab=X_tab, X_test=X_test, uncertainty_granularity=5
)
assert (preds.shape[0], probs.shape[1], unc_preds.shape[1]) == (32, 2, 3)
##############################################################################
......@@ -109,14 +122,14 @@ def test_fit_with_deephead():
@pytest.mark.parametrize(
"X_wide, X_tab, target, objective, X_wide_test, X_tab_test, X_test, pred_dim, probs_dim",
"X_wide, X_tab, target, objective, X_wide_test, X_tab_test, X_test, pred_dim, probs_dim, uncertainties_pred_dim",
[
(X_wide, X_tab, target_regres, "regression", X_wide, X_tab, None, 1, None),
(X_wide, X_tab, target_binary, "binary", X_wide, X_tab, None, 1, 2),
(X_wide, X_tab, target_multic, "multiclass", X_wide, X_tab, None, 3, 3),
(X_wide, X_tab, target_regres, "regression", None, None, X_test, 1, None),
(X_wide, X_tab, target_binary, "binary", None, None, X_test, 1, 2),
(X_wide, X_tab, target_multic, "multiclass", None, None, X_test, 3, 3),
(X_wide, X_tab, target_regres, "regression", X_wide, X_tab, None, 1, None, 4),
(X_wide, X_tab, target_binary, "binary", X_wide, X_tab, None, 1, 2, 3),
(X_wide, X_tab, target_multic, "multiclass", X_wide, X_tab, None, 3, 3, 4),
(X_wide, X_tab, target_regres, "regression", None, None, X_test, 1, None, 4),
(X_wide, X_tab, target_binary, "binary", None, None, X_test, 1, 2, 3),
(X_wide, X_tab, target_multic, "multiclass", None, None, X_test, 3, 3, 4),
],
)
def test_fit_objectives_tab_transformer(
......@@ -129,6 +142,7 @@ def test_fit_objectives_tab_transformer(
X_test,
pred_dim,
probs_dim,
uncertainties_pred_dim,
):
wide = Wide(np.unique(X_wide).shape[0], pred_dim)
tab_transformer = TabTransformer(
......@@ -140,11 +154,22 @@ def test_fit_objectives_tab_transformer(
trainer = Trainer(model, objective=objective, verbose=0)
trainer.fit(X_wide=X_wide, X_tab=X_tab, target=target, batch_size=16)
preds = trainer.predict(X_wide=X_wide, X_tab=X_tab, X_test=X_test)
if objective == "binary":
pass
probs = trainer.predict_proba(X_wide=X_wide, X_tab=X_tab, X_test=X_test)
unc_preds = trainer.predict_uncertainty(
X_wide=X_wide, X_tab=X_tab, X_test=X_test, uncertainty_granularity=5
)
if objective == "regression":
assert (preds.shape[0], probs, unc_preds.shape[1]) == (
32,
probs_dim,
uncertainties_pred_dim,
)
else:
probs = trainer.predict_proba(X_wide=X_wide, X_tab=X_tab, X_test=X_test)
assert preds.shape[0] == 32, probs.shape[1] == probs_dim
assert (preds.shape[0], probs.shape[1], unc_preds.shape[1]) == (
32,
probs_dim,
uncertainties_pred_dim,
)
##############################################################################
......@@ -153,14 +178,14 @@ def test_fit_objectives_tab_transformer(
@pytest.mark.parametrize(
"X_wide, X_tab, target, objective, X_wide_test, X_tab_test, X_test, pred_dim, probs_dim",
"X_wide, X_tab, target, objective, X_wide_test, X_tab_test, X_test, pred_dim, probs_dim, uncertainties_pred_dim",
[
(X_wide, X_tab, target_regres, "regression", X_wide, X_tab, None, 1, None),
(X_wide, X_tab, target_binary, "binary", X_wide, X_tab, None, 1, 2),
(X_wide, X_tab, target_multic, "multiclass", X_wide, X_tab, None, 3, 3),
(X_wide, X_tab, target_regres, "regression", None, None, X_test, 1, None),
(X_wide, X_tab, target_binary, "binary", None, None, X_test, 1, 2),
(X_wide, X_tab, target_multic, "multiclass", None, None, X_test, 3, 3),
(X_wide, X_tab, target_regres, "regression", X_wide, X_tab, None, 1, None, 4),
(X_wide, X_tab, target_binary, "binary", X_wide, X_tab, None, 1, 2, 3),
(X_wide, X_tab, target_multic, "multiclass", X_wide, X_tab, None, 3, 3, 4),
(X_wide, X_tab, target_regres, "regression", None, None, X_test, 1, None, 4),
(X_wide, X_tab, target_binary, "binary", None, None, X_test, 1, 2, 3),
(X_wide, X_tab, target_multic, "multiclass", None, None, X_test, 3, 3, 4),
],
)
def test_fit_objectives_tabnet(
......@@ -173,6 +198,7 @@ def test_fit_objectives_tabnet(
X_test,
pred_dim,
probs_dim,
uncertainties_pred_dim,
):
warnings.filterwarnings("ignore")
wide = Wide(np.unique(X_wide).shape[0], pred_dim)
......@@ -185,11 +211,22 @@ def test_fit_objectives_tabnet(
trainer = Trainer(model, objective=objective, verbose=0)
trainer.fit(X_wide=X_wide, X_tab=X_tab, target=target, batch_size=16)
preds = trainer.predict(X_wide=X_wide, X_tab=X_tab, X_test=X_test)
if objective == "binary":
pass
probs = trainer.predict_proba(X_wide=X_wide, X_tab=X_tab, X_test=X_test)
unc_preds = trainer.predict_uncertainty(
X_wide=X_wide, X_tab=X_tab, X_test=X_test, uncertainty_granularity=5
)
if objective == "regression":
assert (preds.shape[0], probs, unc_preds.shape[1]) == (
32,
probs_dim,
uncertainties_pred_dim,
)
else:
probs = trainer.predict_proba(X_wide=X_wide, X_tab=X_tab, X_test=X_test)
assert preds.shape[0] == 32, probs.shape[1] == probs_dim
assert (preds.shape[0], probs.shape[1], unc_preds.shape[1]) == (
32,
probs_dim,
uncertainties_pred_dim,
)
##############################################################################
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册