diff --git a/docs/callbacks.rst b/docs/callbacks.rst
index 2ba47533aa625522273f5cb3777acadfb36541e1..e8e65c4515292f51155394c96fae2a28038eb20f 100644
--- a/docs/callbacks.rst
+++ b/docs/callbacks.rst
@@ -1,8 +1,8 @@
 Callbacks
 =========
 
-Here are the 4 callbacks available in ``pytorch-widedepp``: ``History``,
-``LRHistory``, ``ModelCheckpoint`` and ``EarlyStopping``.
+Here are the 5 callbacks available in ``pytorch-widedepp``: ``History``,
+``LRHistory``, ``ModelCheckpoint``, ``EarlyStopping`` and ``RayTuneReporter``.
 
 .. note:: ``History`` runs by default, so it should not be passed
     to the ``Trainer``
diff --git a/docs/examples.rst b/docs/examples.rst
index 3af3f85a3c40e689d0d885a7920c036a022920f6..ce5a8024dd00a6b01f14bd8e1d132c4acec59a17 100644
--- a/docs/examples.rst
+++ b/docs/examples.rst
@@ -17,3 +17,4 @@ them to address different problems
 * `Using Custom DataLoaders and Torchmetrics <https://github.com/jrzaurin/pytorch-widedeep/blob/master/examples/09_Custom_DataLoader_Imbalanced_dataset.ipynb>`__
 * `The Transformer Family <https://github.com/jrzaurin/pytorch-widedeep/blob/master/examples/10_The_Transformer_Family.ipynb>`__
 * `Extracting Embeddings <https://github.com/jrzaurin/pytorch-widedeep/blob/master/examples/11_Extracting_Embeddings.ipynb>`__
+* `HyperParameter Tuning With RayTune <https://github.com/jrzaurin/pytorch-widedeep/blob/master/examples/12_HyperParameter_tuning_w_RayTune.ipynb>`__
diff --git a/docs/index.rst b/docs/index.rst
index 67f70f6b6cc772b1c310bb9c0d2441ec4123f0af..5153aaf9210df8dfe1ab37754f8d6ac87d9dea54 100644
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -33,11 +33,11 @@ Introduction
 <https://arxiv.org/abs/1606.07792>`_.
 
 In general terms, ``pytorch-widedeep`` is a package to use deep learning with
-tabular data. In particular, is intended to facilitate the combination of text
-and images with corresponding tabular data using wide and deep models. With
-that in mind there are a number of architectures that can be implemented with
-just a few lines of code. The main components of those architectures are shown
-in the Figure below:
+tabular and multimodal data. In particular, is intended to facilitate the
+combination of text and images with corresponding tabular data using wide and
+deep models. With that in mind there are a number of architectures that can
+be implemented with just a few lines of code. The main components of those
+architectures are shown in the Figure below:
 
 .. image:: figures/widedeep_arch.png
    :width: 700px
@@ -88,29 +88,52 @@ into:
 
 
 It is important to emphasize that **each individual component, wide,
-deeptabular, deeptext and deepimage, can be used independently** and in
-isolation. For example, one could use only ``wide``, which is in simply a
-linear model. In fact, one of the most interesting offerings of
-``pytorch-widedeep`` is the ``deeptabular`` component. Currently,
-``pytorch-widedeep`` offers 4 models for that component:
-
-1. ``TabMlp``: this is almost identical to the `tabular
-model <https://docs.fast.ai/tutorial.tabular.html>`_ in the fantastic
-`fastai <https://docs.fast.ai/>`_ library, and consists simply in embeddings
-representing the categorical features, concatenated with the continuous
-features, and passed then through a MLP.
-
-2. ``TabRenset``: This is similar to the previous model but the embeddings are
+deeptabular, deeptext and deepimage, can be used independently and in
+isolation**. For example, one could use only ``wide``, which is in simply a
+linear model. In fact, one of the most interesting functionalities in
+``pytorch-widedeep`` would be the use of the ``deeptabular`` component on its
+own, i.e. what one might normally refer as Deep Learning for Tabular Data.
+Currently, ``pytorch-widedeep`` offers the following different models for
+that component:
+
+
+1. **TabMlp**: a simple MLP that receives embeddings representing the
+categorical features, concatenated with the continuous features.
+
+2. **TabResnet**: similar to the previous model but the embeddings are
 passed through a series of ResNet blocks built with dense layers.
 
-3. ``Tabnet``: Details on TabNet can be found in: `TabNet: Attentive
-Interpretable Tabular Learning <https://arxiv.org/abs/1908.07442>`_.
+3. **TabNet**: details on TabNet can be found in `TabNet: Attentive
+Interpretable Tabular Learning <https://arxiv.org/abs/1908.07442>`_
 
-4. ``TabTransformer``: Details on the TabTransformer can be found in:
+And the ``Tabformer`` family, i.e. Transformers for Tabular data:
+
+4. **TabTransformer**: details on the TabTransformer can be found in
 `TabTransformer: Tabular Data Modeling Using Contextual Embeddings
 <https://arxiv.org/pdf/2012.06678.pdf>`_.
 
-For details on these 4 models and their options please see the examples in the
+5. **SAINT**: Details on SAINT can be found in `SAINT: Improved Neural
+Networks for Tabular Data via Row Attention and Contrastive Pre-Training
+<https://arxiv.org/abs/2106.01342>`_.
+
+6. **FT-Transformer**: details on the FT-Transformer can be found in
+`Revisiting Deep Learning Models for Tabular Data
+<https://arxiv.org/abs/2106.11959>`_.
+
+7. **TabFastFormer**: adaptation of the FastFormer for tabular data. Details
+on the Fasformer can be found in `FastFormers: Highly Efficient Transformer
+Models for Natural Language Understanding
+<https://arxiv.org/abs/2010.13382>`_
+
+8. **TabPerceiver**: adaptation of the Perceiver for tabular data. Details on
+the Perceiver can be found in `Perceiver: General Perception with Iterative
+Attention <https://arxiv.org/abs/2103.03206>`_
+
+Note that while there are scientific publications for the TabTransformer,
+SAINT and FT-Transformer, the TabFasfFormer and TabPerceiver are our own
+adaptation of those algorithms for tabular data.
+
+For details on these models and their options please see the examples in the
 Examples folder and the documentation.
 
 Finally, while I recommend using the ``wide`` and ``deeptabular`` models in
@@ -120,13 +143,8 @@ possible as long as the the custom models have an attribute called
 ``output_dim`` with the size of the last layer of activations, so that
 ``WideDeep`` can be constructed. Again, examples on how to use custom
 components can be found in the Examples folder. Just in case
-``pytorch-widedeep`` includes standard text (stack of LSTMs) and image
-(pre-trained ResNets or stack of CNNs) models.
-
-References
-----------
-[1] Heng-Tze Cheng, et al. 2016. Wide & Deep Learning for Recommender Systems.
-`arXiv:1606.07792 <https://arxiv.org/abs/1606.07792>`_.
+``pytorch-widedeep`` includes standard text (stack of LSTMs or GRUs) and
+image(pre-trained ResNets or stack of CNNs) models.
 
 Indices and tables
 ==================
diff --git a/docs/installation.rst b/docs/installation.rst
index 11500c0f9855cbf731ddf2508e24fcde96788fff..a4e2ad8843f18f6d8f73bb176e850c24b5984b1e 100644
--- a/docs/installation.rst
+++ b/docs/installation.rst
@@ -41,4 +41,5 @@ Dependencies
 * torchvision
 * einops
 * wrapt
-* torchmetrics
\ No newline at end of file
+* torchmetrics
+* ray[tune]
diff --git a/docs/requirements.txt b/docs/requirements.txt
index bab197235030907c1efdbe9be159c0b030b0f7f9..4fde482b8868cc688400c156da6974de512579a6 100644
--- a/docs/requirements.txt
+++ b/docs/requirements.txt
@@ -17,4 +17,5 @@ torch
 torchvision
 einops
 wrapt
-torchmetrics
\ No newline at end of file
+torchmetrics
+ray[tune]
\ No newline at end of file