From 5d7d6c8aa56483e1d8e690bdaa46d1851f2638f9 Mon Sep 17 00:00:00 2001 From: Javier Date: Mon, 23 Oct 2017 21:17:34 +0100 Subject: [PATCH] fix typos --- demo1_prepare_data.ipynb | 24 ++++++++++++------------ demo2_building_blocks.ipynb | 4 ++-- 2 files changed, 14 insertions(+), 14 deletions(-) diff --git a/demo1_prepare_data.ipynb b/demo1_prepare_data.ipynb index 422d3d4..9fface4 100644 --- a/demo1_prepare_data.ipynb +++ b/demo1_prepare_data.ipynb @@ -199,7 +199,7 @@ "\n", "DF = pd.read_csv('data/adult_data.csv')\n", "\n", - "# Let's create a feature that will be our target for logistic classification\n", + "# Let's create a feature that will be our target for logistic regression\n", "DF['income_label'] = (DF[\"income_bracket\"].apply(lambda x: \">50K\" in x)).astype(int)\n", "\n", "DF.head()" @@ -213,7 +213,7 @@ "\n", "We need to define the columns in the dataset that will be passed to the *\"wide-\"* and the *\"deep-side\"* of the model. For more details of what I mean by \"wide\" and \"deep\" I recommend either to read [this tutorial](https://www.tensorflow.org/tutorials/wide_and_deep), the [original paper](https://arxiv.org/pdf/1606.07792.pdf) or the demo2 in this repo. \n", "\n", - "In the example below, the wide and crossed column will be passed to the wide side of the model while the embedding_cols and continuous columns will go through the deep side. \n", + "In the example below, the wide and crossed column will be passed to the wide side of the model while the embedding columns and continuous columns will go through the deep side. \n", "\n", "We also need to state our target and the method that will be used to fit/predict that target (regression, logistic or multiclass)." ] @@ -253,13 +253,13 @@ }, "outputs": [], "source": [ - "# If embeddings_cols does not include the number of embeddings, it will be set as\n", - "# def_dim (8)\n", - "if len(embeddings_cols[0]) == 1:\n", - " emb_dim = {e:def_dim for e in embeddings_cols}\n", - "else:\n", + "# If embeddings_cols does not include the embeddings dimensions it will be set as\n", + "# def_dim\n", + "if type(embeddings_cols[0]) is tuple:\n", " emb_dim = dict(embeddings_cols)\n", " embeddings_cols = [emb[0] for emb in embeddings_cols]\n", + "else:\n", + " emb_dim = {e:def_dim for e in embeddings_cols}\n", "deep_cols = embeddings_cols+continuous_cols" ] }, @@ -335,7 +335,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "When we one-hot encode this feature later, it will be only 1 if and only if the two constituent features are 1. In other words, the level `Bachelors-Adm-clerical` of the `education_occupation` feature will be 1 if and only if for that particular observation `education=Bachelors` AND `occupation=Adm-clerical`." + "When we one-hot encode this feature later, it will be only 1 *if and only* if the two constituent features are 1. In other words, the level `Bachelors-Adm-clerical` of the `education_occupation` feature will be 1 *if and only if* for that particular observation `education=Bachelors` AND `occupation=Adm-clerical`." ] }, { @@ -455,7 +455,7 @@ }, { "cell_type": "code", - "execution_count": 10, + "execution_count": 8, "metadata": { "collapsed": false }, @@ -482,7 +482,7 @@ }, { "cell_type": "code", - "execution_count": 11, + "execution_count": 9, "metadata": { "collapsed": false }, @@ -519,7 +519,7 @@ }, { "cell_type": "code", - "execution_count": 12, + "execution_count": 10, "metadata": { "collapsed": false }, @@ -538,7 +538,7 @@ }, { "cell_type": "code", - "execution_count": 13, + "execution_count": 11, "metadata": { "collapsed": false }, diff --git a/demo2_building_blocks.ipynb b/demo2_building_blocks.ipynb index 257fd6f..afed301 100644 --- a/demo2_building_blocks.ipynb +++ b/demo2_building_blocks.ipynb @@ -399,7 +399,7 @@ "\n", "Still, the deep part implemented here will be comprised by two layers of 100 and 50 neurons, so strictly speaking and under today's standards, is not very \"deep\". \n", "\n", - "As mentioned earlier, the deep part receives embeddings and can also receive numerical features if one likes. The set up of the deep part is \"stored\" in our favourite dictionar wd_dataset. There we have to entries: " + "As mentioned earlier, the deep part receives embeddings and can also receive numerical features if one likes. The set up of the deep part is \"stored\" in our favourite dictionar `wd_dataset`. There we have two entries: " ] }, { @@ -427,7 +427,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "These should be read as follows: the feature `workclass` has 9 unique values and it will be represented using 10 embeddings. In addition, in the input tensor to the deep part, is at column 2. With this information, plus continuous columns list at the beginning of this notebook, we can build the deep part of the model.\n", + "These should be read as follows: the feature `workclass` has 9 unique values and it will be represented using 10 embeddings. In addition, in the input tensor to the deep part, `workclass` is at column 2. With this information, plus continuous columns list at the beginning of this notebook, we can build the deep part of the model.\n", "\n", "In pytorch, embedding layers are defined as:" ] -- GitLab