提交 b1e34495 编写于 作者: M Moez Ali

adding tutorials in dev-1.0.1

上级 099f10c1
因为 它太大了无法显示 source diff 。你可以改为 查看blob
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# <span style=\"color:orange\">Anomaly Detection Tutorial (ANO103) - Level Expert</span>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Date Updated: Feb 25, 2020**\n",
"\n",
"# Work in progress\n",
"We are currently working on this tutorial. Please check back soon! \n",
"\n",
"### In the mean time, you can see: \n",
"- __[Anomaly Detection Tutorial (ANO101) - Level Beginner](https://github.com/pycaret/pycaret/blob/master/Tutorials/Anomaly%20Detection%20Tutorial%20Level%20Beginner%20-%20ANO101.ipynb)__"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# <span style=\"color:orange\">Anomaly Detection Tutorial (ANO102) - Level Intermediate</span>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Date Updated: Feb 25, 2020**\n",
"\n",
"# Work in progress\n",
"We are currently working on this tutorial. Please check back soon! \n",
"\n",
"### In the mean time, you can see: \n",
"- __[Anomaly Detection Tutorial (ANO101) - Level Beginner](https://github.com/pycaret/pycaret/blob/master/Tutorials/Anomaly%20Detection%20Tutorial%20Level%20Beginner%20-%20ANO101.ipynb)__"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# <span style=\"color:orange\">Association Rule Mining Tutorial (ARUL101)</span>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Date Updated: Feb 25, 2020**\n",
"\n",
"# 1.0 Objective of Tutorial\n",
"Welcome to Association Rule Mining Tutorial (#ARUL101). This tutorial assumes that you are new to PyCaret and looking to get started with Association Rule Mining using `pycaret.arules` Module.\n",
"\n",
"In this tutorial we will learn:\n",
"\n",
"\n",
"* **Getting Data:** How to import data from PyCaret repository?\n",
"* **Setting up Environment:** How to setup experiment in PyCaret to get started with association rule mining?\n",
"* **Create Model:** How to create a model to evaluate results?\n",
"* **Plot Model:** How to analyze model using various plots?\n",
"\n",
"Read Time : Approx. 15 Minutes\n",
"\n",
"\n",
"## 1.1 Installing PyCaret\n",
"First step to get started with PyCaret is to install pycaret. Installing pycaret is easy and take few minutes only. Follow the instructions below:\n",
"\n",
"#### Installing PyCaret in Local Jupyter Notebook\n",
"`pip install pycaret` <br />\n",
"\n",
"#### Installing PyCaret on Google Colab or Azure Notebooks\n",
"`!pip install pycaret`\n",
"\n",
"\n",
"## 1.2 Pre-Requisites\n",
"- Python 3.x\n",
"- Latest version of pycaret\n",
"- Internet connection to load data from pycaret's repository\n",
"- Basic Knowledge of Association Rule Mining\n",
"\n",
"## 1.3 For Google colab users:\n",
"If you are running this notebook on Google colab, below code of cells must be run at top of the notebook to display interactive visuals.<br/>\n",
"<br/>\n",
"`from pycaret.utils import enable_colab` <br/>\n",
"`enable_colab()`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 2.0 What is Association Rule Mining?\n",
"Association rule learning is a rule-based machine learning method for discovering interesting relations between variables in large databases. It is intended to identify strong rules discovered in databases using some measures of interestingness. For example, the rule {onions, potatoes} --> {burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy burger. Such information can be used as the basis for decisions about marketing activities such as, e.g., promotional pricing or product placements.\n",
"\n",
"__[Learn More about Association Rule Mining](https://en.wikipedia.org/wiki/Association_rule_learning)__"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 3.0 Overview of Association Rule Module in PyCaret\n",
"PyCaret's association rule module (`pycaret.arules`) is a supervised machine learning module which is used for discovering interesting relations between variables in dataset. This module automatically transforms any transactional database into shape that is acceptable for apriori algorithm. Apriori is an algorithm for frequent item set mining and association rule learning over relational databases."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 4.0 Dataset for the Tutorial"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For this tutorial we will use a small sample from UCI dataset called **Online Retail Dataset**. This is a transactional dataset which contains transactions occurring between 01/12/2010 and 09/12/2011 for a UK-based and registered non-store online retail. The company mainly sells unique all-occasion gifts. Many customers of the company are wholesalers. Short description of attributes are as follows:\n",
"\n",
"- **InvoiceNo:** Invoice number. Nominal, a 6-digit integral number uniquely assigned to each transaction. If this code starts with letter 'c', it indicates a cancellation.\n",
"- **StockCode:** Product (item) code. Nominal, a 5-digit integral number uniquely assigned to each distinct product.\n",
"- **Description:** Product (item) name. Nominal.\n",
"- **Quantity:** The quantities of each product (item) per transaction. Numeric.\n",
"- **InvoiceData:** Invice Date and time. Numeric, the day and time when each transaction was generated.\n",
"- **UnitPrice:** Unit price. Numeric, Product price per unit in sterling.\n",
"- **CustomerID:** Customer number. Nominal, a 5-digit integral number uniquely assigned to each customer.\n",
"- **Country:** Country name. Nominal, the name of the country where each customer resides.\n",
"\n",
"#### Dataset Acknowledgement:\n",
"Dr Daqing Chen, Director: Public Analytics group. chend@lsbu.ac.uk, School of Engineering, London South Bank University, London SE1 0AA, UK.\n",
"\n",
"\n",
"The original dataset and data dictionary can be __[found here.](http://archive.ics.uci.edu/ml/datasets/online+retail)__ "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 5.0 Getting the Data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can download the data from the original source __[found here](http://archive.ics.uci.edu/ml/datasets/online+retail)__ and load it using pandas __[(Learn How)](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html)__ or you can use PyCaret's data respository to load the data using `get_data()` function (This will require internet connection)."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>InvoiceNo</th>\n",
" <th>StockCode</th>\n",
" <th>Description</th>\n",
" <th>Quantity</th>\n",
" <th>InvoiceDate</th>\n",
" <th>UnitPrice</th>\n",
" <th>CustomerID</th>\n",
" <th>Country</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>536370</td>\n",
" <td>22728</td>\n",
" <td>ALARM CLOCK BAKELIKE PINK</td>\n",
" <td>24</td>\n",
" <td>12/1/2010 8:45</td>\n",
" <td>3.75</td>\n",
" <td>12583.0</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>536370</td>\n",
" <td>22727</td>\n",
" <td>ALARM CLOCK BAKELIKE RED</td>\n",
" <td>24</td>\n",
" <td>12/1/2010 8:45</td>\n",
" <td>3.75</td>\n",
" <td>12583.0</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>536370</td>\n",
" <td>22726</td>\n",
" <td>ALARM CLOCK BAKELIKE GREEN</td>\n",
" <td>12</td>\n",
" <td>12/1/2010 8:45</td>\n",
" <td>3.75</td>\n",
" <td>12583.0</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>536370</td>\n",
" <td>21724</td>\n",
" <td>PANDA AND BUNNIES STICKER SHEET</td>\n",
" <td>12</td>\n",
" <td>12/1/2010 8:45</td>\n",
" <td>0.85</td>\n",
" <td>12583.0</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>536370</td>\n",
" <td>21883</td>\n",
" <td>STARS GIFT TAPE</td>\n",
" <td>24</td>\n",
" <td>12/1/2010 8:45</td>\n",
" <td>0.65</td>\n",
" <td>12583.0</td>\n",
" <td>France</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" InvoiceNo StockCode Description Quantity \\\n",
"0 536370 22728 ALARM CLOCK BAKELIKE PINK 24 \n",
"1 536370 22727 ALARM CLOCK BAKELIKE RED 24 \n",
"2 536370 22726 ALARM CLOCK BAKELIKE GREEN 12 \n",
"3 536370 21724 PANDA AND BUNNIES STICKER SHEET 12 \n",
"4 536370 21883 STARS GIFT TAPE 24 \n",
"\n",
" InvoiceDate UnitPrice CustomerID Country \n",
"0 12/1/2010 8:45 3.75 12583.0 France \n",
"1 12/1/2010 8:45 3.75 12583.0 France \n",
"2 12/1/2010 8:45 3.75 12583.0 France \n",
"3 12/1/2010 8:45 0.85 12583.0 France \n",
"4 12/1/2010 8:45 0.65 12583.0 France "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from pycaret.datasets import get_data\n",
"data = get_data('france')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Note:** If you are downloading the data from original source, you have to filter `Country` for 'France' only, if you wish to reproduce the results in this experiment."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(8557, 8)"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#check the shape of data\n",
"data.shape"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 6.0 Setting up Environment in PyCaret"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`setup()` function initializes the environment in pycaret and transforms the transactional dataset into shape that is acceptable to Apriori algorithm. It requires two mandatory parameters: `transaction_id` which is the name of column representing transaction id and will be used to pivot the matrix; and `item_id` which is the name of column used for creation of rules. Normally, this will be the variable of interest. You can also pass an optional parameter `ignore_items` to ignore certain values for creation of rule."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"from pycaret.arules import *"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"text/html": [
"<style type=\"text/css\" >\n",
"</style><table id=\"T_eb6fed18_5808_11ea_9795_84fdd13f47d7\" ><thead> <tr> <th class=\"col_heading level0 col0\" >Description</th> <th class=\"col_heading level0 col1\" >Value</th> </tr></thead><tbody>\n",
" <tr>\n",
" <td id=\"T_eb6fed18_5808_11ea_9795_84fdd13f47d7row0_col0\" class=\"data row0 col0\" >session_id</td>\n",
" <td id=\"T_eb6fed18_5808_11ea_9795_84fdd13f47d7row0_col1\" class=\"data row0 col1\" >6209</td>\n",
" </tr>\n",
" <tr>\n",
" <td id=\"T_eb6fed18_5808_11ea_9795_84fdd13f47d7row1_col0\" class=\"data row1 col0\" ># Transactions</td>\n",
" <td id=\"T_eb6fed18_5808_11ea_9795_84fdd13f47d7row1_col1\" class=\"data row1 col1\" >461</td>\n",
" </tr>\n",
" <tr>\n",
" <td id=\"T_eb6fed18_5808_11ea_9795_84fdd13f47d7row2_col0\" class=\"data row2 col0\" ># Items</td>\n",
" <td id=\"T_eb6fed18_5808_11ea_9795_84fdd13f47d7row2_col1\" class=\"data row2 col1\" >1565</td>\n",
" </tr>\n",
" <tr>\n",
" <td id=\"T_eb6fed18_5808_11ea_9795_84fdd13f47d7row3_col0\" class=\"data row3 col0\" >Ignore Items</td>\n",
" <td id=\"T_eb6fed18_5808_11ea_9795_84fdd13f47d7row3_col1\" class=\"data row3 col1\" >None</td>\n",
" </tr>\n",
" </tbody></table>"
],
"text/plain": [
"<pandas.io.formats.style.Styler at 0x1b5242fb208>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"exp_arul101 = setup(data = data, \n",
" transaction_id = 'InvoiceNo',\n",
" item_id = 'Description') "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Once the setup is succesfully executed it prints the information grid that contains few important information:\n",
"\n",
"- **# Transactions :** Unique number of transactions in the dataset. In this case unique `InvoiceNo`. <br/>\n",
"<br/>\n",
"- **# Items :** Unique number of items in the dataset. In this case `Description`. <br/>\n",
"<br/>\n",
"- **Ignore Items :** Items to be ignored in rule mining. Many times there are relations which are too obvious and you might want to ignore them for this analysis. For example: many transactional dataset will contain shipping cost which is very obvious relationship that can be ignored in `setup()` using `ignore_items` parameter. In this tutorial, we will run the `setup()` twice, first without ignoring any items and later with ignoring items. <br/>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 7.0 Create a Model"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Creating a association rule model is simple. `create_model()` requires no mandatory parameter. It has 4 optional parameters which are as follows:\n",
"\n",
"- **metric:** Metric to evaluate if a rule is of interest. Default is set to confidence. Other available metrics include 'support', 'lift', 'leverage', 'conviction'. <br/>\n",
"<br/>\n",
"- **threshold:** Minimal threshold for the evaluation metric, via the `metric` parameter, to decide whether a candidate rule is of interest. Default is set to `0.5`. <br/>\n",
"<br/>\n",
"- **min_support:** A float between 0 and 1 for minumum support of the itemsets returned. The support is computed as the fraction `transactions_where_item(s)_occur / total_transactions`. Default is set to `0.05`. <br/>\n",
"<br/>\n",
"- **round:** Number of decimal places metrics in score grid will be rounded to. <br/>\n",
"\n",
"Let's create an association rule model with all default values."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"scrolled": false
},
"outputs": [],
"source": [
"model1 = create_model() #model created and stored in model1 variable."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(141, 9)\n"
]
}
],
"source": [
"print(model1.shape) #141 rules created."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>antecedents</th>\n",
" <th>consequents</th>\n",
" <th>antecedent support</th>\n",
" <th>consequent support</th>\n",
" <th>support</th>\n",
" <th>confidence</th>\n",
" <th>lift</th>\n",
" <th>leverage</th>\n",
" <th>conviction</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>(JUMBO BAG WOODLAND ANIMALS)</td>\n",
" <td>(POSTAGE)</td>\n",
" <td>0.0651</td>\n",
" <td>0.6746</td>\n",
" <td>0.0651</td>\n",
" <td>1.0000</td>\n",
" <td>1.4823</td>\n",
" <td>0.0212</td>\n",
" <td>inf</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>(SET/6 RED SPOTTY PAPER CUPS, SET/20 RED RETRO...</td>\n",
" <td>(SET/6 RED SPOTTY PAPER PLATES)</td>\n",
" <td>0.0868</td>\n",
" <td>0.1085</td>\n",
" <td>0.0846</td>\n",
" <td>0.9750</td>\n",
" <td>8.9895</td>\n",
" <td>0.0752</td>\n",
" <td>35.6616</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>(SET/6 RED SPOTTY PAPER PLATES, SET/20 RED RET...</td>\n",
" <td>(SET/6 RED SPOTTY PAPER CUPS)</td>\n",
" <td>0.0868</td>\n",
" <td>0.1171</td>\n",
" <td>0.0846</td>\n",
" <td>0.9750</td>\n",
" <td>8.3236</td>\n",
" <td>0.0744</td>\n",
" <td>35.3145</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>(POSTAGE, SET/20 RED RETROSPOT PAPER NAPKINS ,...</td>\n",
" <td>(SET/6 RED SPOTTY PAPER PLATES)</td>\n",
" <td>0.0716</td>\n",
" <td>0.1085</td>\n",
" <td>0.0694</td>\n",
" <td>0.9697</td>\n",
" <td>8.9406</td>\n",
" <td>0.0617</td>\n",
" <td>29.4208</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>(POSTAGE, SET/20 RED RETROSPOT PAPER NAPKINS ,...</td>\n",
" <td>(SET/6 RED SPOTTY PAPER CUPS)</td>\n",
" <td>0.0716</td>\n",
" <td>0.1171</td>\n",
" <td>0.0694</td>\n",
" <td>0.9697</td>\n",
" <td>8.2783</td>\n",
" <td>0.0610</td>\n",
" <td>29.1345</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" antecedents \\\n",
"0 (JUMBO BAG WOODLAND ANIMALS) \n",
"1 (SET/6 RED SPOTTY PAPER CUPS, SET/20 RED RETRO... \n",
"2 (SET/6 RED SPOTTY PAPER PLATES, SET/20 RED RET... \n",
"3 (POSTAGE, SET/20 RED RETROSPOT PAPER NAPKINS ,... \n",
"4 (POSTAGE, SET/20 RED RETROSPOT PAPER NAPKINS ,... \n",
"\n",
" consequents antecedent support consequent support \\\n",
"0 (POSTAGE) 0.0651 0.6746 \n",
"1 (SET/6 RED SPOTTY PAPER PLATES) 0.0868 0.1085 \n",
"2 (SET/6 RED SPOTTY PAPER CUPS) 0.0868 0.1171 \n",
"3 (SET/6 RED SPOTTY PAPER PLATES) 0.0716 0.1085 \n",
"4 (SET/6 RED SPOTTY PAPER CUPS) 0.0716 0.1171 \n",
"\n",
" support confidence lift leverage conviction \n",
"0 0.0651 1.0000 1.4823 0.0212 inf \n",
"1 0.0846 0.9750 8.9895 0.0752 35.6616 \n",
"2 0.0846 0.9750 8.3236 0.0744 35.3145 \n",
"3 0.0694 0.9697 8.9406 0.0617 29.4208 \n",
"4 0.0694 0.9697 8.2783 0.0610 29.1345 "
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"model1.head() #see the rules"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"___"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 8.0 Setup with `ignore_items`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In `model1` created above, notice that the number 1 rule is of `JUMBO BAG WOODLAND ANIMALS` with `POSTAGE` which is very obvious. In example below, we will use `ignore_items` parameter in `setup()` to ignore `POSTAGE` from the dataset and re-create the association rule model."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<style type=\"text/css\" >\n",
"</style><table id=\"T_f4582aee_5808_11ea_8231_84fdd13f47d7\" ><thead> <tr> <th class=\"col_heading level0 col0\" >Description</th> <th class=\"col_heading level0 col1\" >Value</th> </tr></thead><tbody>\n",
" <tr>\n",
" <td id=\"T_f4582aee_5808_11ea_8231_84fdd13f47d7row0_col0\" class=\"data row0 col0\" >session_id</td>\n",
" <td id=\"T_f4582aee_5808_11ea_8231_84fdd13f47d7row0_col1\" class=\"data row0 col1\" >1373</td>\n",
" </tr>\n",
" <tr>\n",
" <td id=\"T_f4582aee_5808_11ea_8231_84fdd13f47d7row1_col0\" class=\"data row1 col0\" ># Transactions</td>\n",
" <td id=\"T_f4582aee_5808_11ea_8231_84fdd13f47d7row1_col1\" class=\"data row1 col1\" >461</td>\n",
" </tr>\n",
" <tr>\n",
" <td id=\"T_f4582aee_5808_11ea_8231_84fdd13f47d7row2_col0\" class=\"data row2 col0\" ># Items</td>\n",
" <td id=\"T_f4582aee_5808_11ea_8231_84fdd13f47d7row2_col1\" class=\"data row2 col1\" >1565</td>\n",
" </tr>\n",
" <tr>\n",
" <td id=\"T_f4582aee_5808_11ea_8231_84fdd13f47d7row3_col0\" class=\"data row3 col0\" >Ignore Items</td>\n",
" <td id=\"T_f4582aee_5808_11ea_8231_84fdd13f47d7row3_col1\" class=\"data row3 col1\" >['POSTAGE']</td>\n",
" </tr>\n",
" </tbody></table>"
],
"text/plain": [
"<pandas.io.formats.style.Styler at 0x1b524950a08>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"exp_arul101 = setup(data = data, \n",
" transaction_id = 'InvoiceNo',\n",
" item_id = 'Description',\n",
" ignore_items = ['POSTAGE']) "
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"model2 = create_model()"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(45, 9)\n"
]
}
],
"source": [
"print(model2.shape) #notice how only 45 rules are created vs. 141 above."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>antecedents</th>\n",
" <th>consequents</th>\n",
" <th>antecedent support</th>\n",
" <th>consequent support</th>\n",
" <th>support</th>\n",
" <th>confidence</th>\n",
" <th>lift</th>\n",
" <th>leverage</th>\n",
" <th>conviction</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>(SET/6 RED SPOTTY PAPER PLATES, SET/20 RED RET...</td>\n",
" <td>(SET/6 RED SPOTTY PAPER CUPS)</td>\n",
" <td>0.0868</td>\n",
" <td>0.1171</td>\n",
" <td>0.0846</td>\n",
" <td>0.9750</td>\n",
" <td>8.3236</td>\n",
" <td>0.0744</td>\n",
" <td>35.3145</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>(SET/6 RED SPOTTY PAPER CUPS, SET/20 RED RETRO...</td>\n",
" <td>(SET/6 RED SPOTTY PAPER PLATES)</td>\n",
" <td>0.0868</td>\n",
" <td>0.1085</td>\n",
" <td>0.0846</td>\n",
" <td>0.9750</td>\n",
" <td>8.9895</td>\n",
" <td>0.0752</td>\n",
" <td>35.6616</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>(SET/6 RED SPOTTY PAPER PLATES)</td>\n",
" <td>(SET/6 RED SPOTTY PAPER CUPS)</td>\n",
" <td>0.1085</td>\n",
" <td>0.1171</td>\n",
" <td>0.1041</td>\n",
" <td>0.9600</td>\n",
" <td>8.1956</td>\n",
" <td>0.0914</td>\n",
" <td>22.0716</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>(CHILDRENS CUTLERY SPACEBOY )</td>\n",
" <td>(CHILDRENS CUTLERY DOLLY GIRL )</td>\n",
" <td>0.0586</td>\n",
" <td>0.0629</td>\n",
" <td>0.0542</td>\n",
" <td>0.9259</td>\n",
" <td>14.7190</td>\n",
" <td>0.0505</td>\n",
" <td>12.6508</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>(SET/6 RED SPOTTY PAPER CUPS)</td>\n",
" <td>(SET/6 RED SPOTTY PAPER PLATES)</td>\n",
" <td>0.1171</td>\n",
" <td>0.1085</td>\n",
" <td>0.1041</td>\n",
" <td>0.8889</td>\n",
" <td>8.1956</td>\n",
" <td>0.0914</td>\n",
" <td>8.0239</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" antecedents \\\n",
"0 (SET/6 RED SPOTTY PAPER PLATES, SET/20 RED RET... \n",
"1 (SET/6 RED SPOTTY PAPER CUPS, SET/20 RED RETRO... \n",
"2 (SET/6 RED SPOTTY PAPER PLATES) \n",
"3 (CHILDRENS CUTLERY SPACEBOY ) \n",
"4 (SET/6 RED SPOTTY PAPER CUPS) \n",
"\n",
" consequents antecedent support consequent support \\\n",
"0 (SET/6 RED SPOTTY PAPER CUPS) 0.0868 0.1171 \n",
"1 (SET/6 RED SPOTTY PAPER PLATES) 0.0868 0.1085 \n",
"2 (SET/6 RED SPOTTY PAPER CUPS) 0.1085 0.1171 \n",
"3 (CHILDRENS CUTLERY DOLLY GIRL ) 0.0586 0.0629 \n",
"4 (SET/6 RED SPOTTY PAPER PLATES) 0.1171 0.1085 \n",
"\n",
" support confidence lift leverage conviction \n",
"0 0.0846 0.9750 8.3236 0.0744 35.3145 \n",
"1 0.0846 0.9750 8.9895 0.0752 35.6616 \n",
"2 0.1041 0.9600 8.1956 0.0914 22.0716 \n",
"3 0.0542 0.9259 14.7190 0.0505 12.6508 \n",
"4 0.1041 0.8889 8.1956 0.0914 8.0239 "
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"model2.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 9.0 Plot Model"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" <script type=\"text/javascript\">\n",
" window.PlotlyConfig = {MathJaxConfig: 'local'};\n",
" if (window.MathJax) {MathJax.Hub.Config({SVG: {font: \"STIX-Web\"}});}\n",
" if (typeof require !== 'undefined') {\n",
" require.undef(\"plotly\");\n",
" requirejs.config({\n",
" paths: {\n",
" 'plotly': ['https://cdn.plot.ly/plotly-latest.min']\n",
" }\n",
" });\n",
" require(['plotly'], function(Plotly) {\n",
" window._Plotly = Plotly;\n",
" });\n",
" }\n",
" </script>\n",
" "
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.plotly.v1+json": {
"config": {
"plotlyServerURL": "https://plot.ly"
},
"data": [
{
"customdata": [
[
"SET/6 RED SPOTTY PAPER PLATES",
"SET/6 RED SPOTTY PAPER CUPS"
],
[
"SET/6 RED SPOTTY PAPER CUPS",
"SET/6 RED SPOTTY PAPER PLATES"
],
[
"SET/6 RED SPOTTY PAPER PLATES",
"SET/6 RED SPOTTY PAPER CUPS"
],
[
"CHILDRENS CUTLERY SPACEBOY ",
"CHILDRENS CUTLERY DOLLY GIRL "
],
[
"SET/6 RED SPOTTY PAPER CUPS",
"SET/6 RED SPOTTY PAPER PLATES"
],
[
"ALARM CLOCK BAKELIKE PINK",
"ALARM CLOCK BAKELIKE RED "
],
[
"CHILDRENS CUTLERY DOLLY GIRL ",
"CHILDRENS CUTLERY SPACEBOY "
],
[
"ALARM CLOCK BAKELIKE PINK",
"ALARM CLOCK BAKELIKE GREEN"
],
[
"ALARM CLOCK BAKELIKE RED ",
"ALARM CLOCK BAKELIKE GREEN"
],
[
"SET/6 RED SPOTTY PAPER CUPS",
"SET/20 RED RETROSPOT PAPER NAPKINS "
],
[
"ALARM CLOCK BAKELIKE RED ",
"ALARM CLOCK BAKELIKE PINK"
],
[
"SET/6 RED SPOTTY PAPER PLATES",
"SET/20 RED RETROSPOT PAPER NAPKINS "
],
[
"ALARM CLOCK BAKELIKE GREEN",
"ALARM CLOCK BAKELIKE RED "
],
[
"ALARM CLOCK BAKELIKE RED ",
"ALARM CLOCK BAKELIKE PINK"
],
[
"SET/6 RED SPOTTY PAPER PLATES",
"SET/6 RED SPOTTY PAPER CUPS"
],
[
"SET/20 RED RETROSPOT PAPER NAPKINS ",
"SET/6 RED SPOTTY PAPER CUPS"
],
[
"SET/20 RED RETROSPOT PAPER NAPKINS ",
"SET/6 RED SPOTTY PAPER PLATES"
],
[
"PLASTERS IN TIN CIRCUS PARADE ",
"PLASTERS IN TIN WOODLAND ANIMALS"
],
[
"SET/20 RED RETROSPOT PAPER NAPKINS ",
"SET/6 RED SPOTTY PAPER CUPS"
],
[
"PLASTERS IN TIN SPACEBOY",
"PLASTERS IN TIN WOODLAND ANIMALS"
],
[
"ALARM CLOCK BAKELIKE GREEN",
"ALARM CLOCK BAKELIKE PINK"
],
[
"SET/6 RED SPOTTY PAPER CUPS",
"SET/20 RED RETROSPOT PAPER NAPKINS "
],
[
"ALARM CLOCK BAKELIKE PINK",
"ALARM CLOCK BAKELIKE GREEN"
],
[
"ALARM CLOCK BAKELIKE PINK",
"ALARM CLOCK BAKELIKE RED "
],
[
"SET/6 RED SPOTTY PAPER CUPS",
"SET/6 RED SPOTTY PAPER PLATES"
],
[
"DOLLY GIRL LUNCH BOX",
"SPACEBOY LUNCH BOX "
],
[
"ALARM CLOCK BAKELIKE RED ",
"ALARM CLOCK BAKELIKE PINK"
],
[
"PLASTERS IN TIN CIRCUS PARADE ",
"PLASTERS IN TIN SPACEBOY"
],
[
"PLASTERS IN TIN WOODLAND ANIMALS",
"PLASTERS IN TIN CIRCUS PARADE "
],
[
"PLASTERS IN TIN SPACEBOY",
"PLASTERS IN TIN CIRCUS PARADE "
],
[
"ALARM CLOCK BAKELIKE GREEN",
"ALARM CLOCK BAKELIKE PINK"
],
[
"ALARM CLOCK BAKELIKE PINK",
"ALARM CLOCK BAKELIKE RED "
],
[
"PLASTERS IN TIN WOODLAND ANIMALS",
"PLASTERS IN TIN SPACEBOY"
],
[
"PLASTERS IN TIN WOODLAND ANIMALS",
"PLASTERS IN TIN CIRCUS PARADE "
],
[
"PLASTERS IN TIN CIRCUS PARADE ",
"PLASTERS IN TIN WOODLAND ANIMALS"
],
[
"ROUND SNACK BOXES SET OF 4 FRUITS ",
"ROUND SNACK BOXES SET OF4 WOODLAND "
],
[
"SPACEBOY LUNCH BOX ",
"DOLLY GIRL LUNCH BOX"
],
[
"LUNCH BAG WOODLAND",
"LUNCH BAG SPACEBOY DESIGN "
],
[
"LUNCH BAG SPACEBOY DESIGN ",
"LUNCH BAG RED RETROSPOT"
],
[
"LUNCH BAG SPACEBOY DESIGN ",
"LUNCH BAG WOODLAND"
],
[
"LUNCH BAG SPACEBOY DESIGN ",
"LUNCH BAG APPLE DESIGN"
],
[
"PLASTERS IN TIN CIRCUS PARADE ",
"PLASTERS IN TIN SPACEBOY"
],
[
"STRAWBERRY LUNCH BOX WITH CUTLERY",
"LUNCH BOX WITH CUTLERY RETROSPOT "
],
[
"LUNCH BAG APPLE DESIGN",
"LUNCH BAG SPACEBOY DESIGN "
],
[
"LUNCH BAG APPLE DESIGN",
"LUNCH BAG RED RETROSPOT"
]
],
"hoverlabel": {
"namelength": 0
},
"hovertemplate": "support=%{x}<br>confidence=%{y}<br>antecedents=%{customdata[0]}<br>consequents=%{customdata[1]}<br>antecedents_short=%{text}<br>lift=%{marker.color}",
"legendgroup": "",
"marker": {
"color": [
8.3236,
8.9895,
8.1956,
14.719,
8.1956,
10.7409,
14.719,
10.1901,
9.9037,
7.2031,
9.2944,
7.0923,
9.9037,
9.0331,
8.9895,
6.567,
7.0923,
5.1604,
7.2031,
5.1292,
8.5699,
6.567,
8.5699,
9.0331,
8.3236,
6.5857,
10.7409,
5.6577,
4.4645,
4.4374,
10.1901,
9.2944,
5.1292,
4.0474,
4.0474,
4.1879,
6.5857,
5.3129,
4.0936,
5.3129,
4.9942,
4.4374,
4.2124,
4.9942,
3.9298
],
"coloraxis": "coloraxis",
"opacity": 0.5,
"symbol": "circle"
},
"mode": "markers+text",
"name": "",
"showlegend": false,
"text": [
"SET/6 RED ",
"SET/6 RED ",
"SET/6 RED ",
"CHILDRENS ",
"SET/6 RED ",
"ALARM CLOC",
"CHILDRENS ",
"ALARM CLOC",
"ALARM CLOC",
"SET/6 RED ",
"ALARM CLOC",
"SET/6 RED ",
"ALARM CLOC",
"ALARM CLOC",
"SET/6 RED ",
"SET/20 RED",
"SET/20 RED",
"PLASTERS I",
"SET/20 RED",
"PLASTERS I",
"ALARM CLOC",
"SET/6 RED ",
"ALARM CLOC",
"ALARM CLOC",
"SET/6 RED ",
"DOLLY GIRL",
"ALARM CLOC",
"PLASTERS I",
"PLASTERS I",
"PLASTERS I",
"ALARM CLOC",
"ALARM CLOC",
"PLASTERS I",
"PLASTERS I",
"PLASTERS I",
"ROUND SNAC",
"SPACEBOY L",
"LUNCH BAG ",
"LUNCH BAG ",
"LUNCH BAG ",
"LUNCH BAG ",
"PLASTERS I",
"STRAWBERRY",
"LUNCH BAG ",
"LUNCH BAG "
],
"textposition": "top center",
"type": "scatter",
"x": [
0.0846,
0.0846,
0.1041,
0.0542,
0.1041,
0.0542,
0.0542,
0.0542,
0.0672,
0.0846,
0.0542,
0.0868,
0.0672,
0.0629,
0.0846,
0.0868,
0.0868,
0.0586,
0.0846,
0.0889,
0.0629,
0.0868,
0.0629,
0.0629,
0.0846,
0.0607,
0.0542,
0.0586,
0.0586,
0.0781,
0.0542,
0.0542,
0.0889,
0.0868,
0.0868,
0.0542,
0.0607,
0.0564,
0.0564,
0.0564,
0.0564,
0.0781,
0.0542,
0.0564,
0.0564
],
"xaxis": "x",
"y": [
0.975,
0.975,
0.96,
0.9259,
0.8889,
0.8621,
0.8621,
0.8621,
0.8378,
0.8125,
0.8065,
0.8,
0.7949,
0.7838,
0.78,
0.7692,
0.7692,
0.75,
0.75,
0.7455,
0.7436,
0.7407,
0.725,
0.725,
0.7222,
0.7,
0.6757,
0.675,
0.6585,
0.6545,
0.641,
0.625,
0.6119,
0.597,
0.5882,
0.5814,
0.5714,
0.5532,
0.5417,
0.5417,
0.5417,
0.5294,
0.5208,
0.52,
0.52
],
"yaxis": "y"
}
],
"layout": {
"coloraxis": {
"colorbar": {
"title": {
"text": "lift"
}
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
]
},
"height": 800,
"legend": {
"tracegroupgap": 0
},
"margin": {
"t": 60
},
"plot_bgcolor": "rgb(240,240,240)",
"template": {
"data": {
"bar": [
{
"error_x": {
"color": "#2a3f5f"
},
"error_y": {
"color": "#2a3f5f"
},
"marker": {
"line": {
"color": "#E5ECF6",
"width": 0.5
}
},
"type": "bar"
}
],
"barpolar": [
{
"marker": {
"line": {
"color": "#E5ECF6",
"width": 0.5
}
},
"type": "barpolar"
}
],
"carpet": [
{
"aaxis": {
"endlinecolor": "#2a3f5f",
"gridcolor": "white",
"linecolor": "white",
"minorgridcolor": "white",
"startlinecolor": "#2a3f5f"
},
"baxis": {
"endlinecolor": "#2a3f5f",
"gridcolor": "white",
"linecolor": "white",
"minorgridcolor": "white",
"startlinecolor": "#2a3f5f"
},
"type": "carpet"
}
],
"choropleth": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "choropleth"
}
],
"contour": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "contour"
}
],
"contourcarpet": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "contourcarpet"
}
],
"heatmap": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "heatmap"
}
],
"heatmapgl": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "heatmapgl"
}
],
"histogram": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "histogram"
}
],
"histogram2d": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "histogram2d"
}
],
"histogram2dcontour": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "histogram2dcontour"
}
],
"mesh3d": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "mesh3d"
}
],
"parcoords": [
{
"line": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "parcoords"
}
],
"pie": [
{
"automargin": true,
"type": "pie"
}
],
"scatter": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatter"
}
],
"scatter3d": [
{
"line": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatter3d"
}
],
"scattercarpet": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattercarpet"
}
],
"scattergeo": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattergeo"
}
],
"scattergl": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattergl"
}
],
"scattermapbox": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattermapbox"
}
],
"scatterpolar": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterpolar"
}
],
"scatterpolargl": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterpolargl"
}
],
"scatterternary": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterternary"
}
],
"surface": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "surface"
}
],
"table": [
{
"cells": {
"fill": {
"color": "#EBF0F8"
},
"line": {
"color": "white"
}
},
"header": {
"fill": {
"color": "#C8D4E3"
},
"line": {
"color": "white"
}
},
"type": "table"
}
]
},
"layout": {
"annotationdefaults": {
"arrowcolor": "#2a3f5f",
"arrowhead": 0,
"arrowwidth": 1
},
"coloraxis": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"colorscale": {
"diverging": [
[
0,
"#8e0152"
],
[
0.1,
"#c51b7d"
],
[
0.2,
"#de77ae"
],
[
0.3,
"#f1b6da"
],
[
0.4,
"#fde0ef"
],
[
0.5,
"#f7f7f7"
],
[
0.6,
"#e6f5d0"
],
[
0.7,
"#b8e186"
],
[
0.8,
"#7fbc41"
],
[
0.9,
"#4d9221"
],
[
1,
"#276419"
]
],
"sequential": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"sequentialminus": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
]
},
"colorway": [
"#636efa",
"#EF553B",
"#00cc96",
"#ab63fa",
"#FFA15A",
"#19d3f3",
"#FF6692",
"#B6E880",
"#FF97FF",
"#FECB52"
],
"font": {
"color": "#2a3f5f"
},
"geo": {
"bgcolor": "white",
"lakecolor": "white",
"landcolor": "#E5ECF6",
"showlakes": true,
"showland": true,
"subunitcolor": "white"
},
"hoverlabel": {
"align": "left"
},
"hovermode": "closest",
"mapbox": {
"style": "light"
},
"paper_bgcolor": "white",
"plot_bgcolor": "#E5ECF6",
"polar": {
"angularaxis": {
"gridcolor": "white",
"linecolor": "white",
"ticks": ""
},
"bgcolor": "#E5ECF6",
"radialaxis": {
"gridcolor": "white",
"linecolor": "white",
"ticks": ""
}
},
"scene": {
"xaxis": {
"backgroundcolor": "#E5ECF6",
"gridcolor": "white",
"gridwidth": 2,
"linecolor": "white",
"showbackground": true,
"ticks": "",
"zerolinecolor": "white"
},
"yaxis": {
"backgroundcolor": "#E5ECF6",
"gridcolor": "white",
"gridwidth": 2,
"linecolor": "white",
"showbackground": true,
"ticks": "",
"zerolinecolor": "white"
},
"zaxis": {
"backgroundcolor": "#E5ECF6",
"gridcolor": "white",
"gridwidth": 2,
"linecolor": "white",
"showbackground": true,
"ticks": "",
"zerolinecolor": "white"
}
},
"shapedefaults": {
"line": {
"color": "#2a3f5f"
}
},
"ternary": {
"aaxis": {
"gridcolor": "white",
"linecolor": "white",
"ticks": ""
},
"baxis": {
"gridcolor": "white",
"linecolor": "white",
"ticks": ""
},
"bgcolor": "#E5ECF6",
"caxis": {
"gridcolor": "white",
"linecolor": "white",
"ticks": ""
}
},
"title": {
"x": 0.05
},
"xaxis": {
"automargin": true,
"gridcolor": "white",
"linecolor": "white",
"ticks": "",
"title": {
"standoff": 15
},
"zerolinecolor": "white",
"zerolinewidth": 2
},
"yaxis": {
"automargin": true,
"gridcolor": "white",
"linecolor": "white",
"ticks": "",
"title": {
"standoff": 15
},
"zerolinecolor": "white",
"zerolinewidth": 2
}
}
},
"title": {
"text": "2D Plot of Support, Confidence and Lift"
},
"xaxis": {
"anchor": "y",
"domain": [
0,
1
],
"title": {
"text": "support"
},
"type": "log"
},
"yaxis": {
"anchor": "x",
"domain": [
0,
1
],
"title": {
"text": "confidence"
}
}
}
},
"text/html": [
"<div>\n",
" \n",
" \n",
" <div id=\"b1b71c16-b29b-45d8-bc5a-78c50cb4dee2\" class=\"plotly-graph-div\" style=\"height:800px; width:100%;\"></div>\n",
" <script type=\"text/javascript\">\n",
" require([\"plotly\"], function(Plotly) {\n",
" window.PLOTLYENV=window.PLOTLYENV || {};\n",
" \n",
" if (document.getElementById(\"b1b71c16-b29b-45d8-bc5a-78c50cb4dee2\")) {\n",
" Plotly.newPlot(\n",
" 'b1b71c16-b29b-45d8-bc5a-78c50cb4dee2',\n",
" [{\"customdata\": [[\"SET/6 RED SPOTTY PAPER PLATES\", \"SET/6 RED SPOTTY PAPER CUPS\"], [\"SET/6 RED SPOTTY PAPER CUPS\", \"SET/6 RED SPOTTY PAPER PLATES\"], [\"SET/6 RED SPOTTY PAPER PLATES\", \"SET/6 RED SPOTTY PAPER CUPS\"], [\"CHILDRENS CUTLERY SPACEBOY \", \"CHILDRENS CUTLERY DOLLY GIRL \"], [\"SET/6 RED SPOTTY PAPER CUPS\", \"SET/6 RED SPOTTY PAPER PLATES\"], [\"ALARM CLOCK BAKELIKE PINK\", \"ALARM CLOCK BAKELIKE RED \"], [\"CHILDRENS CUTLERY DOLLY GIRL \", \"CHILDRENS CUTLERY SPACEBOY \"], [\"ALARM CLOCK BAKELIKE PINK\", \"ALARM CLOCK BAKELIKE GREEN\"], [\"ALARM CLOCK BAKELIKE RED \", \"ALARM CLOCK BAKELIKE GREEN\"], [\"SET/6 RED SPOTTY PAPER CUPS\", \"SET/20 RED RETROSPOT PAPER NAPKINS \"], [\"ALARM CLOCK BAKELIKE RED \", \"ALARM CLOCK BAKELIKE PINK\"], [\"SET/6 RED SPOTTY PAPER PLATES\", \"SET/20 RED RETROSPOT PAPER NAPKINS \"], [\"ALARM CLOCK BAKELIKE GREEN\", \"ALARM CLOCK BAKELIKE RED \"], [\"ALARM CLOCK BAKELIKE RED \", \"ALARM CLOCK BAKELIKE PINK\"], [\"SET/6 RED SPOTTY PAPER PLATES\", \"SET/6 RED SPOTTY PAPER CUPS\"], [\"SET/20 RED RETROSPOT PAPER NAPKINS \", \"SET/6 RED SPOTTY PAPER CUPS\"], [\"SET/20 RED RETROSPOT PAPER NAPKINS \", \"SET/6 RED SPOTTY PAPER PLATES\"], [\"PLASTERS IN TIN CIRCUS PARADE \", \"PLASTERS IN TIN WOODLAND ANIMALS\"], [\"SET/20 RED RETROSPOT PAPER NAPKINS \", \"SET/6 RED SPOTTY PAPER CUPS\"], [\"PLASTERS IN TIN SPACEBOY\", \"PLASTERS IN TIN WOODLAND ANIMALS\"], [\"ALARM CLOCK BAKELIKE GREEN\", \"ALARM CLOCK BAKELIKE PINK\"], [\"SET/6 RED SPOTTY PAPER CUPS\", \"SET/20 RED RETROSPOT PAPER NAPKINS \"], [\"ALARM CLOCK BAKELIKE PINK\", \"ALARM CLOCK BAKELIKE GREEN\"], [\"ALARM CLOCK BAKELIKE PINK\", \"ALARM CLOCK BAKELIKE RED \"], [\"SET/6 RED SPOTTY PAPER CUPS\", \"SET/6 RED SPOTTY PAPER PLATES\"], [\"DOLLY GIRL LUNCH BOX\", \"SPACEBOY LUNCH BOX \"], [\"ALARM CLOCK BAKELIKE RED \", \"ALARM CLOCK BAKELIKE PINK\"], [\"PLASTERS IN TIN CIRCUS PARADE \", \"PLASTERS IN TIN SPACEBOY\"], [\"PLASTERS IN TIN WOODLAND ANIMALS\", \"PLASTERS IN TIN CIRCUS PARADE \"], [\"PLASTERS IN TIN SPACEBOY\", \"PLASTERS IN TIN CIRCUS PARADE \"], [\"ALARM CLOCK BAKELIKE GREEN\", \"ALARM CLOCK BAKELIKE PINK\"], [\"ALARM CLOCK BAKELIKE PINK\", \"ALARM CLOCK BAKELIKE RED \"], [\"PLASTERS IN TIN WOODLAND ANIMALS\", \"PLASTERS IN TIN SPACEBOY\"], [\"PLASTERS IN TIN WOODLAND ANIMALS\", \"PLASTERS IN TIN CIRCUS PARADE \"], [\"PLASTERS IN TIN CIRCUS PARADE \", \"PLASTERS IN TIN WOODLAND ANIMALS\"], [\"ROUND SNACK BOXES SET OF 4 FRUITS \", \"ROUND SNACK BOXES SET OF4 WOODLAND \"], [\"SPACEBOY LUNCH BOX \", \"DOLLY GIRL LUNCH BOX\"], [\"LUNCH BAG WOODLAND\", \"LUNCH BAG SPACEBOY DESIGN \"], [\"LUNCH BAG SPACEBOY DESIGN \", \"LUNCH BAG RED RETROSPOT\"], [\"LUNCH BAG SPACEBOY DESIGN \", \"LUNCH BAG WOODLAND\"], [\"LUNCH BAG SPACEBOY DESIGN \", \"LUNCH BAG APPLE DESIGN\"], [\"PLASTERS IN TIN CIRCUS PARADE \", \"PLASTERS IN TIN SPACEBOY\"], [\"STRAWBERRY LUNCH BOX WITH CUTLERY\", \"LUNCH BOX WITH CUTLERY RETROSPOT \"], [\"LUNCH BAG APPLE DESIGN\", \"LUNCH BAG SPACEBOY DESIGN \"], [\"LUNCH BAG APPLE DESIGN\", \"LUNCH BAG RED RETROSPOT\"]], \"hoverlabel\": {\"namelength\": 0}, \"hovertemplate\": \"support=%{x}<br>confidence=%{y}<br>antecedents=%{customdata[0]}<br>consequents=%{customdata[1]}<br>antecedents_short=%{text}<br>lift=%{marker.color}\", \"legendgroup\": \"\", \"marker\": {\"color\": [8.3236, 8.9895, 8.1956, 14.719, 8.1956, 10.7409, 14.719, 10.1901, 9.9037, 7.2031, 9.2944, 7.0923, 9.9037, 9.0331, 8.9895, 6.567, 7.0923, 5.1604, 7.2031, 5.1292, 8.5699, 6.567, 8.5699, 9.0331, 8.3236, 6.5857, 10.7409, 5.6577, 4.4645, 4.4374, 10.1901, 9.2944, 5.1292, 4.0474, 4.0474, 4.1879, 6.5857, 5.3129, 4.0936, 5.3129, 4.9942, 4.4374, 4.2124, 4.9942, 3.9298], \"coloraxis\": \"coloraxis\", \"opacity\": 0.5, \"symbol\": \"circle\"}, \"mode\": \"markers+text\", \"name\": \"\", \"showlegend\": false, \"text\": [\"SET/6 RED \", \"SET/6 RED \", \"SET/6 RED \", \"CHILDRENS \", \"SET/6 RED \", \"ALARM CLOC\", \"CHILDRENS \", \"ALARM CLOC\", \"ALARM CLOC\", \"SET/6 RED \", \"ALARM CLOC\", \"SET/6 RED \", \"ALARM CLOC\", \"ALARM CLOC\", \"SET/6 RED \", \"SET/20 RED\", \"SET/20 RED\", \"PLASTERS I\", \"SET/20 RED\", \"PLASTERS I\", \"ALARM CLOC\", \"SET/6 RED \", \"ALARM CLOC\", \"ALARM CLOC\", \"SET/6 RED \", \"DOLLY GIRL\", \"ALARM CLOC\", \"PLASTERS I\", \"PLASTERS I\", \"PLASTERS I\", \"ALARM CLOC\", \"ALARM CLOC\", \"PLASTERS I\", \"PLASTERS I\", \"PLASTERS I\", \"ROUND SNAC\", \"SPACEBOY L\", \"LUNCH BAG \", \"LUNCH BAG \", \"LUNCH BAG \", \"LUNCH BAG \", \"PLASTERS I\", \"STRAWBERRY\", \"LUNCH BAG \", \"LUNCH BAG \"], \"textposition\": \"top center\", \"type\": \"scatter\", \"x\": [0.0846, 0.0846, 0.1041, 0.0542, 0.1041, 0.0542, 0.0542, 0.0542, 0.0672, 0.0846, 0.0542, 0.0868, 0.0672, 0.0629, 0.0846, 0.0868, 0.0868, 0.0586, 0.0846, 0.0889, 0.0629, 0.0868, 0.0629, 0.0629, 0.0846, 0.0607, 0.0542, 0.0586, 0.0586, 0.0781, 0.0542, 0.0542, 0.0889, 0.0868, 0.0868, 0.0542, 0.0607, 0.0564, 0.0564, 0.0564, 0.0564, 0.0781, 0.0542, 0.0564, 0.0564], \"xaxis\": \"x\", \"y\": [0.975, 0.975, 0.96, 0.9259, 0.8889, 0.8621, 0.8621, 0.8621, 0.8378, 0.8125, 0.8065, 0.8, 0.7949, 0.7838, 0.78, 0.7692, 0.7692, 0.75, 0.75, 0.7455, 0.7436, 0.7407, 0.725, 0.725, 0.7222, 0.7, 0.6757, 0.675, 0.6585, 0.6545, 0.641, 0.625, 0.6119, 0.597, 0.5882, 0.5814, 0.5714, 0.5532, 0.5417, 0.5417, 0.5417, 0.5294, 0.5208, 0.52, 0.52], \"yaxis\": \"y\"}],\n",
" {\"coloraxis\": {\"colorbar\": {\"title\": {\"text\": \"lift\"}}, \"colorscale\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]]}, \"height\": 800, \"legend\": {\"tracegroupgap\": 0}, \"margin\": {\"t\": 60}, \"plot_bgcolor\": \"rgb(240,240,240)\", \"template\": {\"data\": {\"bar\": [{\"error_x\": {\"color\": \"#2a3f5f\"}, \"error_y\": {\"color\": \"#2a3f5f\"}, \"marker\": {\"line\": {\"color\": \"#E5ECF6\", \"width\": 0.5}}, \"type\": \"bar\"}], \"barpolar\": [{\"marker\": {\"line\": {\"color\": \"#E5ECF6\", \"width\": 0.5}}, \"type\": \"barpolar\"}], \"carpet\": [{\"aaxis\": {\"endlinecolor\": \"#2a3f5f\", \"gridcolor\": \"white\", \"linecolor\": \"white\", \"minorgridcolor\": \"white\", \"startlinecolor\": \"#2a3f5f\"}, \"baxis\": {\"endlinecolor\": \"#2a3f5f\", \"gridcolor\": \"white\", \"linecolor\": \"white\", \"minorgridcolor\": \"white\", \"startlinecolor\": \"#2a3f5f\"}, \"type\": \"carpet\"}], \"choropleth\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"type\": \"choropleth\"}], \"contour\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"colorscale\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]], \"type\": \"contour\"}], \"contourcarpet\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"type\": \"contourcarpet\"}], \"heatmap\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"colorscale\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]], \"type\": \"heatmap\"}], \"heatmapgl\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"colorscale\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]], \"type\": \"heatmapgl\"}], \"histogram\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"histogram\"}], \"histogram2d\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"colorscale\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]], \"type\": \"histogram2d\"}], \"histogram2dcontour\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"colorscale\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]], \"type\": \"histogram2dcontour\"}], \"mesh3d\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"type\": \"mesh3d\"}], \"parcoords\": [{\"line\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"parcoords\"}], \"pie\": [{\"automargin\": true, \"type\": \"pie\"}], \"scatter\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scatter\"}], \"scatter3d\": [{\"line\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scatter3d\"}], \"scattercarpet\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scattercarpet\"}], \"scattergeo\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scattergeo\"}], \"scattergl\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scattergl\"}], \"scattermapbox\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scattermapbox\"}], \"scatterpolar\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scatterpolar\"}], \"scatterpolargl\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scatterpolargl\"}], \"scatterternary\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scatterternary\"}], \"surface\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"colorscale\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]], \"type\": \"surface\"}], \"table\": [{\"cells\": {\"fill\": {\"color\": \"#EBF0F8\"}, \"line\": {\"color\": \"white\"}}, \"header\": {\"fill\": {\"color\": \"#C8D4E3\"}, \"line\": {\"color\": \"white\"}}, \"type\": \"table\"}]}, \"layout\": {\"annotationdefaults\": {\"arrowcolor\": \"#2a3f5f\", \"arrowhead\": 0, \"arrowwidth\": 1}, \"coloraxis\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"colorscale\": {\"diverging\": [[0, \"#8e0152\"], [0.1, \"#c51b7d\"], [0.2, \"#de77ae\"], [0.3, \"#f1b6da\"], [0.4, \"#fde0ef\"], [0.5, \"#f7f7f7\"], [0.6, \"#e6f5d0\"], [0.7, \"#b8e186\"], [0.8, \"#7fbc41\"], [0.9, \"#4d9221\"], [1, \"#276419\"]], \"sequential\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]], \"sequentialminus\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]]}, \"colorway\": [\"#636efa\", \"#EF553B\", \"#00cc96\", \"#ab63fa\", \"#FFA15A\", \"#19d3f3\", \"#FF6692\", \"#B6E880\", \"#FF97FF\", \"#FECB52\"], \"font\": {\"color\": \"#2a3f5f\"}, \"geo\": {\"bgcolor\": \"white\", \"lakecolor\": \"white\", \"landcolor\": \"#E5ECF6\", \"showlakes\": true, \"showland\": true, \"subunitcolor\": \"white\"}, \"hoverlabel\": {\"align\": \"left\"}, \"hovermode\": \"closest\", \"mapbox\": {\"style\": \"light\"}, \"paper_bgcolor\": \"white\", \"plot_bgcolor\": \"#E5ECF6\", \"polar\": {\"angularaxis\": {\"gridcolor\": \"white\", \"linecolor\": \"white\", \"ticks\": \"\"}, \"bgcolor\": \"#E5ECF6\", \"radialaxis\": {\"gridcolor\": \"white\", \"linecolor\": \"white\", \"ticks\": \"\"}}, \"scene\": {\"xaxis\": {\"backgroundcolor\": \"#E5ECF6\", \"gridcolor\": \"white\", \"gridwidth\": 2, \"linecolor\": \"white\", \"showbackground\": true, \"ticks\": \"\", \"zerolinecolor\": \"white\"}, \"yaxis\": {\"backgroundcolor\": \"#E5ECF6\", \"gridcolor\": \"white\", \"gridwidth\": 2, \"linecolor\": \"white\", \"showbackground\": true, \"ticks\": \"\", \"zerolinecolor\": \"white\"}, \"zaxis\": {\"backgroundcolor\": \"#E5ECF6\", \"gridcolor\": \"white\", \"gridwidth\": 2, \"linecolor\": \"white\", \"showbackground\": true, \"ticks\": \"\", \"zerolinecolor\": \"white\"}}, \"shapedefaults\": {\"line\": {\"color\": \"#2a3f5f\"}}, \"ternary\": {\"aaxis\": {\"gridcolor\": \"white\", \"linecolor\": \"white\", \"ticks\": \"\"}, \"baxis\": {\"gridcolor\": \"white\", \"linecolor\": \"white\", \"ticks\": \"\"}, \"bgcolor\": \"#E5ECF6\", \"caxis\": {\"gridcolor\": \"white\", \"linecolor\": \"white\", \"ticks\": \"\"}}, \"title\": {\"x\": 0.05}, \"xaxis\": {\"automargin\": true, \"gridcolor\": \"white\", \"linecolor\": \"white\", \"ticks\": \"\", \"title\": {\"standoff\": 15}, \"zerolinecolor\": \"white\", \"zerolinewidth\": 2}, \"yaxis\": {\"automargin\": true, \"gridcolor\": \"white\", \"linecolor\": \"white\", \"ticks\": \"\", \"title\": {\"standoff\": 15}, \"zerolinecolor\": \"white\", \"zerolinewidth\": 2}}}, \"title\": {\"text\": \"2D Plot of Support, Confidence and Lift\"}, \"xaxis\": {\"anchor\": \"y\", \"domain\": [0.0, 1.0], \"title\": {\"text\": \"support\"}, \"type\": \"log\"}, \"yaxis\": {\"anchor\": \"x\", \"domain\": [0.0, 1.0], \"title\": {\"text\": \"confidence\"}}},\n",
" {\"responsive\": true}\n",
" ).then(function(){\n",
" \n",
"var gd = document.getElementById('b1b71c16-b29b-45d8-bc5a-78c50cb4dee2');\n",
"var x = new MutationObserver(function (mutations, observer) {{\n",
" var display = window.getComputedStyle(gd).display;\n",
" if (!display || display === 'none') {{\n",
" console.log([gd, 'removed!']);\n",
" Plotly.purge(gd);\n",
" observer.disconnect();\n",
" }}\n",
"}});\n",
"\n",
"// Listen for the removal of the full notebook cells\n",
"var notebookContainer = gd.closest('#notebook-container');\n",
"if (notebookContainer) {{\n",
" x.observe(notebookContainer, {childList: true});\n",
"}}\n",
"\n",
"// Listen for the clearing of the current output cell\n",
"var outputEl = gd.closest('.output');\n",
"if (outputEl) {{\n",
" x.observe(outputEl, {childList: true});\n",
"}}\n",
"\n",
" })\n",
" };\n",
" });\n",
" </script>\n",
" </div>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"plot_model(model2)"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" <script type=\"text/javascript\">\n",
" window.PlotlyConfig = {MathJaxConfig: 'local'};\n",
" if (window.MathJax) {MathJax.Hub.Config({SVG: {font: \"STIX-Web\"}});}\n",
" if (typeof require !== 'undefined') {\n",
" require.undef(\"plotly\");\n",
" requirejs.config({\n",
" paths: {\n",
" 'plotly': ['https://cdn.plot.ly/plotly-latest.min']\n",
" }\n",
" });\n",
" require(['plotly'], function(Plotly) {\n",
" window._Plotly = Plotly;\n",
" });\n",
" }\n",
" </script>\n",
" "
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.plotly.v1+json": {
"config": {
"plotlyServerURL": "https://plot.ly"
},
"data": [
{
"customdata": [
[
"SET/6 RED SPOTTY PAPER PLATES",
"SET/6 RED SPOTTY PAPER CUPS"
],
[
"SET/6 RED SPOTTY PAPER CUPS",
"SET/6 RED SPOTTY PAPER PLATES"
],
[
"SET/6 RED SPOTTY PAPER PLATES",
"SET/6 RED SPOTTY PAPER CUPS"
],
[
"CHILDRENS CUTLERY SPACEBOY ",
"CHILDRENS CUTLERY DOLLY GIRL "
],
[
"SET/6 RED SPOTTY PAPER CUPS",
"SET/6 RED SPOTTY PAPER PLATES"
],
[
"ALARM CLOCK BAKELIKE PINK",
"ALARM CLOCK BAKELIKE RED "
],
[
"CHILDRENS CUTLERY DOLLY GIRL ",
"CHILDRENS CUTLERY SPACEBOY "
],
[
"ALARM CLOCK BAKELIKE PINK",
"ALARM CLOCK BAKELIKE GREEN"
],
[
"ALARM CLOCK BAKELIKE RED ",
"ALARM CLOCK BAKELIKE GREEN"
],
[
"SET/6 RED SPOTTY PAPER CUPS",
"SET/20 RED RETROSPOT PAPER NAPKINS "
],
[
"ALARM CLOCK BAKELIKE RED ",
"ALARM CLOCK BAKELIKE PINK"
],
[
"SET/6 RED SPOTTY PAPER PLATES",
"SET/20 RED RETROSPOT PAPER NAPKINS "
],
[
"ALARM CLOCK BAKELIKE GREEN",
"ALARM CLOCK BAKELIKE RED "
],
[
"ALARM CLOCK BAKELIKE RED ",
"ALARM CLOCK BAKELIKE PINK"
],
[
"SET/6 RED SPOTTY PAPER PLATES",
"SET/6 RED SPOTTY PAPER CUPS"
],
[
"SET/20 RED RETROSPOT PAPER NAPKINS ",
"SET/6 RED SPOTTY PAPER CUPS"
],
[
"SET/20 RED RETROSPOT PAPER NAPKINS ",
"SET/6 RED SPOTTY PAPER PLATES"
],
[
"PLASTERS IN TIN CIRCUS PARADE ",
"PLASTERS IN TIN WOODLAND ANIMALS"
],
[
"SET/20 RED RETROSPOT PAPER NAPKINS ",
"SET/6 RED SPOTTY PAPER CUPS"
],
[
"PLASTERS IN TIN SPACEBOY",
"PLASTERS IN TIN WOODLAND ANIMALS"
],
[
"ALARM CLOCK BAKELIKE GREEN",
"ALARM CLOCK BAKELIKE PINK"
],
[
"SET/6 RED SPOTTY PAPER CUPS",
"SET/20 RED RETROSPOT PAPER NAPKINS "
],
[
"ALARM CLOCK BAKELIKE PINK",
"ALARM CLOCK BAKELIKE GREEN"
],
[
"ALARM CLOCK BAKELIKE PINK",
"ALARM CLOCK BAKELIKE RED "
],
[
"SET/6 RED SPOTTY PAPER CUPS",
"SET/6 RED SPOTTY PAPER PLATES"
],
[
"DOLLY GIRL LUNCH BOX",
"SPACEBOY LUNCH BOX "
],
[
"ALARM CLOCK BAKELIKE RED ",
"ALARM CLOCK BAKELIKE PINK"
],
[
"PLASTERS IN TIN CIRCUS PARADE ",
"PLASTERS IN TIN SPACEBOY"
],
[
"PLASTERS IN TIN WOODLAND ANIMALS",
"PLASTERS IN TIN CIRCUS PARADE "
],
[
"PLASTERS IN TIN SPACEBOY",
"PLASTERS IN TIN CIRCUS PARADE "
],
[
"ALARM CLOCK BAKELIKE GREEN",
"ALARM CLOCK BAKELIKE PINK"
],
[
"ALARM CLOCK BAKELIKE PINK",
"ALARM CLOCK BAKELIKE RED "
],
[
"PLASTERS IN TIN WOODLAND ANIMALS",
"PLASTERS IN TIN SPACEBOY"
],
[
"PLASTERS IN TIN WOODLAND ANIMALS",
"PLASTERS IN TIN CIRCUS PARADE "
],
[
"PLASTERS IN TIN CIRCUS PARADE ",
"PLASTERS IN TIN WOODLAND ANIMALS"
],
[
"ROUND SNACK BOXES SET OF 4 FRUITS ",
"ROUND SNACK BOXES SET OF4 WOODLAND "
],
[
"SPACEBOY LUNCH BOX ",
"DOLLY GIRL LUNCH BOX"
],
[
"LUNCH BAG WOODLAND",
"LUNCH BAG SPACEBOY DESIGN "
],
[
"LUNCH BAG SPACEBOY DESIGN ",
"LUNCH BAG RED RETROSPOT"
],
[
"LUNCH BAG SPACEBOY DESIGN ",
"LUNCH BAG WOODLAND"
],
[
"LUNCH BAG SPACEBOY DESIGN ",
"LUNCH BAG APPLE DESIGN"
],
[
"PLASTERS IN TIN CIRCUS PARADE ",
"PLASTERS IN TIN SPACEBOY"
],
[
"STRAWBERRY LUNCH BOX WITH CUTLERY",
"LUNCH BOX WITH CUTLERY RETROSPOT "
],
[
"LUNCH BAG APPLE DESIGN",
"LUNCH BAG SPACEBOY DESIGN "
],
[
"LUNCH BAG APPLE DESIGN",
"LUNCH BAG RED RETROSPOT"
]
],
"hoverlabel": {
"namelength": 0
},
"hovertemplate": "support=%{x}<br>confidence=%{y}<br>lift=%{z}<br>antecedents=%{customdata[0]}<br>consequents=%{customdata[1]}<br>antecedent support=%{marker.color}",
"legendgroup": "",
"marker": {
"color": [
0.0868,
0.0868,
0.1085,
0.0586,
0.1171,
0.0629,
0.0629,
0.0629,
0.0803,
0.1041,
0.0672,
0.1085,
0.0846,
0.0803,
0.1085,
0.1128,
0.1128,
0.0781,
0.1128,
0.1193,
0.0846,
0.1171,
0.0868,
0.0868,
0.1171,
0.0868,
0.0803,
0.0868,
0.0889,
0.1193,
0.0846,
0.0868,
0.1453,
0.1453,
0.1475,
0.0933,
0.1063,
0.102,
0.1041,
0.1041,
0.1041,
0.1475,
0.1041,
0.1085,
0.1085
],
"coloraxis": "coloraxis",
"opacity": 0.7,
"symbol": "circle"
},
"mode": "markers",
"name": "",
"scene": "scene",
"showlegend": false,
"type": "scatter3d",
"x": [
0.0846,
0.0846,
0.1041,
0.0542,
0.1041,
0.0542,
0.0542,
0.0542,
0.0672,
0.0846,
0.0542,
0.0868,
0.0672,
0.0629,
0.0846,
0.0868,
0.0868,
0.0586,
0.0846,
0.0889,
0.0629,
0.0868,
0.0629,
0.0629,
0.0846,
0.0607,
0.0542,
0.0586,
0.0586,
0.0781,
0.0542,
0.0542,
0.0889,
0.0868,
0.0868,
0.0542,
0.0607,
0.0564,
0.0564,
0.0564,
0.0564,
0.0781,
0.0542,
0.0564,
0.0564
],
"y": [
0.975,
0.975,
0.96,
0.9259,
0.8889,
0.8621,
0.8621,
0.8621,
0.8378,
0.8125,
0.8065,
0.8,
0.7949,
0.7838,
0.78,
0.7692,
0.7692,
0.75,
0.75,
0.7455,
0.7436,
0.7407,
0.725,
0.725,
0.7222,
0.7,
0.6757,
0.675,
0.6585,
0.6545,
0.641,
0.625,
0.6119,
0.597,
0.5882,
0.5814,
0.5714,
0.5532,
0.5417,
0.5417,
0.5417,
0.5294,
0.5208,
0.52,
0.52
],
"z": [
8.3236,
8.9895,
8.1956,
14.719,
8.1956,
10.7409,
14.719,
10.1901,
9.9037,
7.2031,
9.2944,
7.0923,
9.9037,
9.0331,
8.9895,
6.567,
7.0923,
5.1604,
7.2031,
5.1292,
8.5699,
6.567,
8.5699,
9.0331,
8.3236,
6.5857,
10.7409,
5.6577,
4.4645,
4.4374,
10.1901,
9.2944,
5.1292,
4.0474,
4.0474,
4.1879,
6.5857,
5.3129,
4.0936,
5.3129,
4.9942,
4.4374,
4.2124,
4.9942,
3.9298
]
}
],
"layout": {
"coloraxis": {
"colorbar": {
"title": {
"text": "antecedent support"
}
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
]
},
"height": 800,
"legend": {
"tracegroupgap": 0
},
"scene": {
"domain": {
"x": [
0,
1
],
"y": [
0,
1
]
},
"xaxis": {
"title": {
"text": "support"
}
},
"yaxis": {
"title": {
"text": "confidence"
}
},
"zaxis": {
"title": {
"text": "lift"
}
}
},
"template": {
"data": {
"bar": [
{
"error_x": {
"color": "#2a3f5f"
},
"error_y": {
"color": "#2a3f5f"
},
"marker": {
"line": {
"color": "#E5ECF6",
"width": 0.5
}
},
"type": "bar"
}
],
"barpolar": [
{
"marker": {
"line": {
"color": "#E5ECF6",
"width": 0.5
}
},
"type": "barpolar"
}
],
"carpet": [
{
"aaxis": {
"endlinecolor": "#2a3f5f",
"gridcolor": "white",
"linecolor": "white",
"minorgridcolor": "white",
"startlinecolor": "#2a3f5f"
},
"baxis": {
"endlinecolor": "#2a3f5f",
"gridcolor": "white",
"linecolor": "white",
"minorgridcolor": "white",
"startlinecolor": "#2a3f5f"
},
"type": "carpet"
}
],
"choropleth": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "choropleth"
}
],
"contour": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "contour"
}
],
"contourcarpet": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "contourcarpet"
}
],
"heatmap": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "heatmap"
}
],
"heatmapgl": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "heatmapgl"
}
],
"histogram": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "histogram"
}
],
"histogram2d": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "histogram2d"
}
],
"histogram2dcontour": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "histogram2dcontour"
}
],
"mesh3d": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "mesh3d"
}
],
"parcoords": [
{
"line": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "parcoords"
}
],
"pie": [
{
"automargin": true,
"type": "pie"
}
],
"scatter": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatter"
}
],
"scatter3d": [
{
"line": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatter3d"
}
],
"scattercarpet": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattercarpet"
}
],
"scattergeo": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattergeo"
}
],
"scattergl": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattergl"
}
],
"scattermapbox": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattermapbox"
}
],
"scatterpolar": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterpolar"
}
],
"scatterpolargl": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterpolargl"
}
],
"scatterternary": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterternary"
}
],
"surface": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "surface"
}
],
"table": [
{
"cells": {
"fill": {
"color": "#EBF0F8"
},
"line": {
"color": "white"
}
},
"header": {
"fill": {
"color": "#C8D4E3"
},
"line": {
"color": "white"
}
},
"type": "table"
}
]
},
"layout": {
"annotationdefaults": {
"arrowcolor": "#2a3f5f",
"arrowhead": 0,
"arrowwidth": 1
},
"coloraxis": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"colorscale": {
"diverging": [
[
0,
"#8e0152"
],
[
0.1,
"#c51b7d"
],
[
0.2,
"#de77ae"
],
[
0.3,
"#f1b6da"
],
[
0.4,
"#fde0ef"
],
[
0.5,
"#f7f7f7"
],
[
0.6,
"#e6f5d0"
],
[
0.7,
"#b8e186"
],
[
0.8,
"#7fbc41"
],
[
0.9,
"#4d9221"
],
[
1,
"#276419"
]
],
"sequential": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"sequentialminus": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
]
},
"colorway": [
"#636efa",
"#EF553B",
"#00cc96",
"#ab63fa",
"#FFA15A",
"#19d3f3",
"#FF6692",
"#B6E880",
"#FF97FF",
"#FECB52"
],
"font": {
"color": "#2a3f5f"
},
"geo": {
"bgcolor": "white",
"lakecolor": "white",
"landcolor": "#E5ECF6",
"showlakes": true,
"showland": true,
"subunitcolor": "white"
},
"hoverlabel": {
"align": "left"
},
"hovermode": "closest",
"mapbox": {
"style": "light"
},
"paper_bgcolor": "white",
"plot_bgcolor": "#E5ECF6",
"polar": {
"angularaxis": {
"gridcolor": "white",
"linecolor": "white",
"ticks": ""
},
"bgcolor": "#E5ECF6",
"radialaxis": {
"gridcolor": "white",
"linecolor": "white",
"ticks": ""
}
},
"scene": {
"xaxis": {
"backgroundcolor": "#E5ECF6",
"gridcolor": "white",
"gridwidth": 2,
"linecolor": "white",
"showbackground": true,
"ticks": "",
"zerolinecolor": "white"
},
"yaxis": {
"backgroundcolor": "#E5ECF6",
"gridcolor": "white",
"gridwidth": 2,
"linecolor": "white",
"showbackground": true,
"ticks": "",
"zerolinecolor": "white"
},
"zaxis": {
"backgroundcolor": "#E5ECF6",
"gridcolor": "white",
"gridwidth": 2,
"linecolor": "white",
"showbackground": true,
"ticks": "",
"zerolinecolor": "white"
}
},
"shapedefaults": {
"line": {
"color": "#2a3f5f"
}
},
"ternary": {
"aaxis": {
"gridcolor": "white",
"linecolor": "white",
"ticks": ""
},
"baxis": {
"gridcolor": "white",
"linecolor": "white",
"ticks": ""
},
"bgcolor": "#E5ECF6",
"caxis": {
"gridcolor": "white",
"linecolor": "white",
"ticks": ""
}
},
"title": {
"x": 0.05
},
"xaxis": {
"automargin": true,
"gridcolor": "white",
"linecolor": "white",
"ticks": "",
"title": {
"standoff": 15
},
"zerolinecolor": "white",
"zerolinewidth": 2
},
"yaxis": {
"automargin": true,
"gridcolor": "white",
"linecolor": "white",
"ticks": "",
"title": {
"standoff": 15
},
"zerolinecolor": "white",
"zerolinewidth": 2
}
}
},
"title": {
"text": "3d Plot for Rule Mining"
},
"width": 900
}
},
"text/html": [
"<div>\n",
" \n",
" \n",
" <div id=\"844ec277-e712-4ff9-a4bb-b41044cc8227\" class=\"plotly-graph-div\" style=\"height:800px; width:900px;\"></div>\n",
" <script type=\"text/javascript\">\n",
" require([\"plotly\"], function(Plotly) {\n",
" window.PLOTLYENV=window.PLOTLYENV || {};\n",
" \n",
" if (document.getElementById(\"844ec277-e712-4ff9-a4bb-b41044cc8227\")) {\n",
" Plotly.newPlot(\n",
" '844ec277-e712-4ff9-a4bb-b41044cc8227',\n",
" [{\"customdata\": [[\"SET/6 RED SPOTTY PAPER PLATES\", \"SET/6 RED SPOTTY PAPER CUPS\"], [\"SET/6 RED SPOTTY PAPER CUPS\", \"SET/6 RED SPOTTY PAPER PLATES\"], [\"SET/6 RED SPOTTY PAPER PLATES\", \"SET/6 RED SPOTTY PAPER CUPS\"], [\"CHILDRENS CUTLERY SPACEBOY \", \"CHILDRENS CUTLERY DOLLY GIRL \"], [\"SET/6 RED SPOTTY PAPER CUPS\", \"SET/6 RED SPOTTY PAPER PLATES\"], [\"ALARM CLOCK BAKELIKE PINK\", \"ALARM CLOCK BAKELIKE RED \"], [\"CHILDRENS CUTLERY DOLLY GIRL \", \"CHILDRENS CUTLERY SPACEBOY \"], [\"ALARM CLOCK BAKELIKE PINK\", \"ALARM CLOCK BAKELIKE GREEN\"], [\"ALARM CLOCK BAKELIKE RED \", \"ALARM CLOCK BAKELIKE GREEN\"], [\"SET/6 RED SPOTTY PAPER CUPS\", \"SET/20 RED RETROSPOT PAPER NAPKINS \"], [\"ALARM CLOCK BAKELIKE RED \", \"ALARM CLOCK BAKELIKE PINK\"], [\"SET/6 RED SPOTTY PAPER PLATES\", \"SET/20 RED RETROSPOT PAPER NAPKINS \"], [\"ALARM CLOCK BAKELIKE GREEN\", \"ALARM CLOCK BAKELIKE RED \"], [\"ALARM CLOCK BAKELIKE RED \", \"ALARM CLOCK BAKELIKE PINK\"], [\"SET/6 RED SPOTTY PAPER PLATES\", \"SET/6 RED SPOTTY PAPER CUPS\"], [\"SET/20 RED RETROSPOT PAPER NAPKINS \", \"SET/6 RED SPOTTY PAPER CUPS\"], [\"SET/20 RED RETROSPOT PAPER NAPKINS \", \"SET/6 RED SPOTTY PAPER PLATES\"], [\"PLASTERS IN TIN CIRCUS PARADE \", \"PLASTERS IN TIN WOODLAND ANIMALS\"], [\"SET/20 RED RETROSPOT PAPER NAPKINS \", \"SET/6 RED SPOTTY PAPER CUPS\"], [\"PLASTERS IN TIN SPACEBOY\", \"PLASTERS IN TIN WOODLAND ANIMALS\"], [\"ALARM CLOCK BAKELIKE GREEN\", \"ALARM CLOCK BAKELIKE PINK\"], [\"SET/6 RED SPOTTY PAPER CUPS\", \"SET/20 RED RETROSPOT PAPER NAPKINS \"], [\"ALARM CLOCK BAKELIKE PINK\", \"ALARM CLOCK BAKELIKE GREEN\"], [\"ALARM CLOCK BAKELIKE PINK\", \"ALARM CLOCK BAKELIKE RED \"], [\"SET/6 RED SPOTTY PAPER CUPS\", \"SET/6 RED SPOTTY PAPER PLATES\"], [\"DOLLY GIRL LUNCH BOX\", \"SPACEBOY LUNCH BOX \"], [\"ALARM CLOCK BAKELIKE RED \", \"ALARM CLOCK BAKELIKE PINK\"], [\"PLASTERS IN TIN CIRCUS PARADE \", \"PLASTERS IN TIN SPACEBOY\"], [\"PLASTERS IN TIN WOODLAND ANIMALS\", \"PLASTERS IN TIN CIRCUS PARADE \"], [\"PLASTERS IN TIN SPACEBOY\", \"PLASTERS IN TIN CIRCUS PARADE \"], [\"ALARM CLOCK BAKELIKE GREEN\", \"ALARM CLOCK BAKELIKE PINK\"], [\"ALARM CLOCK BAKELIKE PINK\", \"ALARM CLOCK BAKELIKE RED \"], [\"PLASTERS IN TIN WOODLAND ANIMALS\", \"PLASTERS IN TIN SPACEBOY\"], [\"PLASTERS IN TIN WOODLAND ANIMALS\", \"PLASTERS IN TIN CIRCUS PARADE \"], [\"PLASTERS IN TIN CIRCUS PARADE \", \"PLASTERS IN TIN WOODLAND ANIMALS\"], [\"ROUND SNACK BOXES SET OF 4 FRUITS \", \"ROUND SNACK BOXES SET OF4 WOODLAND \"], [\"SPACEBOY LUNCH BOX \", \"DOLLY GIRL LUNCH BOX\"], [\"LUNCH BAG WOODLAND\", \"LUNCH BAG SPACEBOY DESIGN \"], [\"LUNCH BAG SPACEBOY DESIGN \", \"LUNCH BAG RED RETROSPOT\"], [\"LUNCH BAG SPACEBOY DESIGN \", \"LUNCH BAG WOODLAND\"], [\"LUNCH BAG SPACEBOY DESIGN \", \"LUNCH BAG APPLE DESIGN\"], [\"PLASTERS IN TIN CIRCUS PARADE \", \"PLASTERS IN TIN SPACEBOY\"], [\"STRAWBERRY LUNCH BOX WITH CUTLERY\", \"LUNCH BOX WITH CUTLERY RETROSPOT \"], [\"LUNCH BAG APPLE DESIGN\", \"LUNCH BAG SPACEBOY DESIGN \"], [\"LUNCH BAG APPLE DESIGN\", \"LUNCH BAG RED RETROSPOT\"]], \"hoverlabel\": {\"namelength\": 0}, \"hovertemplate\": \"support=%{x}<br>confidence=%{y}<br>lift=%{z}<br>antecedents=%{customdata[0]}<br>consequents=%{customdata[1]}<br>antecedent support=%{marker.color}\", \"legendgroup\": \"\", \"marker\": {\"color\": [0.0868, 0.0868, 0.1085, 0.0586, 0.1171, 0.0629, 0.0629, 0.0629, 0.0803, 0.1041, 0.0672, 0.1085, 0.0846, 0.0803, 0.1085, 0.1128, 0.1128, 0.0781, 0.1128, 0.1193, 0.0846, 0.1171, 0.0868, 0.0868, 0.1171, 0.0868, 0.0803, 0.0868, 0.0889, 0.1193, 0.0846, 0.0868, 0.1453, 0.1453, 0.1475, 0.0933, 0.1063, 0.102, 0.1041, 0.1041, 0.1041, 0.1475, 0.1041, 0.1085, 0.1085], \"coloraxis\": \"coloraxis\", \"opacity\": 0.7, \"symbol\": \"circle\"}, \"mode\": \"markers\", \"name\": \"\", \"scene\": \"scene\", \"showlegend\": false, \"type\": \"scatter3d\", \"x\": [0.0846, 0.0846, 0.1041, 0.0542, 0.1041, 0.0542, 0.0542, 0.0542, 0.0672, 0.0846, 0.0542, 0.0868, 0.0672, 0.0629, 0.0846, 0.0868, 0.0868, 0.0586, 0.0846, 0.0889, 0.0629, 0.0868, 0.0629, 0.0629, 0.0846, 0.0607, 0.0542, 0.0586, 0.0586, 0.0781, 0.0542, 0.0542, 0.0889, 0.0868, 0.0868, 0.0542, 0.0607, 0.0564, 0.0564, 0.0564, 0.0564, 0.0781, 0.0542, 0.0564, 0.0564], \"y\": [0.975, 0.975, 0.96, 0.9259, 0.8889, 0.8621, 0.8621, 0.8621, 0.8378, 0.8125, 0.8065, 0.8, 0.7949, 0.7838, 0.78, 0.7692, 0.7692, 0.75, 0.75, 0.7455, 0.7436, 0.7407, 0.725, 0.725, 0.7222, 0.7, 0.6757, 0.675, 0.6585, 0.6545, 0.641, 0.625, 0.6119, 0.597, 0.5882, 0.5814, 0.5714, 0.5532, 0.5417, 0.5417, 0.5417, 0.5294, 0.5208, 0.52, 0.52], \"z\": [8.3236, 8.9895, 8.1956, 14.719, 8.1956, 10.7409, 14.719, 10.1901, 9.9037, 7.2031, 9.2944, 7.0923, 9.9037, 9.0331, 8.9895, 6.567, 7.0923, 5.1604, 7.2031, 5.1292, 8.5699, 6.567, 8.5699, 9.0331, 8.3236, 6.5857, 10.7409, 5.6577, 4.4645, 4.4374, 10.1901, 9.2944, 5.1292, 4.0474, 4.0474, 4.1879, 6.5857, 5.3129, 4.0936, 5.3129, 4.9942, 4.4374, 4.2124, 4.9942, 3.9298]}],\n",
" {\"coloraxis\": {\"colorbar\": {\"title\": {\"text\": \"antecedent support\"}}, \"colorscale\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]]}, \"height\": 800, \"legend\": {\"tracegroupgap\": 0}, \"scene\": {\"domain\": {\"x\": [0.0, 1.0], \"y\": [0.0, 1.0]}, \"xaxis\": {\"title\": {\"text\": \"support\"}}, \"yaxis\": {\"title\": {\"text\": \"confidence\"}}, \"zaxis\": {\"title\": {\"text\": \"lift\"}}}, \"template\": {\"data\": {\"bar\": [{\"error_x\": {\"color\": \"#2a3f5f\"}, \"error_y\": {\"color\": \"#2a3f5f\"}, \"marker\": {\"line\": {\"color\": \"#E5ECF6\", \"width\": 0.5}}, \"type\": \"bar\"}], \"barpolar\": [{\"marker\": {\"line\": {\"color\": \"#E5ECF6\", \"width\": 0.5}}, \"type\": \"barpolar\"}], \"carpet\": [{\"aaxis\": {\"endlinecolor\": \"#2a3f5f\", \"gridcolor\": \"white\", \"linecolor\": \"white\", \"minorgridcolor\": \"white\", \"startlinecolor\": \"#2a3f5f\"}, \"baxis\": {\"endlinecolor\": \"#2a3f5f\", \"gridcolor\": \"white\", \"linecolor\": \"white\", \"minorgridcolor\": \"white\", \"startlinecolor\": \"#2a3f5f\"}, \"type\": \"carpet\"}], \"choropleth\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"type\": \"choropleth\"}], \"contour\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"colorscale\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]], \"type\": \"contour\"}], \"contourcarpet\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"type\": \"contourcarpet\"}], \"heatmap\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"colorscale\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]], \"type\": \"heatmap\"}], \"heatmapgl\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"colorscale\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]], \"type\": \"heatmapgl\"}], \"histogram\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"histogram\"}], \"histogram2d\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"colorscale\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]], \"type\": \"histogram2d\"}], \"histogram2dcontour\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"colorscale\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]], \"type\": \"histogram2dcontour\"}], \"mesh3d\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"type\": \"mesh3d\"}], \"parcoords\": [{\"line\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"parcoords\"}], \"pie\": [{\"automargin\": true, \"type\": \"pie\"}], \"scatter\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scatter\"}], \"scatter3d\": [{\"line\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scatter3d\"}], \"scattercarpet\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scattercarpet\"}], \"scattergeo\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scattergeo\"}], \"scattergl\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scattergl\"}], \"scattermapbox\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scattermapbox\"}], \"scatterpolar\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scatterpolar\"}], \"scatterpolargl\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scatterpolargl\"}], \"scatterternary\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scatterternary\"}], \"surface\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"colorscale\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]], \"type\": \"surface\"}], \"table\": [{\"cells\": {\"fill\": {\"color\": \"#EBF0F8\"}, \"line\": {\"color\": \"white\"}}, \"header\": {\"fill\": {\"color\": \"#C8D4E3\"}, \"line\": {\"color\": \"white\"}}, \"type\": \"table\"}]}, \"layout\": {\"annotationdefaults\": {\"arrowcolor\": \"#2a3f5f\", \"arrowhead\": 0, \"arrowwidth\": 1}, \"coloraxis\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"colorscale\": {\"diverging\": [[0, \"#8e0152\"], [0.1, \"#c51b7d\"], [0.2, \"#de77ae\"], [0.3, \"#f1b6da\"], [0.4, \"#fde0ef\"], [0.5, \"#f7f7f7\"], [0.6, \"#e6f5d0\"], [0.7, \"#b8e186\"], [0.8, \"#7fbc41\"], [0.9, \"#4d9221\"], [1, \"#276419\"]], \"sequential\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]], \"sequentialminus\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]]}, \"colorway\": [\"#636efa\", \"#EF553B\", \"#00cc96\", \"#ab63fa\", \"#FFA15A\", \"#19d3f3\", \"#FF6692\", \"#B6E880\", \"#FF97FF\", \"#FECB52\"], \"font\": {\"color\": \"#2a3f5f\"}, \"geo\": {\"bgcolor\": \"white\", \"lakecolor\": \"white\", \"landcolor\": \"#E5ECF6\", \"showlakes\": true, \"showland\": true, \"subunitcolor\": \"white\"}, \"hoverlabel\": {\"align\": \"left\"}, \"hovermode\": \"closest\", \"mapbox\": {\"style\": \"light\"}, \"paper_bgcolor\": \"white\", \"plot_bgcolor\": \"#E5ECF6\", \"polar\": {\"angularaxis\": {\"gridcolor\": \"white\", \"linecolor\": \"white\", \"ticks\": \"\"}, \"bgcolor\": \"#E5ECF6\", \"radialaxis\": {\"gridcolor\": \"white\", \"linecolor\": \"white\", \"ticks\": \"\"}}, \"scene\": {\"xaxis\": {\"backgroundcolor\": \"#E5ECF6\", \"gridcolor\": \"white\", \"gridwidth\": 2, \"linecolor\": \"white\", \"showbackground\": true, \"ticks\": \"\", \"zerolinecolor\": \"white\"}, \"yaxis\": {\"backgroundcolor\": \"#E5ECF6\", \"gridcolor\": \"white\", \"gridwidth\": 2, \"linecolor\": \"white\", \"showbackground\": true, \"ticks\": \"\", \"zerolinecolor\": \"white\"}, \"zaxis\": {\"backgroundcolor\": \"#E5ECF6\", \"gridcolor\": \"white\", \"gridwidth\": 2, \"linecolor\": \"white\", \"showbackground\": true, \"ticks\": \"\", \"zerolinecolor\": \"white\"}}, \"shapedefaults\": {\"line\": {\"color\": \"#2a3f5f\"}}, \"ternary\": {\"aaxis\": {\"gridcolor\": \"white\", \"linecolor\": \"white\", \"ticks\": \"\"}, \"baxis\": {\"gridcolor\": \"white\", \"linecolor\": \"white\", \"ticks\": \"\"}, \"bgcolor\": \"#E5ECF6\", \"caxis\": {\"gridcolor\": \"white\", \"linecolor\": \"white\", \"ticks\": \"\"}}, \"title\": {\"x\": 0.05}, \"xaxis\": {\"automargin\": true, \"gridcolor\": \"white\", \"linecolor\": \"white\", \"ticks\": \"\", \"title\": {\"standoff\": 15}, \"zerolinecolor\": \"white\", \"zerolinewidth\": 2}, \"yaxis\": {\"automargin\": true, \"gridcolor\": \"white\", \"linecolor\": \"white\", \"ticks\": \"\", \"title\": {\"standoff\": 15}, \"zerolinecolor\": \"white\", \"zerolinewidth\": 2}}}, \"title\": {\"text\": \"3d Plot for Rule Mining\"}, \"width\": 900},\n",
" {\"responsive\": true}\n",
" ).then(function(){\n",
" \n",
"var gd = document.getElementById('844ec277-e712-4ff9-a4bb-b41044cc8227');\n",
"var x = new MutationObserver(function (mutations, observer) {{\n",
" var display = window.getComputedStyle(gd).display;\n",
" if (!display || display === 'none') {{\n",
" console.log([gd, 'removed!']);\n",
" Plotly.purge(gd);\n",
" observer.disconnect();\n",
" }}\n",
"}});\n",
"\n",
"// Listen for the removal of the full notebook cells\n",
"var notebookContainer = gd.closest('#notebook-container');\n",
"if (notebookContainer) {{\n",
" x.observe(notebookContainer, {childList: true});\n",
"}}\n",
"\n",
"// Listen for the clearing of the current output cell\n",
"var outputEl = gd.closest('.output');\n",
"if (outputEl) {{\n",
" x.observe(outputEl, {childList: true});\n",
"}}\n",
"\n",
" })\n",
" };\n",
" });\n",
" </script>\n",
" </div>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"plot_model(model2, plot = '3d')"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
因为 它太大了无法显示 source diff 。你可以改为 查看blob
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# <span style=\"color:orange\">Binary Classification Tutorial (CLF103) - Level Expert</span>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Date Updated: Feb 25, 2020**\n",
"\n",
"# Work in progress\n",
"We are currently working on this tutorial. Please check back soon! \n",
"\n",
"### In the mean time, you can see: \n",
"- __[Binary Classification Tutorial (CLF101) - Level Beginner](https://github.com/pycaret/pycaret/blob/master/Tutorials/Binary%20Classification%20Tutorial%20Level%20Beginner%20-%20%20CLF101.ipynb)__\n",
"- __[Binary Classification Tutorial (CLF102) - Level Intermediate](https://github.com/pycaret/pycaret/blob/master/Tutorials/Binary%20Classification%20Tutorial%20Level%20Intermediate%20-%20CLF102.ipynb)__"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.4"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
因为 它太大了无法显示 source diff 。你可以改为 查看blob
因为 它太大了无法显示 source diff 。你可以改为 查看blob
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# <span style=\"color:orange\">Clustering Tutorial (CLU103) - Level Expert</span>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Date Updated: Feb 25, 2020**\n",
"\n",
"# Work in progress\n",
"We are currently working on this tutorial. Please check back soon! \n",
"\n",
"### In the mean time, you can see: \n",
"- __[Clustering Tutorial (CLU101) - Level Beginner](https://github.com/pycaret/pycaret/blob/master/Tutorials/Clustering%20Tutorial%20Level%20Beginner%20-%20CLU101.ipynb)__"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# <span style=\"color:orange\">Clustering Tutorial (CLU102) - Level Intermediate</span>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Date Updated: Feb 25, 2020**\n",
"\n",
"# Work in progress\n",
"We are currently working on this tutorial. Please check back soon! \n",
"\n",
"### In the mean time, you can see: \n",
"- __[Clustering Tutorial (CLU101) - Level Beginner](https://github.com/pycaret/pycaret/blob/master/Tutorials/Clustering%20Tutorial%20Level%20Beginner%20-%20CLU101.ipynb)__"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
因为 它太大了无法显示 source diff 。你可以改为 查看blob
因为 它太大了无法显示 source diff 。你可以改为 查看blob
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# <span style=\"color:orange\">Natural Language Processing Tutorial (NLP103) - Level Expert</span>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Date Updated: Feb 25, 2020**\n",
"\n",
"# Work in progress\n",
"We are currently working on this tutorial. Please check back soon! \n",
"\n",
"### In the mean time, you can see: \n",
"- __[Natural Language Processing Tutorial (NLP101) - Level Beginner](https://github.com/pycaret/pycaret/blob/master/Tutorials/Natural%20Language%20Processing%20Tutorial%20Level%20Beginner%20-%20NLP101.ipynb)__\n",
"- __[Natural Language Processing Tutorial (NLP102) - Level Intermediate](https://github.com/pycaret/pycaret/blob/master/Tutorials/Natural%20Language%20Processing%20Tutorial%20Level%20Intermediate%20-%20NLP102.ipynb)__"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# <span style=\"color:orange\">Natural Language Processing Tutorial (NLP102) - Level Intermediate</span>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Date Updated: Feb 25, 2020**\n",
"\n",
"# 1.0 Objective of Tutorial\n",
"Welcome to Natural Language Processing Tutorial (NLP102). This tutorial assumes that you have completed __[Natural Language Processing Tutorial (NLP101) - Level Beginner](https://github.com/pycaret/pycaret/blob/master/Tutorials/Natural%20Language%20Processing%20Tutorial%20Level%20Beginner%20-%20NLP101.ipynb)__. If you haven't we strongly recommend you to go back and progress through the beginner's tutorial as several key concepts that we aim to cover in this tutorial are inter-connected with Beginner's Tutorial.\n",
"\n",
"Building on the previous tutorial, we will learn the following in this tutorial: \n",
"\n",
"* **Custom Stopwords:** How to define custom stopwords?\n",
"* **Evaluate Topic Model:** How to evaluate performance of a topic model?\n",
"* **Hyperparameter Tuning:** How to tune hyperparameter (# of topics) for a topic model?\n",
"* **Save / Load Experiment:** How to save an entire experiment?\n",
"\n",
"Read Time : Approx. 30 Minutes\n",
"\n",
"\n",
"## 1.1 Installing PyCaret\n",
"If you haven't installed PyCaret yet. Please follow the link to __[Beginner's Tutorial](https://github.com/pycaret/pycaret/blob/master/Tutorials/Natural%20Language%20Processing%20Tutorial%20Level%20Beginner%20-%20NLP101.ipynb)__ for instruction on how to install pycaret.\n",
"\n",
"\n",
"## 1.2 Pre-Requisites\n",
"- Python 3.x\n",
"- Latest version of pycaret\n",
"- Internet connection to load data from pycaret's repository\n",
"- Completion of Natural Language Processing Tutorial (NLP101) - Level Beginner\n",
"- Completion of Natural Binary Classification Tutorial (CLF101) - Level Beginner\n",
"\n",
"## 1.3 For Google colab users:\n",
"If you are running this notebook on Google colab, below code of cells must be run at top of the notebook to display interactive visuals.<br/>\n",
"<br/>\n",
"`from pycaret.utils import enable_colab` <br/>\n",
"`enable_colab()`\n",
"\n",
"## 1.4 See also:\n",
"- __[Natural Language Processing Tutorial (NLP101) - Level Beginner](https://github.com/pycaret/pycaret/blob/master/Tutorials/Natural%20Language%20Processing%20Tutorial%20Level%20Beginner%20-%20NLP101.ipynb)__\n",
"- __[Natural Language Processing Tutorial (NLP103) - Level Expert](https://github.com/pycaret/pycaret/blob/master/Tutorials/Natural%20Language%20Processing%20Tutorial%20Level%20Expert%20-%20NLP103.ipynb)__\n",
"- __[Binary Classification Tutorial (CLF101) - Level Beginner](https://github.com/pycaret/pycaret/blob/master/Tutorials/Binary%20Classification%20Tutorial%20Level%20Beginner%20-%20%20CLF101.ipynb)__"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 2.0 Brief Overview of Tutorial\n",
"\n",
"Building on from previous tutorial __[Natural Language Processing Tutorial (NLP101) - Level Beginner](https://github.com/pycaret/pycaret/blob/master/Tutorials/Natural%20Language%20Processing%20Tutorial%20Level%20Beginner%20-%20NLP101.ipynb)__ we will create a topic model in this tutorial after passing `custom_stopwords`. We will then evaluate and compare the results of topic model with the one we created in last tutorial. We will then see how to evaluate topic models using coherence value and also how to use PyCaret's unique implementation of `tune_model()` function that allows you to optimize supervised learning target (status column in this example). We will finally finish off the tutorial by using the output of topic model in `pycaret.classifcation` module to find the best classifier that can predict loan default using topic information extracted from `pycaret.nlp` module.\n",
"\n",
"Let's get started!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 3.0 Dataset for the Tutorial"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For this tutorial we will be using the same dataset that was used in __[Natural Language Processing Tutorial (NLP101) - Level Beginner](https://github.com/pycaret/pycaret/blob/master/Tutorials/Natural%20Language%20Processing%20Tutorial%20Level%20Beginner%20-%20NLP101.ipynb)__. You can download the dataset from our github repository __[(Click here to Download)](https://github.com/pycaret/pycaret/blob/master/datasets/kiva.csv)__ or you can use PyCaret's `get_data()` function to import the dataset (This will require internet connection).\n",
"\n",
"#### Dataset Acknowledgement:\n",
"Kiva Microfunds https://www.kiva.org/ "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 4.0 Getting the Data"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>country</th>\n",
" <th>en</th>\n",
" <th>gender</th>\n",
" <th>loan_amount</th>\n",
" <th>nonpayment</th>\n",
" <th>sector</th>\n",
" <th>status</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Dominican Republic</td>\n",
" <td>\"Banco Esperanza\" is a group of 10 women looki...</td>\n",
" <td>F</td>\n",
" <td>1225</td>\n",
" <td>partner</td>\n",
" <td>Retail</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Dominican Republic</td>\n",
" <td>\"Caminemos Hacia Adelante\" or \"Walking Forward...</td>\n",
" <td>F</td>\n",
" <td>1975</td>\n",
" <td>lender</td>\n",
" <td>Clothing</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Dominican Republic</td>\n",
" <td>\"Creciendo Por La Union\" is a group of 10 peop...</td>\n",
" <td>F</td>\n",
" <td>2175</td>\n",
" <td>partner</td>\n",
" <td>Clothing</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Dominican Republic</td>\n",
" <td>\"Cristo Vive\" (\"Christ lives\" is a group of 10...</td>\n",
" <td>F</td>\n",
" <td>1425</td>\n",
" <td>partner</td>\n",
" <td>Clothing</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Dominican Republic</td>\n",
" <td>\"Cristo Vive\" is a large group of 35 people, 2...</td>\n",
" <td>F</td>\n",
" <td>4025</td>\n",
" <td>partner</td>\n",
" <td>Food</td>\n",
" <td>0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" country en \\\n",
"0 Dominican Republic \"Banco Esperanza\" is a group of 10 women looki... \n",
"1 Dominican Republic \"Caminemos Hacia Adelante\" or \"Walking Forward... \n",
"2 Dominican Republic \"Creciendo Por La Union\" is a group of 10 peop... \n",
"3 Dominican Republic \"Cristo Vive\" (\"Christ lives\" is a group of 10... \n",
"4 Dominican Republic \"Cristo Vive\" is a large group of 35 people, 2... \n",
"\n",
" gender loan_amount nonpayment sector status \n",
"0 F 1225 partner Retail 0 \n",
"1 F 1975 lender Clothing 0 \n",
"2 F 2175 partner Clothing 0 \n",
"3 F 1425 partner Clothing 0 \n",
"4 F 4025 partner Food 0 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from pycaret.datasets import get_data\n",
"data = get_data('kiva')"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(6818, 7)"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#check the shape of data\n",
"data.shape"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(1000, 7)"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# sampling the data to select only 1000 documents\n",
"data = data.sample(1000, random_state=786).reset_index(drop=True)\n",
"data.shape"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 5.0 Setting up Environment in PyCaret"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`setup()` function initializes the environment in pycaret and performs several text pre-processing steps that are imperative to work with NLP problems. In last tutorial, we have not passed any custom stopwords, which we will do in this tutorial using `custom_stopwords` parameter. All the custom stopwords passed below are obtained through the analysis we performed in __[Natural Language Processing Tutorial (NLP101) - Level Beginner](https://github.com/pycaret/pycaret/blob/master/Tutorials/Natural%20Language%20Processing%20Tutorial%20Level%20Beginner%20-%20NLP101.ipynb)__ (refer to section 9.1). These are the words with very high frequency in the documents. As such, this is adding more noise than information. Deciding a list of custom stopwords is a subjective decision and mostly stems from your understanding of the dataset. For example, in this dataset words like 'loan', 'income', 'business', 'usd' etc. are very obvious since we are working on a dataset with customer loans. Rest of the parameters passed in `setup()` below are same as last tutorial."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"from pycaret.nlp import *"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"text/html": [
"<style type=\"text/css\" >\n",
"</style><table id=\"T_20befa0a_580b_11ea_bbe3_84fdd13f47d7\" ><thead> <tr> <th class=\"col_heading level0 col0\" >Description</th> <th class=\"col_heading level0 col1\" >Value</th> </tr></thead><tbody>\n",
" <tr>\n",
" <td id=\"T_20befa0a_580b_11ea_bbe3_84fdd13f47d7row0_col0\" class=\"data row0 col0\" >session_id</td>\n",
" <td id=\"T_20befa0a_580b_11ea_bbe3_84fdd13f47d7row0_col1\" class=\"data row0 col1\" >123</td>\n",
" </tr>\n",
" <tr>\n",
" <td id=\"T_20befa0a_580b_11ea_bbe3_84fdd13f47d7row1_col0\" class=\"data row1 col0\" ># Documents</td>\n",
" <td id=\"T_20befa0a_580b_11ea_bbe3_84fdd13f47d7row1_col1\" class=\"data row1 col1\" >1000</td>\n",
" </tr>\n",
" <tr>\n",
" <td id=\"T_20befa0a_580b_11ea_bbe3_84fdd13f47d7row2_col0\" class=\"data row2 col0\" >Vocab Size</td>\n",
" <td id=\"T_20befa0a_580b_11ea_bbe3_84fdd13f47d7row2_col1\" class=\"data row2 col1\" >4557</td>\n",
" </tr>\n",
" <tr>\n",
" <td id=\"T_20befa0a_580b_11ea_bbe3_84fdd13f47d7row3_col0\" class=\"data row3 col0\" >Custom Stopwords</td>\n",
" <td id=\"T_20befa0a_580b_11ea_bbe3_84fdd13f47d7row3_col1\" class=\"data row3 col1\" >True</td>\n",
" </tr>\n",
" </tbody></table>"
],
"text/plain": [
"<pandas.io.formats.style.Styler at 0x2d394f43248>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"exp_nlp102 = setup(data = data, target = 'en', session_id = 123,\n",
" custom_stopwords = ['loan', 'income', 'usd', 'many', 'also', 'make', 'business', 'buy', \n",
" 'sell', 'purchase','year', 'people', 'able', 'enable', 'old', 'woman',\n",
" 'child', 'school'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 6.0 Create a Topic Model"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"scrolled": false
},
"outputs": [],
"source": [
"lda = create_model('lda')"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" <script type=\"text/javascript\">\n",
" window.PlotlyConfig = {MathJaxConfig: 'local'};\n",
" if (window.MathJax) {MathJax.Hub.Config({SVG: {font: \"STIX-Web\"}});}\n",
" if (typeof require !== 'undefined') {\n",
" require.undef(\"plotly\");\n",
" requirejs.config({\n",
" paths: {\n",
" 'plotly': ['https://cdn.plot.ly/plotly-latest.min']\n",
" }\n",
" });\n",
" require(['plotly'], function(Plotly) {\n",
" window._Plotly = Plotly;\n",
" });\n",
" }\n",
" </script>\n",
" "
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.plotly.v1+json": {
"config": {
"plotlyServerURL": "https://plot.ly"
},
"data": [
{
"alignmentgroup": "True",
"customdata": [
[
"rice, farmer, land, small, crop, area, sector, use, farm, group"
],
[
"work, product, help, small, use, live, family, start, home, community"
],
[
"marry, pay, stock, family, fee, market, get, request, increase, use"
],
[
"work, steer, hair, salon, cosmetic, machine, service, wood, computer, line"
]
],
"hoverlabel": {
"namelength": 0
},
"hovertemplate": "Topic=%{x}<br>Documents=%{y}<br>Keyword=%{customdata[0]}",
"legendgroup": "",
"marker": {
"color": "#636efa"
},
"name": "",
"offsetgroup": "",
"orientation": "v",
"showlegend": false,
"textposition": "auto",
"type": "bar",
"x": [
"Topic 0",
"Topic 1",
"Topic 2",
"Topic 3"
],
"xaxis": "x",
"y": [
29,
583,
372,
16
],
"yaxis": "y"
}
],
"layout": {
"barmode": "relative",
"legend": {
"tracegroupgap": 0
},
"template": {
"data": {
"bar": [
{
"error_x": {
"color": "#2a3f5f"
},
"error_y": {
"color": "#2a3f5f"
},
"marker": {
"line": {
"color": "#E5ECF6",
"width": 0.5
}
},
"type": "bar"
}
],
"barpolar": [
{
"marker": {
"line": {
"color": "#E5ECF6",
"width": 0.5
}
},
"type": "barpolar"
}
],
"carpet": [
{
"aaxis": {
"endlinecolor": "#2a3f5f",
"gridcolor": "white",
"linecolor": "white",
"minorgridcolor": "white",
"startlinecolor": "#2a3f5f"
},
"baxis": {
"endlinecolor": "#2a3f5f",
"gridcolor": "white",
"linecolor": "white",
"minorgridcolor": "white",
"startlinecolor": "#2a3f5f"
},
"type": "carpet"
}
],
"choropleth": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "choropleth"
}
],
"contour": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "contour"
}
],
"contourcarpet": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "contourcarpet"
}
],
"heatmap": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "heatmap"
}
],
"heatmapgl": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "heatmapgl"
}
],
"histogram": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "histogram"
}
],
"histogram2d": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "histogram2d"
}
],
"histogram2dcontour": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "histogram2dcontour"
}
],
"mesh3d": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "mesh3d"
}
],
"parcoords": [
{
"line": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "parcoords"
}
],
"pie": [
{
"automargin": true,
"type": "pie"
}
],
"scatter": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatter"
}
],
"scatter3d": [
{
"line": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatter3d"
}
],
"scattercarpet": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattercarpet"
}
],
"scattergeo": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattergeo"
}
],
"scattergl": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattergl"
}
],
"scattermapbox": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattermapbox"
}
],
"scatterpolar": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterpolar"
}
],
"scatterpolargl": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterpolargl"
}
],
"scatterternary": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterternary"
}
],
"surface": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "surface"
}
],
"table": [
{
"cells": {
"fill": {
"color": "#EBF0F8"
},
"line": {
"color": "white"
}
},
"header": {
"fill": {
"color": "#C8D4E3"
},
"line": {
"color": "white"
}
},
"type": "table"
}
]
},
"layout": {
"annotationdefaults": {
"arrowcolor": "#2a3f5f",
"arrowhead": 0,
"arrowwidth": 1
},
"coloraxis": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"colorscale": {
"diverging": [
[
0,
"#8e0152"
],
[
0.1,
"#c51b7d"
],
[
0.2,
"#de77ae"
],
[
0.3,
"#f1b6da"
],
[
0.4,
"#fde0ef"
],
[
0.5,
"#f7f7f7"
],
[
0.6,
"#e6f5d0"
],
[
0.7,
"#b8e186"
],
[
0.8,
"#7fbc41"
],
[
0.9,
"#4d9221"
],
[
1,
"#276419"
]
],
"sequential": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"sequentialminus": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
]
},
"colorway": [
"#636efa",
"#EF553B",
"#00cc96",
"#ab63fa",
"#FFA15A",
"#19d3f3",
"#FF6692",
"#B6E880",
"#FF97FF",
"#FECB52"
],
"font": {
"color": "#2a3f5f"
},
"geo": {
"bgcolor": "white",
"lakecolor": "white",
"landcolor": "#E5ECF6",
"showlakes": true,
"showland": true,
"subunitcolor": "white"
},
"hoverlabel": {
"align": "left"
},
"hovermode": "closest",
"mapbox": {
"style": "light"
},
"paper_bgcolor": "white",
"plot_bgcolor": "#E5ECF6",
"polar": {
"angularaxis": {
"gridcolor": "white",
"linecolor": "white",
"ticks": ""
},
"bgcolor": "#E5ECF6",
"radialaxis": {
"gridcolor": "white",
"linecolor": "white",
"ticks": ""
}
},
"scene": {
"xaxis": {
"backgroundcolor": "#E5ECF6",
"gridcolor": "white",
"gridwidth": 2,
"linecolor": "white",
"showbackground": true,
"ticks": "",
"zerolinecolor": "white"
},
"yaxis": {
"backgroundcolor": "#E5ECF6",
"gridcolor": "white",
"gridwidth": 2,
"linecolor": "white",
"showbackground": true,
"ticks": "",
"zerolinecolor": "white"
},
"zaxis": {
"backgroundcolor": "#E5ECF6",
"gridcolor": "white",
"gridwidth": 2,
"linecolor": "white",
"showbackground": true,
"ticks": "",
"zerolinecolor": "white"
}
},
"shapedefaults": {
"line": {
"color": "#2a3f5f"
}
},
"ternary": {
"aaxis": {
"gridcolor": "white",
"linecolor": "white",
"ticks": ""
},
"baxis": {
"gridcolor": "white",
"linecolor": "white",
"ticks": ""
},
"bgcolor": "#E5ECF6",
"caxis": {
"gridcolor": "white",
"linecolor": "white",
"ticks": ""
}
},
"title": {
"x": 0.05
},
"xaxis": {
"automargin": true,
"gridcolor": "white",
"linecolor": "white",
"ticks": "",
"title": {
"standoff": 15
},
"zerolinecolor": "white",
"zerolinewidth": 2
},
"yaxis": {
"automargin": true,
"gridcolor": "white",
"linecolor": "white",
"ticks": "",
"title": {
"standoff": 15
},
"zerolinecolor": "white",
"zerolinewidth": 2
}
}
},
"title": {
"text": "Document Distribution by Topics"
},
"xaxis": {
"anchor": "y",
"domain": [
0,
1
],
"title": {
"text": "Topic"
}
},
"yaxis": {
"anchor": "x",
"domain": [
0,
1
],
"title": {
"text": "Documents"
}
}
}
},
"text/html": [
"<div>\n",
" \n",
" \n",
" <div id=\"3f00f7e4-8ffa-4de8-8d2c-c223e627a7ff\" class=\"plotly-graph-div\" style=\"height:525px; width:100%;\"></div>\n",
" <script type=\"text/javascript\">\n",
" require([\"plotly\"], function(Plotly) {\n",
" window.PLOTLYENV=window.PLOTLYENV || {};\n",
" \n",
" if (document.getElementById(\"3f00f7e4-8ffa-4de8-8d2c-c223e627a7ff\")) {\n",
" Plotly.newPlot(\n",
" '3f00f7e4-8ffa-4de8-8d2c-c223e627a7ff',\n",
" [{\"alignmentgroup\": \"True\", \"customdata\": [[\"rice, farmer, land, small, crop, area, sector, use, farm, group\"], [\"work, product, help, small, use, live, family, start, home, community\"], [\"marry, pay, stock, family, fee, market, get, request, increase, use\"], [\"work, steer, hair, salon, cosmetic, machine, service, wood, computer, line\"]], \"hoverlabel\": {\"namelength\": 0}, \"hovertemplate\": \"Topic=%{x}<br>Documents=%{y}<br>Keyword=%{customdata[0]}\", \"legendgroup\": \"\", \"marker\": {\"color\": \"#636efa\"}, \"name\": \"\", \"offsetgroup\": \"\", \"orientation\": \"v\", \"showlegend\": false, \"textposition\": \"auto\", \"type\": \"bar\", \"x\": [\"Topic 0\", \"Topic 1\", \"Topic 2\", \"Topic 3\"], \"xaxis\": \"x\", \"y\": [29, 583, 372, 16], \"yaxis\": \"y\"}],\n",
" {\"barmode\": \"relative\", \"legend\": {\"tracegroupgap\": 0}, \"template\": {\"data\": {\"bar\": [{\"error_x\": {\"color\": \"#2a3f5f\"}, \"error_y\": {\"color\": \"#2a3f5f\"}, \"marker\": {\"line\": {\"color\": \"#E5ECF6\", \"width\": 0.5}}, \"type\": \"bar\"}], \"barpolar\": [{\"marker\": {\"line\": {\"color\": \"#E5ECF6\", \"width\": 0.5}}, \"type\": \"barpolar\"}], \"carpet\": [{\"aaxis\": {\"endlinecolor\": \"#2a3f5f\", \"gridcolor\": \"white\", \"linecolor\": \"white\", \"minorgridcolor\": \"white\", \"startlinecolor\": \"#2a3f5f\"}, \"baxis\": {\"endlinecolor\": \"#2a3f5f\", \"gridcolor\": \"white\", \"linecolor\": \"white\", \"minorgridcolor\": \"white\", \"startlinecolor\": \"#2a3f5f\"}, \"type\": \"carpet\"}], \"choropleth\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"type\": \"choropleth\"}], \"contour\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"colorscale\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]], \"type\": \"contour\"}], \"contourcarpet\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"type\": \"contourcarpet\"}], \"heatmap\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"colorscale\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]], \"type\": \"heatmap\"}], \"heatmapgl\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"colorscale\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]], \"type\": \"heatmapgl\"}], \"histogram\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"histogram\"}], \"histogram2d\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"colorscale\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]], \"type\": \"histogram2d\"}], \"histogram2dcontour\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"colorscale\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]], \"type\": \"histogram2dcontour\"}], \"mesh3d\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"type\": \"mesh3d\"}], \"parcoords\": [{\"line\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"parcoords\"}], \"pie\": [{\"automargin\": true, \"type\": \"pie\"}], \"scatter\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scatter\"}], \"scatter3d\": [{\"line\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scatter3d\"}], \"scattercarpet\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scattercarpet\"}], \"scattergeo\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scattergeo\"}], \"scattergl\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scattergl\"}], \"scattermapbox\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scattermapbox\"}], \"scatterpolar\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scatterpolar\"}], \"scatterpolargl\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scatterpolargl\"}], \"scatterternary\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scatterternary\"}], \"surface\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"colorscale\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]], \"type\": \"surface\"}], \"table\": [{\"cells\": {\"fill\": {\"color\": \"#EBF0F8\"}, \"line\": {\"color\": \"white\"}}, \"header\": {\"fill\": {\"color\": \"#C8D4E3\"}, \"line\": {\"color\": \"white\"}}, \"type\": \"table\"}]}, \"layout\": {\"annotationdefaults\": {\"arrowcolor\": \"#2a3f5f\", \"arrowhead\": 0, \"arrowwidth\": 1}, \"coloraxis\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"colorscale\": {\"diverging\": [[0, \"#8e0152\"], [0.1, \"#c51b7d\"], [0.2, \"#de77ae\"], [0.3, \"#f1b6da\"], [0.4, \"#fde0ef\"], [0.5, \"#f7f7f7\"], [0.6, \"#e6f5d0\"], [0.7, \"#b8e186\"], [0.8, \"#7fbc41\"], [0.9, \"#4d9221\"], [1, \"#276419\"]], \"sequential\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]], \"sequentialminus\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]]}, \"colorway\": [\"#636efa\", \"#EF553B\", \"#00cc96\", \"#ab63fa\", \"#FFA15A\", \"#19d3f3\", \"#FF6692\", \"#B6E880\", \"#FF97FF\", \"#FECB52\"], \"font\": {\"color\": \"#2a3f5f\"}, \"geo\": {\"bgcolor\": \"white\", \"lakecolor\": \"white\", \"landcolor\": \"#E5ECF6\", \"showlakes\": true, \"showland\": true, \"subunitcolor\": \"white\"}, \"hoverlabel\": {\"align\": \"left\"}, \"hovermode\": \"closest\", \"mapbox\": {\"style\": \"light\"}, \"paper_bgcolor\": \"white\", \"plot_bgcolor\": \"#E5ECF6\", \"polar\": {\"angularaxis\": {\"gridcolor\": \"white\", \"linecolor\": \"white\", \"ticks\": \"\"}, \"bgcolor\": \"#E5ECF6\", \"radialaxis\": {\"gridcolor\": \"white\", \"linecolor\": \"white\", \"ticks\": \"\"}}, \"scene\": {\"xaxis\": {\"backgroundcolor\": \"#E5ECF6\", \"gridcolor\": \"white\", \"gridwidth\": 2, \"linecolor\": \"white\", \"showbackground\": true, \"ticks\": \"\", \"zerolinecolor\": \"white\"}, \"yaxis\": {\"backgroundcolor\": \"#E5ECF6\", \"gridcolor\": \"white\", \"gridwidth\": 2, \"linecolor\": \"white\", \"showbackground\": true, \"ticks\": \"\", \"zerolinecolor\": \"white\"}, \"zaxis\": {\"backgroundcolor\": \"#E5ECF6\", \"gridcolor\": \"white\", \"gridwidth\": 2, \"linecolor\": \"white\", \"showbackground\": true, \"ticks\": \"\", \"zerolinecolor\": \"white\"}}, \"shapedefaults\": {\"line\": {\"color\": \"#2a3f5f\"}}, \"ternary\": {\"aaxis\": {\"gridcolor\": \"white\", \"linecolor\": \"white\", \"ticks\": \"\"}, \"baxis\": {\"gridcolor\": \"white\", \"linecolor\": \"white\", \"ticks\": \"\"}, \"bgcolor\": \"#E5ECF6\", \"caxis\": {\"gridcolor\": \"white\", \"linecolor\": \"white\", \"ticks\": \"\"}}, \"title\": {\"x\": 0.05}, \"xaxis\": {\"automargin\": true, \"gridcolor\": \"white\", \"linecolor\": \"white\", \"ticks\": \"\", \"title\": {\"standoff\": 15}, \"zerolinecolor\": \"white\", \"zerolinewidth\": 2}, \"yaxis\": {\"automargin\": true, \"gridcolor\": \"white\", \"linecolor\": \"white\", \"ticks\": \"\", \"title\": {\"standoff\": 15}, \"zerolinecolor\": \"white\", \"zerolinewidth\": 2}}}, \"title\": {\"text\": \"Document Distribution by Topics\"}, \"xaxis\": {\"anchor\": \"y\", \"domain\": [0.0, 1.0], \"title\": {\"text\": \"Topic\"}}, \"yaxis\": {\"anchor\": \"x\", \"domain\": [0.0, 1.0], \"title\": {\"text\": \"Documents\"}}},\n",
" {\"responsive\": true}\n",
" ).then(function(){\n",
" \n",
"var gd = document.getElementById('3f00f7e4-8ffa-4de8-8d2c-c223e627a7ff');\n",
"var x = new MutationObserver(function (mutations, observer) {{\n",
" var display = window.getComputedStyle(gd).display;\n",
" if (!display || display === 'none') {{\n",
" console.log([gd, 'removed!']);\n",
" Plotly.purge(gd);\n",
" observer.disconnect();\n",
" }}\n",
"}});\n",
"\n",
"// Listen for the removal of the full notebook cells\n",
"var notebookContainer = gd.closest('#notebook-container');\n",
"if (notebookContainer) {{\n",
" x.observe(notebookContainer, {childList: true});\n",
"}}\n",
"\n",
"// Listen for the clearing of the current output cell\n",
"var outputEl = gd.closest('.output');\n",
"if (outputEl) {{\n",
" x.observe(outputEl, {childList: true});\n",
"}}\n",
"\n",
" })\n",
" };\n",
" });\n",
" </script>\n",
" </div>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"plot_model(lda, plot = 'topic_distribution')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If you compare the output above with the one in section 9.4 of __[last tutorial](https://github.com/pycaret/pycaret/blob/master/Tutorials/Natural%20Language%20Processing%20Tutorial%20Level%20Beginner%20-%20NLP101.ipynb)__ you would notice that distribution of topics have changed, but what is more important to observe here is when you hover over the bars, keywords gives you better idea of theme of the topic in this experiment compared to last one because we have removed some noice by removing custom stopwords. For eg. `Topic 3` seems to be about customers seeking trade loans as it include keywords like 'hair', 'salon', 'wood', 'machine'. `Topic 0` is about farming/agricultural loans, `Topic 1` is mostly about retail loans and `Topic 2` seems to be about loans for domestic reasons. \n",
"\n",
"Topic Modeling is very iterative machine learning task, finding the right list of custom stopwords is only possible after several iterations. We encourage you to repeat the experiments to gain actionable insights that finally leads you to the best working and implementable model. So far we have learned how to create and analyze a topic model using `pycaret.nlp` module. In next section we will go a few steps deeper to understand how to evaluate a topic model."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 7.0 Evaluating Topic Model"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Many topic models including Latent Dirichlet allocation are probabilistic models, providing both a predictive and latent topic representation. It is generally assumed that results generated by these models are meaningful and useful and due to its unsupervised training process it is hard to evaluate those assumptions. Nevertheless, it is equally important to identify if a trained model is objectively good or bad, as well have an ability to compare different models/methods. To do so, one would require an objective measure for the quality. Traditionally, and still for many practical applications, to evaluate if \"the correct thing\" has been learned about the corpus, an implicit knowledge and \"eyeballing\" approaches are used. Ideally, we’d like to capture this information in a single metric that can be maximized, and compared. The approaches that are commonly used today:\n",
"\n",
"- **Eye Balling Models :** Look at Top N words, Topics / Documents etc. \n",
"- **Intrinsic Evaluation Metrics:** Interpretability and semantics of model\n",
"- **Extrinsic Evaluation Metrics:** Is model good at performing predefined tasks, such as classification (later in this tutorial we will use our topic model to build a classifier to predict loan default)\n",
"- **Human Judgements:** Does the topic model improves your understanding of the problem?\n",
"\n",
"In this section we will learn how to evaluate coherence value of a topic model using `tune_model()` function. Followed by extrinsic evaluation on number of topics in a topic model to optimize classifier that can predict default using `status` column in the dataset.\n",
"\n",
"__[Read More about Model Evaluation](https://towardsdatascience.com/evaluate-topic-model-in-python-latent-dirichlet-allocation-lda-7d57484bb5d0)__"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 7.1 Intrinsic Evaluation using Coherence Value"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**What is Topic Coherence?** Topic Coherence measures score a single topic by measuring the degree of semantic similarity between high scoring words in the topic. These measurements help distinguish between topics that are semantically interpretable topics and topics that are artifacts of statistical inference. `tune_model()` function iterates on a pre-defined grid with different number of topics and create a model for each parameter. Topic coherence is then evaluated for different models and are visually presented in a graph the has `Coherence Score` on y-axis as a function of `# Topics` on x-axis. See example below:"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"IntProgress(value=0, description='Processing: ', max=25)"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "9f3fe29a2c5c420f9df5982d0bb4f8ec",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Output()"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
" <script type=\"text/javascript\">\n",
" window.PlotlyConfig = {MathJaxConfig: 'local'};\n",
" if (window.MathJax) {MathJax.Hub.Config({SVG: {font: \"STIX-Web\"}});}\n",
" if (typeof require !== 'undefined') {\n",
" require.undef(\"plotly\");\n",
" requirejs.config({\n",
" paths: {\n",
" 'plotly': ['https://cdn.plot.ly/plotly-latest.min']\n",
" }\n",
" });\n",
" require(['plotly'], function(Plotly) {\n",
" window._Plotly = Plotly;\n",
" });\n",
" }\n",
" </script>\n",
" "
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.plotly.v1+json": {
"config": {
"plotlyServerURL": "https://plot.ly"
},
"data": [
{
"hoverlabel": {
"namelength": 0
},
"hovertemplate": "Metric=Coherence<br># Topics=%{x}<br>Score=%{y}",
"legendgroup": "Metric=Coherence",
"line": {
"color": "#636efa",
"dash": "solid",
"shape": "linear"
},
"mode": "lines",
"name": "Metric=Coherence",
"showlegend": true,
"type": "scatter",
"x": [
2,
4,
8,
16,
32,
64,
100,
200,
300,
400
],
"xaxis": "x",
"y": [
0.5089163264713378,
0.4982793046401531,
0.46415151580222114,
0.4043819739299509,
0.41263177839497134,
0.38878595817291817,
0.37519844359483173,
0.42158590047469074,
0.4373706800043006,
0.4681067603128716
],
"yaxis": "y"
}
],
"layout": {
"legend": {
"tracegroupgap": 0
},
"plot_bgcolor": "rgb(245,245,245)",
"template": {
"data": {
"bar": [
{
"error_x": {
"color": "#2a3f5f"
},
"error_y": {
"color": "#2a3f5f"
},
"marker": {
"line": {
"color": "#E5ECF6",
"width": 0.5
}
},
"type": "bar"
}
],
"barpolar": [
{
"marker": {
"line": {
"color": "#E5ECF6",
"width": 0.5
}
},
"type": "barpolar"
}
],
"carpet": [
{
"aaxis": {
"endlinecolor": "#2a3f5f",
"gridcolor": "white",
"linecolor": "white",
"minorgridcolor": "white",
"startlinecolor": "#2a3f5f"
},
"baxis": {
"endlinecolor": "#2a3f5f",
"gridcolor": "white",
"linecolor": "white",
"minorgridcolor": "white",
"startlinecolor": "#2a3f5f"
},
"type": "carpet"
}
],
"choropleth": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "choropleth"
}
],
"contour": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "contour"
}
],
"contourcarpet": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "contourcarpet"
}
],
"heatmap": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "heatmap"
}
],
"heatmapgl": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "heatmapgl"
}
],
"histogram": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "histogram"
}
],
"histogram2d": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "histogram2d"
}
],
"histogram2dcontour": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "histogram2dcontour"
}
],
"mesh3d": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "mesh3d"
}
],
"parcoords": [
{
"line": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "parcoords"
}
],
"pie": [
{
"automargin": true,
"type": "pie"
}
],
"scatter": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatter"
}
],
"scatter3d": [
{
"line": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatter3d"
}
],
"scattercarpet": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattercarpet"
}
],
"scattergeo": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattergeo"
}
],
"scattergl": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattergl"
}
],
"scattermapbox": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattermapbox"
}
],
"scatterpolar": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterpolar"
}
],
"scatterpolargl": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterpolargl"
}
],
"scatterternary": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterternary"
}
],
"surface": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "surface"
}
],
"table": [
{
"cells": {
"fill": {
"color": "#EBF0F8"
},
"line": {
"color": "white"
}
},
"header": {
"fill": {
"color": "#C8D4E3"
},
"line": {
"color": "white"
}
},
"type": "table"
}
]
},
"layout": {
"annotationdefaults": {
"arrowcolor": "#2a3f5f",
"arrowhead": 0,
"arrowwidth": 1
},
"coloraxis": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"colorscale": {
"diverging": [
[
0,
"#8e0152"
],
[
0.1,
"#c51b7d"
],
[
0.2,
"#de77ae"
],
[
0.3,
"#f1b6da"
],
[
0.4,
"#fde0ef"
],
[
0.5,
"#f7f7f7"
],
[
0.6,
"#e6f5d0"
],
[
0.7,
"#b8e186"
],
[
0.8,
"#7fbc41"
],
[
0.9,
"#4d9221"
],
[
1,
"#276419"
]
],
"sequential": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"sequentialminus": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
]
},
"colorway": [
"#636efa",
"#EF553B",
"#00cc96",
"#ab63fa",
"#FFA15A",
"#19d3f3",
"#FF6692",
"#B6E880",
"#FF97FF",
"#FECB52"
],
"font": {
"color": "#2a3f5f"
},
"geo": {
"bgcolor": "white",
"lakecolor": "white",
"landcolor": "#E5ECF6",
"showlakes": true,
"showland": true,
"subunitcolor": "white"
},
"hoverlabel": {
"align": "left"
},
"hovermode": "closest",
"mapbox": {
"style": "light"
},
"paper_bgcolor": "white",
"plot_bgcolor": "#E5ECF6",
"polar": {
"angularaxis": {
"gridcolor": "white",
"linecolor": "white",
"ticks": ""
},
"bgcolor": "#E5ECF6",
"radialaxis": {
"gridcolor": "white",
"linecolor": "white",
"ticks": ""
}
},
"scene": {
"xaxis": {
"backgroundcolor": "#E5ECF6",
"gridcolor": "white",
"gridwidth": 2,
"linecolor": "white",
"showbackground": true,
"ticks": "",
"zerolinecolor": "white"
},
"yaxis": {
"backgroundcolor": "#E5ECF6",
"gridcolor": "white",
"gridwidth": 2,
"linecolor": "white",
"showbackground": true,
"ticks": "",
"zerolinecolor": "white"
},
"zaxis": {
"backgroundcolor": "#E5ECF6",
"gridcolor": "white",
"gridwidth": 2,
"linecolor": "white",
"showbackground": true,
"ticks": "",
"zerolinecolor": "white"
}
},
"shapedefaults": {
"line": {
"color": "#2a3f5f"
}
},
"ternary": {
"aaxis": {
"gridcolor": "white",
"linecolor": "white",
"ticks": ""
},
"baxis": {
"gridcolor": "white",
"linecolor": "white",
"ticks": ""
},
"bgcolor": "#E5ECF6",
"caxis": {
"gridcolor": "white",
"linecolor": "white",
"ticks": ""
}
},
"title": {
"x": 0.05
},
"xaxis": {
"automargin": true,
"gridcolor": "white",
"linecolor": "white",
"ticks": "",
"title": {
"standoff": 15
},
"zerolinecolor": "white",
"zerolinewidth": 2
},
"yaxis": {
"automargin": true,
"gridcolor": "white",
"linecolor": "white",
"ticks": "",
"title": {
"standoff": 15
},
"zerolinecolor": "white",
"zerolinewidth": 2
}
}
},
"title": {
"text": "Coherence Value and # of Topics"
},
"xaxis": {
"anchor": "y",
"domain": [
0,
1
],
"title": {
"text": "# Topics"
}
},
"yaxis": {
"anchor": "x",
"domain": [
0,
1
],
"title": {
"text": "Score"
}
}
}
},
"text/html": [
"<div>\n",
" \n",
" \n",
" <div id=\"25b30b02-5f88-409f-a0f8-19f2941dbf1d\" class=\"plotly-graph-div\" style=\"height:525px; width:100%;\"></div>\n",
" <script type=\"text/javascript\">\n",
" require([\"plotly\"], function(Plotly) {\n",
" window.PLOTLYENV=window.PLOTLYENV || {};\n",
" \n",
" if (document.getElementById(\"25b30b02-5f88-409f-a0f8-19f2941dbf1d\")) {\n",
" Plotly.newPlot(\n",
" '25b30b02-5f88-409f-a0f8-19f2941dbf1d',\n",
" [{\"hoverlabel\": {\"namelength\": 0}, \"hovertemplate\": \"Metric=Coherence<br># Topics=%{x}<br>Score=%{y}\", \"legendgroup\": \"Metric=Coherence\", \"line\": {\"color\": \"#636efa\", \"dash\": \"solid\", \"shape\": \"linear\"}, \"mode\": \"lines\", \"name\": \"Metric=Coherence\", \"showlegend\": true, \"type\": \"scatter\", \"x\": [2, 4, 8, 16, 32, 64, 100, 200, 300, 400], \"xaxis\": \"x\", \"y\": [0.5089163264713378, 0.4982793046401531, 0.46415151580222114, 0.4043819739299509, 0.41263177839497134, 0.38878595817291817, 0.37519844359483173, 0.42158590047469074, 0.4373706800043006, 0.4681067603128716], \"yaxis\": \"y\"}],\n",
" {\"legend\": {\"tracegroupgap\": 0}, \"plot_bgcolor\": \"rgb(245,245,245)\", \"template\": {\"data\": {\"bar\": [{\"error_x\": {\"color\": \"#2a3f5f\"}, \"error_y\": {\"color\": \"#2a3f5f\"}, \"marker\": {\"line\": {\"color\": \"#E5ECF6\", \"width\": 0.5}}, \"type\": \"bar\"}], \"barpolar\": [{\"marker\": {\"line\": {\"color\": \"#E5ECF6\", \"width\": 0.5}}, \"type\": \"barpolar\"}], \"carpet\": [{\"aaxis\": {\"endlinecolor\": \"#2a3f5f\", \"gridcolor\": \"white\", \"linecolor\": \"white\", \"minorgridcolor\": \"white\", \"startlinecolor\": \"#2a3f5f\"}, \"baxis\": {\"endlinecolor\": \"#2a3f5f\", \"gridcolor\": \"white\", \"linecolor\": \"white\", \"minorgridcolor\": \"white\", \"startlinecolor\": \"#2a3f5f\"}, \"type\": \"carpet\"}], \"choropleth\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"type\": \"choropleth\"}], \"contour\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"colorscale\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]], \"type\": \"contour\"}], \"contourcarpet\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"type\": \"contourcarpet\"}], \"heatmap\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"colorscale\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]], \"type\": \"heatmap\"}], \"heatmapgl\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"colorscale\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]], \"type\": \"heatmapgl\"}], \"histogram\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"histogram\"}], \"histogram2d\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"colorscale\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]], \"type\": \"histogram2d\"}], \"histogram2dcontour\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"colorscale\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]], \"type\": \"histogram2dcontour\"}], \"mesh3d\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"type\": \"mesh3d\"}], \"parcoords\": [{\"line\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"parcoords\"}], \"pie\": [{\"automargin\": true, \"type\": \"pie\"}], \"scatter\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scatter\"}], \"scatter3d\": [{\"line\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scatter3d\"}], \"scattercarpet\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scattercarpet\"}], \"scattergeo\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scattergeo\"}], \"scattergl\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scattergl\"}], \"scattermapbox\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scattermapbox\"}], \"scatterpolar\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scatterpolar\"}], \"scatterpolargl\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scatterpolargl\"}], \"scatterternary\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scatterternary\"}], \"surface\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"colorscale\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]], \"type\": \"surface\"}], \"table\": [{\"cells\": {\"fill\": {\"color\": \"#EBF0F8\"}, \"line\": {\"color\": \"white\"}}, \"header\": {\"fill\": {\"color\": \"#C8D4E3\"}, \"line\": {\"color\": \"white\"}}, \"type\": \"table\"}]}, \"layout\": {\"annotationdefaults\": {\"arrowcolor\": \"#2a3f5f\", \"arrowhead\": 0, \"arrowwidth\": 1}, \"coloraxis\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"colorscale\": {\"diverging\": [[0, \"#8e0152\"], [0.1, \"#c51b7d\"], [0.2, \"#de77ae\"], [0.3, \"#f1b6da\"], [0.4, \"#fde0ef\"], [0.5, \"#f7f7f7\"], [0.6, \"#e6f5d0\"], [0.7, \"#b8e186\"], [0.8, \"#7fbc41\"], [0.9, \"#4d9221\"], [1, \"#276419\"]], \"sequential\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]], \"sequentialminus\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]]}, \"colorway\": [\"#636efa\", \"#EF553B\", \"#00cc96\", \"#ab63fa\", \"#FFA15A\", \"#19d3f3\", \"#FF6692\", \"#B6E880\", \"#FF97FF\", \"#FECB52\"], \"font\": {\"color\": \"#2a3f5f\"}, \"geo\": {\"bgcolor\": \"white\", \"lakecolor\": \"white\", \"landcolor\": \"#E5ECF6\", \"showlakes\": true, \"showland\": true, \"subunitcolor\": \"white\"}, \"hoverlabel\": {\"align\": \"left\"}, \"hovermode\": \"closest\", \"mapbox\": {\"style\": \"light\"}, \"paper_bgcolor\": \"white\", \"plot_bgcolor\": \"#E5ECF6\", \"polar\": {\"angularaxis\": {\"gridcolor\": \"white\", \"linecolor\": \"white\", \"ticks\": \"\"}, \"bgcolor\": \"#E5ECF6\", \"radialaxis\": {\"gridcolor\": \"white\", \"linecolor\": \"white\", \"ticks\": \"\"}}, \"scene\": {\"xaxis\": {\"backgroundcolor\": \"#E5ECF6\", \"gridcolor\": \"white\", \"gridwidth\": 2, \"linecolor\": \"white\", \"showbackground\": true, \"ticks\": \"\", \"zerolinecolor\": \"white\"}, \"yaxis\": {\"backgroundcolor\": \"#E5ECF6\", \"gridcolor\": \"white\", \"gridwidth\": 2, \"linecolor\": \"white\", \"showbackground\": true, \"ticks\": \"\", \"zerolinecolor\": \"white\"}, \"zaxis\": {\"backgroundcolor\": \"#E5ECF6\", \"gridcolor\": \"white\", \"gridwidth\": 2, \"linecolor\": \"white\", \"showbackground\": true, \"ticks\": \"\", \"zerolinecolor\": \"white\"}}, \"shapedefaults\": {\"line\": {\"color\": \"#2a3f5f\"}}, \"ternary\": {\"aaxis\": {\"gridcolor\": \"white\", \"linecolor\": \"white\", \"ticks\": \"\"}, \"baxis\": {\"gridcolor\": \"white\", \"linecolor\": \"white\", \"ticks\": \"\"}, \"bgcolor\": \"#E5ECF6\", \"caxis\": {\"gridcolor\": \"white\", \"linecolor\": \"white\", \"ticks\": \"\"}}, \"title\": {\"x\": 0.05}, \"xaxis\": {\"automargin\": true, \"gridcolor\": \"white\", \"linecolor\": \"white\", \"ticks\": \"\", \"title\": {\"standoff\": 15}, \"zerolinecolor\": \"white\", \"zerolinewidth\": 2}, \"yaxis\": {\"automargin\": true, \"gridcolor\": \"white\", \"linecolor\": \"white\", \"ticks\": \"\", \"title\": {\"standoff\": 15}, \"zerolinecolor\": \"white\", \"zerolinewidth\": 2}}}, \"title\": {\"text\": \"Coherence Value and # of Topics\"}, \"xaxis\": {\"anchor\": \"y\", \"domain\": [0.0, 1.0], \"title\": {\"text\": \"# Topics\"}}, \"yaxis\": {\"anchor\": \"x\", \"domain\": [0.0, 1.0], \"title\": {\"text\": \"Score\"}}},\n",
" {\"responsive\": true}\n",
" ).then(function(){\n",
" \n",
"var gd = document.getElementById('25b30b02-5f88-409f-a0f8-19f2941dbf1d');\n",
"var x = new MutationObserver(function (mutations, observer) {{\n",
" var display = window.getComputedStyle(gd).display;\n",
" if (!display || display === 'none') {{\n",
" console.log([gd, 'removed!']);\n",
" Plotly.purge(gd);\n",
" observer.disconnect();\n",
" }}\n",
"}});\n",
"\n",
"// Listen for the removal of the full notebook cells\n",
"var notebookContainer = gd.closest('#notebook-container');\n",
"if (notebookContainer) {{\n",
" x.observe(notebookContainer, {childList: true});\n",
"}}\n",
"\n",
"// Listen for the clearing of the current output cell\n",
"var outputEl = gd.closest('.output');\n",
"if (outputEl) {{\n",
" x.observe(outputEl, {childList: true});\n",
"}}\n",
"\n",
" })\n",
" };\n",
" });\n",
" </script>\n",
" </div>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Best Model: Latent Dirichlet Allocation | # Topics: 2 | Coherence: 0.5089\n"
]
}
],
"source": [
"tuned_unsupervised = tune_model(model = 'lda', multi_core = True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Model with highest coherence score is the best model based on intrinsic evaluation criteria. As appealing as it may sound that performance of a topic model can be captured in one number i.e. Coherence Score, it doesn't come without its downside. We encourage you to do some more reading about Coherence Score to understand more about it. __[[Read More]](https://www.aclweb.org/anthology/D12-1087.pdf)__\n",
"\n",
"We have only covered Coherence in this tutorial. The other popular measure is perplexity. It captures how surprised a model is of new data it has not seen before, and is measured as the normalized log-likelihood of a held-out test set. Focussing on the log-likelihood part, you can think of the perplexity metric as measuring how probable some new unseen data is given the model that was learned earlier. That is to say, how well does the model represent or reproduce the statistics of the held-out data. However, recent studies have shown that predictive likelihood (or equivalently, perplexity) and human judgment are often not correlated, and even sometimes slightly anti-correlated as such optimizing for perplexity may not yield human interpretable topics. __[[Reference]](https://towardsdatascience.com/evaluate-topic-model-in-python-latent-dirichlet-allocation-lda-7d57484bb5d0)__"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 7.2 Extrinsic Evaluation using Classifier"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The dataset we are using is labelled using `status` column (1 means loan default, 0 means no default). We will now use `tune_model()` function to determine the best number of topics. Best in this case is defined by the measure of interest in supervised machine learning which in this case is `Accuracy` since this is a classification problem. See example below:"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"IntProgress(value=0, description='Processing: ', max=25)"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "cb7233bc9a764e909bf94e5ef9e240be",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Output()"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
" <script type=\"text/javascript\">\n",
" window.PlotlyConfig = {MathJaxConfig: 'local'};\n",
" if (window.MathJax) {MathJax.Hub.Config({SVG: {font: \"STIX-Web\"}});}\n",
" if (typeof require !== 'undefined') {\n",
" require.undef(\"plotly\");\n",
" requirejs.config({\n",
" paths: {\n",
" 'plotly': ['https://cdn.plot.ly/plotly-latest.min']\n",
" }\n",
" });\n",
" require(['plotly'], function(Plotly) {\n",
" window._Plotly = Plotly;\n",
" });\n",
" }\n",
" </script>\n",
" "
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.plotly.v1+json": {
"config": {
"plotlyServerURL": "https://plot.ly"
},
"data": [
{
"hoverlabel": {
"namelength": 0
},
"hovertemplate": "Metric=Accuracy<br># Topics=%{x}<br>Score=%{y}",
"legendgroup": "Metric=Accuracy",
"line": {
"color": "#636efa",
"dash": "solid",
"shape": "linear"
},
"mode": "lines",
"name": "Metric=Accuracy",
"showlegend": true,
"type": "scatter",
"x": [
2,
4,
8,
16,
32,
64,
100,
200,
300,
400
],
"xaxis": "x",
"y": [
0.857,
0.874,
0.867,
0.867,
0.86,
0.85,
0.857,
0.859,
0.856,
0.858
],
"yaxis": "y"
},
{
"hoverlabel": {
"namelength": 0
},
"hovertemplate": "Metric=AUC<br># Topics=%{x}<br>Score=%{y}",
"legendgroup": "Metric=AUC",
"line": {
"color": "#EF553B",
"dash": "solid",
"shape": "linear"
},
"mode": "lines",
"name": "Metric=AUC",
"showlegend": true,
"type": "scatter",
"x": [
2,
4,
8,
16,
32,
64,
100,
200,
300,
400
],
"xaxis": "x",
"y": [
0.9194908474142345,
0.9226590501792116,
0.9232030849974399,
0.9181387608806963,
0.9203629032258064,
0.9205949180747568,
0.914986559139785,
0.9108662954429083,
0.9128064196108551,
0.9085861495135689
],
"yaxis": "y"
},
{
"hoverlabel": {
"namelength": 0
},
"hovertemplate": "Metric=Recall<br># Topics=%{x}<br>Score=%{y}",
"legendgroup": "Metric=Recall",
"line": {
"color": "#00cc96",
"dash": "solid",
"shape": "linear"
},
"mode": "lines",
"name": "Metric=Recall",
"showlegend": true,
"type": "scatter",
"x": [
2,
4,
8,
16,
32,
64,
100,
200,
300,
400
],
"xaxis": "x",
"y": [
0.8266129032258065,
0.8387096774193549,
0.844758064516129,
0.8467741935483871,
0.8387096774193549,
0.8346774193548387,
0.8487903225806451,
0.8346774193548387,
0.8326612903225806,
0.8346774193548387
],
"yaxis": "y"
},
{
"hoverlabel": {
"namelength": 0
},
"hovertemplate": "Metric=Precision<br># Topics=%{x}<br>Score=%{y}",
"legendgroup": "Metric=Precision",
"line": {
"color": "#ab63fa",
"dash": "solid",
"shape": "linear"
},
"mode": "lines",
"name": "Metric=Precision",
"showlegend": true,
"type": "scatter",
"x": [
2,
4,
8,
16,
32,
64,
100,
200,
300,
400
],
"xaxis": "x",
"y": [
0.8779443254817987,
0.9004329004329005,
0.8821052631578947,
0.8805031446540881,
0.8739495798319328,
0.8589211618257261,
0.8609406952965235,
0.8752642706131079,
0.8713080168776371,
0.8734177215189873
],
"yaxis": "y"
},
{
"hoverlabel": {
"namelength": 0
},
"hovertemplate": "Metric=F1<br># Topics=%{x}<br>Score=%{y}",
"legendgroup": "Metric=F1",
"line": {
"color": "#FFA15A",
"dash": "solid",
"shape": "linear"
},
"mode": "lines",
"name": "Metric=F1",
"showlegend": true,
"type": "scatter",
"x": [
2,
4,
8,
16,
32,
64,
100,
200,
300,
400
],
"xaxis": "x",
"y": [
0.8515057113187955,
0.8684759916492693,
0.8630278063851698,
0.8633093525179856,
0.8559670781893004,
0.8466257668711658,
0.8548223350253806,
0.8544891640866874,
0.8515463917525773,
0.8536082474226804
],
"yaxis": "y"
},
{
"hoverlabel": {
"namelength": 0
},
"hovertemplate": "Metric=Kappa<br># Topics=%{x}<br>Score=%{y}",
"legendgroup": "Metric=Kappa",
"line": {
"color": "#19d3f3",
"dash": "solid",
"shape": "linear"
},
"mode": "lines",
"name": "Metric=Kappa",
"showlegend": true,
"type": "scatter",
"x": [
2,
4,
8,
16,
32,
64,
100,
200,
300,
400
],
"xaxis": "x",
"y": [
0.7138489122256552,
0.747846690787999,
0.7338935574229692,
0.7339020759639547,
0.7198924386964594,
0.6999135751096316,
0.7139496551393045,
0.7178781233492869,
0.7118801421391299,
0.7158818068316419
],
"yaxis": "y"
}
],
"layout": {
"legend": {
"tracegroupgap": 0
},
"margin": {
"t": 60
},
"plot_bgcolor": "rgb(245,245,245)",
"template": {
"data": {
"bar": [
{
"error_x": {
"color": "#2a3f5f"
},
"error_y": {
"color": "#2a3f5f"
},
"marker": {
"line": {
"color": "#E5ECF6",
"width": 0.5
}
},
"type": "bar"
}
],
"barpolar": [
{
"marker": {
"line": {
"color": "#E5ECF6",
"width": 0.5
}
},
"type": "barpolar"
}
],
"carpet": [
{
"aaxis": {
"endlinecolor": "#2a3f5f",
"gridcolor": "white",
"linecolor": "white",
"minorgridcolor": "white",
"startlinecolor": "#2a3f5f"
},
"baxis": {
"endlinecolor": "#2a3f5f",
"gridcolor": "white",
"linecolor": "white",
"minorgridcolor": "white",
"startlinecolor": "#2a3f5f"
},
"type": "carpet"
}
],
"choropleth": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "choropleth"
}
],
"contour": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "contour"
}
],
"contourcarpet": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "contourcarpet"
}
],
"heatmap": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "heatmap"
}
],
"heatmapgl": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "heatmapgl"
}
],
"histogram": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "histogram"
}
],
"histogram2d": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "histogram2d"
}
],
"histogram2dcontour": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "histogram2dcontour"
}
],
"mesh3d": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "mesh3d"
}
],
"parcoords": [
{
"line": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "parcoords"
}
],
"pie": [
{
"automargin": true,
"type": "pie"
}
],
"scatter": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatter"
}
],
"scatter3d": [
{
"line": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatter3d"
}
],
"scattercarpet": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattercarpet"
}
],
"scattergeo": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattergeo"
}
],
"scattergl": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattergl"
}
],
"scattermapbox": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattermapbox"
}
],
"scatterpolar": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterpolar"
}
],
"scatterpolargl": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterpolargl"
}
],
"scatterternary": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterternary"
}
],
"surface": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "surface"
}
],
"table": [
{
"cells": {
"fill": {
"color": "#EBF0F8"
},
"line": {
"color": "white"
}
},
"header": {
"fill": {
"color": "#C8D4E3"
},
"line": {
"color": "white"
}
},
"type": "table"
}
]
},
"layout": {
"annotationdefaults": {
"arrowcolor": "#2a3f5f",
"arrowhead": 0,
"arrowwidth": 1
},
"coloraxis": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"colorscale": {
"diverging": [
[
0,
"#8e0152"
],
[
0.1,
"#c51b7d"
],
[
0.2,
"#de77ae"
],
[
0.3,
"#f1b6da"
],
[
0.4,
"#fde0ef"
],
[
0.5,
"#f7f7f7"
],
[
0.6,
"#e6f5d0"
],
[
0.7,
"#b8e186"
],
[
0.8,
"#7fbc41"
],
[
0.9,
"#4d9221"
],
[
1,
"#276419"
]
],
"sequential": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"sequentialminus": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
]
},
"colorway": [
"#636efa",
"#EF553B",
"#00cc96",
"#ab63fa",
"#FFA15A",
"#19d3f3",
"#FF6692",
"#B6E880",
"#FF97FF",
"#FECB52"
],
"font": {
"color": "#2a3f5f"
},
"geo": {
"bgcolor": "white",
"lakecolor": "white",
"landcolor": "#E5ECF6",
"showlakes": true,
"showland": true,
"subunitcolor": "white"
},
"hoverlabel": {
"align": "left"
},
"hovermode": "closest",
"mapbox": {
"style": "light"
},
"paper_bgcolor": "white",
"plot_bgcolor": "#E5ECF6",
"polar": {
"angularaxis": {
"gridcolor": "white",
"linecolor": "white",
"ticks": ""
},
"bgcolor": "#E5ECF6",
"radialaxis": {
"gridcolor": "white",
"linecolor": "white",
"ticks": ""
}
},
"scene": {
"xaxis": {
"backgroundcolor": "#E5ECF6",
"gridcolor": "white",
"gridwidth": 2,
"linecolor": "white",
"showbackground": true,
"ticks": "",
"zerolinecolor": "white"
},
"yaxis": {
"backgroundcolor": "#E5ECF6",
"gridcolor": "white",
"gridwidth": 2,
"linecolor": "white",
"showbackground": true,
"ticks": "",
"zerolinecolor": "white"
},
"zaxis": {
"backgroundcolor": "#E5ECF6",
"gridcolor": "white",
"gridwidth": 2,
"linecolor": "white",
"showbackground": true,
"ticks": "",
"zerolinecolor": "white"
}
},
"shapedefaults": {
"line": {
"color": "#2a3f5f"
}
},
"ternary": {
"aaxis": {
"gridcolor": "white",
"linecolor": "white",
"ticks": ""
},
"baxis": {
"gridcolor": "white",
"linecolor": "white",
"ticks": ""
},
"bgcolor": "#E5ECF6",
"caxis": {
"gridcolor": "white",
"linecolor": "white",
"ticks": ""
}
},
"title": {
"x": 0.05
},
"xaxis": {
"automargin": true,
"gridcolor": "white",
"linecolor": "white",
"ticks": "",
"title": {
"standoff": 15
},
"zerolinecolor": "white",
"zerolinewidth": 2
},
"yaxis": {
"automargin": true,
"gridcolor": "white",
"linecolor": "white",
"ticks": "",
"title": {
"standoff": 15
},
"zerolinecolor": "white",
"zerolinewidth": 2
}
}
},
"title": {
"text": "Logistic Regression Metrics and # of Topics",
"x": 0.45,
"xanchor": "center",
"y": 0.95,
"yanchor": "top"
},
"xaxis": {
"anchor": "y",
"domain": [
0,
1
],
"title": {
"text": "# Topics"
}
},
"yaxis": {
"anchor": "x",
"domain": [
0,
1
],
"range": [
0,
1
],
"title": {
"text": "Score"
}
}
}
},
"text/html": [
"<div>\n",
" \n",
" \n",
" <div id=\"56e41f18-89c3-4f3a-ad6f-56ed4c4d2470\" class=\"plotly-graph-div\" style=\"height:525px; width:100%;\"></div>\n",
" <script type=\"text/javascript\">\n",
" require([\"plotly\"], function(Plotly) {\n",
" window.PLOTLYENV=window.PLOTLYENV || {};\n",
" \n",
" if (document.getElementById(\"56e41f18-89c3-4f3a-ad6f-56ed4c4d2470\")) {\n",
" Plotly.newPlot(\n",
" '56e41f18-89c3-4f3a-ad6f-56ed4c4d2470',\n",
" [{\"hoverlabel\": {\"namelength\": 0}, \"hovertemplate\": \"Metric=Accuracy<br># Topics=%{x}<br>Score=%{y}\", \"legendgroup\": \"Metric=Accuracy\", \"line\": {\"color\": \"#636efa\", \"dash\": \"solid\", \"shape\": \"linear\"}, \"mode\": \"lines\", \"name\": \"Metric=Accuracy\", \"showlegend\": true, \"type\": \"scatter\", \"x\": [2, 4, 8, 16, 32, 64, 100, 200, 300, 400], \"xaxis\": \"x\", \"y\": [0.857, 0.874, 0.867, 0.867, 0.86, 0.85, 0.857, 0.859, 0.856, 0.858], \"yaxis\": \"y\"}, {\"hoverlabel\": {\"namelength\": 0}, \"hovertemplate\": \"Metric=AUC<br># Topics=%{x}<br>Score=%{y}\", \"legendgroup\": \"Metric=AUC\", \"line\": {\"color\": \"#EF553B\", \"dash\": \"solid\", \"shape\": \"linear\"}, \"mode\": \"lines\", \"name\": \"Metric=AUC\", \"showlegend\": true, \"type\": \"scatter\", \"x\": [2, 4, 8, 16, 32, 64, 100, 200, 300, 400], \"xaxis\": \"x\", \"y\": [0.9194908474142345, 0.9226590501792116, 0.9232030849974399, 0.9181387608806963, 0.9203629032258064, 0.9205949180747568, 0.914986559139785, 0.9108662954429083, 0.9128064196108551, 0.9085861495135689], \"yaxis\": \"y\"}, {\"hoverlabel\": {\"namelength\": 0}, \"hovertemplate\": \"Metric=Recall<br># Topics=%{x}<br>Score=%{y}\", \"legendgroup\": \"Metric=Recall\", \"line\": {\"color\": \"#00cc96\", \"dash\": \"solid\", \"shape\": \"linear\"}, \"mode\": \"lines\", \"name\": \"Metric=Recall\", \"showlegend\": true, \"type\": \"scatter\", \"x\": [2, 4, 8, 16, 32, 64, 100, 200, 300, 400], \"xaxis\": \"x\", \"y\": [0.8266129032258065, 0.8387096774193549, 0.844758064516129, 0.8467741935483871, 0.8387096774193549, 0.8346774193548387, 0.8487903225806451, 0.8346774193548387, 0.8326612903225806, 0.8346774193548387], \"yaxis\": \"y\"}, {\"hoverlabel\": {\"namelength\": 0}, \"hovertemplate\": \"Metric=Precision<br># Topics=%{x}<br>Score=%{y}\", \"legendgroup\": \"Metric=Precision\", \"line\": {\"color\": \"#ab63fa\", \"dash\": \"solid\", \"shape\": \"linear\"}, \"mode\": \"lines\", \"name\": \"Metric=Precision\", \"showlegend\": true, \"type\": \"scatter\", \"x\": [2, 4, 8, 16, 32, 64, 100, 200, 300, 400], \"xaxis\": \"x\", \"y\": [0.8779443254817987, 0.9004329004329005, 0.8821052631578947, 0.8805031446540881, 0.8739495798319328, 0.8589211618257261, 0.8609406952965235, 0.8752642706131079, 0.8713080168776371, 0.8734177215189873], \"yaxis\": \"y\"}, {\"hoverlabel\": {\"namelength\": 0}, \"hovertemplate\": \"Metric=F1<br># Topics=%{x}<br>Score=%{y}\", \"legendgroup\": \"Metric=F1\", \"line\": {\"color\": \"#FFA15A\", \"dash\": \"solid\", \"shape\": \"linear\"}, \"mode\": \"lines\", \"name\": \"Metric=F1\", \"showlegend\": true, \"type\": \"scatter\", \"x\": [2, 4, 8, 16, 32, 64, 100, 200, 300, 400], \"xaxis\": \"x\", \"y\": [0.8515057113187955, 0.8684759916492693, 0.8630278063851698, 0.8633093525179856, 0.8559670781893004, 0.8466257668711658, 0.8548223350253806, 0.8544891640866874, 0.8515463917525773, 0.8536082474226804], \"yaxis\": \"y\"}, {\"hoverlabel\": {\"namelength\": 0}, \"hovertemplate\": \"Metric=Kappa<br># Topics=%{x}<br>Score=%{y}\", \"legendgroup\": \"Metric=Kappa\", \"line\": {\"color\": \"#19d3f3\", \"dash\": \"solid\", \"shape\": \"linear\"}, \"mode\": \"lines\", \"name\": \"Metric=Kappa\", \"showlegend\": true, \"type\": \"scatter\", \"x\": [2, 4, 8, 16, 32, 64, 100, 200, 300, 400], \"xaxis\": \"x\", \"y\": [0.7138489122256552, 0.747846690787999, 0.7338935574229692, 0.7339020759639547, 0.7198924386964594, 0.6999135751096316, 0.7139496551393045, 0.7178781233492869, 0.7118801421391299, 0.7158818068316419], \"yaxis\": \"y\"}],\n",
" {\"legend\": {\"tracegroupgap\": 0}, \"margin\": {\"t\": 60}, \"plot_bgcolor\": \"rgb(245,245,245)\", \"template\": {\"data\": {\"bar\": [{\"error_x\": {\"color\": \"#2a3f5f\"}, \"error_y\": {\"color\": \"#2a3f5f\"}, \"marker\": {\"line\": {\"color\": \"#E5ECF6\", \"width\": 0.5}}, \"type\": \"bar\"}], \"barpolar\": [{\"marker\": {\"line\": {\"color\": \"#E5ECF6\", \"width\": 0.5}}, \"type\": \"barpolar\"}], \"carpet\": [{\"aaxis\": {\"endlinecolor\": \"#2a3f5f\", \"gridcolor\": \"white\", \"linecolor\": \"white\", \"minorgridcolor\": \"white\", \"startlinecolor\": \"#2a3f5f\"}, \"baxis\": {\"endlinecolor\": \"#2a3f5f\", \"gridcolor\": \"white\", \"linecolor\": \"white\", \"minorgridcolor\": \"white\", \"startlinecolor\": \"#2a3f5f\"}, \"type\": \"carpet\"}], \"choropleth\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"type\": \"choropleth\"}], \"contour\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"colorscale\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]], \"type\": \"contour\"}], \"contourcarpet\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"type\": \"contourcarpet\"}], \"heatmap\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"colorscale\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]], \"type\": \"heatmap\"}], \"heatmapgl\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"colorscale\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]], \"type\": \"heatmapgl\"}], \"histogram\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"histogram\"}], \"histogram2d\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"colorscale\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]], \"type\": \"histogram2d\"}], \"histogram2dcontour\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"colorscale\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]], \"type\": \"histogram2dcontour\"}], \"mesh3d\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"type\": \"mesh3d\"}], \"parcoords\": [{\"line\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"parcoords\"}], \"pie\": [{\"automargin\": true, \"type\": \"pie\"}], \"scatter\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scatter\"}], \"scatter3d\": [{\"line\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scatter3d\"}], \"scattercarpet\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scattercarpet\"}], \"scattergeo\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scattergeo\"}], \"scattergl\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scattergl\"}], \"scattermapbox\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scattermapbox\"}], \"scatterpolar\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scatterpolar\"}], \"scatterpolargl\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scatterpolargl\"}], \"scatterternary\": [{\"marker\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"type\": \"scatterternary\"}], \"surface\": [{\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}, \"colorscale\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]], \"type\": \"surface\"}], \"table\": [{\"cells\": {\"fill\": {\"color\": \"#EBF0F8\"}, \"line\": {\"color\": \"white\"}}, \"header\": {\"fill\": {\"color\": \"#C8D4E3\"}, \"line\": {\"color\": \"white\"}}, \"type\": \"table\"}]}, \"layout\": {\"annotationdefaults\": {\"arrowcolor\": \"#2a3f5f\", \"arrowhead\": 0, \"arrowwidth\": 1}, \"coloraxis\": {\"colorbar\": {\"outlinewidth\": 0, \"ticks\": \"\"}}, \"colorscale\": {\"diverging\": [[0, \"#8e0152\"], [0.1, \"#c51b7d\"], [0.2, \"#de77ae\"], [0.3, \"#f1b6da\"], [0.4, \"#fde0ef\"], [0.5, \"#f7f7f7\"], [0.6, \"#e6f5d0\"], [0.7, \"#b8e186\"], [0.8, \"#7fbc41\"], [0.9, \"#4d9221\"], [1, \"#276419\"]], \"sequential\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]], \"sequentialminus\": [[0.0, \"#0d0887\"], [0.1111111111111111, \"#46039f\"], [0.2222222222222222, \"#7201a8\"], [0.3333333333333333, \"#9c179e\"], [0.4444444444444444, \"#bd3786\"], [0.5555555555555556, \"#d8576b\"], [0.6666666666666666, \"#ed7953\"], [0.7777777777777778, \"#fb9f3a\"], [0.8888888888888888, \"#fdca26\"], [1.0, \"#f0f921\"]]}, \"colorway\": [\"#636efa\", \"#EF553B\", \"#00cc96\", \"#ab63fa\", \"#FFA15A\", \"#19d3f3\", \"#FF6692\", \"#B6E880\", \"#FF97FF\", \"#FECB52\"], \"font\": {\"color\": \"#2a3f5f\"}, \"geo\": {\"bgcolor\": \"white\", \"lakecolor\": \"white\", \"landcolor\": \"#E5ECF6\", \"showlakes\": true, \"showland\": true, \"subunitcolor\": \"white\"}, \"hoverlabel\": {\"align\": \"left\"}, \"hovermode\": \"closest\", \"mapbox\": {\"style\": \"light\"}, \"paper_bgcolor\": \"white\", \"plot_bgcolor\": \"#E5ECF6\", \"polar\": {\"angularaxis\": {\"gridcolor\": \"white\", \"linecolor\": \"white\", \"ticks\": \"\"}, \"bgcolor\": \"#E5ECF6\", \"radialaxis\": {\"gridcolor\": \"white\", \"linecolor\": \"white\", \"ticks\": \"\"}}, \"scene\": {\"xaxis\": {\"backgroundcolor\": \"#E5ECF6\", \"gridcolor\": \"white\", \"gridwidth\": 2, \"linecolor\": \"white\", \"showbackground\": true, \"ticks\": \"\", \"zerolinecolor\": \"white\"}, \"yaxis\": {\"backgroundcolor\": \"#E5ECF6\", \"gridcolor\": \"white\", \"gridwidth\": 2, \"linecolor\": \"white\", \"showbackground\": true, \"ticks\": \"\", \"zerolinecolor\": \"white\"}, \"zaxis\": {\"backgroundcolor\": \"#E5ECF6\", \"gridcolor\": \"white\", \"gridwidth\": 2, \"linecolor\": \"white\", \"showbackground\": true, \"ticks\": \"\", \"zerolinecolor\": \"white\"}}, \"shapedefaults\": {\"line\": {\"color\": \"#2a3f5f\"}}, \"ternary\": {\"aaxis\": {\"gridcolor\": \"white\", \"linecolor\": \"white\", \"ticks\": \"\"}, \"baxis\": {\"gridcolor\": \"white\", \"linecolor\": \"white\", \"ticks\": \"\"}, \"bgcolor\": \"#E5ECF6\", \"caxis\": {\"gridcolor\": \"white\", \"linecolor\": \"white\", \"ticks\": \"\"}}, \"title\": {\"x\": 0.05}, \"xaxis\": {\"automargin\": true, \"gridcolor\": \"white\", \"linecolor\": \"white\", \"ticks\": \"\", \"title\": {\"standoff\": 15}, \"zerolinecolor\": \"white\", \"zerolinewidth\": 2}, \"yaxis\": {\"automargin\": true, \"gridcolor\": \"white\", \"linecolor\": \"white\", \"ticks\": \"\", \"title\": {\"standoff\": 15}, \"zerolinecolor\": \"white\", \"zerolinewidth\": 2}}}, \"title\": {\"text\": \"Logistic Regression Metrics and # of Topics\", \"x\": 0.45, \"xanchor\": \"center\", \"y\": 0.95, \"yanchor\": \"top\"}, \"xaxis\": {\"anchor\": \"y\", \"domain\": [0.0, 1.0], \"title\": {\"text\": \"# Topics\"}}, \"yaxis\": {\"anchor\": \"x\", \"domain\": [0.0, 1.0], \"range\": [0, 1], \"title\": {\"text\": \"Score\"}}},\n",
" {\"responsive\": true}\n",
" ).then(function(){\n",
" \n",
"var gd = document.getElementById('56e41f18-89c3-4f3a-ad6f-56ed4c4d2470');\n",
"var x = new MutationObserver(function (mutations, observer) {{\n",
" var display = window.getComputedStyle(gd).display;\n",
" if (!display || display === 'none') {{\n",
" console.log([gd, 'removed!']);\n",
" Plotly.purge(gd);\n",
" observer.disconnect();\n",
" }}\n",
"}});\n",
"\n",
"// Listen for the removal of the full notebook cells\n",
"var notebookContainer = gd.closest('#notebook-container');\n",
"if (notebookContainer) {{\n",
" x.observe(notebookContainer, {childList: true});\n",
"}}\n",
"\n",
"// Listen for the clearing of the current output cell\n",
"var outputEl = gd.closest('.output');\n",
"if (outputEl) {{\n",
" x.observe(outputEl, {childList: true});\n",
"}}\n",
"\n",
" })\n",
" };\n",
" });\n",
" </script>\n",
" </div>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Best Model: Latent Dirichlet Allocation | # Topics: 4 | Accuracy : 0.874\n"
]
}
],
"source": [
"tuned_classification = tune_model(model = 'lda', multi_core = True, supervised_target = 'status')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this example the `Accuracy` is optimized when `num_topics` are set to `4`. It is very likely that number of topics to optimize the supervised metric such as accuracy in this case would be different than model with best coherence value. At the end of the day, which one to use is totally dependent on the ues case of topic model. Evaluating topic models is a complex subject. It is unlikely that you will understand them fully if this is your first time doing topic modeling. We recommend you to watch this __[YouTube Video](https://www.youtube.com/watch?v=UkmIljRIG_M)__, if you interested in learning more."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 8.0 Saving the experiment"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In __[Natural Language Processing Tutorial (NLP101) - Level Beginner](https://github.com/pycaret/pycaret/blob/master/Tutorials/Natural%20Language%20Processing%20Tutorial%20Level%20Beginner%20-%20NLP101.ipynb)__ we have learned how to save and load the model. In this experiment we will learn how to save the entire experiment including all the outputs and models we have built in this experiment. Saving experiment is as simple as saving model."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Experiment Succesfully Saved\n"
]
}
],
"source": [
"save_experiment('Experiment_123 08Feb2020')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 9.0 Loading saved experiment"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To load a saved experiment on a future date or in a different environment, we would use the `load_experiment()` function."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Object</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Info</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Dataset</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Corpus</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Dictionary</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Text</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>Latent Dirichlet Allocation</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>Best Model</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>Best Model</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Object\n",
"0 Info\n",
"1 Dataset\n",
"2 Corpus\n",
"3 Dictionary\n",
"4 Text\n",
"5 Latent Dirichlet Allocation\n",
"6 Best Model\n",
"7 Best Model"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"saved_experiment = load_experiment('Experiment_123 08Feb2020')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Notice that when you used `load_experiment()`, it has loaded the entire experiments and all the intermediate outputs in variable `saved_experiment`. You can access specific items in a similar way you would access list elements in Python. See example below in which we are accessing our final stacking ensembler and store it in `final_stack_soft_loaded` variable."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [],
"source": [
"saved_lda = saved_experiment[5]"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"LdaModel(num_terms=4557, num_topics=4, decay=0.5, chunksize=100)\n"
]
}
],
"source": [
"print(saved_lda)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 10.0 Wrap-up / Next Steps?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We have covered several key concepts in this tutorial such as model evaluation using intrinsic and extrinsic technique. In next tutorial. We have performed several text pre-processing steps including removal of custom_stopwords using `setup()` then we have created a topic model and compared the results with the one we created in last tutorial. We have also talked about different ways to evaluate topic model and have used `tune_model()` to evaluate coherence value of a LDA model. We have also used `tune_model()` to evaluate number of topics in a supervised setting (in this case we have used it to build a classifier to predict loan status).\n",
"\n",
"In next tutorial we will use `pycaret.nlp` together with `pycaret.classification` and focus on using supervised and unsupervised module of pycaret together. \n",
"\n",
"See you at the next tutorial. Follow the link to __[Natural Language Processing (NLP103) - Level Expert](https://github.com/pycaret/pycaret/blob/master/Tutorials/Natural%20Language%20Processing%20Tutorial%20Level%20Expert%20-%20NLP103.ipynb)__"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
因为 它太大了无法显示 source diff 。你可以改为 查看blob
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# <span style=\"color:orange\">Regression Tutorial (REG103) - Level Expert</span>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Date Updated: Feb 25, 2020**\n",
"\n",
"\n",
"# Work in progress\n",
"We are currently working on this tutorial. Please check back soon! \n",
"\n",
"### In the mean time, you can see: \n",
"- __[Regression Tutorial (REG101) - Level Beginner](https://github.com/pycaret/pycaret/blob/master/Tutorials/Regression%20Tutorial%20Level%20Beginner%20-%20REG101.ipynb)__\n",
"- __[Regression Tutorial (REG102) - Level Intermediate](https://github.com/pycaret/pycaret/blob/master/Tutorials/Regression%20Tutorial%20Level%20Intermediate%20-%20REG102.ipynb)__"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
因为 它太大了无法显示 source diff 。你可以改为 查看blob
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册