{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Natural Language Processing Tutorial (NLP101) - Level Beginner" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Date Updated: Feb 25, 2020**\n", "\n", "# 1.0 Objective of Tutorial\n", "Welcome to Natural Language Processing Tutorial (NLP101). This tutorial assumes that you are new to PyCaret and looking to get started with Natural Language Processing using `pycaret.nlp` Module.\n", "\n", "In this tutorial we will learn:\n", "\n", "\n", "* **Getting Data:** How to import data from PyCaret repository?\n", "* **Setting up Environment:** How to setup environment in PyCaret and perform critical text pre-processing tasks?\n", "* **Create Model:** How to create a topic model?\n", "* **Assign Model:** How to assign documents/text to topics using a trained model?\n", "* **Plot Model:** How to analyze topic models / overall corpus using various plots?\n", "* **Save / Load Model:** How to save / load model for future use?\n", "\n", "Read Time : Approx. 30 Minutes\n", "\n", "\n", "## 1.1 Installing PyCaret\n", "First step to get started with PyCaret is to install pycaret. Installing pycaret is easy and take few minutes only. Follow the instructions below:\n", "\n", "#### Installing PyCaret in Local Jupyter Notebook\n", "`pip install pycaret`
\n", "\n", "#### Installing PyCaret on Google Colab or Azure Notebooks\n", "`!pip install pycaret`\n", "\n", "\n", "## 1.2 Pre-Requisites\n", "- Python 3.x\n", "- Latest version of pycaret\n", "- Internet connection to load data from pycaret's repository\n", "- Basic Knowledge of NLP \n", "\n", "## 1.3 For Google colab users:\n", "If you are running this notebook on Google colab, below code of cells must be run at top of the notebook to display interactive visuals.
\n", "
\n", "`from pycaret.utils import enable_colab`
\n", "`enable_colab()`\n", "\n", "## 1.4 See also:\n", "- __[Natural Language Processing Tutorial (NLP102) - Level Intermediate](https://github.com/pycaret/pycaret/blob/master/Tutorials/Natural%20Language%20Processing%20Tutorial%20Level%20Intermediate%20-%20NLP102.ipynb)__\n", "- __[Natural Language Processing Tutorial (NLP103) - Level Expert](https://github.com/pycaret/pycaret/blob/master/Tutorials/Natural%20Language%20Processing%20Tutorial%20Level%20Expert%20-%20NLP103.ipynb)__" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 2.0 What is Natural Language Processing?\n", "\n", "Natural Language Processing (NLP in short) is a branch of artificial intelligence that deals with analyzing, understanding and generating the languages that humans use naturally in order to interface with computers in both written and spoken contexts using natural human languages instead of computer languages. Some of the common use case of NLP in machine learning are: \n", "\n", "- **Topic discovery and modeling:** Capture the meaning and themes in text collections, and apply advanced modeling techniques such as Topic Modeling to group similar documents together.\n", "- **Sentiment Analysis:** Identifying the mood or subjective opinions within large amounts of text, including average sentiment and opinion mining.\n", "- **Document summarization:** Automatically generating synopses of large bodies of text.\n", "- **Speech-to-text and text-to-speech conversion:** Transforming voice commands into written text, and vice versa.\n", "- **Machine translation:** Automatic translation of text or speech from one language to another. \n", "\n", "__[Learn More about Natural Language Processing](https://en.wikipedia.org/wiki/Natural_language_processing)__" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 3.0 Overview of Natural Language Processing Module in PyCaret\n", "PyCaret's NLP module (`pycaret.nlp`) is an unsupervised machine learning module which can be used for analyzing the text data by creating topic model to find hidden semantic structure in documents. PyCaret's NLP module comes built-in with a wide range of text pre-processing techniques which is the fundamental step in any NLP problem. It transforms the raw text into a format that machine learning algorithms can learn from.\n", "\n", "As of first release, PyCaret's NLP module only support `English` language and provides several popular implementation of topic models from Latent Dirichlet Allocation to Non-Negative Matrix Factorization. It has over 5 ready-to-use algorithms and over 10 plots to analyze the text. PyCaret's NLP module also implements a unique function `tune_model()` that allows you to tune the hyperparameters of a topic model to optimize the supervised learning objective such as `AUC` for classification or `R2` for regression." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 4.0 Dataset for the Tutorial" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For this tutorial we will be using data from **Kiva Microfunds** https://www.kiva.org/. Kiva Microfunds is a non-profit that allows individuals to lend money to low-income entrepreneurs and students around the world. Since starting in 2005, Kiva has crowd-funded millions of loans with a repayment rate of around 98%. At Kiva, each loan request includes both traditional demographic information on the borrower, such as gender and location, as well as a personal story. In this tutorial we will use the text given in personal story to gain insights of the dataset and understand hidden semantic structure in the text. The dataset contains 6,818 samples. Short description of features are below:\n", "\n", "- **country:** country of borrower\n", "- **en:** Personal story of borrower when applied for loan\n", "- **gender:** Gender (M=male, F=female)\n", "- **loan_amount:** Amount of loan approved and disbursed\n", "- **nonpayment:** Type of lender (Lender = personal registered user on Kiva website, Partner = microfinance institution who work with Kiva to find and fund loans)\n", "- **sector:** sector of borrower\n", "- **status:** status of loan (1-default, 0-repaid)\n", "\n", "In this tutorial we will only use `en` column to create topic model. In next tutorial __[Natural Language Processing (NLP102) - Level Intermediate](https://github.com/pycaret/pycaret/blob/master/Tutorials/Natural%20Language%20Processing%20Tutorial%20Level%20Intermediate%20-%20NLP102.ipynb)__ we will use topic model to build a classifier that predicts `status` of loan to know whether the applicant will default or not. \n", "\n", "#### Dataset Acknowledgement:\n", "Kiva Microfunds https://www.kiva.org/ " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 5.0 Getting the Data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can download the data from PyCaret's git repository __[Click Here to Download](https://github.com/pycaret/pycaret/blob/master/datasets/kiva.csv)__ or you can load it using `get_data()` function (This will require internet connection)." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countryengenderloan_amountnonpaymentsectorstatus
0Dominican Republic\"Banco Esperanza\" is a group of 10 women looki...F1225partnerRetail0
1Dominican Republic\"Caminemos Hacia Adelante\" or \"Walking Forward...F1975lenderClothing0
2Dominican Republic\"Creciendo Por La Union\" is a group of 10 peop...F2175partnerClothing0
3Dominican Republic\"Cristo Vive\" (\"Christ lives\" is a group of 10...F1425partnerClothing0
4Dominican Republic\"Cristo Vive\" is a large group of 35 people, 2...F4025partnerFood0
\n", "
" ], "text/plain": [ " country en \\\n", "0 Dominican Republic \"Banco Esperanza\" is a group of 10 women looki... \n", "1 Dominican Republic \"Caminemos Hacia Adelante\" or \"Walking Forward... \n", "2 Dominican Republic \"Creciendo Por La Union\" is a group of 10 peop... \n", "3 Dominican Republic \"Cristo Vive\" (\"Christ lives\" is a group of 10... \n", "4 Dominican Republic \"Cristo Vive\" is a large group of 35 people, 2... \n", "\n", " gender loan_amount nonpayment sector status \n", "0 F 1225 partner Retail 0 \n", "1 F 1975 lender Clothing 0 \n", "2 F 2175 partner Clothing 0 \n", "3 F 1425 partner Clothing 0 \n", "4 F 4025 partner Food 0 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from pycaret.datasets import get_data\n", "data = get_data('kiva')" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(6818, 7)" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#check the shape of data\n", "data.shape" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(1000, 7)" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# sampling the data to select only 1000 documents\n", "data = data.sample(1000, random_state=786).reset_index(drop=True)\n", "data.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 6.0 Setting up Environment in PyCaret" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`setup()` function initializes the environment in pycaret and performs several text pre-processing steps that are imperative to work with NLP problems. setup must be called before executing any other function in pycaret. It takes two parameters: pandas dataframe and name of the text column passed as `target` parameter. You can also pass a `list` containing text, in which case you don't need to pass `target` parameter. When setup is executed, following pre-processing steps are applied automatically:\n", "\n", "- **Removing Numeric Characters:** All numeric characters are removed from the text. They are replaced with blanks.
\n", "
\n", "- **Removing Special Characters:** All non-alphanumeric special characters are removed from the text. They are also replaced with blanks.
\n", "
\n", "- **Word Tokenization:** Word tokenization is the process of splitting a large sample of text into words. This is the core requirement in natural language processing tasks where each word needs to be captured separately for further analysis. __[Read More](https://nlp.stanford.edu/IR-book/html/htmledition/tokenization-1.html)__
\n", "
\n", "- **Stopword Removal:** A stop word (or stopword) is a word that is often removed from text because it is common and provides little value for information retrieval, even though it might be linguistically meaningful. Example of such words in english language are: \"the\", \"a\", \"an\", \"in\" etc. __[Read More](https://en.wikipedia.org/wiki/Stop_words)__
\n", "
\n", "- **Bigram Extraction:** A bigram is a sequence of two adjacent elements from a string of tokens, which are typically letters, syllables, or words. For example: word New York is captured as two different words \"New\" and \"York\" when tokenization is performed but if it is repeated enough times, Bigram Extraction will represent the word as one i.e. \"New_York\" __[Read More](https://en.wikipedia.org/wiki/Bigram)__
\n", "
\n", "- **Trigram Extraction:** Similar to bigram extraction, trigram is a sequence of three adjacent elements from a string of tokens. __[Read More](https://en.wikipedia.org/wiki/Trigram)__
\n", "
\n", "- **Lemmatizing:** Lemmatization is the process of grouping together the inflected forms of a word so they can be analysed as a single word, identified by the word's lemma, or dictionary form. In English language, word appears in several inflected forms. For example the verb 'to walk' may appear as 'walk', 'walked', 'walks', 'walking'. The base form, 'walk', that one might look up in a dictionary, is called the lemma for the word. __[Read More](https://en.wikipedia.org/wiki/Lemmatisation)__
\n", "
\n", "- **Custom Stopwords:** Many times text contains words that are not stopwords by the rule of language but they add no or very little information. For example, in this tutorial we are using the loan dataset. As such, words like \"loan\", \"bank\", \"money\", \"business\" are too obvious and adds no value. More often than not, they also add a lot of noise in the topic model. You can remove those words from corpus by using `custom_stopwords` parameter. In next tutorial, __[Natural Language Processing Tutorial (NLP102) - Level Intermediate](https://github.com/pycaret/pycaret/blob/master/Tutorials/Natural%20Language%20Processing%20Tutorial%20Level%20Intermediate%20-%20NLP102.ipynb)__ we will demonstrate the use of `custom_stopwords` parameter inside `setup()`.
\n", "
\n", "\n", "**Note :** Some functionalities in `pycaret.nlp` requires english language model. The language model is not downloaded automatically when you install pycaret. You will have to download these python command line interface such as Anaconda Prompt. To download the model, please type the following in your command line:\n", "\n", "`python -m spacy download en_core_web_sm`
\n", "`python -m textblob.download_corpora`
" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "from pycaret.nlp import *" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "scrolled": false }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Description Value
session_id123
# Documents1000
Vocab Size4596
Custom StopwordsFalse
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "exp_nlp101 = setup(data = data, target = 'en', session_id = 123)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Once the setup is succesfully executed it prints the information grid with the following information: \n", "\n", "- **session_id :** A pseduo-random number distributed as a seed in all functions for later reproducibility. If no `session_id` is passed, a random number is automatically generated that is distributed to all functions. In this experiment session_id is set as `123` for later reproducibility.
\n", "
\n", "- **# Documents :** Number of documents (or samples in dataset if dataframe is passed).
\n", "
\n", "- **Vocab Size :** Size of vocabulary in the corpus after applying all text pre-processing such as removal of stopwords, bigram/trigram extraction, lemmatization etc.
\n", "\n", "Notice that all text pre-processing steps are performed automatically when you execute `setup()`. These steps are imperative to perform any NLP experiment. `setup()` function prepares the corpus and dictionary that is ready-to-use for the topic models that you can create using `create_model()` function. Another way to pass the text is in the form of list. See an example below:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "list" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#convert 'en' column of dataset into list format\n", "text_list = list(data['en'])\n", "type(text_list)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Description Value
session_id123
# Documents1000
Vocab Size4596
Custom StopwordsFalse
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "exp_nlp101_list = setup(data = text_list, session_id = 123)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice the output of `exp_nlp101_list` is identical to the output of `exp_nlp101`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 7.0 Create a Topic Model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**What is Topic Model?** In machine learning and natural language processing, a topic model is a type of statistical model for discovering the abstract \"topics\" that occur in a collection of documents. Topic modeling is a frequently used text-mining tool for discovery of hidden semantic structures in a text body. Intuitively, given that a document is about a particular topic, one would expect particular words to appear in the document more or less frequently: \"dog\" and \"bone\" will appear more often in documents about dogs, \"cat\" and \"meow\" will appear in documents about cats, and \"the\" and \"is\" will appear equally in both. A document typically concerns multiple topics in different proportions; thus, in a document that is 10% about cats and 90% about dogs, there would probably be about 9 times more dog words than cat words. The \"topics\" produced by topic modeling techniques are clusters of similar words. A topic model captures this intuition in a mathematical framework, which allows examining a set of documents and discovering, based on the statistics of the words in each, what the topics might be and what each document's balance of topics is. __[Read More](https://en.wikipedia.org/wiki/Topic_model)__\n", "\n", "Creating a topic model in PyCaret is simple and similar to how you would have created a model in supervised modules of pycaret. A topic model is created using `create_model()` function which takes one mandatory parameter i.e. name of model as a string. This function returns a trained model object. There are 5 topic models available in PyCaret. see the docstring of `create_model()` for complete list of models. See an example below where we create Latent Dirichlet Allocation (LDA) model:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "scrolled": false }, "outputs": [], "source": [ "lda = create_model('lda')" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "LdaModel(num_terms=4596, num_topics=4, decay=0.5, chunksize=100)\n" ] } ], "source": [ "print(lda)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We have created Latent Dirichlet Allocation (LDA) model with just one word i.e. `create_model()`. Notice the `num_topics` parameter is set to `4` which is a default value taken when you donot pass `num_topics` parameter in `create_model()`. In below example, we will create LDA model with 6 topics and we will also set `multi_core` parameter to `True`. When `multi_core` is set to `True` Latent Dirichlet Allocation (LDA) uses all CPU cores to parallelize and speed up model training." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "lda2 = create_model('lda', num_topics = 6, multi_core = True)" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "LdaModel(num_terms=4596, num_topics=6, decay=0.5, chunksize=100)\n" ] } ], "source": [ "print(lda2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 8.0 Assign a Model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that we have created a topic model, we would like to assign the topic proportions to our dataset (6818 documents / samples) to analyze the results. We will achieve this by using `assign_model()` function. See an example below:" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
enTopic_0Topic_1Topic_2Topic_3Dominant_TopicPerc_Dominant_Topic
0praxide marry child primary school train tailo...0.8885330.0011670.0020190.108281Topic 00.89
1gynaecology practitioner run year old live wif...0.2978770.2861660.0020580.413899Topic 30.41
2live child boy girl range year old sell used w...0.2081600.0705870.0023250.718928Topic 30.72
3phanice marry child daughter secondary school ...0.7387360.0014360.0024840.257344Topic 00.74
4year old hotel kaptembwa operate hotel last ye...0.7214750.0289890.0020590.247477Topic 00.72
\n", "
" ], "text/plain": [ " en Topic_0 Topic_1 \\\n", "0 praxide marry child primary school train tailo... 0.888533 0.001167 \n", "1 gynaecology practitioner run year old live wif... 0.297877 0.286166 \n", "2 live child boy girl range year old sell used w... 0.208160 0.070587 \n", "3 phanice marry child daughter secondary school ... 0.738736 0.001436 \n", "4 year old hotel kaptembwa operate hotel last ye... 0.721475 0.028989 \n", "\n", " Topic_2 Topic_3 Dominant_Topic Perc_Dominant_Topic \n", "0 0.002019 0.108281 Topic 0 0.89 \n", "1 0.002058 0.413899 Topic 3 0.41 \n", "2 0.002325 0.718928 Topic 3 0.72 \n", "3 0.002484 0.257344 Topic 0 0.74 \n", "4 0.002059 0.247477 Topic 0 0.72 " ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "lda_results = assign_model(lda)\n", "lda_results.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice how 6 additional columns are now added to the dataframe. `en` is the text after all pre-processing. `Topic_0 ... Topic_3` are the topic proportions and represents the distribution of topics for each document. `Dominant_Topic` is the topic number with highest proportion and `Perc_Dominant_Topic` is the percentage of dominant topic over 1 (only shown when models are stochastic i.e. sum of all proportions equal to 1) ." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 9.0 Plot a Model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`plot_model()` function can be used to analyze the overall corpus or only specific topics extracted through topic model. Hence the function `plot_model()` can also work without passing any trained model object. See examples below:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 9.1 Frequency Distribution of Entire Corpus" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/html": [ " \n", " " ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.plotly.v1+json": { "config": { "linkText": "Export to plot.ly", "plotlyServerURL": "https://plot.ly", "showLink": true }, "data": [ { "marker": { "color": "rgba(255, 153, 51, 0.6)", "line": { "color": "rgba(255, 153, 51, 1.0)", "width": 1 } }, "name": "count", "orientation": "v", "text": "", "type": "bar", "x": [ "loan", "business", "child", "year", "sell", "buy", "school", "also", "work", "use", "old", "family", "make", "able", "start", "small", "live", "help", "group", "purchase", "product", "income", "increase", "husband", "home", "woman", "community", "customer", "good", "well", "need", "order", "request", "member", "pay", "hope", "marry", "stock", "want", "grow", "people", "time", "new", "provide", "farmer", "store", "get", "many", "client", "entrepreneur", "expand", "area", "support", "go", "money", "improve", "market", "give", "shop", "rice", "hard", "first", "sale", "food", "clothing", "ago", "invest", "plan", "run", "mother", "profit", "take", "clothe", "life", "education", "farm", "high", "service", "continue", "capital", "month", "house", "enable", "young", "local", "item", "fee", "demand", "wife", "large", "operate", "part", "say", "married", "receive", "meet", "repay", "day", "still", "supply" ], "y": [ 1932, 1904, 1199, 1069, 1039, 796, 668, 647, 630, 597, 564, 562, 529, 516, 486, 472, 467, 460, 456, 395, 390, 357, 347, 345, 331, 325, 319, 315, 313, 308, 291, 289, 273, 267, 263, 263, 259, 256, 246, 243, 243, 242, 242, 235, 235, 230, 219, 217, 216, 214, 209, 209, 209, 209, 205, 202, 198, 198, 197, 192, 191, 188, 186, 184, 182, 182, 181, 178, 178, 174, 172, 171, 168, 167, 167, 165, 164, 163, 162, 156, 154, 153, 152, 149, 149, 142, 141, 140, 137, 136, 133, 133, 132, 131, 131, 130, 130, 128, 126, 124 ] } ], "layout": { "legend": { "bgcolor": "#F5F6F9", "font": { "color": "#4D5663" } }, "paper_bgcolor": "#F5F6F9", "plot_bgcolor": "#F5F6F9", "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "heatmapgl": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmapgl" } ], "histogram": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "font": { "color": "#4D5663" }, "text": "Top 100 words after removing stop words" }, "xaxis": { "gridcolor": "#E1E5ED", "linecolor": "black", "showgrid": true, "tickfont": { "color": "#4D5663" }, "title": { "font": { "color": "#4D5663" }, "text": "" }, "zerolinecolor": "#E1E5ED" }, "yaxis": { "gridcolor": "#E1E5ED", "linecolor": "black", "showgrid": true, "tickfont": { "color": "#4D5663" }, "title": { "font": { "color": "#4D5663" }, "text": "Count" }, "zerolinecolor": "#E1E5ED" } } }, "text/html": [ "
\n", " \n", " \n", "
\n", " \n", "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "plot_model()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 9.2 Top 100 Bigrams on Entire Corpus" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/html": [ " \n", " " ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.plotly.v1+json": { "config": { "linkText": "Export to plot.ly", "plotlyServerURL": "https://plot.ly", "showLink": true }, "data": [ { "marker": { "color": "rgba(255, 153, 51, 0.6)", "line": { "color": "rgba(255, 153, 51, 1.0)", "width": 1 } }, "name": "count", "orientation": "v", "text": "", "type": "bar", "x": [ "year old", "request loan", "start business", "marry child", "year ago", "loan buy", "use loan", "expand business", "business year", "loan use", "school fee", "child school", "old child", "repay loan", "second loan", "first loan", "hard work", "old married", "old marry", "primary school", "loan usd", "business sell", "work hard", "small business", "loan purchase", "mother child", "business able", "married child", "pay school", "use buy", "educate child", "increase stock", "need loan", "increase income", "child live", "secondary school", "loan order", "support family", "last year", "business selling", "woman group", "start small", "business training_educational_programs_information", "business grow", "run business", "apply loan", "able buy", "training_educational_programs_information org", "start sell", "group member", "help business", "also sell", "year experience", "loan able", "husband work", "business help", "farmer also", "high school", "give loan", "business loan", "operate business", "child child", "loan pemci", "loan invest", "begin business", "child primary", "pay loan", "sell product", "child age", "receive loan", "able repay", "husband child", "member group", "go school", "rice farmer", "provide family", "loan help", "many loan", "person group", "risk_making loan", "loan risk_willing_accept_additional", "risk_willing_accept_additional risk_making", "communities_remains_unsettled_affecte many", "microinsurance_acces business", "fee child", "mifex_offers_client microinsurance_acces", "loan month", "loan intend", "group loan", "able provide", "sell clothing", "willing_repay loan", "attend school", "high demand", "loan group", "live husband", "grow business", "run small", "old mother", "old live" ], "y": [ 403, 194, 175, 168, 161, 160, 147, 115, 112, 105, 105, 99, 87, 82, 81, 80, 78, 76, 76, 74, 68, 68, 67, 67, 66, 66, 60, 59, 58, 57, 57, 51, 51, 50, 50, 50, 49, 49, 47, 46, 46, 46, 46, 46, 45, 44, 44, 44, 43, 43, 43, 42, 42, 42, 42, 41, 41, 41, 40, 40, 39, 39, 39, 39, 38, 37, 37, 37, 37, 37, 36, 36, 35, 35, 35, 34, 34, 34, 33, 32, 32, 32, 32, 32, 31, 31, 30, 29, 29, 29, 29, 29, 28, 28, 28, 28, 28, 28, 28, 27 ] } ], "layout": { "legend": { "bgcolor": "#F5F6F9", "font": { "color": "#4D5663" } }, "paper_bgcolor": "#F5F6F9", "plot_bgcolor": "#F5F6F9", "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "heatmapgl": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmapgl" } ], "histogram": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "font": { "color": "#4D5663" }, "text": "Top 100 bigrams after removing stop words" }, "xaxis": { "gridcolor": "#E1E5ED", "linecolor": "black", "showgrid": true, "tickfont": { "color": "#4D5663" }, "title": { "font": { "color": "#4D5663" }, "text": "" }, "zerolinecolor": "#E1E5ED" }, "yaxis": { "gridcolor": "#E1E5ED", "linecolor": "black", "showgrid": true, "tickfont": { "color": "#4D5663" }, "title": { "font": { "color": "#4D5663" }, "text": "Count" }, "zerolinecolor": "#E1E5ED" } } }, "text/html": [ "
\n", " \n", " \n", "
\n", " \n", "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "plot_model(plot = 'bigram')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 9.3 Frequency Distribution of Topic 1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`plot_model()` can also be used to analyze the same plots for specific topics. To generate plots at topic level, function requires trained model object to be passed inside `plot_model()`. In example below we will generate frequency distribution on `Topic 1` only as defined by `topic_num` parameter." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/html": [ " \n", " " ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.plotly.v1+json": { "config": { "linkText": "Export to plot.ly", "plotlyServerURL": "https://plot.ly", "showLink": true }, "data": [ { "marker": { "color": "rgba(255, 153, 51, 0.6)", "line": { "color": "rgba(255, 153, 51, 1.0)", "width": 1 } }, "name": "count", "orientation": "v", "text": "", "type": "bar", "x": [ "drug", "rise", "year", "work", "child", "loan", "pharmacy", "health", "stock", "service", "purchase", "train", "clinic", "plan", "nurse", "live", "offer", "perseverance", "midwife", "hard", "old", "also", "family", "facility", "assistance", "care", "financial", "base", "mother", "apply", "maternal", "many", "madre", "client", "general", "communities_remains_unsettled_affecte", "invertir", "infection", "curative", "disclaimer_due_recent_event", "government", "gas", "able", "open", "private", "pablo", "sector", "run", "risk_willing_accept_additional", "risk_making", "provide", "slum", "treat", "una", "training", "son", "poder", "vive", "competitor", "dina", "desea", "actividade", "start", "comprarse", "allergy", "worker", "various", "cannot_afford", "ste", "ante", "syrups", "antimalarial", "area", "ulcer", "community", "sell", "ella", "era", "painkiller", "medication", "parte", "people", "personally", "worm", "price", "know", "ingreso", "product", "home", "hiv_aid", "high", "haca", "good", "give", "genere", "futuro", "free", "selling", "skin", "low" ], "y": [ 8, 7, 6, 6, 6, 6, 5, 5, 4, 4, 4, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 ] } ], "layout": { "legend": { "bgcolor": "#F5F6F9", "font": { "color": "#4D5663" } }, "paper_bgcolor": "#F5F6F9", "plot_bgcolor": "#F5F6F9", "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "heatmapgl": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmapgl" } ], "histogram": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "font": { "color": "#4D5663" }, "text": "Topic 1: Top 100 words after removing stop words" }, "xaxis": { "gridcolor": "#E1E5ED", "linecolor": "black", "showgrid": true, "tickfont": { "color": "#4D5663" }, "title": { "font": { "color": "#4D5663" }, "text": "" }, "zerolinecolor": "#E1E5ED" }, "yaxis": { "gridcolor": "#E1E5ED", "linecolor": "black", "showgrid": true, "tickfont": { "color": "#4D5663" }, "title": { "font": { "color": "#4D5663" }, "text": "Count" }, "zerolinecolor": "#E1E5ED" } } }, "text/html": [ "
\n", " \n", " \n", "
\n", " \n", "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "plot_model(lda, plot = 'frequency', topic_num = 'Topic 1')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 9.4 Topic Distribution" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/html": [ " \n", " " ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plot.ly" }, "data": [ { "alignmentgroup": "True", "customdata": [ [ "loan, child, business, school, year, buy, able, group, pay, also" ], [ "risk_willing_accept_additional, communities_remains_unsettled_affecte, risk_making, disclaimer_due_recent_event, bread, many, rise, health, kenya_security_situation_many, facility" ], [ "loan, also, rice, farmer, use, small, area, land, many, sector" ], [ "business, loan, sell, year, child, work, product, buy, make, help" ] ], "hoverlabel": { "namelength": 0 }, "hovertemplate": "Topic=%{x}
Documents=%{y}
Keyword=%{customdata[0]}", "legendgroup": "", "marker": { "color": "#636efa" }, "name": "", "offsetgroup": "", "orientation": "v", "showlegend": false, "textposition": "auto", "type": "bar", "x": [ "Topic 0", "Topic 1", "Topic 2", "Topic 3" ], "xaxis": "x", "y": [ 415, 8, 26, 551 ], "yaxis": "y" } ], "layout": { "barmode": "relative", "legend": { "tracegroupgap": 0 }, "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "heatmapgl": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmapgl" } ], "histogram": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "text": "Document Distribution by Topics" }, "xaxis": { "anchor": "y", "domain": [ 0, 1 ], "title": { "text": "Topic" } }, "yaxis": { "anchor": "x", "domain": [ 0, 1 ], "title": { "text": "Documents" } } } }, "text/html": [ "
\n", " \n", " \n", "
\n", " \n", "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "plot_model(lda, plot = 'topic_distribution')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Each document is a distribution of topics and not a single topic. Although, if the task is of categorizing document into specific topics, it wouldn't be wrong to use the topic proportion with highest value to categorize the document into **a topic**. In above plot, each document is categorized into one topic using the largest proportion of topic weights. We can see most of the documents are in `Topic 3` with only few in `Topic 1`. If you hover over these bars, you will get basic idea of themes in this topic by looking at the keywords. For example if you evaluate `Topic 2`, you will see keywords words like 'farmer', 'rice', 'land', which probably means that the loan applicants in this category pertains to agricultural/farming loans. However, if you hover over `Topic 0` and `Topic 3` you will observe lot of repitions and keywords are overlapping in all topics such as word \"loan\" and \"business\" appears both in `Topic 0` and `Topic 3`. In next tutorial, __[Natural Language Processing Tutorial (NLP102) - Level Intermediate](https://github.com/pycaret/pycaret/blob/master/Tutorials/Natural%20Language%20Processing%20Tutorial%20Level%20Intermediate%20-%20NLP102.ipynb)__ we will demonstrate the use of `custom_stopwords` at which point we will re-analyze this plot." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 9.5 T-distributed Stochastic Neighbor Embedding (t-SNE)" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/html": [ " \n", " " ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plot.ly" }, "data": [ { "hoverlabel": { "namelength": 0 }, "hovertemplate": "Dominant_Topic=Topic 0
0=%{x}
1=%{y}
2=%{z}", "legendgroup": "Dominant_Topic=Topic 0", "marker": { "color": "#636efa", "opacity": 0.7, "symbol": "circle" }, "mode": "markers", "name": "Dominant_Topic=Topic 0", "scene": "scene", "showlegend": true, "type": "scatter3d", "x": [ -8.608729362487793, -3.449270486831665, -5.1493988037109375, -8.355928421020508, -8.8486967086792, -3.106522798538208, -4.316102981567383, -4.473966121673584, 8.270657539367676, -2.873662233352661, 1.5604636669158936, -3.0006773471832275, -5.905602931976318, -4.92913818359375, -3.0338642597198486, 0.930622398853302, -3.809321403503418, -1.559649109840393, -2.369685173034668, 8.092514991760254, -7.861120223999023, -1.312436580657959, 5.276008129119873, 8.006168365478516, 7.973714828491211, -8.497553825378418, -3.0834035873413086, -7.652866363525391, 2.8802108764648438, -4.255480766296387, -2.1227211952209473, -5.729995250701904, 2.7832870483398438, 2.303431987762451, 2.9241220951080322, -5.528079032897949, 0.5474454760551453, -6.840963363647461, -3.609055757522583, -5.957661151885986, 7.534295082092285, -2.8011667728424072, -2.818911552429199, -3.7643165588378906, -5.934777736663818, -3.583871364593506, -8.408769607543945, -0.4865702986717224, 3.963343620300293, -8.324466705322266, -4.69083309173584, -7.93284273147583, -8.605924606323242, -2.441927671432495, -8.578213691711426, -2.843726396560669, -5.954280376434326, -4.846769332885742, -6.4718017578125, -4.106837272644043, 1.7550873756408691, -0.9781631827354431, 0.4939008951187134, -6.8309736251831055, 1.746622920036316, 4.058075428009033, 7.272129058837891, 1.0785306692123413, -5.532783031463623, 0.018191581591963768, 7.720111846923828, 2.7534592151641846, 8.162467956542969, 7.783346652984619, -4.851553440093994, -6.034549713134766, 4.617560863494873, -2.396671772003174, 8.326144218444824, 3.1808297634124756, -4.241217136383057, -3.2994580268859863, -2.716110944747925, 4.507784366607666, 1.0880563259124756, -4.012491226196289, -1.8818981647491455, 1.9844415187835693, 4.998645305633545, 7.014450550079346, 1.810992956161499, -4.530220031738281, 2.601922035217285, -3.2850115299224854, -2.650792121887207, 2.1750783920288086, 5.607980728149414, 6.964809417724609, -3.1164863109588623, 8.1226167678833, -2.707444667816162, -4.3681745529174805, -3.772362232208252, -6.8314032554626465, 3.173532009124756, 8.241069793701172, -0.6367214918136597, -1.9717408418655396, 3.82382869720459, -5.21741247177124, 3.894155979156494, 6.346513748168945, -0.33898887038230896, -3.68632435798645, 0.9410411715507507, -3.275622844696045, -6.4809651374816895, -2.9954397678375244, -5.343206405639648, -4.160217761993408, -2.8710989952087402, -6.107564449310303, -6.234940528869629, 7.897449970245361, 0.9974428415298462, -3.680330276489258, 7.341864585876465, 6.0024871826171875, -8.925536155700684, -3.243818998336792, -3.4610402584075928, -2.4697353839874268, -4.2484049797058105, -4.627392292022705, 3.2683284282684326, 6.513227939605713, 1.9488303661346436, 8.314109802246094, -4.982259273529053, -3.837163209915161, -4.298853397369385, 5.485028266906738, -6.639005661010742, -0.4742388427257538, 5.408748149871826, -4.697657585144043, -8.142009735107422, 6.493923664093018, 3.275724172592163, -8.562911033630371, -2.34922194480896, -4.519927978515625, -2.6134252548217773, 0.16655118763446808, 7.023719310760498, -5.708735942840576, 8.267511367797852, 0.5781689882278442, 7.824449062347412, -3.7917401790618896, 6.369394302368164, -5.2639288902282715, 3.188347101211548, -6.064022541046143, -8.458724975585938, 7.541464328765869, -7.121298313140869, 2.336148500442505, -8.364485740661621, -7.574917793273926, 0.9537202715873718, -8.91515064239502, -4.756725788116455, -8.15151596069336, -5.926100254058838, -8.851580619812012, -2.9197139739990234, 1.7345081567764282, -5.692661285400391, -6.6680755615234375, -4.56101655960083, 4.964341640472412, -2.847907781600952, 8.377347946166992, -4.516342639923096, 1.369204044342041, 7.6496171951293945, 7.46951150894165, 2.570364475250244, -7.135148048400879, -1.3272935152053833, -5.943087100982666, 6.660731315612793, -8.859842300415039, -3.9621949195861816, -5.268120765686035, 1.7571638822555542, -0.625186562538147, -8.526070594787598, 7.870302200317383, 6.444554805755615, -6.147066593170166, -2.636779308319092, -6.8055100440979, -4.049270153045654, 8.198321342468262, -5.157452583312988, -8.911176681518555, -2.4145867824554443, 8.37739086151123, -3.9904563426971436, -8.689764976501465, -3.6727983951568604, 6.6334547996521, -7.037793159484863, -2.5300076007843018, 8.112597465515137, 4.728193759918213, 5.347702503204346, -6.052605628967285, -6.0023722648620605, 8.363930702209473, -0.19841821491718292, 6.531321048736572, -8.922245025634766, 7.527495384216309, 4.859605312347412, -3.3463120460510254, 4.447306156158447, -4.583302021026611, -3.8735556602478027, 2.8631463050842285, -8.245158195495605, -0.186861053109169, -8.522489547729492, -0.46440061926841736, -7.891273021697998, -5.2904372215271, -8.612557411193848, -5.344502925872803, -2.0431463718414307, -2.4421143531799316, -8.476845741271973, -8.834551811218262, -2.339165449142456, -8.181684494018555, -0.24352937936782837, -5.776079177856445, 4.045302867889404, -4.585671424865723, 7.750515937805176, 8.206377983093262, -4.314725399017334, 5.966803550720215, -2.7218410968780518, 1.294284701347351, -2.026068925857544, 2.819899559020996, -7.279694080352783, 4.723045349121094, 7.7768940925598145, -4.7547383308410645, 0.06778036057949066, 7.7748284339904785, -2.6685845851898193, -7.61445426940918, 3.7598743438720703, -8.364686965942383, -3.0397918224334717, 8.088624954223633, -3.57975172996521, -2.753681182861328, -1.4837597608566284, 8.260339736938477, -5.35454797744751, 8.135769844055176, 8.317216873168945, -4.804644584655762, -3.672417163848877, -5.399998188018799, 1.876243233680725, 5.225409507751465, -3.086794376373291, 3.939983606338501, -3.053954839706421, -4.322286128997803, -4.015697956085205, -8.944201469421387, 1.7437891960144043, -8.242138862609863, 4.6682257652282715, -0.11360710859298706, 1.2034311294555664, -7.075969696044922, -4.273066520690918, 8.319559097290039, 6.9774298667907715, 7.7195563316345215, -6.922321319580078, -5.43720006942749, 7.90116024017334, -8.926554679870605, -5.219162940979004, -8.845494270324707, -4.931650161743164, -2.5691049098968506, -2.2941646575927734, -5.7717976570129395, -2.6244466304779053, 1.9723505973815918, -1.4425878524780273, -5.319399356842041, -3.334491491317749, -5.577247142791748, -4.894825458526611, -5.830865859985352, 5.64504337310791, -5.514723300933838, -8.484657287597656, 5.633171558380127, -4.411510944366455, -8.407997131347656, 1.4185802936553955, 4.164414405822754, -2.5574533939361572, 7.7895121574401855, -4.371315956115723, -4.946470260620117, 6.665078163146973, 5.077589511871338, 1.8037611246109009, -6.5562849044799805, -3.091287612915039, -5.3392415046691895, 6.4955267906188965, 6.809565544128418, -8.897774696350098, 8.158504486083984, -6.524725437164307, 7.411317348480225, -3.799009084701538, -6.258737087249756, -6.450069427490234, -5.762934684753418, 3.8512699604034424, -1.1006628274917603, -4.3772807121276855, -4.061111927032471, 0.8759416937828064, -6.55994176864624, 7.480964183807373, -3.5743563175201416, 3.2225418090820312, -3.288435459136963, 7.99682092666626, -5.7620744705200195, 8.368756294250488, -6.12191915512085, -3.691117286682129, -0.09982374310493469, 8.226402282714844, 7.311054229736328, 5.398276329040527, 2.3997862339019775, -7.17294454574585, -8.10906982421875, 3.233354091644287, -4.148129463195801, -3.8669190406799316, 1.1474299430847168, -6.048267841339111, -2.394587755203247, 2.948960781097412, 7.484297275543213, -4.222890853881836, -3.7742414474487305, -4.461194038391113, 3.9481327533721924, 7.471209526062012, -7.015602111816406, -6.858746528625488, -3.494337320327759, -3.6546742916107178, 8.368766784667969, 1.758576512336731, 2.4928321838378906, 7.4136505126953125, 1.1071833372116089, 2.951479196548462, 8.040063858032227, -4.121102333068848, 1.1765954494476318, 0.09473879635334015, 8.147856712341309, -3.23895525932312, 6.670626163482666, -5.4531707763671875, -5.722188949584961, -8.78007984161377, -5.648556232452393, -2.782255172729492, -2.6945455074310303, -7.5685529708862305, 3.2311770915985107, 3.2940611839294434, 8.131217002868652, -0.486512690782547, 0.5802382826805115, 7.207589149475098, 2.086496114730835, -0.47125542163848877, -1.1996725797653198, 8.144001960754395, -2.284681797027588, -7.72421932220459 ], "y": [ -3.2324318885803223, 5.277303218841553, -0.018763583153486252, -3.7696444988250732, -0.9735664129257202, 6.414970874786377, -2.4616570472717285, 3.036369800567627, -8.642210960388184, 7.170027732849121, -9.900508880615234, 6.4729790687561035, -7.35194730758667, 2.2820024490356445, 7.397436141967773, -12.928598403930664, -10.097301483154297, -11.963315963745117, -11.543264389038086, -9.346709251403809, -3.753180742263794, -12.103446960449219, -5.8319525718688965, -7.657008647918701, -6.6177825927734375, -3.561178684234619, 6.055457592010498, 0.05856155604124069, -13.009231567382812, -3.1250481605529785, -11.638569831848145, 1.2366719245910645, -12.992560386657715, -12.969182968139648, -5.323522567749023, -8.101351737976074, -12.821091651916504, -6.14075231552124, 4.718807220458984, -2.8261935710906982, -9.944504737854004, -10.716425895690918, 8.235130310058594, -4.106523036956787, -6.162966251373291, 4.967967510223389, -0.560211181640625, -7.081262111663818, -12.846026420593262, -3.826547384262085, -5.271877765655518, -4.7045817375183105, -3.0844812393188477, 8.291351318359375, -0.6271271109580994, 6.974771499633789, -6.721761226654053, 2.4062869548797607, -3.144770383834839, -3.6931540966033936, -9.784075736999512, -8.87219524383545, -6.735124111175537, 0.5624590516090393, -12.878146171569824, -5.486743450164795, -11.045071601867676, -12.529806137084961, -7.889498710632324, -7.155088901519775, -6.361093044281006, -9.309624671936035, -9.20474624633789, -0.9351221919059753, -5.250212669372559, -2.6209895610809326, -4.962277889251709, 9.099774360656738, -8.275613784790039, -12.980395317077637, -4.953699111938477, -10.737165451049805, 8.876758575439453, -12.595656394958496, -12.90499210357666, -3.8313212394714355, -8.650429725646973, -12.866043090820312, -12.519328117370605, -11.400992393493652, -6.5207953453063965, -2.0270795822143555, -9.37130355834961, -10.793049812316895, 8.85978889465332, -6.43766975402832, -10.796454429626465, -11.420391082763672, 6.105976581573486, -6.9422478675842285, 8.498481750488281, -4.634562969207764, -10.228311538696289, 0.29767248034477234, -12.97388744354248, -7.310024261474609, -7.109548091888428, -11.759332656860352, -5.453476905822754, -0.341526597738266, -12.220160484313965, -11.834074974060059, -8.893685340881348, -10.4113187789917, -6.70322322845459, 5.801604747772217, 0.7172303795814514, -11.077193260192871, 1.7245938777923584, -7.405261993408203, 7.74803352355957, -7.262660980224609, -7.241642475128174, -6.750965595245361, -8.9594087600708, 4.7823638916015625, -7.121555328369141, -6.240936279296875, -1.2912964820861816, 5.5770063400268555, 5.148241996765137, 9.045000076293945, -4.417532444000244, -0.7983973622322083, -6.168528079986572, -10.444629669189453, -10.043021202087402, -7.462344646453857, -8.795153617858887, 4.339568614959717, -4.603349208831787, -12.312639236450195, -6.389598369598389, -12.458325386047363, -10.858746528625488, 2.636219024658203, -0.2559947371482849, -11.454607963562012, -12.903450012207031, -0.6204216480255127, -8.279253005981445, -0.9064727425575256, 7.683566093444824, -12.755207061767578, -7.614328384399414, -2.782802104949951, -7.534177303314209, -6.8692450523376465, -10.14359188079834, 4.412321090698242, -10.523641586303711, 1.3938053846359253, -6.402314186096191, 0.00980041828006506, -0.6942394971847534, -10.633208274841309, 0.3517674505710602, -12.995100021362305, -0.4527131915092468, -4.757317066192627, -6.246978282928467, -1.6423394680023193, -9.035099983215332, -4.399980545043945, 0.6129963397979736, -2.623068332672119, 7.524178504943848, -9.873193740844727, -7.807625770568848, 0.6368944048881531, -1.1446222066879272, -12.045401573181152, -11.064653396606445, -7.780551433563232, 2.9741382598876953, -12.794673919677734, -10.440888404846191, -6.288724899291992, -5.369764804840088, 0.34705641865730286, -11.547798156738281, 0.6070419549942017, -11.669611930847168, -1.050195574760437, -9.980433464050293, 1.7309761047363281, -12.457697868347168, -11.957052230834961, -3.201975107192993, -6.667614936828613, -11.795063018798828, -7.222358703613281, 7.803147792816162, 0.567852258682251, 3.924945831298828, -9.087203979492188, 1.9359396696090698, -1.9111689329147339, 9.037997245788574, -7.880828857421875, 3.973325490951538, -1.6468164920806885, 4.725566864013672, -11.346151351928711, -5.881087779998779, -11.388223648071289, -6.92593240737915, -12.619355201721191, -5.936289310455322, 0.9886465668678284, 1.081111192703247, -8.206315040588379, -6.685049057006836, -10.534004211425781, -1.6399890184402466, -10.646321296691895, -5.480894088745117, 5.7467474937438965, -5.534256458282471, -1.1824109554290771, 4.291285991668701, -13.024026870727539, -1.857722282409668, -7.487233638763428, -0.580228865146637, -12.4597749710083, -4.5473408699035645, -8.45848560333252, -1.1432121992111206, -1.1777616739273071, -11.759846687316895, 8.61285400390625, -2.3537514209747314, -1.0308501720428467, -8.169567108154297, -4.255670070648193, -6.967818737030029, 1.2373806238174438, -12.815945625305176, 2.8258585929870605, -6.274702072143555, -7.951848983764648, -2.3131814002990723, -12.05471134185791, -11.228277206420898, -6.746183395385742, -8.716175079345703, -12.531049728393555, -3.1078264713287354, -5.677170276641846, -10.263808250427246, 2.530517578125, -12.667078018188477, -10.240041732788086, 7.323768138885498, -5.001969814300537, -5.545917510986328, -4.009189128875732, -10.920876502990723, -9.523324012756348, 3.8661978244781494, 7.514462947845459, -11.590827941894531, -8.80990982055664, -8.278257369995117, -9.389430046081543, -7.557162761688232, -0.47330865263938904, -10.392454147338867, -1.0400813817977905, -9.966798782348633, -12.17703914642334, 6.3203864097595215, -12.852073669433594, 7.289274215698242, -9.593133926391602, -7.736659049987793, -1.8407498598098755, -5.912135601043701, -4.0047478675842285, -12.62319564819336, -12.637042045593262, -12.968555450439453, 0.3732537031173706, -2.7610065937042236, -7.53108024597168, -6.522862434387207, -6.359399795532227, 0.5403794646263123, -3.057241439819336, -9.530138969421387, -2.1753833293914795, 1.827723741531372, -2.6736502647399902, -8.177227020263672, 7.474987030029297, -11.532506942749023, -7.80603551864624, 8.604122161865234, -6.201617240905762, -12.08350658416748, 1.644649863243103, 4.442707538604736, 1.5457873344421387, -5.085255146026611, -7.4031171798706055, -5.945273399353027, -8.167999267578125, -3.7324817180633545, -12.23571491241455, -9.489651679992676, -2.6746647357940674, -9.207059860229492, -12.751825332641602, -11.276749610900879, -1.0308811664581299, -9.573236465454102, 2.180112361907959, -10.580181121826172, -12.496176719665527, -5.864532947540283, -6.654474258422852, 6.437155723571777, -0.531973123550415, -11.689266204833984, -11.43298053741455, -1.8147958517074585, -7.004122257232666, -6.6549296379089355, -5.780554294586182, -8.533978462219238, -2.196058988571167, -6.77315616607666, -6.588069915771484, -5.694440841674805, -7.112741947174072, 3.233166217803955, 3.852717161178589, -12.915281295776367, -6.426896095275879, -10.797722816467285, 3.8906686305999756, -5.348701477050781, 4.565430641174316, -6.664451599121094, 1.3158841133117676, -7.728134632110596, 1.0087112188339233, -10.222164154052734, -12.622815132141113, -8.966133117675781, -11.02711296081543, -5.5915913581848145, -6.338002681732178, -5.744369029998779, -4.452816486358643, -6.458662986755371, -9.846705436706543, 4.237620830535889, -9.061651229858398, -7.556887626647949, 9.297043800354004, -12.41849422454834, -6.19303035736084, -4.705785274505615, -9.511556625366211, 3.0840299129486084, -5.581637859344482, -5.915792465209961, -5.883378505706787, -6.104863166809082, 5.213750839233398, -10.413405418395996, -8.17791748046875, -10.100908279418945, -6.648164749145508, -5.78048038482666, -12.869028091430664, -5.3351359367370605, -6.786882400512695, -5.154938220977783, -9.154455184936523, -12.714197158813477, -9.306323051452637, 5.9155988693237305, -9.646931648254395, -8.163620948791504, 1.0285879373550415, -0.9924960732460022, 1.3738409280776978, 7.062137126922607, -8.007708549499512, 0.1077171266078949, -12.949271202087402, -12.261669158935547, -7.8932976722717285, -7.112748146057129, -12.560954093933105, -11.050013542175293, -5.667097568511963, -11.468131065368652, -10.964484214782715, -8.154644966125488, 8.782564163208008, -4.842001438140869 ], "z": [ 6.746514320373535, 12.811984062194824, 10.601083755493164, 6.511877536773682, 8.130897521972656, 12.925955772399902, 9.982985496520996, 12.160283088684082, 4.195158958435059, 13.143065452575684, 7.785168647766113, 12.750600814819336, 6.119930744171143, 11.73922348022461, 13.003804206848145, 7.759198188781738, 6.585062026977539, 7.269756317138672, 7.076412677764893, 4.91074800491333, 6.359663009643555, 7.3319268226623535, 1.0554616451263428, 3.2537856101989746, 1.6278916597366333, 6.614842891693115, 12.695048332214355, 9.059518814086914, 7.882207870483398, 9.638287544250488, 7.135869979858398, 10.865660667419434, 7.862599849700928, 7.834588050842285, 0.2901163697242737, 6.1583571434021, 7.680866718292236, 6.197624683380127, 12.638298988342285, 6.2717084884643555, 5.967748165130615, 6.763901233673096, 13.197595596313477, 9.489364624023438, 6.5219011306762695, 12.766916275024414, 8.163591384887695, 4.460629940032959, 7.765139579772949, 6.492335796356201, 7.772266387939453, 6.332442283630371, 6.766706466674805, 13.089080810546875, 8.38231372833252, 12.750910758972168, 6.284316062927246, 11.815947532653809, 6.266360759735107, 9.430444717407227, 7.798612594604492, 6.149358749389648, 3.6209561824798584, 9.787267684936523, 7.751767158508301, 0.6504831314086914, 6.339016437530518, 7.169306755065918, 6.158875942230225, 4.2768025398254395, 1.3432549238204956, 7.781469345092773, 4.749871730804443, -0.5629522800445557, 7.634530067443848, 6.336797714233398, 1.2605584859848022, 13.32711410522461, 3.7964823246002197, 7.863204002380371, 8.451432228088379, 6.776581287384033, 13.173887252807617, 7.623557090759277, 7.753581523895264, 9.435323715209961, 6.0516557693481445, 7.746746063232422, 7.518266677856445, 6.600613117218018, 2.835052967071533, 10.038748741149902, 7.788628101348877, 6.789539337158203, 13.391077041625977, 2.6191723346710205, 7.0750651359558105, 6.6339030265808105, 12.909720420837402, 2.109422206878662, 12.953719139099121, 8.552163124084473, 6.607480049133301, 9.407814979553223, 7.856576919555664, 2.642458200454712, 4.5555195808410645, 7.174683094024658, 0.5339387655258179, 7.5262064933776855, 6.912546157836914, 7.0053277015686035, 6.327749729156494, 6.652246952056885, 3.3765392303466797, 12.889914512634277, 10.089974403381348, 6.894048690795898, 11.338295936584473, 5.887945652008057, 12.832844734191895, 6.111665725708008, 6.0962419509887695, 1.992158055305481, 7.723991870880127, 12.782735824584961, 2.764404058456421, 1.6673074960708618, 7.837253093719482, 12.608969688415527, 12.689826011657715, 13.192039489746094, 8.815436363220215, 10.577176094055176, 2.0076982975006104, 7.04223108291626, 7.82831335067749, 2.8230865001678467, 6.271163463592529, 12.665103912353516, 8.641836166381836, 7.359694480895996, 6.187407970428467, 7.506958961486816, 6.289834022521973, 11.952861785888672, 8.70207405090332, 6.894958019256592, 7.789945125579834, 8.371840476989746, 7.344192028045654, 10.597037315368652, 13.105454444885254, 7.637822151184082, 3.6891345977783203, 6.251028060913086, 3.0516700744628906, 3.7425594329833984, 5.5615715980529785, 12.653823852539062, 5.793785572052002, 11.11937141418457, 2.4325432777404785, 8.443166732788086, 7.8277082443237305, 5.9886369705200195, 9.47583293914795, 7.8586602210998535, 8.492914199829102, 6.323050498962402, 2.5294036865234375, 7.557150363922119, 6.319187641143799, 6.385948657989502, 10.340505599975586, 7.09046745300293, 13.19060230255127, 7.803857326507568, 6.129810333251953, 9.934438705444336, 10.458806037902832, 7.133255958557129, 6.921557903289795, 3.1876542568206787, 12.141319274902344, 7.6870293617248535, 5.823223114013672, 2.0974137783050537, 0.43096455931663513, 9.472490310668945, 7.313355922698975, 10.324886322021484, 6.83273983001709, 8.029459953308105, 6.541597366333008, 11.345478057861328, 7.013215065002441, 7.039764404296875, 6.6930742263793945, 1.8551957607269287, 6.960641860961914, 6.107481956481934, 12.843109130859375, 9.804247856140137, 12.599120140075684, 4.632702350616455, 11.495463371276855, 7.415002346038818, 13.291914939880371, 3.3373866081237793, 12.50910758972168, 7.246981143951416, 12.74697494506836, 6.574223518371582, 6.218181133270264, 7.027169704437256, 2.087451934814453, 7.5831298828125, 1.2534247636795044, 10.545919418334961, 10.650283813476562, 3.700547695159912, 3.999051332473755, 7.027348518371582, 7.566952705383301, 6.0049967765808105, 0.8354551196098328, 12.966381072998047, 0.5942354798316956, 10.425329208374023, 12.696590423583984, 7.894613742828369, 6.193132400512695, 4.667288780212402, 8.402335166931152, 7.5078020095825195, 6.346318244934082, 6.2070841789245605, 7.480309963226318, 6.983563423156738, 7.1606669425964355, 13.386669158935547, 6.735622406005859, 7.999048709869385, 5.776525020599365, 6.409738063812256, 4.244297027587891, 10.846781730651855, 7.740487575531006, 12.045818328857422, 1.066963791847229, 3.5714030265808105, 10.064497947692871, 7.166871070861816, 6.968428134918213, 3.3398938179016113, 6.082853317260742, 7.136936664581299, 6.377022743225098, 0.8609383702278137, 5.649747371673584, 11.885656356811523, 7.601564407348633, 5.641059875488281, 13.10670280456543, 6.3027753829956055, 0.7592028379440308, 6.492092609405518, 6.863331317901611, 5.028425693511963, 10.25658130645752, 12.707297325134277, 7.342916965484619, 4.353297710418701, 6.187038898468018, 4.9027018547058105, 3.0267391204833984, 10.610507011413574, 6.651885986328125, 7.111292839050293, 7.819421768188477, 7.263932704925537, 12.833979606628418, 7.767245292663574, 12.95699405670166, 6.438746929168701, 5.918197154998779, 7.48997688293457, 1.6741148233413696, 6.45266580581665, 7.587811470031738, 7.582798004150391, 7.80139684677124, 9.514472007751465, 9.843725204467773, 2.946376323699951, 3.1852309703826904, 1.339151382446289, 9.738850593566895, 6.143686771392822, 5.101253986358643, 7.298426628112793, 11.411994934082031, 7.0723876953125, 6.210196018218994, 12.877384185791016, 7.092127323150635, 6.120710372924805, 12.934403419494629, 2.2014856338500977, 7.305784702301025, 11.26805305480957, 10.517182350158691, 11.156737327575684, 7.551323890686035, 6.139267444610596, 1.1759206056594849, 6.163282871246338, 6.582191467285156, 7.301997184753418, 6.414979934692383, 6.641368389129639, 7.742767333984375, 7.681879997253418, 7.010233402252197, -0.44145506620407104, 6.431982040405273, 11.645869255065918, 5.866111755371094, 7.498767375946045, 1.5692119598388672, 6.13582706451416, 12.946629524230957, 7.459321022033691, 6.916593074798584, 6.726354122161865, 7.442661285400391, 2.1936142444610596, 6.142195701599121, 0.06807901710271835, 6.092672348022461, 6.513795852661133, 6.128721237182617, 6.397894859313965, 1.1290664672851562, 4.775674819946289, 12.25947380065918, 12.4917631149292, 7.747496128082275, 6.199736595153809, 6.107022285461426, 10.294180870056152, 0.3254573941230774, 10.591338157653809, 1.6991016864776611, 10.924318313598633, 3.1033976078033447, 10.52253246307373, 6.621813774108887, 7.579616546630859, 4.5113606452941895, 6.306589126586914, 0.4334876537322998, 2.4078173637390137, 6.221585273742676, 6.376091480255127, 2.5262277126312256, 6.4961371421813965, 12.585611343383789, 7.726626396179199, 6.106338977813721, 13.450275421142578, 6.9987359046936035, 1.0259356498718262, 8.632084846496582, 6.3810858726501465, 12.192031860351562, 0.819706916809082, 0.4922841787338257, 6.224085330963135, 6.20281457901001, 12.833333015441895, 6.657467365264893, 3.663516044616699, 7.817115783691406, 2.9338414669036865, 0.06657177954912186, 7.731274127960205, 0.3158619701862335, 1.8941045999526978, 8.4197359085083, 7.659017562866211, 7.619638919830322, 4.836021423339844, 12.855193138122559, 7.043217658996582, 6.171200752258301, 10.74526309967041, 7.953227996826172, 11.002592086791992, 12.892925262451172, 7.507109642028809, 9.125078201293945, 7.83336067199707, 6.803323268890381, 3.570676565170288, 4.485213756561279, 7.583399772644043, 6.378637790679932, 1.1173348426818848, 7.621819496154785, 7.502367973327637, 3.798527956008911, 13.149773597717285, 6.316201210021973 ] }, { "hoverlabel": { "namelength": 0 }, "hovertemplate": "Dominant_Topic=Topic 1
0=%{x}
1=%{y}
2=%{z}", "legendgroup": "Dominant_Topic=Topic 1", "marker": { "color": "#EF553B", "opacity": 0.7, "symbol": "circle" }, "mode": "markers", "name": "Dominant_Topic=Topic 1", "scene": "scene", "showlegend": true, "type": "scatter3d", "x": [ 0.058246783912181854, 0.1561301052570343, 0.12853141129016876, 0.9836311340332031, 1.2443991899490356, 0.13790865242481232, 0.5545778274536133, 1.2896524667739868 ], "y": [ -4.75504732131958, -4.710216999053955, -4.75202751159668, -4.400564193725586, -4.526569843292236, -4.70463752746582, -4.816626071929932, -4.700214862823486 ], "z": [ -0.17616577446460724, -0.2693259119987488, -0.20382407307624817, -1.2945891618728638, -1.0880329608917236, -0.27115598320961, -0.2405235767364502, -0.7442936301231384 ] }, { "hoverlabel": { "namelength": 0 }, "hovertemplate": "Dominant_Topic=Topic 2
0=%{x}
1=%{y}
2=%{z}", "legendgroup": "Dominant_Topic=Topic 2", "marker": { "color": "#00cc96", "opacity": 0.7, "symbol": "circle" }, "mode": "markers", "name": "Dominant_Topic=Topic 2", "scene": "scene", "showlegend": true, "type": "scatter3d", "x": [ 13.163626670837402, 13.733743667602539, 13.969192504882812, 13.487640380859375, 14.067075729370117, 13.884115219116211, 13.753656387329102, 13.800711631774902, 14.135066986083984, 13.964573860168457, 14.048131942749023, 13.134417533874512, 13.613687515258789, 13.69484806060791, 13.296577453613281, 13.746986389160156, 13.671345710754395, 13.401246070861816, 13.419979095458984, 13.385261535644531, 13.941920280456543, 13.435635566711426, 7.737373352050781, 13.232757568359375, 13.111310958862305, 13.527458190917969 ], "y": [ 2.4953877925872803, 2.7507145404815674, 3.1218438148498535, 2.873567581176758, 2.764153003692627, 2.3648459911346436, 3.0187699794769287, 2.3721120357513428, 2.6975669860839844, 3.166067600250244, 2.9242823123931885, 2.6768717765808105, 3.2372846603393555, 2.450134038925171, 2.7277164459228516, 2.7714297771453857, 2.7318549156188965, 3.139054298400879, 3.1056079864501953, 2.350173234939575, 2.499986171722412, 2.2197625637054443, 0.33688706159591675, 3.077280282974243, 2.5864768028259277, 3.1844606399536133 ], "z": [ 0.6359475255012512, 0.8671063184738159, 0.22300828993320465, -0.045689113438129425, 0.142240971326828, 0.3588857650756836, 0.8157697319984436, 0.19070309400558472, 0.35602471232414246, 0.4118032157421112, 0.6710631251335144, 0.20975320041179657, 0.21337349712848663, 0.7536283135414124, 0.7789515852928162, -0.08175381273031235, -0.08449839055538177, 0.12496815621852875, 0.6769125461578369, 0.1786510944366455, 0.6544660925865173, 0.4211713671684265, -1.7129967212677002, 0.32872340083122253, 0.5160210132598877, 0.6307522058486938 ] }, { "hoverlabel": { "namelength": 0 }, "hovertemplate": "Dominant_Topic=Topic 3
0=%{x}
1=%{y}
2=%{z}", "legendgroup": "Dominant_Topic=Topic 3", "marker": { "color": "#ab63fa", "opacity": 0.7, "symbol": "circle" }, "mode": "markers", "name": "Dominant_Topic=Topic 3", "scene": "scene", "showlegend": true, "type": "scatter3d", "x": [ -3.4518070220947266, -5.974636554718018, -0.3675481379032135, -0.5131378173828125, -5.154693126678467, -4.653284072875977, -2.571751832962036, 1.6958732604980469, 6.561468601226807, 4.227837562561035, 4.980151653289795, 5.250816822052002, -2.75576114654541, -2.742145538330078, -6.118823528289795, -5.105478763580322, 5.615278244018555, -6.1962504386901855, 6.637463092803955, 0.7429002523422241, -0.9151397943496704, 1.8460878133773804, 6.896239280700684, -2.620873212814331, -0.26035648584365845, 5.511272430419922, -3.0356976985931396, -5.232486724853516, -4.802530288696289, -0.34123820066452026, -4.632390975952148, 2.7914156913757324, 2.381577491760254, 1.4784542322158813, -2.6855621337890625, 6.21040153503418, 5.382091045379639, 4.213400363922119, 2.3975064754486084, -5.951471328735352, -3.218024969100952, -1.7511042356491089, 2.165469169616699, 0.6544477343559265, 4.181297779083252, -4.644601345062256, -0.5353297591209412, 4.518369197845459, -6.078373908996582, 4.290629863739014, -6.307061195373535, -0.5403984785079956, 1.9647140502929688, 3.5957844257354736, -0.2169264554977417, 6.7745513916015625, 4.776814937591553, 2.740790605545044, -4.755561828613281, -2.675581693649292, 2.6503148078918457, 7.269653797149658, 0.5517745614051819, 6.820511341094971, 6.7001848220825195, -3.995833158493042, 5.410713195800781, 6.783051490783691, -2.8717257976531982, -6.150622367858887, -2.6980583667755127, -2.7660698890686035, -2.6457393169403076, -5.924203395843506, 6.839329242706299, 3.988854169845581, -5.09930419921875, 3.2048301696777344, -1.1724884510040283, 1.0654455423355103, -4.741488933563232, 7.514071464538574, 0.518671452999115, -6.292123794555664, 4.6876935958862305, 6.761606216430664, -2.65973162651062, 3.8287672996520996, -4.392484188079834, -2.79485821723938, 3.224149227142334, 0.47654974460601807, 6.361915588378906, 0.13499711453914642, 3.6531386375427246, -0.31280577182769775, -4.490962505340576, 1.066427230834961, 3.4423866271972656, -3.375094413757324, 5.886269569396973, -2.609355926513672, -0.1121843010187149, -5.80911111831665, -6.179607391357422, 4.4405293464660645, 4.047403335571289, 6.100541591644287, 6.752261161804199, 1.7326972484588623, -0.5717279314994812, -5.864713668823242, -3.47226881980896, 1.284292221069336, -2.095217227935791, 7.409093379974365, 0.8077054619789124, -0.29786670207977295, 0.981610894203186, 6.2772722244262695, -0.8251239061355591, 4.913400650024414, 6.8156962394714355, -4.859147071838379, -0.25578394532203674, -1.2426050901412964, 7.269052982330322, 3.5552480220794678, -2.535705804824829, -2.3468310832977295, -2.6252501010894775, 3.2443065643310547, -0.6275790333747864, -2.6040730476379395, 0.45768389105796814, -1.1908997297286987, 6.894589424133301, 5.764966011047363, -6.339216709136963, 2.7806618213653564, -2.6009280681610107, -5.37055778503418, -5.652210712432861, -3.2331056594848633, -3.5178511142730713, 0.2934303879737854, 2.379102945327759, 2.9665603637695312, -0.8507229089736938, -2.524473190307617, 6.890372276306152, -0.9364815950393677, 1.3633460998535156, -3.080249786376953, 0.43787458539009094, 3.8692121505737305, -6.094329833984375, 3.453906536102295, 2.806138277053833, 7.441344261169434, -3.3355231285095215, 2.5012331008911133, 3.092989921569824, -6.281585693359375, -6.238928318023682, -4.357186794281006, -6.200563430786133, 7.733442783355713, 7.357832908630371, -4.575900554656982, -6.084381580352783, 5.6413187980651855, 1.4352747201919556, 5.159327030181885, -2.244234561920166, 0.37631985545158386, 2.604994773864746, 0.8083765506744385, -2.968345880508423, 0.3003805875778198, 7.273813247680664, 0.6222625970840454, -2.7887473106384277, 1.4394196271896362, 5.96528434753418, -2.0809993743896484, -2.8482141494750977, -0.955662727355957, -4.509816646575928, -1.677110195159912, -1.8302712440490723, -5.027673721313477, 1.2807289361953735, 2.4464876651763916, 1.2347030639648438, 7.547338485717773, -4.693323135375977, -5.5411295890808105, -4.225985527038574, -6.047224044799805, 5.505976676940918, -6.270900249481201, 3.4388928413391113, 2.2190592288970947, 6.298388481140137, -3.8384478092193604, 4.26814603805542, 3.806337356567383, -6.086893558502197, 6.555688381195068, 6.888233661651611, -1.4270676374435425, -2.6158676147460938, -3.7137081623077393, -1.7110193967819214, -1.3292139768600464, 1.8271523714065552, -2.307079315185547, 7.742177486419678, -2.7488107681274414, 5.987419128417969, -6.129706859588623, 3.654303550720215, -5.8188157081604, -6.149956226348877, 0.5367414355278015, 5.729249954223633, 4.160065650939941, -0.7964770197868347, 5.342001914978027, -2.5448198318481445, -6.1581339836120605, -1.760185956954956, -0.14353786408901215, -4.8915486335754395, 2.8025565147399902, 1.9717779159545898, 1.1111620664596558, -6.307384490966797, 1.0016088485717773, -6.203761577606201, -3.1304845809936523, 5.525404930114746, 1.1949830055236816, -6.2080793380737305, 3.7714827060699463, -2.545783758163452, -6.021918296813965, 2.976012945175171, -5.820621490478516, -4.923766613006592, -6.197386264801025, 6.881927490234375, 2.7352778911590576, -2.7100796699523926, 5.702954292297363, -5.952491283416748, 3.055121660232544, 5.329661846160889, 5.182342052459717, -3.343639373779297, -6.210087299346924, -2.5785114765167236, -6.157729625701904, 6.637842178344727, -1.5329111814498901, -6.122990131378174, 7.058689594268799, 2.915288209915161, 6.552022933959961, 1.1506438255310059, -2.2499122619628906, -4.250595569610596, -1.1158603429794312, 3.245218515396118, -6.208540439605713, 0.034577444195747375, -5.951157093048096, -6.050309658050537, 3.9066808223724365, 3.6433358192443848, 6.62352180480957, 0.15758664906024933, -4.332118511199951, 1.574273943901062, -2.6694443225860596, 6.650590896606445, -5.660196304321289, -3.816059112548828, -6.114686489105225, 6.474085807800293, 2.0566885471343994, -2.8311212062835693, -0.8159781098365784, 4.894783973693848, -1.5914565324783325, 3.9532296657562256, 6.061136245727539, -6.119542121887207, -5.7910308837890625, 6.041424751281738, -2.6873011589050293, -6.023285388946533, 6.183802604675293, -2.8186097145080566, 2.195082902908325, 3.0707552433013916, -5.667844772338867, 1.1710147857666016, 2.8653087615966797, 4.840304851531982, 6.2863874435424805, -2.5770485401153564, 7.278298377990723, 6.450287342071533, -6.306005954742432, -2.6512815952301025, 3.229722499847412, -3.5929906368255615, -6.1645283699035645, -3.29612398147583, 6.463062763214111, 6.2574567794799805, 7.496033668518066, -5.838460445404053, 5.393617153167725, -3.6059036254882812, 6.852855205535889, 5.979652404785156, -6.297787666320801, -0.18919901549816132, 6.347261905670166, -6.229615211486816, -0.42110005021095276, 6.152153015136719, -6.159173965454102, 1.8968969583511353, -0.37843286991119385, 3.0826563835144043, -1.6306650638580322, -2.783228635787964, 4.544470310211182, 6.308055400848389, -0.906629204750061, 7.28689432144165, 0.5297806262969971, 4.261302471160889, -1.6326947212219238, -2.7979862689971924, -3.6582465171813965, 6.589996337890625, -0.2642025053501129, 1.5905438661575317, 7.351256847381592, -4.249001502990723, 1.956833004951477, 2.3637120723724365, -5.912341594696045, -4.789648056030273, 6.742133140563965, -3.1917951107025146, -2.776099920272827, 1.7150485515594482, -4.623565673828125, 6.0916643142700195, 2.6587438583374023, -2.7134461402893066, 6.948448181152344, -4.471816539764404, 5.902249813079834, 1.9568935632705688, -0.09557592123746872, 3.1162989139556885, 7.272781848907471, 6.766915798187256, 5.394679546356201, 3.948556900024414, -6.112876892089844, -0.7622652649879456, 7.023505687713623, 6.721491813659668, 6.331692218780518, 1.736575722694397, -0.6874353289604187, 5.9444684982299805, 3.025883436203003, -3.6941676139831543, -1.7003202438354492, -1.7533848285675049, 1.8088995218276978, 5.2111029624938965, -2.9913060665130615, 0.4073651432991028, 4.497563362121582, -4.906501293182373, 2.5807077884674072, -3.4613540172576904, -1.430375576019287, -4.497349262237549, 0.6254693269729614, -5.041782855987549, 7.741985321044922, 2.2318191528320312, 2.7636795043945312, -0.1829439103603363, 3.150782585144043, -2.4834673404693604, -5.152724742889404, 4.809642314910889, 1.8508014678955078, 6.41422700881958, 6.9458794593811035, -1.8131723403930664, 1.0240775346755981, -5.417356014251709, 7.104406833648682, -0.34111830592155457, -3.03507137298584, 2.595288038253784, -4.333835124969482, 3.9685659408569336, 2.4140589237213135, 2.4582860469818115, 3.928769826889038, 3.947068691253662, -1.771388292312622, -5.837461471557617, -4.452534198760986, -4.5517988204956055, -2.688206195831299, -0.6484195590019226, -5.3174004554748535, -5.989104270935059, -2.628960609436035, -4.553346633911133, 1.1934969425201416, -2.42482852935791, -5.945751667022705, 6.505843639373779, 5.823894500732422, -0.02186225913465023, -6.19637393951416, 4.237849235534668, 6.176779747009277, 3.4920833110809326, 2.071037530899048, -6.183388710021973, -3.796663999557495, 3.763983726501465, -6.189035892486572, -3.3829402923583984, 2.0139048099517822, 5.297320365905762, -2.815049409866333, -4.828632354736328, -5.835789680480957, 0.4424566626548767, -1.0385874509811401, 5.85499906539917, -0.03751004859805107, -4.646152019500732, 5.932539463043213, 7.691462516784668, -3.104128360748291, -2.5596988201141357, -1.6407967805862427, -3.5974535942077637, 5.013754367828369, 6.49699592590332, 5.745203971862793, -0.5206701159477234, 1.6748476028442383, 0.5300881862640381, -0.32668742537498474, 2.6251349449157715, -2.8812789916992188, -2.8667304515838623, -4.264454364776611, -6.23938512802124, 3.6735429763793945, -2.641266107559204, -2.386012077331543, 0.7654821276664734, -5.037657260894775, 0.31304943561553955, -3.7442901134490967, -0.16509371995925903, 1.3925365209579468, 2.7713680267333984, -0.9500067234039307, 5.641291618347168, -0.46724921464920044, -3.541078805923462, 3.054553747177124, -1.4317517280578613, -2.460489511489868, -2.8113248348236084, -0.8030842542648315, -6.060348033905029, -6.261189937591553, 0.7352820634841919, 3.579338788986206, 5.909436225891113, -6.191468715667725, -1.5771645307540894, -6.189692497253418, 6.59331750869751, -2.494593620300293, 6.839545726776123, 2.8790109157562256, -4.660956382751465, -2.110710620880127, -2.6126747131347656, 5.354724407196045, 6.026319980621338, 5.5868821144104, 6.6379313468933105, 3.3770525455474854, 2.8504140377044678, 1.370614767074585, 6.590887069702148, 6.749755859375, 3.671675682067871, 1.2213385105133057, -1.6257281303405762, 5.112521171569824, 3.9780142307281494, -5.064460277557373, 2.6185805797576904, 0.30701127648353577, 6.655488014221191, 5.021397590637207, 6.397559642791748, -4.751299858093262, 0.2931877374649048, -6.316053867340088, 0.27267172932624817, 2.772361993789673, 3.0713868141174316, -2.196528434753418, 0.1379212588071823, -5.369034290313721, -2.1645126342773438, 6.839101791381836, -3.492252826690674, -2.9240145683288574, 5.486019611358643, 1.065589189529419, -1.5610530376434326, -3.09352707862854, 3.6088662147521973 ], "y": [ 0.32797303795814514, 2.404906749725342, 7.369823932647705, -0.4472329914569855, 2.8312361240386963, 0.8835166692733765, 13.471397399902344, 1.312079668045044, -4.702779293060303, 4.6539459228515625, -4.303044319152832, 4.87094783782959, 14.121492385864258, 13.142468452453613, 7.688896179199219, 4.569891452789307, -4.321971893310547, 7.106027603149414, -0.5901292562484741, -3.121424674987793, -0.15305101871490479, -3.3289811611175537, -4.59382963180542, 13.145713806152344, 7.407279014587402, 4.448369026184082, 12.91065502166748, 1.4173755645751953, 10.004203796386719, -0.49690449237823486, 11.887613296508789, -1.9963244199752808, -4.344603061676025, -3.4457004070281982, 6.995433807373047, 4.327411651611328, -4.471351623535156, 0.32294419407844543, -1.7574776411056519, 2.404705286026001, 0.2367633581161499, 10.9364595413208, -3.5146241188049316, 7.349238872528076, -4.6582465171813965, 0.9381887912750244, 8.224658966064453, 0.43844079971313477, 9.12637996673584, 3.853895664215088, 4.918574810028076, 5.366687774658203, -3.817920446395874, -2.6125810146331787, 8.4547758102417, -0.7623319029808044, 3.396144151687622, 6.270819664001465, 0.9317845106124878, 13.311023712158203, 0.5261668562889099, -5.6756181716918945, -0.9701476097106934, -4.5735273361206055, 3.945324659347534, 2.6613447666168213, 0.0976715087890625, -4.2162981033325195, 14.222201347351074, 9.142143249511719, 13.166787147521973, 0.10140426456928253, 6.8583292961120605, 8.8366060256958, -4.463413715362549, 4.603196620941162, 9.477113723754883, 1.1967206001281738, 6.9614787101745605, -1.000731348991394, 11.769890785217285, 1.171942114830017, -3.2628941535949707, 4.243557929992676, 5.076249599456787, -4.629000186920166, 13.36112117767334, 5.637414455413818, 12.072399139404297, 13.678829193115234, 2.7718734741210938, -2.8571341037750244, -0.5754015445709229, 7.982421875, -2.6577963829040527, 7.855935096740723, 12.102802276611328, -1.3128172159194946, 5.573668956756592, 6.776702880859375, -4.495830059051514, 13.369290351867676, 4.878288269042969, 2.095600128173828, 7.27313232421875, 5.203450679779053, 5.434654235839844, -4.574571132659912, -2.9835329055786133, -4.447168827056885, -0.7158223390579224, 9.81009578704834, 0.34039542078971863, -1.6537930965423584, -0.08328438550233841, 1.4184954166412354, 1.4864917993545532, 7.055913925170898, -0.9730584621429443, -0.4535506069660187, 8.731771469116211, -4.819665908813477, -4.532609462738037, 1.0203006267547607, -1.7141261100769043, 5.21725606918335, -5.590508937835693, 4.177460670471191, 6.2458953857421875, 13.783019065856934, 13.402283668518066, -2.3263843059539795, -1.4630154371261597, 13.305733680725098, -2.242765426635742, -0.2817506194114685, -5.0128607749938965, 2.458569288253784, 4.715137004852295, 1.7039209604263306, 13.871014595031738, 10.691067695617676, 10.198442459106445, 12.7907075881958, 0.3235166072845459, -2.1461567878723145, -3.9287209510803223, -4.589815616607666, -1.0913163423538208, 13.515128135681152, -4.5851664543151855, -0.3438931107521057, 6.18291711807251, 8.544514656066895, 7.377016544342041, 0.07275700569152832, 2.5896995067596436, 2.977630853652954, -2.0185275077819824, 1.2775769233703613, 2.472529649734497, -1.8182718753814697, -2.215487480163574, 4.647337913513184, 4.080532073974609, 1.6424741744995117, 6.263227462768555, -0.11050783097743988, 3.331388473510742, 0.8419665694236755, 2.5006933212280273, -4.454519748687744, 1.8753315210342407, -4.162442207336426, 13.825906753540039, 4.050835609436035, 5.160576343536377, -4.138684272766113, 12.916318893432617, -1.7091152667999268, -5.616116046905518, 7.637135982513428, 13.084057807922363, -3.3652663230895996, 4.419475555419922, -0.08654096722602844, 14.209547996520996, 6.0264105796813965, 10.6693115234375, -0.16885411739349365, -0.14396391808986664, 1.2218056917190552, 1.0711795091629028, 2.978137969970703, 1.4191699028015137, 0.8929211497306824, 11.834944725036621, 5.556882858276367, 0.6287466287612915, 3.342897653579712, 1.9102706909179688, 5.868223190307617, -2.4497218132019043, -3.645275831222534, -4.470520973205566, 12.548351287841797, 0.20612099766731262, -4.419437408447266, 2.931002616882324, -3.698087453842163, -4.625110626220703, -0.22968889772891998, 13.204788208007812, 0.3904314339160919, 1.8858473300933838, -0.2547886073589325, -3.6503655910491943, -0.021545378491282463, -0.5938901901245117, 6.7316999435424805, 2.478868007659912, 8.760437965393066, -5.303199291229248, 9.843831062316895, 8.320487976074219, -3.145263433456421, -4.48553991317749, 2.765167713165283, 9.08353042602539, -4.30925178527832, 6.616494655609131, 7.815683364868164, 10.912590026855469, -1.6036484241485596, 11.496627807617188, -3.94071888923645, -3.0216403007507324, 6.7007737159729, 5.684957504272461, 7.1789021492004395, 3.703916072845459, 12.840275764465332, -4.787809371948242, -4.3022685050964355, 7.244716644287109, 2.709876775741577, 13.486662864685059, 2.787303924560547, 3.2228686809539795, 9.837479591369629, 1.166093111038208, 5.93298864364624, 2.020864963531494, 4.857450485229492, 12.932744979858398, -4.4794158935546875, 9.568907737731934, 5.525707721710205, -4.427078723907471, -4.189756870269775, 2.433908700942993, 8.47235107421875, 13.845191955566406, 8.005282402038574, -5.492978096008301, 10.127114295959473, 6.545940399169922, 2.0280601978302, 5.832729339599609, 4.046074390411377, -1.028720736503601, 13.885920524597168, 0.6523146629333496, -0.6223676204681396, 5.88991641998291, 8.421955108642578, -0.6094814538955688, 9.535933494567871, 2.483414888381958, -2.918426990509033, 0.8917533159255981, -0.555415153503418, -0.5240895748138428, 2.7346017360687256, -1.2489216327667236, 13.027658462524414, 1.3131041526794434, 1.909189224243164, 12.561236381530762, 8.27377986907959, 4.104440212249756, 5.964338302612305, 13.122288703918457, -1.3514997959136963, -4.760419845581055, 9.969788551330566, 4.188620090484619, 2.2835614681243896, 8.635319709777832, 9.953506469726562, 4.3577752113342285, 2.040879964828491, 3.5350091457366943, 4.351614952087402, 12.939626693725586, -4.735556602478027, -3.5573558807373047, 5.612346649169922, 1.8833633661270142, 5.389880657196045, 4.898609638214111, -0.4399477243423462, 13.552018165588379, 1.3769923448562622, 0.6353745460510254, 3.242767810821533, 13.354181289672852, -2.300846576690674, 12.715502738952637, 3.6730871200561523, 7.166164398193359, 0.778549313545227, 2.493964433670044, 1.0620484352111816, 2.117069721221924, -3.8943867683410645, 12.65718936920166, -4.7266340255737305, 2.8158082962036133, 4.4241790771484375, 8.425127029418945, -4.605546951293945, 3.2441329956054688, -1.443716287612915, 0.1267869770526886, 5.729201793670654, 5.696499347686768, -0.8948007822036743, -1.6770539283752441, 1.9627503156661987, 13.23470401763916, 0.24532093107700348, -4.356502056121826, 9.291462898254395, -5.622418403625488, -2.976391077041626, 0.19585783779621124, 10.205821990966797, 13.665322303771973, 12.693394660949707, -2.3711822032928467, 5.409047603607178, -1.269616961479187, -5.699145793914795, 12.300127029418945, 6.149184226989746, 6.426771640777588, 9.505109786987305, 1.013998031616211, -3.364795207977295, 0.23233307898044586, 13.789654731750488, -1.350368857383728, 11.895737648010254, -4.507035732269287, -4.011865139007568, 12.945374488830566, -0.61611008644104, 0.8301447629928589, 1.2918416261672974, -1.489989995956421, -0.5594978928565979, -1.2620559930801392, -5.680200099945068, -3.923320770263672, -3.6990420818328857, -2.9410345554351807, 7.975861549377441, 9.16361141204834, 1.695326566696167, -4.164378643035889, 1.308779001235962, -1.3261998891830444, 9.025400161743164, -4.183596134185791, -3.5946290493011475, 0.38896211981773376, 6.760949611663818, -0.16340836882591248, -1.4097046852111816, -3.7603485584259033, 12.876251220703125, 6.915197849273682, 3.4027373790740967, 9.687360763549805, -3.7357144355773926, 2.4593093395233154, 7.240843772888184, 10.706131935119629, 1.3194490671157837, 9.432827949523926, -0.7850820422172546, 6.4755072593688965, -3.706904888153076, -1.7843542098999023, -4.620924949645996, 13.59253978729248, 1.4719841480255127, -5.420749664306641, -1.4132044315338135, 4.19186544418335, 3.5625953674316406, 11.136853218078613, 3.6427624225616455, 1.515758991241455, 3.389756441116333, 8.470471382141113, 12.894721031188965, -4.161252021789551, 12.195939064025879, -4.385196685791016, -1.7521021366119385, 2.711824417114258, 3.181649923324585, 0.297929584980011, 7.371640205383301, 2.07808518409729, 10.773149490356445, 10.580636978149414, 13.032410621643066, -0.40049096941947937, 1.4975864887237549, 2.951413869857788, 13.377059936523438, 3.826010227203369, 2.034362316131592, -0.14295543730258942, 9.428221702575684, -4.415853500366211, -4.525622844696045, -0.5973482131958008, 3.0067520141601562, 0.04262496531009674, 2.046412706375122, 2.483957290649414, -1.6415032148361206, 6.870683193206787, 0.44537198543548584, -2.7594387531280518, 7.008498668670654, 0.29051142930984497, -4.743031978607178, -4.282266616821289, 0.1148669570684433, 9.927711486816406, 2.22890305519104, -1.5114212036132812, -0.043987952172756195, -5.5781707763671875, -1.454829216003418, 11.733251571655273, 2.4644205570220947, 0.19560828804969788, 0.10215191543102264, 14.141342163085938, 2.094242811203003, 12.723898887634277, -4.032718658447266, 0.94820636510849, 0.23724161088466644, 6.740036487579346, -1.3342357873916626, 3.3229451179504395, 8.570246696472168, 2.7552874088287354, 7.440169334411621, 14.220775604248047, 0.6514694094657898, 5.179464817047119, -2.3154590129852295, 13.48257827758789, 14.044302940368652, 1.2080695629119873, 9.258849143981934, 8.081339836120605, 0.4285544157028198, 8.446242332458496, -1.1354026794433594, 2.0882558822631836, 5.873257160186768, -4.387773036956787, 8.869010925292969, 12.680274963378906, 1.3680620193481445, 9.824603080749512, 13.539546012878418, 13.819357872009277, -0.05085138976573944, 9.38965892791748, 5.432363986968994, 2.101693868637085, -2.5838277339935303, 2.1601409912109375, 7.179089546203613, -0.19680364429950714, 5.504307270050049, -0.7709856033325195, 13.68851375579834, -4.463455677032471, -1.3550983667373657, 11.881979942321777, -0.07863171398639679, 14.133028030395508, -4.785550117492676, -4.524905204772949, 4.67458963394165, 2.213501453399658, -5.223025321960449, 6.155317306518555, -1.1453492641448975, -4.628788471221924, -4.370754718780518, -4.883419036865234, 1.2397665977478027, 6.404118537902832, -4.125412464141846, 5.505820274353027, 9.175337791442871, 2.948162794113159, 7.994574069976807, 1.7272365093231201, 0.6515579223632812, -3.611471652984619, 11.723965644836426, 4.078916072845459, 4.142738342285156, 3.5492167472839355, 2.6987597942352295, -1.5787526369094849, 10.112263679504395, 4.841960906982422, 10.692008018493652, 6.756309986114502, -4.4628448486328125, 12.72258186340332, 12.816941261291504, -4.54249382019043, -1.0009315013885498, -0.1876685619354248, 7.2303242683410645, -2.623818874359131 ], "z": [ -11.55898666381836, -8.720427513122559, -4.586028099060059, -11.849308013916016, -7.943033218383789, -10.721872329711914, 2.753598928451538, -8.179717063903809, -2.7833590507507324, -4.950562000274658, -7.346385478973389, -4.132875442504883, 4.962983131408691, 1.8459479808807373, -5.864584445953369, -6.221900939941406, -7.79464864730835, -6.071702003479004, -4.384982109069824, -3.543145179748535, -6.614151954650879, -1.9756041765213013, -3.8713624477386475, 1.2986172437667847, -4.5434770584106445, -5.076231002807617, -0.16871970891952515, -9.901494026184082, -0.3437176048755646, -11.80817985534668, -2.6497161388397217, -10.125020980834961, -1.8490959405899048, -3.282320499420166, -5.4733734130859375, -3.5777690410614014, -7.834588527679443, -7.003997802734375, -10.451191902160645, -8.70828628540039, -11.754592895507812, -1.657088279724121, -4.1025776863098145, -4.426784038543701, -1.860129952430725, -10.514453887939453, -3.9607558250427246, -6.653895378112793, -5.2328619956970215, -5.2953033447265625, -6.965317726135254, -5.953505039215088, -2.8599493503570557, -9.58108139038086, -3.7459335327148438, -4.18146276473999, -5.217652797698975, -5.069403171539307, -10.678008079528809, 2.297898530960083, -8.12919807434082, -0.12747922539710999, -8.056715965270996, -3.365886688232422, -3.2065823078155518, -7.729677200317383, -5.805863857269287, -4.375550746917725, 5.182742118835449, -5.152862548828125, 0.8383035063743591, -11.830866813659668, -5.542474746704102, -5.442371368408203, -3.987976312637329, -5.10561466217041, -6.171199321746826, -7.306985855102539, -5.082303047180176, -11.313131332397461, -2.823056697845459, -2.4286606311798096, -3.2329111099243164, -7.3290791511535645, -4.439760208129883, -2.8592660427093506, 2.306075096130371, -4.623790264129639, -2.2794506549835205, 3.7567784786224365, -6.456790447235107, -3.81225848197937, -4.78088903427124, -3.655085563659668, -9.525148391723633, -3.0000996589660645, -2.3971710205078125, -8.187856674194336, -4.841911792755127, -5.727846145629883, -7.023292064666748, 1.7457629442214966, -6.154674530029297, -9.078978538513184, -6.009461402893066, -4.521295547485352, -4.622961044311523, -2.281895399093628, -3.6558570861816406, -1.3289177417755127, -6.245406627655029, -4.705809116363525, -11.687593460083008, -6.791118144989014, -11.904860496520996, -2.6023261547088623, -8.435256958007812, -4.8100266456604, -11.374690055847168, -4.878171443939209, -3.6414172649383545, -1.6524732112884521, -4.544269561767578, -10.494725227355957, -5.173386573791504, -6.161469459533691, -0.3701936900615692, -5.546193599700928, -5.86267614364624, 4.302010536193848, 2.387235403060913, -9.84363079071045, -5.325887680053711, 1.9289768934249878, -4.7002739906311035, -11.942472457885742, -1.9621996879577637, -4.644162654876709, -7.087390422821045, -7.287871837615967, 3.5726630687713623, -3.8909764289855957, -4.367773532867432, -0.5784296989440918, -11.392043113708496, -4.946963787078857, -3.0091896057128906, -1.5381335020065308, -5.533845901489258, 3.4170596599578857, -4.019906997680664, -11.930548667907715, -5.103083610534668, -4.529864311218262, -4.445230484008789, -7.339771270751953, -8.539363861083984, -6.22062873840332, -10.139227867126465, -2.566486358642578, -7.811707496643066, -10.358830451965332, -9.481439590454102, -7.095065116882324, -7.395960807800293, -9.199102401733398, -6.359515190124512, -1.4006561040878296, -2.6597354412078857, -10.745891571044922, -8.646153450012207, -6.955854415893555, -7.8166069984436035, -8.144305229187012, 3.4128971099853516, -6.5354180335998535, -5.406836986541748, -1.7720465660095215, -0.01631959341466427, -4.451266288757324, -0.30715107917785645, -4.2230024337768555, 0.6072561740875244, -3.168635845184326, -3.7640814781188965, -11.895476341247559, 5.145760536193848, -5.642000198364258, 0.1301284283399582, -11.731674194335938, -11.994352340698242, -10.167104721069336, -8.679044723510742, -6.684276580810547, -8.332341194152832, -2.3611388206481934, -2.743149995803833, -6.307760238647461, -11.07996940612793, -7.829508304595947, -5.088894367218018, -6.533936500549316, -9.650403022766113, -3.7601022720336914, -6.122432231903076, -1.4404175281524658, -7.024602890014648, -2.457066774368286, -8.1852445602417, -4.971983432769775, -3.557011127471924, -11.988968849182129, 1.7456979751586914, -11.351536750793457, -8.390135765075684, -11.99555492401123, -3.1122703552246094, -11.914706230163574, -0.9951756596565247, -5.638162612915039, -4.432317733764648, -5.393882751464844, 0.15036576986312866, -4.687889099121094, -5.612040042877197, -3.428838014602661, -7.413618087768555, -5.860896110534668, -3.3453283309936523, -8.045114517211914, -5.658245086669922, -5.81728458404541, -1.6930625438690186, -5.252724647521973, -3.093247413635254, -3.43438458442688, -1.5565712451934814, -4.808370590209961, -6.621486186981201, -4.488162517547607, -7.634488105773926, -0.37377774715423584, -1.8779953718185425, -1.5115633010864258, -6.022412300109863, -6.164057731628418, 3.7280209064483643, -8.294407844543457, -6.336790084838867, -4.692972183227539, -10.216141700744629, -6.473692417144775, -3.40090274810791, -5.533727645874023, 0.6443198323249817, -7.436634063720703, -4.905362129211426, -5.034519672393799, -7.860538482666016, -8.14770221710205, -7.8488335609436035, -5.5028510093688965, 4.8237714767456055, -5.7374749183654785, -0.35736358165740967, -2.5513970851898193, -6.235482215881348, -3.119159460067749, -4.89811897277832, -3.331238269805908, -11.270548820495605, 4.061755657196045, -10.986214637756348, -5.873490333557129, -4.737090587615967, -5.543889045715332, -11.688711166381836, -4.942051410675049, -8.64941692352295, -8.94391918182373, -7.149824619293213, -4.402238368988037, -7.677778720855713, -7.779723644256592, -11.085433006286621, 1.3908021450042725, -3.9341583251953125, -9.288496017456055, -1.4110503196716309, -5.640522003173828, -3.3934080600738525, -5.178230285644531, 1.141713261604309, -5.348264217376709, -1.8245633840560913, -2.766857147216797, -5.329925537109375, -4.40902042388916, -5.457197189331055, -4.594233989715576, -3.719813585281372, -7.914721965789795, -7.695558071136475, -3.5908424854278564, 0.38222455978393555, -0.8690208196640015, -5.668587684631348, -6.343151569366455, -7.905178546905518, -5.1862688064575195, -4.433113098144531, -4.863656997680664, 3.154247283935547, -2.8754847049713135, -4.4337944984436035, -8.029945373535156, 2.1651926040649414, -9.828082084655762, -1.0791712999343872, -7.639917850494385, -5.533569812774658, -4.382483959197998, -4.1121625900268555, -2.4883902072906494, -9.05966854095459, -7.361113548278809, -1.127992033958435, -3.0259790420532227, -4.337498664855957, -7.22684907913208, -3.7638916969299316, -2.9815733432769775, -7.993737697601318, -5.532262325286865, -4.930290699005127, -6.529343605041504, -5.28941011428833, -6.269709587097168, -9.401144981384277, -8.249138832092285, 1.026442050933838, -6.731062412261963, -5.804935932159424, -3.19897723197937, -0.29700005054473877, -3.6707255840301514, -7.0366339683532715, -2.5085947513580322, 3.5388453006744385, -1.1866238117218018, -4.518500328063965, -5.873241424560547, -11.092483520507812, -0.11570470035076141, -2.0301480293273926, -4.99012565612793, -4.687368869781494, -4.982244968414307, -10.451203346252441, -3.6548447608947754, -11.582770347595215, 4.281044960021973, -11.005560874938965, -2.6355667114257812, -6.651893138885498, -3.038097858428955, 0.938366711139679, -3.853471279144287, -10.6152925491333, -4.907918930053711, -10.807096481323242, -11.712079048156738, -8.954936981201172, -0.11751504987478256, -4.117561340332031, -6.156259059906006, -9.289515495300293, -5.762171745300293, -3.284003734588623, -3.2561652660369873, -4.6527886390686035, -4.399907112121582, -10.911364555358887, -3.37939715385437, -6.382122993469238, -5.442935466766357, -11.410212516784668, -5.358147144317627, -12.060132026672363, -10.94332218170166, -7.637814044952393, -0.12015194445848465, -4.7764363288879395, -5.343525409698486, -0.49958348274230957, -3.956634283065796, -7.847753524780273, -4.948116779327393, 0.12189611047506332, -8.713976860046387, -6.276416301727295, -0.8390994668006897, -4.686158180236816, -4.08144998550415, -5.102473735809326, -1.5251365900039673, 3.463165521621704, -9.764908790588379, 0.1656426638364792, -10.87250804901123, -3.428739547729492, -3.0214715003967285, -1.4368022680282593, -6.635255813598633, -9.817567825317383, -2.8874123096466064, -3.7496731281280518, -0.1800733059644699, -2.50980806350708, -2.1667211055755615, -2.6366758346557617, -10.402261734008789, -6.845040798187256, -5.834780693054199, -7.248034954071045, -4.953550815582275, -9.113312721252441, 0.250684529542923, 0.06913436204195023, 0.8940771818161011, -11.848278999328613, -9.804121017456055, -8.12875747680664, 2.4281675815582275, -6.226993083953857, -7.764004707336426, -10.303509712219238, -5.044991970062256, -5.496092319488525, -7.283377170562744, -11.714855194091797, -8.176210403442383, -7.148736953735352, -2.3898773193359375, -6.453646183013916, -8.910049438476562, -6.148642063140869, -11.462176322937012, -9.436209678649902, -6.102930068969727, -11.649415969848633, -0.8003620505332947, -8.012540817260742, -11.795052528381348, -0.37506088614463806, -8.902336120605469, -4.394169330596924, -6.640049457550049, 0.21515995264053345, -5.890520095825195, -2.744603395462036, -4.493929386138916, -1.6957416534423828, -10.032668113708496, 4.883516788482666, -8.076478958129883, -1.0824462175369263, -8.300447463989258, -4.290074348449707, -5.38604736328125, -5.07572078704834, -11.057092666625977, -6.955334663391113, -3.662440061569214, -6.753798007965088, -5.252066135406494, 5.175827980041504, -11.073488235473633, -6.799788951873779, -9.170713424682617, 3.128044366836548, 4.425876140594482, -8.798023223876953, -0.7762895822525024, -3.9684362411499023, -11.534854888916016, -3.743471145629883, -11.163490295410156, -7.075694561004639, -5.73482608795166, -7.772233009338379, -3.477954149246216, -1.0426963567733765, -7.311435699462891, -2.8525137901306152, 4.001303672790527, 4.576021194458008, -6.692566394805908, -5.034015655517578, -6.699734210968018, -7.8252081871032715, -9.567848205566406, -2.4089062213897705, -6.045662879943848, -11.979235649108887, -6.630395412445068, -4.468752861022949, 4.570698261260986, -3.987588405609131, -8.830552101135254, -2.6871979236602783, -11.900578498840332, 4.319845199584961, -1.9023139476776123, -6.804488658905029, -3.9699015617370605, -3.6994259357452393, -0.026319235563278198, -4.70905065536499, -11.2097806930542, -3.1140987873077393, -3.144108533859253, -1.0529837608337402, -8.525023460388184, -5.568939208984375, -8.209943771362305, -4.623018264770508, -0.8267196416854858, -6.640819072723389, -4.021506309509277, -3.8234663009643555, -6.049319267272949, -4.263439655303955, -2.8534421920776367, -6.536315441131592, -7.401490688323975, -6.854971408843994, -6.724142074584961, -8.905938148498535, -2.910115957260132, -6.12882661819458, -3.889491081237793, -5.481695175170898, -3.988154888153076, -0.9497745037078857, -0.05490387976169586, -7.63749885559082, -11.313039779663086, -11.937177658081055, -5.446620464324951, -9.569853782653809 ] } ], "layout": { "height": 800, "legend": { "tracegroupgap": 0 }, "scene": { "domain": { "x": [ 0, 1 ], "y": [ 0, 1 ] }, "xaxis": { "title": { "text": "0" } }, "yaxis": { "title": { "text": "1" } }, "zaxis": { "title": { "text": "2" } } }, "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "heatmapgl": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmapgl" } ], "histogram": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "text": "3d TSNE Plot for Topic Model" }, "width": 900 } }, "text/html": [ "
\n", " \n", " \n", "
\n", " \n", "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "plot_model(lda, plot = 'tsne')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "T-distributed Stochastic Neighbor Embedding (t-SNE) is a nonlinear dimensionality reduction technique well-suited for embedding high-dimensional data for visualization in a low-dimensional space of two or three dimensions. \n", "\n", "__[Learn More](https://en.wikipedia.org/wiki/T-distributed_stochastic_neighbor_embedding)__" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 9.6 Uniform Manifold Approximation and Projection Plot" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/html": [ " \n", " " ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "plot_model(lda, plot = 'umap')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "UMAP (Uniform Manifold Approximation and Projection) is a novel manifold learning technique for dimensionality reduction. It is similar to tSNE and PCA in its purpose as all of them are techniques to reduce dimensionality for 2d/3d projections. UMAP is constructed from a theoretical framework based in Riemannian geometry and algebraic topology. \n", "\n", "__[Learn More](https://towardsdatascience.com/how-exactly-umap-works-13e3040e1668)__" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 10.0 Evaluate Model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Another way to analyze performance of models is to use `evaluate_model()` function which displays a user interface for all of the available plots for a given model. It internally uses the `plot_model()` function. See below example where we have generated Sentiment Polarity Plot for `Topic 3` using LDA model stored in `lda` variable." ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "ca1c7e1a351d408b9efbc6803987e482", "version_major": 2, "version_minor": 0 }, "text/plain": [ "interactive(children=(ToggleButtons(description='Plot Type:', icons=('',), options=(('Frequency Plot', 'freque…" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "evaluate_model(lda)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 11.0 Saving the model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As you get deeper into Natural Language Processing, you will learn that training time of topic models increases exponentially as the size of corpus increases. As such, if you would like to continue your experiment or analysis at a later point, you don't need to repeat the entire experiment and re-train your model. PyCaret inbuilt function `save_model()` allows you to save the model for later use." ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Model Succesfully Saved\n" ] } ], "source": [ "save_model(lda,'Final LDA Model 08Feb2020')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 12.0 Loading the model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To load a saved model on a future date in the same or different environment, we would use the PyCaret's `load_model()` function." ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Model Sucessfully Loaded\n" ] } ], "source": [ "saved_lda = load_model('Final LDA Model 08Feb2020')" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "LdaModel(num_terms=4596, num_topics=4, decay=0.5, chunksize=100)\n" ] } ], "source": [ "print(saved_lda)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 13.0 Wrap-up / Next Steps?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What we have covered in this tutorial is the entire workflow for Natural Language Processing experiment. Our task today was to create and analyze a topic model. We have performed several text pre-processing steps using `setup()` then we have created a topic model using `create_model()`, assigned topics to the dataset using `assign_model()` and analyze the results using `plot_model()`. All this was completed in less than 10 commands that are naturally constructed and very intuitive to remember. Re-creating the entire experiment without PyCaret would have taken well over 100 lines of code.\n", "\n", "In this tutorial, we have only covered basics of `pycaret.nlp`. In the next tutorial we will demonstrate the use of `tune_model()` to automatically select the number of topics for a topic model. We will also go deeper into few concepts and techniques such as `custom_stopwords` to improve the result of a topic model. \n", "\n", "See you at the next tutorial. Follow the link to __[Natural Language Processing (NLP102) - Level Intermediate](https://github.com/pycaret/pycaret/blob/master/Tutorials/Natural%20Language%20Processing%20Tutorial%20Level%20Intermediate%20-%20NLP102.ipynb)__" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.4" } }, "nbformat": 4, "nbformat_minor": 2 }