13: interactivity

Inter-topic Distance Map (Multidimensional Scaling) The visualization represents different topics as clusters in a two-dimensional space, showing how topics relate to each other based on the distances between them. Each cluster grouping can help identify overarching themes or categories in the dataset, useful for understanding the diversity of content or for segmenting data for more focused analysis.

Advanced ML Analytics

Topic Modeling

from gensim.parsing.preprocessing import preprocess_string
from gensim import corpora, models
import gensim
import pyLDAvis.gensim_models as gensimvis
import pyLDAvis
import pandas as pd
# Load the dataset
data = pd.read_csv('main.csv')

# Preprocess the text
def preprocess(text):
    return preprocess_string(text)

# Apply preprocessing
processed_docs = data['cleaned'].map(preprocess)

# Create a dictionary and corpus needed for Topic Modeling
dictionary = corpora.Dictionary(processed_docs)
corpus = [dictionary.doc2bow(doc) for doc in processed_docs]

# LDA model
lda_model = models.LdaModel(corpus, num_topics=5, id2word=dictionary, passes=15, random_state=42)

# Display the topics
topics = lda_model.print_topics(num_words=5)
for topic in topics:
    print(topic)

# Visualize the topics
pyLDAvis.enable_notebook()
# Visualize the topics with adjusted size
lda_display = gensimvis.prepare(lda_model, corpus, dictionary, sort_topics=False, 
                                plot_opts={'width': 800, 'height': 600})  # Adjust width and height as needed
pyLDAvis.display(lda_display)

(0, '0.022*"leav" + 0.021*"forc" + 0.020*"william" + 0.020*"wife" + 0.020*"appeal"')
(1, '0.032*"joe" + 0.031*"giudic" + 0.029*"cheat" + 0.029*"christian" + 0.017*"pollster"')
(2, '0.150*"thehil" + 0.060*"trump" + 0.017*"report" + 0.014*"dem" + 0.012*"gop"')
(3, '0.027*"win" + 0.021*"return" + 0.020*"season" + 0.019*"past" + 0.018*"bring"')
(4, '0.045*"new" + 0.034*"deni" + 0.031*"chef" + 0.030*"kitchen" + 0.026*"star"')

C:\Users\srinivas\AppData\Roaming\Python\Python38\site-packages\pyLDAvis\_prepare.py:243: FutureWarning:

In a future version of pandas all arguments of DataFrame.drop except for the argument 'labels' will be keyword-only

Sunburst Chart of Sentiment and Emotion This sunburst chart provides a hierarchical view of sentiment and emotion classification within your dataset. The inner rings may represent high-level sentiments (like positive, neutral, negative), while the outer rings show specific emotions (joy, anger, fear, etc.). This visualization is beneficial for quickly assessing the emotional tone and variability across the content, aiding in tasks like sentiment analysis or consumer perception studies.

Emotion Recognition and Sentiment Analysis

import pandas as pd
import plotly.express as px
import plotly.graph_objects as go

# Load the dataset
data = pd.read_csv('main.csv')
# Create a sunburst chart with sentiment (inside) and emotion (outside) using Plotly
fig_sunburst = px.sunburst(data, path=['sentiment', 'emotion'], title='Sunburst Chart of Sentiment and Emotion',
                           color_continuous_scale='RdBu')
fig_sunburst.show()

C:\Users\srinivas\AppData\Roaming\Python\Python38\site-packages\plotly\io\_renderers.py:396: DeprecationWarning:

distutils Version classes are deprecated. Use packaging.version instead.

C:\Users\srinivas\AppData\Roaming\Python\Python38\site-packages\plotly\io\_renderers.py:396: DeprecationWarning:

distutils Version classes are deprecated. Use packaging.version instead.