Sentiment Analysis in Power BI

How to use natural language sentiment analysis in your text data with Power BI.

I am really starting to fall in love with Power BI now that I have the ability to use Python scripts to transform my data and bring my visuals to a whole new level. We will use the NLTK Sentiment Intensity Analyzer that will iterate over each of our comments and provide a polarity score that ranges from 1 to -1.

How to Use Sentiment Analysis in Power BI using Python’s Natural Language Processing Tool Kit

This is a super easy script to write. We will perform the following steps prior to opening up our Power BI desktop. Here are the steps will take to use NLTK’s sentiment analyzer. Visualize Data with Python. Interpret Large Datasets. A/B Test Your Hypothesis. Save 50%!

Load our data into Power BI.
Write 4 lines of Python script.
Create a conditional column with our polarity scores.

For the first part, I didn’t use a special dataset. I just wrote 6 lines of varying sentiment that our script will analyze. Let’s take a look at the data:

From the data you can see that some are negative, some are positive and some are neutral. Once this data is loaded into Power BI, we can initiate the Python script.

Access the Python Script

Once your Script Window is open, you will import the following script.

#load in our dependencies
import pandas as pd
from nltk.sentiment.vader import SentimentIntensityAnalyzer

#load in the sentiment analyzer
sia=SentimentIntensityAnalyzer()

#apply the analyzer over each comment
dataset['polairty scores'] =dataset['Message'].apply(lambda x: sia.polarity_scores(x)['compound'])

If you look at the code, I have annotated or documented the code using the hashtag. Also, remember that the default variable for the dataset in Power BI is called dataset. You will need to expand the new table to get the resulting output of the sentiment analysis function.

Once you expand the table you will have the results of the polarity scores.

Remember that sentiment ranges from -1 to 1. With -1 being the most negative and 1 being the most positive. Now I am going set up a conditional column with the words “positive”, “negative”, and “neutral”.

Visualize your Results

The last part of the adventure is to visualize and communicate your results using the Power BI visuals.

Check out the video Instructions:

Gaelim Holland

5 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Apoorv

6 years ago

Always gives an error since I am using Anaconda for running python scripts. It says –
Missing required dependencies {0}”.format(missing_dependencies))
ImportError: Missing required dependencies [‘numpy’]

ErrorCode=-2147467259
ExceptionType=Microsoft.PowerBI.Scripting.Python.Exceptions.PythonScriptRuntimeException

Author

Reply to Apoorv

Yeah, this is an issue with Power BI finding your Python folder. You need to change this option in Power BI. Make sure in the Python Scripting option, its point to the write folder:
https://docs.microsoft.com/en-us/power-bi/desktop-python-scripts

Dustin

Great post! Thank you. Heads up that you have a variable misspelled in your sample code “polairty score”

Ken

Is this the right path for my python to be used in powerbi? c:\Users\ma.o.p.tolentino\AppData\Local\Programs\Python\Python38-32

Ankur Goswami

Hi, I am new to Python world. Your codes are the first of it’s kind, which i have seen so far. Tried using Polarity Score in my data and copied the codes as you said. But I got an error message.

Formula.Firewall: Query ‘Reason For Leaving Comments’ (step ‘Run Python script’) references other queries or steps, so it may not directly access a data source. Please rebuild this data combination.

Can you please guide me to fix ths.