Create a Seaborn Scatterplot

A scatterplot is one of the best ways to visually view the correlation between two numerical variables. Seaborn has a number of different scatterplot options that help to provide immediate insights.  This tutorial will show you how to quickly create scatterplots and style them to fit your needs.

To create a scatterplot you will need to load in your data and essential libraries.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

These libraries are essential to load in your data which in this case we will be loading in a data set of diamonds prices and features. You can find the dataset here.

diamonds = pd.read_csv(‘diamonds.csv’)

Create Basic Scatterplot

You can create a basic scatterplot with 3 basic parameters x, y, and dataset. Your x and y will be your column names and the data will be the dataset that you loaded prior.

sns.scatterplot(x=’carat’,y=’price’,data=data)

build a basic seaborn scatterplot with x,y, and your dataset.

As you see there is a lot of data here and the style of the individual dots are too closely fixed on the graph to see clearly so lets style the plot by changing the marker used to describe each individual diamond. To change the marker you simply need to add the marker parameter to the code. In the plot below, I am adding “+” as my marker with marker=”+”.

 

you can change the marker in the seaborn scatterplot with the marker parameter.

Change the Color of the Markers

The next step would be to change the color of the markers to get a better understanding of what these closely correlated markers mean. We can use the hue parameter to categorize the markers. Each category will have a color. Naturally, to categorize the data, your data must be either a string or a categorical variable, in this case, we can use the diamond cut quality to produce different categories.


Changing the category colors of your data in the scaterplot with hue parameter in seaborn.

Change the Size of the Markers

You can easily change the size of the markers by adding in the size parameter. You will need to define the size parameter by setting which part of your data is determining the size. In this example. I am going to use the carat to determine the size of the individual markers.

Change the size of each marker with the size parameter in the seaborn scatterplot.

Putting it all Together

Let’s take a look a the final plat and the final code that you need to create the visual below.

#load in the libraries

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

# load your data

data = pd.read_csv('diamonds.csv')

#create your scatter plot

plt.title('Diamond Price and Carat Size')

sns.scatterplot(x='carat',y='price',marker='+', hue='cut', size='carat',data=data)

 

BONUS:

Marker Colors
Styling the Marker Colors with the palette parameter. You can choose from all the individual Matplotlib Color Palettes

Plot Background
Change the plot background with the using the plt.style.use() function. You can find a ton of different Matplotlib Style Templates

The final result will look like the plot below:

Use the palette  parameter to style your seaborn scatter plot

About the Author

Leave a Reply

Be the First to Comment!

Notify of
avatar