Day 32 of 50 Days of Python: Advanced Data Visualisation with Seaborn
Part of Week 5: Data Analysis and Visualisation
Hello and welcome to Day 32! Today, we'll explore advanced data visualisation techniques using the Seaborn package. It’s built on top of Matplotlib so you’ll need it to use Seaborn. This simplifies the creation of complex visualisations and provides some aesthetically pleasing default styles. So will the jibber jabber done, let’s get into it.
Python Setup
As a sort of norm, I’ll once again let you know what you need to install for this to work:
pip install seaborn matplotlib pandas
Visualising Correlations with Heatmaps
We touched on heatmaps in day 31 very briefly. but these are great for visualising the correlation between variables in a dataset. They provide a colour-coded matrix that highlights relationships, which make it easier to identify patterns.
Example: Correlation Matrix of the Iris Dataset
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
# Load the iris dataset
iris = sns.load_dataset("iris")
# Compute the correlation matrix
corr = iris.corr()
# Set the aesthetic style of the plots
sns.set_theme(style="white")
# Create a mask for the upper triangle
mask = np.triu(np.ones_like(corr, dtype=bool))
# Set up the matplotlib figure
f, ax = plt.subplots(figsize=(11, 9))
# Generate a custom diverging colourmap
cmap = sns.diverging_palette(230, 20, as_cmap=True)
# Draw the heatmap with the mask and correct aspect ratio
sns.heatmap(corr, mask=mask, cmap=cmap, vmax=.3, center=0,
square=True, annot=True, linewidths=.5, cbar_kws={"shrink": .5})
plt.title("Correlation Matrix of Iris Dataset")
plt.show()
It’s easy to just copy and paste some code, but better to understand what its actually doing. So, here is a small breakdown so you can grasp the components of the above:
sns.load_dataset("iris"): Loads the Iris dataset.
iris.corr(): Computes the pairwise correlation of columns.
sns.set_theme(style="white"): Sets the plot theme to white.
np.triu(np.ones_like(corr, dtype=bool)): Creates a mask for the upper triangle to avoid redundancy.
sns.diverging_palette(230, 20, as_cmap=True): Creates a custom diverging colourmap.
sns.heatmap(...): Plots the heatmap with annotations and customisations.
Customising Heatmaps
Seaborn allows for some pretty decent customisation to enhance the readability and aesthetics of your heatmaps.
Changing Color Palettes
You can experiment with different colour palettes to suit your data and preferences:
# Using a different colour palette
cmap = sns.color_palette("coolwarm", as_cmap=True)
Adjusting Annotations
Annotations can be formatted to display values with specific precision:
sns.heatmap(corr, annot=True, fmt=".2f", cmap=cmap)
Next Up: Day 33 - Plotly for Interactive Visualisations
We’ll cover some interesting bits with Plotly. a great Python library that enables the creation of interactive and visually appealing data visualisations. So see you for the next one and as always… Happy coding!