Seaborn BoxPlot in Python

Ways to plot the Seaborn BoxPlot in Python

This tutorial provides a deep dive into seaborn.boxplot(), a powerful Python function for visualizing data distributions across different categories. We’ll cover everything from basic plotting to advanced customization, empowering you to generate informative and visually appealing boxplots.

What is a BoxPlot?

A boxplot, also known as a box-and-whisker plot, is a graphical representation of the distribution of numerical data based on five key summary statistics:

  • Minimum: The smallest data point excluding outliers.
  • First Quartile (Q1): 25th percentile of the data.
  • Median (Q2): 50th percentile of the data.
  • Third Quartile (Q3): 75th percentile of the data.
  • Maximum: The largest data point excluding outliers.

Boxplots effectively display the spread and skewness of data, identify potential outliers, and facilitate comparisons between groups.

Use the seaborn.boxplot() function for creating a Seaborn BoxPlot in Python

Seaborn, a Python data visualization library built on Matplotlib, provides a convenient way to create boxplots using the seaborn.boxplot() function.

Let’s start with a simple example:

import random 
import numpy as np
import seaborn as sns

# Generate random data
n = random.sample(range(0,50),30)  
arr = np.array(n)

# Create a boxplot
sns.boxplot(n)

This code snippet generates a basic boxplot for a single distribution.

To gain a deeper understanding of the data distribution, we can overlay a scatter plot on top of the boxplot using seaborn.stripplot().

See the example below.

import random  
import numpy as np
import seaborn as sns

# Generate random data
n = random.sample(range(0,50),30)  
arr = np.array(n)

# Create a boxplot
sns.boxplot(n)

# Add a scatter plot
sns.stripplot(n, color = 'red')

This addition provides a clearer picture of individual data points within the distribution.

One of the most powerful applications of boxplots is comparing distributions across different categories. Seaborn excels in this area and it is very helpful when comparing data across different categories.

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Create a sample DataFrame
df = pd.DataFrame({"Quantity": [5,6,7,8,5,6,7,8,5,6,7,8,5,6,7,8],
                     "Price": [9,10,15,16,13,14,15,18,11,12,14,15,16,17,18,19],
                     "Day" : [1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2],
                     "Product": ['A','A','A','A','B','B','B','B',
                                  'A','A','A','A','B','B','B','B']})

# Create a boxplot for categorical data
sns.boxplot(data = df, y = "Price", x = "Quantity")

Seaborn offers a wide range of customization options for boxplots:

  • linewidth: Controls the thickness of the boxplot lines.
  • palette: Modifies the colors of the boxes.
  • orient: Changes the orientation of the plot (vertical or horizontal).
  • ylim and xlim: Sets the limits of the y and x axes.

Here’s an example incorporating some of these parameters

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Create a sample DataFrame
df = pd.DataFrame({"Quantity": [5,6,7,8,5,6,7,8,5,6,7,8,5,6,7,8],
                     "Price": [9,10,15,16,13,14,15,18,11,12,14,15,16,17,18,19],
                     "Day" : [1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2],
                     "Product": ['A','A','A','A','B','B','B','B',
                                  'A','A','A','A','B','B','B','B']})

# Create a customized boxplot
sns.boxplot(data = df, y = "Price", x = "Quantity", hue = 'Product', 
            linewidth = 2.5 , palette = 'Set2')

Use the seaborn.catplot() function for creating a Seaborn BoxPlot in Python

The seaborn.catplot() function provides a higher-level interface for creating categorical plots, including boxplots. To create a boxplot using catplot(), simply set the kind parameter to “box”.

See the code below.

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Create a sample DataFrame
df = pd.DataFrame({"Quantity": [5,6,7,8,5,6,7,8,5,6,7,8,5,6,7,8],
                     "Price": [9,10,15,16,13,14,15,18,11,12,14,15,16,17,18,19],
                     "Day" : [1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2],
                     "Product": ['A','A','A','A','B','B','B','B',
                                  'A','A','A','A','B','B','B','B']})

# Create a customized boxplot
sns.catplot(data = df, y = "Price", x = "Quantity", hue = 'Product', kind="box")

Conclusion

Seaborn’s boxplot() function is a versatile tool for visualizing and comparing data distributions. By understanding its various parameters and customization options, you can generate informative and visually appealing boxplots that reveal valuable insights from your data. Sources and related content

Use AI tools like ChatGPT and Gemini to learn coding efficiently!

You can also use AI tools like Gemini and ChatGPT to recreate the methods mentioned in the article and in more detail. It is free to register on these tools and you do not need any premium membership to use the prompts mentioned below.

seaborn boxplot in python

seaborn boxplot in python using catplot

Happy Learning!

Explore more from this category at Python Seaborn and Matplotlib. Alternatively, search and view other topics at All Tutorials.