kulifmor.com

Crafting Data Visualizations for Medium Stories with Matplotlib

Written on

Chapter 1: Introduction to Data Visualization

In this guide, I will walk you through the process of creating the data visualization displayed above. This tutorial is designed to be fast-paced and is not overly detailed, which should suit your busy schedule. The visualization we will create is somewhat complex, featuring sub-plots and numerous reusable functions. While this may seem daunting for those less familiar with Matplotlib, don't worry—by the end, you’ll be able to replicate these visualizations without altering any code. Ready to dive in? Let’s begin!

Step 1: Import Required Libraries

To get started, you’ll need to import the following libraries:

import requests

import numpy as np

import pandas as pd

import seaborn as sns

import matplotlib.pyplot as plt

from PIL import Image

from matplotlib.transforms import blended_transform_factory

Step 2: Fetch the Data

I have sourced a dataset from one of my articles for this tutorial. You can access it using this code:

data = requests.get(

).json()

If you prefer to work with your own data, check out this tutorial on extracting and preparing story data from Medium.

Step 3: Set a Seaborn Style

I always start by establishing a style with Seaborn to enhance the aesthetics of my charts. Feel free to customize the style as you see fit. Here are my specific settings for this visualization:

font_family = "Work Sans"

background_color = "#302C32"

grid_color = "#F4CAB9"

text_color = "#ffffff"

edgecolor = "#01110A"

sns.set_style({

"axes.facecolor": background_color + "00",

"figure.facecolor": background_color,

"axes.edgecolor": text_color,

"axes.grid": False,

"axes.axisbelow": True,

"grid.color": grid_color,

"text.color": text_color,

"font.family": font_family,

"xtick.color": text_color,

"ytick.color": text_color,

"xtick.bottom": False,

"xtick.top": False,

"ytick.left": False,

"ytick.right": False,

"axes.spines.left": False,

"axes.spines.bottom": False,

"axes.spines.right": False,

"axes.spines.top": False,

})

My aim was to produce a dark and minimalist chart that is visually appealing.

The video titled "Full Machine Learning Project — Data Visualization with Matplotlib (Part 3)" provides a detailed walkthrough of creating visualizations using Matplotlib. It focuses on practical applications and best practices, making it a perfect complement to this guide.

Step 4: Define Helper Functions

I’ve created some utility functions for later use. The first two functions convert the Matplotlib figure into a PIL image, making it simpler to manage padding and combine multiple charts into one visualization. The last function generates a list of dates for visualizing my story data:

def create_image_from_figure(fig):

plt.tight_layout()

fig.canvas.draw()

data = np.frombuffer(fig.canvas.tostring_rgb(), dtype=np.uint8)

data = data.reshape((fig.canvas.get_width_height()[::-1]) + (3,))

plt.close()

return Image.fromarray(data)

def add_padding_to_chart(chart, left, top, right, bottom, background):

size = chart.size

image = Image.new("RGB", (size[0] + left + right, size[1] + top + bottom), background)

image.paste(chart, (left, top))

return image

def get_dates(story_data):

start = pd.to_datetime(min(story_data.keys())).replace(day=1)

end = pd.to_datetime(max(story_data.keys()))

delta = end - start

date_list = [(start + pd.Timedelta(days=i)).strftime('%Y-%m-%d') for i in range(delta.days + 1)]

return date_list

While Matplotlib offers functionality for padding, I often find it tricky to navigate! 😅

Step 5: Create Data Functions

Next, I've constructed data functions to efficiently extract subsets of data for visualization. I plot one month at a time in sub-plots and have made these functions reusable.

def list_earnings(dates, stats):

result = []

for d in dates:

if d not in stats.keys():

result.append(0)

else:

result.append(stats[d].get("earning", 0) / 100)

return result

def list_statistic(dates, stats, readers, statistic):

result = []

for d in dates:

if d not in stats.keys():

result.append(0)

else:

value = sum(stats[d][reader][statistic] for reader in readers)

result.append(value)

return result

def list_total(dates, stats, field):

return list_statistic(dates, stats, ["member", "nonmember"], field)

def list_nonmember(dates, stats, field):

return list_statistic(dates, stats, ["nonmember"], field)

def list_member(dates, stats, field):

return list_statistic(dates, stats, ["member"], field)

I prioritize writing readable code. Functions like total_reads provide clarity compared to using a simple sum().

Step 6: Develop Plotting Functions

This step involves the most complexity, but with some experimentation, you’ll grasp the concepts quickly. One challenge is how I handle earnings differently due to its unique structure. The plot_grid_lines() function creates grid lines that extend across multiple subplots.

def plot_earnings(ax, dates, stats):

sns.barplot(

ax=ax, x=dates, y=list_earnings(dates, stats),

facecolor="#2EC4B6", edgecolor=edgecolor, saturation=1, width=1,

)

def plot_bars(ax, dates, stats, settings):

sns.barplot(

ax=ax, x=dates, y=settings["function"](dates, stats, settings["field"]),

facecolor=settings["color"], edgecolor=edgecolor, saturation=1, width=1,

)

def write_month(ax, date):

ax.annotate(

pd.to_datetime(date).month_name(), (0.5, -0.04), ha="center", va="top", fontsize=32,

annotation_clip=False, xycoords="axes fraction", fontweight=500

)

def plot_grid_lines(fig, line_start=0.081, is_earnings=False):

transform = blended_transform_factory(fig.transFigure, fig.axes[0].transData)

for y in fig.axes[0].get_yticks()[1:-1]:

fig.axes[0].annotate(

text="${:,}".format(int(y)) if is_earnings else "{:,}".format(int(y)),

xy=(0.075, y), ha="right", va="center", fontsize=32, fontweight=500,

annotation_clip=False, xycoords=("figure fraction", "data")

)

line = plt.Line2D([line_start, 1], [y, y], transform=transform, color=text_color + "22", zorder=-1)

fig.lines.extend([line])

line = plt.Line2D([line_start, 1], [0, 0], transform=transform, color=text_color, zorder=10)

fig.lines.extend([line])

Step 7: Create Chart Functions

In this step, I define functions to generate individual charts for the various metrics I want to analyze. Each function follows a similar structure but varies in its input.

def create_earnings_chart(data, dates):

title = "Total Earnings: ${:,}".format(int(round(sum(list_earnings(dates, data)))))

earnings_chart = create_bar_chart(data, dates, [{"function": plot_earnings}], title, is_earnings=True)

return earnings_chart

def create_views_chart(data, dates):

total_views = sum(list_total(dates, data, "readersThatViewed"))

total_reads = sum(list_total(dates, data, "readersThatRead"))

title = "Total Views: {:,} ( Read: {:.1f}% )".format(total_views, 100 * total_reads / total_views)

view_chart = create_bar_chart(data, dates, [

{"function": list_total, "field": "readersThatViewed", "color": "#DB546144"},

{"function": list_total, "field": "readersThatRead", "color": "#DB5461"}

], title)

return view_chart

Step 8: Assemble the Final Visualization

Now, let’s put everything together. Here’s the code to download the data, generate the list of dates, create metrics for each chart, and merge them into a single final visualization:

def create_chart(charts):

chart = Image.new('RGB', (

charts[0].size[0], sum(c.size[1] for c in charts)

))

y_offset = 0

for c in charts:

chart.paste(c, (0, y_offset))

y_offset += c.size[1]

return chart

data = requests.get(

).json()

dates = get_dates(data)

charts = []

title_text = "Spaces vs. Tabs: Impact on Salaries"

charts.append(create_title(title_text))

charts.append(create_earnings_chart(data, dates))

charts.append(create_views_chart(data, dates))

charts.append(create_members_chart(data, dates))

charts.append(create_claps_chart(data, dates))

chart = create_chart(charts)

The resulting chart mirrors the one showcased at the beginning of this tutorial.

Data Visualization Example

Conclusion

This tutorial provided a quick overview of how to create visualizations for your Medium stories. By using the code shared here, you can gain insights into how your articles perform over time. I encourage you to apply these concepts to your own data and enhance the code to suit your needs. I hope you found this guide useful, and I look forward to seeing you next time! 😄

The video titled "Data Storytelling in Python" discusses how to effectively convey data narratives through visualizations, aligning perfectly with the principles covered in this tutorial.

Share the page:

Twitter Facebook Reddit LinkIn

-----------------------

Recent Post:

The Pursuit of Infinite Potential: Exploring Life's Legacy

Reflecting on life’s legacy and personal ambition to leave a meaningful mark.

The Future of Apple Car: Insights from iOS 16 CarPlay Features

Apple's iOS 16 CarPlay features hint at a future electric vehicle, showcasing new capabilities and a broader integration with car systems.

Constructing the Riemann Integral: A Deep Dive into Real Analysis

Explore the construction of the Riemann integral, its ties to derivatives, and practical applications in calculus.

Raising Awareness: The Dangers of Delta-8 THC for Children

A critical look at how misleading cannabis packaging can endanger children and the necessary measures to mitigate risks.

Supercharge Your macOS Experience with These 7 Essential Apps

Discover seven powerful macOS applications that enhance productivity and usability, most of which are free to use.

Unforeseen Changes: The Impact of COVID-19 on My Cousin's Life

A gripping account of how COVID-19 transformed my cousin Mike's life, from a vibrant athlete to a struggle for recovery.

Letting Go: Embracing Healing and Resilience in Life

A reflection on overcoming pain and focusing on self-healing instead of pursuing the source of our suffering.

Understanding Thales's Theorem: A Comprehensive Guide

Explore the proof of Thales's Theorem through visual aids and detailed explanations.