Friday, February 28, 2020

Bad Charting Part 2: Use the Right Chart Type

This is Part 2 in a series on avoiding the pitfalls of bad charting.Today we'll discuss the importance of using the right chart type for what you want to show, and what happens when you get that wrong.

Wrong Chart Type Examples

Before we get how to choose the right chart type, let's see some examples where using the wrong chart type killed the chart's effectiveness. These examples come from the European Environment Agency's Chart Do's and Don'ts site.

Households by Type

Below we see a stacked column chart depicting households by type. Each column totals 100% but is subdivided to break down percentage of various household types. Before going further, do any particular trends jump out at you from this chart?

The Stacked Column Chart is the wrong choice for this datasource: Chart Do's and Don'ts

Now consider the same data below plotted in a line chart. There's a dramatic difference, and now it's easy to see a nosedive in the Married Couples with Children category. In the earlier chart, this was all but unnoticeable. As this chartmaker did, you might make this the chief message of your chart and highlight it.


A Line Chart is a better choice for this data

Specialization in Nordic Labor Markets

In this example, a bubble chart was plotted over a map to show specialization in high-tech manufacturing and/or R&D in Nordic labor markets in 2005 (that's a mouthful). While the use of the map might be useful to someone very familiar with the local geography, the map obscures rather than helps. It's not the simplest way to show the winners and losers.

The Map hinders rather than helps this chart

The simple bar chart below is a lot easier to digest.


A Bar Chart is a better choice for this purpose


These examples show how important it is to use an appropriate chart type.

What is it You Want to Show?

Choosing the right chart type depends on your objective. Why do you want to visualize this data? Do you want to compare something? Show what something is composed of? Show distribution? Show a relationship? Answering this first question will reduce your chart choices from many to a handful.

Once you know your objective, you'll still have several chart types available. You can settle on the right chart type by also considering the number of variables you need to show, whether there are few or many data points, and whether you'll be showing values over time.

A guide such as the Chart Chooser diagram by Dr. Andrew Abela can be helpful. Keep in mind that this is not a definitive list. New chart types arise from time to time and some chart types wax and wane in popularity over time. For example, right now the pie chart has fallen into disfavor in some circles. You might want to prune your options to the chart types your organization is comfortable with.

Chart Chooser

Let's look at these 4 big categories of data visualization and see what makes them tick.

Comparison

In a comparison you have multiple items to compare or multiple datasets to compare. If you need to show exact values, consider a table in place of a chart.

When you have just a few categories a column chart is a good choice. For example, showing responses to a survey question or comparing sales for the last two years.



 Column Chart

When you have many categories or long names, a horizontal bar chart will work better.

Bar Chart

When you have multiple data series per category, a grouped column chart (also known as a grouped bar chart or a clustered bar graph) may work best. For example, imagine you want to break out responses to a survey question not only by the response given but also by age range.

Grouped Column Chart

To show trends over time for continuous data, use a line chart. For example, illustrating how various categories of expenses are trending over time.

Line Chart

Composition

In composition you are revealing what makes up a data set.

To show the composition of something with simple proportions, a pie chart is ideal. They are one of the most widely-understood chart types. Nevertheless there is currently a backlash against pie charts by some so you should carefully consider whether they are a good choice for your audience. You can help the pie chart's reputation by using them properly, that is to show percentages that add up to 100%.

Pie Chart

A donut chart is nothing more than a pie chart with a hole in the center. Any place you can use a pie chart a donut chart would be equally appropriate. One thing a donut chart can do that a pie chart can't do is show multiple data series by arranging multiple concentric rings around each other.


 Donut Chart

When you want to show trends over time combined with part-to-whole composition, you can use an area chart. Area charts have some similarities to line charts but use filled areas below the line. The categories are stacked upon each other instead of all being plotted from the baseline. In Area charts the individual trends are harder to make out. Here's an area chart showing subscription sales over the course of a year, with substrata for students, adults, businesses, and non-profits. We might gain some insights into how school schedules or holidays affect sales.

Area Charts

To show the composition of data across different categories, consider a stacked column chart. In the first chart below, sales revenue of various product categories are decomposed by season.

Stacked Column Chart

When you want to focus on the composition of the data, you can use a stacked column chart and show percentage in the Y-axis. This is sometimes called a Stacked Percent Chart and all columns will be the same height (100%). In the chart below, reusing the same data from the earlier example, apparel revenue percentage is shown, again broken down by season.

Stacked Percent Chart

Relationships

When you want to show how one variable is related to one or more other variables, consider a scatter plot chart or a bubble chart. Both can all be useful for showing suspected connections between the data.

A scatter plot chart is used to analyze the relationship between two variables (one on each axis). The pattern of intersecting points can highlight a possible relationship. The chart below suggests a relationship between the amount of sugar people eat and their likelihood of tooth decay.

Scatter Plot Chart 
source:  FactTank

A bubble chart is much like a scatter plot chart except it can show a third variable, in the size of each bubble (its area, not its diameter). The bubble chart below shows car sales, with price in the Y-axis, units sold in the X-axis, and revenue in in the bubble. We can tell type B generates the most revenue both by the size of the bubble and its placement.

Bubble Chart
source: Infogram.com

While the above charts are well-suited for relationship visualization, it's also possible to show relationships using more common chart types like line, column, and bar charts.

Distribution

Distribution charts show how variables are distributed over time, which can help identify trends and outliers.

Use a scatter plot can show the distribution of two variables. The scatter plot chart below shows Old Faithful geyser eruptions. It shows there are short-wait eruptions and long-wait eruptions.

Scatter Plat showing Distribution
source: Wikipedia

A histogram is another way to show distribution of continuous data. Histograms superficially resemble column charts but the columns have no spacing and indicate frequency, not specific values. The X-axis shows interval ranges and the Y-axis shows number of times values occurred. The histogram below shows the majority of customers wait between 35-50 seconds.


Histogram showing Customer Wait Time

Progress Toward Goals

The bullet graph, a 21st century chart, is a nice compact way to show progress toward a goal. These can be vertical or horizontal, and combine the idea of a thermometer chart with red/yellow/green ranges.



Many Variables

Radar Charts, also called Spider Charts, are useful when you have many variables. They let you plot a dozen or more variables and derive a shape from them. Comparing shapes then shows similarity.

In the chart below, attributes of the Overwatch League are plotted (blue shape). On top of that, attributes of the Twitch community are plotted (orange shape). The chart shows similarities in the two shapes, which helps make the case that Twitch is a good streaming home for the Overwatch League.

Radar Chart

source:  the eSports Group


Years back when I worked at Microsoft, we would use radar charts in developer usability studies. We would chart shapes for the developers we tested and compare them to what our APIs provided.

Summary

There are many kinds of charts because there are many purposes for visualizing data. We've reviewed some examples of how the wrong chart type obscures rather than clarifies. We learned how to choose the right chart type for your objective and reviewed some of the more common chart types.

In Part 3, we'll look at how Axis Abuse can make your charts misleading.

No comments: