Friday, February 28, 2020

Bad Charting Part 2: Use the Right Chart Type

This is Part 2 in a series on avoiding the pitfalls of bad charting.Today we'll discuss the importance of using the right chart type for what you want to show, and what happens when you get that wrong.

Wrong Chart Type Examples

Before we get how to choose the right chart type, let's see some examples where using the wrong chart type killed the chart's effectiveness. These examples come from the European Environment Agency's Chart Do's and Don'ts site.

Households by Type

Below we see a stacked column chart depicting households by type. Each column totals 100% but is subdivided to break down percentage of various household types. Before going further, do any particular trends jump out at you from this chart?

The Stacked Column Chart is the wrong choice for this datasource: Chart Do's and Don'ts

Now consider the same data below plotted in a line chart. There's a dramatic difference, and now it's easy to see a nosedive in the Married Couples with Children category. In the earlier chart, this was all but unnoticeable. As this chartmaker did, you might make this the chief message of your chart and highlight it.


A Line Chart is a better choice for this data

Specialization in Nordic Labor Markets

In this example, a bubble chart was plotted over a map to show specialization in high-tech manufacturing and/or R&D in Nordic labor markets in 2005 (that's a mouthful). While the use of the map might be useful to someone very familiar with the local geography, the map obscures rather than helps. It's not the simplest way to show the winners and losers.

The Map hinders rather than helps this chart

The simple bar chart below is a lot easier to digest.


A Bar Chart is a better choice for this purpose


These examples show how important it is to use an appropriate chart type.

What is it You Want to Show?

Choosing the right chart type depends on your objective. Why do you want to visualize this data? Do you want to compare something? Show what something is composed of? Show distribution? Show a relationship? Answering this first question will reduce your chart choices from many to a handful.

Once you know your objective, you'll still have several chart types available. You can settle on the right chart type by also considering the number of variables you need to show, whether there are few or many data points, and whether you'll be showing values over time.

A guide such as the Chart Chooser diagram by Dr. Andrew Abela can be helpful. Keep in mind that this is not a definitive list. New chart types arise from time to time and some chart types wax and wane in popularity over time. For example, right now the pie chart has fallen into disfavor in some circles. You might want to prune your options to the chart types your organization is comfortable with.

Chart Chooser

Let's look at these 4 big categories of data visualization and see what makes them tick.

Comparison

In a comparison you have multiple items to compare or multiple datasets to compare. If you need to show exact values, consider a table in place of a chart.

When you have just a few categories a column chart is a good choice. For example, showing responses to a survey question or comparing sales for the last two years.



 Column Chart

When you have many categories or long names, a horizontal bar chart will work better.

Bar Chart

When you have multiple data series per category, a grouped column chart (also known as a grouped bar chart or a clustered bar graph) may work best. For example, imagine you want to break out responses to a survey question not only by the response given but also by age range.

Grouped Column Chart

To show trends over time for continuous data, use a line chart. For example, illustrating how various categories of expenses are trending over time.

Line Chart

Composition

In composition you are revealing what makes up a data set.

To show the composition of something with simple proportions, a pie chart is ideal. They are one of the most widely-understood chart types. Nevertheless there is currently a backlash against pie charts by some so you should carefully consider whether they are a good choice for your audience. You can help the pie chart's reputation by using them properly, that is to show percentages that add up to 100%.

Pie Chart

A donut chart is nothing more than a pie chart with a hole in the center. Any place you can use a pie chart a donut chart would be equally appropriate. One thing a donut chart can do that a pie chart can't do is show multiple data series by arranging multiple concentric rings around each other.


 Donut Chart

When you want to show trends over time combined with part-to-whole composition, you can use an area chart. Area charts have some similarities to line charts but use filled areas below the line. The categories are stacked upon each other instead of all being plotted from the baseline. In Area charts the individual trends are harder to make out. Here's an area chart showing subscription sales over the course of a year, with substrata for students, adults, businesses, and non-profits. We might gain some insights into how school schedules or holidays affect sales.

Area Charts

To show the composition of data across different categories, consider a stacked column chart. In the first chart below, sales revenue of various product categories are decomposed by season.

Stacked Column Chart

When you want to focus on the composition of the data, you can use a stacked column chart and show percentage in the Y-axis. This is sometimes called a Stacked Percent Chart and all columns will be the same height (100%). In the chart below, reusing the same data from the earlier example, apparel revenue percentage is shown, again broken down by season.

Stacked Percent Chart

Relationships

When you want to show how one variable is related to one or more other variables, consider a scatter plot chart or a bubble chart. Both can all be useful for showing suspected connections between the data.

A scatter plot chart is used to analyze the relationship between two variables (one on each axis). The pattern of intersecting points can highlight a possible relationship. The chart below suggests a relationship between the amount of sugar people eat and their likelihood of tooth decay.

Scatter Plot Chart 
source:  FactTank

A bubble chart is much like a scatter plot chart except it can show a third variable, in the size of each bubble (its area, not its diameter). The bubble chart below shows car sales, with price in the Y-axis, units sold in the X-axis, and revenue in in the bubble. We can tell type B generates the most revenue both by the size of the bubble and its placement.

Bubble Chart
source: Infogram.com

While the above charts are well-suited for relationship visualization, it's also possible to show relationships using more common chart types like line, column, and bar charts.

Distribution

Distribution charts show how variables are distributed over time, which can help identify trends and outliers.

Use a scatter plot can show the distribution of two variables. The scatter plot chart below shows Old Faithful geyser eruptions. It shows there are short-wait eruptions and long-wait eruptions.

Scatter Plat showing Distribution
source: Wikipedia

A histogram is another way to show distribution of continuous data. Histograms superficially resemble column charts but the columns have no spacing and indicate frequency, not specific values. The X-axis shows interval ranges and the Y-axis shows number of times values occurred. The histogram below shows the majority of customers wait between 35-50 seconds.


Histogram showing Customer Wait Time

Progress Toward Goals

The bullet graph, a 21st century chart, is a nice compact way to show progress toward a goal. These can be vertical or horizontal, and combine the idea of a thermometer chart with red/yellow/green ranges.



Many Variables

Radar Charts, also called Spider Charts, are useful when you have many variables. They let you plot a dozen or more variables and derive a shape from them. Comparing shapes then shows similarity.

In the chart below, attributes of the Overwatch League are plotted (blue shape). On top of that, attributes of the Twitch community are plotted (orange shape). The chart shows similarities in the two shapes, which helps make the case that Twitch is a good streaming home for the Overwatch League.

Radar Chart

source:  the eSports Group


Years back when I worked at Microsoft, we would use radar charts in developer usability studies. We would chart shapes for the developers we tested and compare them to what our APIs provided.

Summary

There are many kinds of charts because there are many purposes for visualizing data. We've reviewed some examples of how the wrong chart type obscures rather than clarifies. We learned how to choose the right chart type for your objective and reviewed some of the more common chart types.

In Part 3, we'll look at how Axis Abuse can make your charts misleading.

Saturday, February 15, 2020

Bad Charting Part 1: Charts vs. Infographics

In this post I'll share tips on responsible charting and how to avoid common pitfalls that can result in misleading business graphics. We'll do that by looking at examples of poor charts and pointing out the principles that were violated. In this first post, we'll make the distinction between charts and infographics.

A Bad Chart about Why Charts are Bad

Charts: They're All Around Us

Charts are all around us. Many of us rely on them and create them in our work. Dashboards abound in business and billions of dollars are spent annually on business intelligence products. Charts are a regular output of stock markets, scientists, researchers, statisticians, and governments. They're all over social media, web sites, and television. Charts are part of life.

Just as publishing came to the masses via word processing, desktop publishing, and blogging the same thing is happening with charting. Charts can now be created in just a few clicks from office productivity software, charting apps, and cloud services. Marketing campaigns frequently leverage infographics in their messaging.

There are good charts, bad charts, and pretender charts. We live in a time where honest communication is in short supply. There's a lot of deliberate misinformation online and charts are often a vehicle for delivering it. An even larger problem is charts that are unintentionally misleading. That comes from people who don't know the fundamentals of good charting, don't understand their source data well, or just copy-and-paste from somewhere else. These days, charts are created for the most trivial subjects or to express humor in social media posts.

I wonder why that would be...
source: https://imgur.com/uCbnAaf


Just What is a Chart Anyway?

Before we go any further we should define what a chart (or graph) is.

A chart is a data visualization. That is, a graphical representation of some numerical data. It's not possible to create a good chart from poor data, so starting with data you understand is critical. You have a duty to select appropriate and complete source data. You have a duty to portray that data responsibly. You have a duty to provide reference to the source data.

A chart is also a type of content, which means you will make decisions that affect how the chart is perceived by viewers. Your chart tells a story or carries a message, even if you don't intend it to. Be aware that some readers will take your content at face value.

A chart is also data reduction, which obligates you to reduce the data in a responsible way. Your viewer should understand how the data was arrived at.

Finally, a chart is a picture. We all know a picture is worth 1,000 words. Your visual choices imply underlying information. The emotion your chart triggers in your viewer is strong than any figures on the page or text.

Charting, then, is a great responsibility. You are communicating about data that matters greatly to somebody. It needs to be truthful and clear if you want to maintain a reputation for trustworthiness. As we shall see, not everyone takes these responsibilities seriously.

Charts are not Infographics

One of the reasons we have rampant problems with charts is the confusion between a business chart and an infographic. An infographic is "a collection of imagery, charts, and minimal text that gives an easy-to-understand overview of a topic." Infographics often contain charts, and they simply aren't created with the same rigor that business charts are. Let's look at a chart from an infographic:


The above chart is so bad it made Business Insider's list of the Worst Charts of All Time. We can note some deficiencies in this chart:
  1. It has no title, but we can intuit we are being shown sales revenue for fast food restaurants. 
  2. The Y axis isn't labelled, but we can intuit billions of dollars from the chart value labels. 
  3. Columns are replaced with corporate logos, and that's a problem. Look at Burger King ($11.3B in sales) vs. McDonald's ($41B). Is the McD logo 4 times the BK logo? No, it's more like 12 times the area. The "clever" use of logos is a bad idea because it breaks the visual contract with the chart reader.
  4. The data set isn't identified. This clearly isn't all restaurants but we aren't told the criteria for the ones that are included. 
  5. The time frame isn't identified either. Which year's sales revenue are we being shown?
  6. The chart values are an apples and oranges comparison, because the country of Afghanistan is included in the mix.
Below you can see our chart is part of a larger infographic from Princeton University. Now we can see the chart's purpose in support of a message about how Starbucks and McDonald's are global hubs that connect some of the earth's wealthiest and poorest countries. The deficiencies we noted would likely be explained away by the designer as justified in getting the message out in a compelling way. That's the problem with infographics: they're a marketing tool, not a business communication tool. They each have their place but you don't want to confuse them.


Here's another infographic below, this one showing the mobile games market volume by language. Note the sizes of the bubbles. Chinese is $15B and English is $8B, but the Chinese bubble is not 2 times the area of the English bubble, it's closer to 4 because both height and width were multiplied by the 2. We saw this same error earlier with the use of logos and here it is again. The makers of infographics are usually far more concerned with getting your attention than accuracy.

 

Here are some of the important differences between a business chart and an infographic:


An infographic is often designed to push a message and includes data that supports the message. A business chart has a message too, but there's a difference. In your business chart, you had some reason to show the data you did in the way you did. There may well be an important revelation you want to communicate but the viewer may discover additional insights from the data being visualized. The business chart doesn't force a conclusion, it says "look at this. look what's happening here."

Don't make the mistake of using an infographic when you should have a business chart. In an infographic, the excitement comes from the design and messaging. In a business chart, the excitement comes from the data.

Everyone Says So

Why are infographics so popular? Well, one reason commonly given is shown (fittingly) by the infographics below: the human brain processes images 60,000 times faster than text!




You'll find thousands of infographics and designer/marketing web sites promoting this fact. Now that you've learned this fun fact, you can forget it, because it's not true!


source: visme.co

This is a big, big problem today. People see something interesting, especially something that supports the story they want to promote, and they pass it on without a second thought. Using or passing on unverified information has no place in business or in your charts.

When is a Chart Warranted?

Before you embark on creating a chart, ask yourself whether you have a good reason for having one. If the only reason for your chart is to have something pretty, I suggest you reconsider. Charts do make sense in your documents and presentations when any of the following are true:
  • The chart will help your audience understand the data better .
  • When the shape of the data contains your message.
  • When you want to highlight or reveal something that would not be evident from a table of figures or wall of text.
  • When the chart accompanies and augments a narrative or data set (not replaces it).
A chart should never be a replacement for its source data. Either include that original data or provide a link to it.

Characteristics of a Good Chart

A good chart will have these characteristics:
  1. Accurately show facts.
  2. Grab the reader's attention.
  3. Show trends or changes.
  4. Be clear and easy to read.
  5. Have a title and axis labels.
  6. Identify units for values.
  7. Use colors to show differences but not rely solely on color.
  8. Use white space around graphic elements and text for readability. 
  9. Follow established conventions.
In Part 2, we'll look at how to choose the right chart type and what happens when you get it wrong.