What types of chart best tell my data’s story?
The rule of thumb is to use the simplest chart to showcase your results. Far too many people try to jam their data into a funky type of chart for the sake of it, but if it doesn’t fit well with the data type, it just over-complicates the visual and can be hard to figure out for the uninitiated. That being said, many of the newer chart types (coming later in this article) are intuitive and don’t need any explanation to figure out, so if used in the right circumstance can be very effective and even fun!
In selecting the most appropriate chart type (and this is usually done subconsciously if you’ve worked with data long enough), you should take into account the message you’re trying to convey, the data itself and your audience. Qlik have made a good ‘map’ of the selection process for the more traditional types of chart:
Source: Qlik View
This puts the message that you’re trying to convey at the forefront – which of these story types is your data telling?
- A comparison, to compare the magnitude of different categories or time periods to each other. e.g. “what products sells best”, or “how are our sales doing, compared to last year?” This may involve an ordered ranking, when horizontal bar charts come into their own, for example in comparing performance per salesperson across the year. If unordered, comparisons are easier to judge in vertical display, and it is often helpful to add a target / budget line or a long-term average line to guide expectation.
- A relationship, to see the relationship between two or more sets of data (e.g. to find correlations, outliers and clusters of data). e.g. “are we seeing more sales from a higher advertising spend” or “how are unemployment and inflation related?”.
- A distribution, to see how quantitative values are distributed along an axis from lowest to highest, to identify the range, density and shape of the data. e.g. “how many customers do we have per age group?”
- Or a composition, to show how the total is split into parts, e.g. “how big a market share do we have in this region?” or “what areas is our budget divided into?” Relative values are usually more appropriate here (i.e. % of total) but some do look at absolute differences (the value itself). Note that the much-used pie chart is not best practice in the field: research suggests they are only easier than a (stacked) bar chart for the eye to compare sizes in a very few cases – more info here.
Examples of how – and how not – to select your chart
The message you’re trying to convey can change as you explore the data further, and a different chart may be more appropriate. As an example to further illustrate this selection & exploration process, let’s say you have some data on China’s unemployment – you want to show how it changes over time (a time series) so you select a comparison line chart. Then you notice there is a big dip in a particular period, and you think this is explained by inflation but this might not be obvious to viewers. So you add on a second line for inflation to the chart, but now you wonder if that really does explain the issue since there are areas of high inflation but no corresponding dip in unemployment. To test it out you look at the relationship instead and switch to a scatterplot of inflation versus unemployment, colour-coded by year – this leads you to consider adding in a third factor of population size, and suddenly you’ve found a much more interesting story to convey!
It’s also instructive to demonstrate the importance of chart selection when it goes wrong. There are many examples I’m sure you can think of charts being unlabelled, having too much data, too much garish colour, too similar a background etc. More serious examples include distorting axes (e.g. not starting from zero) or omitting parts of a chart because they don’t fit your story. Various hilarious (better to laugh than to cry!) examples of consequences to be found here: www.statisticshowto.com/misleading-graphs/. But here I refer instead to an inappropriate selection of visualisation, which might seem more esoteric, but if done poorly can at best lead to readers missing the message, or at worst it can be misleading.
For example in the first chart below, such composition data is typically displayed in pie or bar charts, whereas selecting a line chart instead (as below) incorrectly suggests an element of change over time. In the second chart below there are two things wrong: the stacking of costs and sales in the bar chart implies that they add up to something meaningful (when in fact costs should be deducted from sales, so a clustered bar would be more appropriate), and also the selection of a 100% stacked column doesn’t demonstrate any absolute change over time, which is usually important for this type of data.
And now for the most exciting part…
What new types of charts are there & when are they suitable?
There are a plethora of new (or at least less commonly used) types of chart out there to make use of, but I would caution once again that some are more intuitive than others. You might end up spending more time explaining how to interpret some of them than they’re worth – the opposite of the ethos behind the adage “a picture is a thousand words”. Essentially, don’t dismiss the good old bar chart if that’s all you need to show your point!
- Sankey diagram, used to analyse and visually size flows. Whilst originally used to model engineering applications such as water or energy flows through systems of pipes, they can also be effectively applied to business e.g. how different sized pockets of capital from different borrower sources are applied to different lender pools, how customers flow through a service, how employees progress through the ranks, or here where the volume of different types of communication originate from and are dispersed to (Source: LinkedIn):
- Chloropleth, used to visually compare a variable across a map, such as the unemployment rate or sales per region. Used here by me to compare London house prices with the rest of the UK, for 2 time points in time:
- Heatmap (not to be confused with chloropleth, this one’s on an artificial layout) used here by data.london.gov.uk to display median 2014 house prices by London borough:
- Chord, a cross between a bar chart and a flow diagram, showing the size and links between various categories. This one looks at which US college major subjects are commonly associated with employees in STEM jobs. Source: Duke
- Back-to-back interactive charts: whilst the chart itself is not new, they can now be interactive or playable through time. These are frequently used to analyse the changing demographics of a company’s customer base, or used here to look at change in the US population’s weight – click on the link for the playable version: http://flowingdata.com/2016/06/14/growing-to-obesity/
- Bubbles / circle packing: this beautiful chart can be used to get an idea of the different “clusters” of participants in a market (e.g. each colour cluster is a different type of capital, with bubble sizes representing individual lender volume appetite), or to visually represent two hierarchies (e.g. each colour cluster is a UK region, with bubbles representing individual county populations within that region):
- Sunburst, essentially a multi-level donut (I know, it should be doughnut!) chart, showing the sizes of one category within another. In business this can be used to analyse, say, sales per product type within different regions, or as OrgVue have done here you could look at the whole employee hierarchy to see at-a-glance the gender balance (or lack thereof!) amongst different role types at each pay grade:
- Colour pinwheel – we all recognise different brands by their selected colours and ModeAnalytics has looked at thousands of Instagram photos relating to brands from 2015, reducing each photo to its most dominant colour and creating a “colour pinwheel” for each brand:
- Word cloud, as a much-needed improvement upon the original pie chart showing causes of deaths in Shakespeare plays (Source: JunkCharts). More business-like examples include analysing sentiment from customers in the different types of words they associate with a company brand on social media:
- Treemap, an explorable version of a bar chart but the area of the rectangle represents the magnitude of each category. In business these are used to break down KPIs into their component parts. In the below, a distinctly non-business case study of Pokemon characters’ powers! (Source: JKunst)
Many more examples of chart types available at: https://github.com/d3/d3/wiki/gallery
I hope you’ve enjoyed this 3-part special look into how visualisation is used in business (and elsewhere) – this has been my favourite blog to write so far!