Relationships make the world go around and make life interesting. Without other human beings around, life would feel like a lonely and empty existence. How people interact with each other tells a lot about the relationships they have.
In a sense, the same could be said for data: when one variable or set of data interacts and affects another, it said to have a ‘relationship’. When two variable vary together, it’s described as correlation relationship.
In this post on Displaying Data, I will be exploring the types of charts ideal for comparing two or more variables together in order to discover a relationship.
In this graph, a collection of points are plotted with Cartesian Coordinates that has each axis represented by a variable. This can help you detect if one variable impacts the other by plotting them together and visualising a correlation relationship.
The types of correlation that can appear on a Scatterplot varies and can be interpreted through the patterns they produce on the graph:
When the variables increase together and produce an upwards slope made up of points, then this is known as a positive correlation. If one variable decreases while the other increases, producing a downwards slope, then it’s described as a negative correlation. However, when the values of one variable stays the same while the others increases, then the correlation is null.
The shape of the correlation slope can also display whether the type of correlation is linear, exponential or even U-shaped. How closely the points packed together into a line also shows the strength of the correlation. If no line emerges after all the points have been plotted, then there is no correlation.
Working in the same way as a Scatterplot, Bubble Charts take displaying data a step further by varying the size of each point into a ‘bubble’ relative to an extra third variable and by labelling or categorising each bubble. This allows for comparisons between bubbles, as they are now distinguished from each other by labels or colouring.
Like Scatterplots, Bubble Charts are useful for analysing patterns and correlations between variables. Time can also be shown by either having it as one of the axis or by animating the visualisation. A great example of this is the Gapminder World tool below, which allows you to explore various statistics in all the countries in the World, over time.
Bubble Charts are quite limited in the amount of data they can visualise: having too many bubbles can make the chart overly cluttered and therefore, hard to read. Of course interactivity can remedy this by having an option to reorganise or filter out categories. Another solution from over cluttering is to use transparent bubbles, which can prevent bubbles from being hidden under larger ones.
Using a matrix between two variables along both axes, Heatmaps visualise the relationship between the them by varying the shades of colour in each cell. The colour-coding system can vary, but the shading amount is proportional to the values that occur when each variable interacts.
Below is a great example of the use of Heatmaps from the Wall Street Journal graphics editor, Stuart A. Thompson, who has used the chart to visualise the sleeping patterns across different cities. From looking at these visualisations, viewers can gain insight into which cities get the most sleep or what the lifestyle trends are.
Because of a Heatmap’s reliance on colour to communicate values, it’s a chart more suited to displaying a generalised view of the data and is therefore difficult to accurately read exact numerical values from.
Remember that correlation is not always causation and another unnoticed or unconsidered variable may be influencing results.
If you are considering visualising correlations when displaying data and are still unsure with which is the best visualisation to apply, then please feel free to get in contact with us and we’ll be happy to talk you through it.