The Problem with Data Visualisation

Eric Sandosham, Ph.D.
5 min readSep 9, 2023

--

Photo by Choong Deng Xiang on Unsplash

Chartjunk

Data Visualisation has been capturing our visual attention since the late 19th century, with illustrious contributors such as Florence Nightingale (her famous Rose Chart on the causes of mortality) and John Snow (his famous Geospatial Chart on cholera outbreak in London). It was during this period that the line chart, bar chart, and pie chart were ‘invented’.

Humans ingest data through their eyes, which are literally and biologically photo-sensitive protrusions of the brains; it’s the only senses that’s connected directly into the brain. This explains the astronomical rise in data visualisation tools — we think with our eyes and anything that reduces that friction is an attractive proposition.

However, despite the improvements in tools, we are seeing more chartjunk than ever before. Organisations are using their data visualisation tools to improve the aesthetics of their charts with very little understanding of the foundational appropriateness of those same charts. Pretty doesn’t cut it anymore! Another major issue I’m seeing is that many organisations are using these visualisation tools to create interactive management (visual) dashboards filled with lots of ill-conceived and disconnected metrics. But I will talk about that problem with dashboards in my next post.

Visual Vocabulary

I co-teach an adult course in visual analytics through the use of Tableau. It is anchored on the construct of visual vocabulary. The term was introduced into the current zeitgeist by the amazing people at the Visual and Data Journalism unit of the Financial Times in 2016. It was in turn, inspired by the work of Jon Schwabish (then working for the US Congressional Budget Office) and Severino Ribecca in Graphic Continuum. The key learning about visual vocabulary is that there is a right chart to convey the desired data message. For example, if the key message is to show deviation from a specific target or baseline, then you are indeed restricted to a set of charts that are appropriate for this; pie charts are obviously not!

The concept of visual vocabulary blew me away when I first encountered it. It was so obvious that I wondered why no one thought of it before. In my adult course, I’ve worked hard to help the learners unpack exactly why the charts work to convey the target message, the visual cues that the eyes receives that makes it easy to consume the information in an unambiguous and unequivocal way, by incorporating new insights from neuroscience and information science . Allow me to summarise some of my own learnings from this data visualisation journey.

Designing the Right Visual

What makes for good visuals ultimately comes down to the information processing load or overload. This can be intuitively understood in terms of (i) eye-tracking, (ii) cognitive load (also known as Miller’s Law) and (iii) audience anticipation.

Consider the chart below which was created at the onset of the recent Covid pandemic by the W.H.O.

You can see that there is a lot of eye-tracking (the visual focus jumping from point to point) and mental cognitive work to compare the size of the bubbles between countries and against the reference legend. The reader has to hold in their ‘mind’s eye’ the image of the bubble to do the comparison. A simple use of colour to additionally differentiate the bubble sizes would have reduce eye-tracking and memory-holding. The key lesson here is that the more eye-tracking that a chart has, the faster the reader to experiences mental fatigue (again, because eyes are extrusions of the brain). Good visuals must minimise eye-tracking.

Cognitive overload is another important consideration in good chart design. For example, a simple line chart showing the growth in the number of customers over time contains 2 pieces of information — number of customers, and time. In 1956, Harvard psychology professor posited that the human mind can take on a cognitive load of 7 ± 2 pieces of information. This became popularly known as Miller’s Law, which has been utilised in many situations since. For example, telco companies all over the world utilise this ‘law’ by limiting landline telephone numbers to 7 digits (excluding country and regional prefix codes) during the pre-smart phone era to reduce the cognitive load on human memory. Interestingly, neuroscience has since proven Prof. Miller’s conjecture to be true! To optimise the cognitive load on a visual chart, we need to consider the following:

  1. Are there ‘empty calories’ in the chart? E.g. duplication of information or unnecessary visual cues.
  2. Can colour be better used to carry information? E.g. the use of ‘traffic light’ colour to convey deviation from target or the use of heatmap or gradient colours to communicate frequency or intensity.

Anticipating the follow-up questions from your reader audience is almost never talked about in the data visualisation literature, which I find rather surprising. The best data visualisation captivate and resonate with us because it works hard at anticipating the questions that would likely follow from the primary information being conveyed, and directly incorporating them into the said visual as additional pieces of information (while being sensitive to cognitive load). If the information being presented visually is useful, then it will logically trigger 2 kinds of questions: (a) peer-to-peer type comparison or (b) deconstructive / drill-down details.

The most popular example of anticipating the audience is the amazing data storytelling presentation by the late Hans Rosling (see video below) where he incorporates 5 pieces of information into a 2-dimensional plane, without the audience ever feeling cognitively overloaded. Hans talks about the linear relationship between health and wealth across the countries (2 pieces of information), proxied by life expectancy and GDP per capita respectively. He anticipates that the audience would naturally want to know if the big countries are the healthiest (and therefore wealthiest) which he shows by representing each country population size by bubble chart (3rd piece of information). Another question that the audience would ask is whether the geographic regions move in tandem, because there is likely more intra-regional economics similarity than inter-regional; Hans uses colour to depict the regions (4th piece of information). The audience then want to know if this trend is consistent over time; so Hans animates his visual through time (5th piece of information).

Conclusion

Data visualisation is a competency that can be taught and mastered. But sadly, it’s not covered in academic education. It seems to me that many teachers and professors, even those advancing data science curriculum, are simply unaware of it or don’t see the importance of it. Data visualisation is fast becoming a basic competency requirement in the knowledge economy … for everybody and not just the data analysts. The tools have gotten so much better, and it’s time we equally raise the bar on our cognitive abilities in this domain.

--

--

Eric Sandosham, Ph.D.
Eric Sandosham, Ph.D.

Written by Eric Sandosham, Ph.D.

Founder & Partner of Red & White Consulting Partners LLP. A passionate and seasoned veteran of business analytics. Former CAO of Citibank APAC.

Responses (2)