Visual Diagnostic Analytics

Eric Sandosham, Ph.D.
5 min readDec 1, 2024

--

A better way to see.

Photo by Adrien on Unsplash

Background

I just concluded a 3-day pilot class teaching working adults at the Singapore Management University with my co-instructor KOO Ping Shung (he writes about AI and data science in Substack). For over 5 years, we’ve been co-instructing a very popular course on Tableau that unpacks the concept of effective visual vocabulary. Starting this year, we expanded our Tableau course into a 12-day 5-module advance certification programme for adult learners (part of the nation’s agenda on continuous learning). The 3-day class mentioned above focuses on approaching diagnostic analytics via the use of visual analytic tools.

This article isn’t about the course, but rather about the choice of tools for data analytics / data science work. We all know the standard rhetoric: learn SQL, learn Python, learn Tensorflow, etc. to be good at data science work. I’ve written often that technical mastery isn’t going to get you very far; it needs to be paired with cognitive competencies such as data sensemaking, problem-framing and computational thinking. Diagnostic Analytics is one of those abilities that is made up of both technical and cognitive competencies. Interestingly, the practice of diagnostic analytics is poorly covered in formal and informal literature. A little while back, I made an attempt to contribute to the space with my article “How To Perform Diagnostic Analytics”.

My 67th article is a follow-up, and in it, I argue that modern visual analytics tools like Tableau are the best way to go about doing diagnostic analytics work.

(I write a weekly series of articles where I call out bad thinking and bad practices in data analytics / data science which you can find here.)

Expansive Thinking

It’s important to use the right tools for the right analytics job. To that effect, I like to think in simplified terms. In the world of data analytics / data science, the tools can be divided into two hemispheres — tools for problem formulation, and tools for solution articulation. SQL, Excel, and visual analytics tools such as Tableau belong to the former, while Python and R belong to the latter. SQL is great at data extraction and precision queries. Python and R are great for data cleaning / preparation and constructing data solutions (e.g. predictive models). But despite their ability (and flexibility) to man-handle data, they are not really great at doing diagnostic analytics work, which falls into the problem formulation hemisphere. For that, the default reigning king is Excel.

There’s a simple reason for it. Robust diagnostic analytics work requires agility in (a) framing and re-framing, (b) creating a range of transformed variables to distil the right information signal from the original set of variables, (c) creating and validating hypotheses, and (d) sensing interesting and non-obvious relationships within the data for deeper dives. Excel’s tabular canvas format lends itself intuitively to all these activities. In a sense, diagnostic analytics requires expansive thinking — the ability to see everything “at one go”.

Now, a tool like Tableau has proven itself to be much better than Excel when it comes to charting and visualisation with its real-time user interactivity. However, it’s not obvious that Tableau could be better than Excel when it comes to diagnostic analytics. Here’s why.

Diagnostic Thinking

A good diagnostic analytics tool must accomplish the following with ease:

  1. Be usable by non-professional data analysts / data scientists.
  2. Be able to manage both transactional level data and person-level data, including creating new transformed variables at both levels.
  3. Be able to trigger pattern recognition.

Let’s address point (1). In a knowledge economy, everyone is exposed to data and have an obligation to leverage it to find opportunities to constantly improve their deliverables. Those “closest” to the data are more likely to interpret it correctly and to see counter-intuitive patterns that can lead to meaningful insights. So a “regular” knowledge worker should be able to interface with the (diagnostic analytics) tool. You can safely assume that a knowledge worker is proficient enough in Excel, and any additional analytics tool would have to be a close mimic in terms of usability. Tableau passes the mark here.

Regarding point (2), a robust diagnostic analysis of any phenomenon would logically consist of interaction- or transaction-level data, paired with aggregated person- or entity-level data. As an example, if you had to do diagnostics on declining a retail business, you would need transaction-level data representing item purchases across time, and you would need information about the customers associated with those transactions and how those attributes might be changing over time as well. And you would to be able to quickly, and experimentally, create transformed variables from the existing variables to amplify the information signal held within the dataset — e.g. creating labels like “high turn-over item” vs “low turn-over item” for retail purchases or creating generation labels like “X, “Y” and “Z” from age. Now Excel excels at these tasks; through the use of the pivot table functionality, you can aggregate transaction-level data in Excel to customer-level, and then enrich that by “vlookup-ing” it with customer attributes. Tableau also passes the mark here as well. In fact, with flying colours. Tableau is arguably better than Excel on this front as it allows you to do customer-level analysis on transaction-level data without having to do pivot tables, including creating transformed variables at either transaction or person-aggregated level within the same transaction-level data, so that everything is held in one place and is re-usable and interchangeably accessible for visualisation. Did you know you can create the classic 9-box grid, odds ratio, and even the extremely useful heuristics of RFM (recency x frequency x monetary) scoring on Tableau?

On point (3), nothing is more effective for pattern detection than visualisation. Even simple colour-coding of tabular data can reveal patterns faster than individual reading of the numbers. While Excel has sufficient charting capabilities, it’s honestly not that great. You can only stick to templates; you will struggle with multiple axes and overlays. All these shortcomings are overcome in Tableau, making it superior to Excel. You can easily visualise and adaptively re-visualise the diagnostic outputs not as point- or tabular-data but as beautiful charts (visual vocabulary is important here!), allowing you to “spar” with the data (Q&A style) at the speed of thought.

Conclusion

My team of consultants no longer use Excel and PowerPoint in their client presentations. It’s Tableau all the way through, from data exploration to diagnostic analysis to client communication / presentation. I was inspired by them to give it a go (I’m an Excel veteran) and have been wonderfully surprised by the increased ease of use and functional utility. To that end, every data analysts and data scientists should be looking to incorporate this into their ever expanding toolbox as a must-have tool.

--

--

Eric Sandosham, Ph.D.
Eric Sandosham, Ph.D.

Written by Eric Sandosham, Ph.D.

Founder & Partner of Red & White Consulting Partners LLP. A passionate and seasoned veteran of business analytics. Former CAO of Citibank APAC.

Responses (1)