Quantitative vs Qualitative Data in Data Analytics
There shouldn’t be a distinction.
I’ve been on a tear to challenge and unpack some very common terms and perspectives in data analytics. We use some of these terms, or believe in some of these concepts, without ever really challenging their foundational principles. In my recent 2 articles, I unpacked the notion of what it really means to “connect the dots” in data analytics, and the errors that data scientists make when they confuse a complex problem for a complicated one.
This week, I would like to explore how data analysts perceive the use of quantitative vs qualitative data. There is the common perception that quantitative data is objective and qualitative data is subjective. But this is so far from reality. And so I dedicate my 89th article to a discourse on whether there really needs to be a difference in the way we approach quantitative vs qualitative data.
(I write a weekly series of articles where I call out bad thinking and bad practices in data analytics / data science which you can find here.)
It’s ALL Subjective
Most knowledge reference sites would depict quantitative data as “numbers-based, countable, or measurable”, and qualitative data as “interpretation-based, descriptive, and unstructured”. They would further add that while quantitative data tells us the what, qualitative data tells us the how and why. There is also the impression that because it’s numbers-based, quantitative data is objective. And because qualitative data is interpretation-based, it is therefore, subjective. These descriptions of the 2 kinds of data impress upon that quantitative data is primary while qualitative data is complementary; there’s a 1st-class vs 2nd-class comparison going on.
Firstly, it’s important to note that there is no such thing as objective or subjective data. Those descriptions refer to information, which is derived from data when you interpret it. The sheer fact that data needs to be interpreted to turn it into information makes the latter subjective and probabilistic. So, ALL information is subjective; there is no such thing as objective data or objective information. However, one can then argue the amount of subjectivity that is resulting from the underlying data may defer between quantitative vs qualitative. However, I would argue that the issue isn’t so much the resulting variance in subjectivity from quantitative vs qualitative data, but rather the nature of the subjectivity.
Information derived from quantitative data is often biased. That’s the primary root of its subjectivity. It’s biased because of the way the data is instrumented or collected. In the world of surveys, you can collect data by asking the respondents to select their choice from a pre-defined set of options. That pre-defined set of options is built on assumptions and hypotheses, and creates bias.
Value in Data Analytics
Data analytics is essentially the search for counter-intuitive or non-obvious patterns. That’s where insights live. That “hunt” typically starts with quantitative data: it’s what data analysts / data scientists reach for because they can run statistics on it, they can use it to build models; it’s extremely malleable. An effective way to explain the counter-intuition is to dive into qualitative data as a next step. Unfortunately, many continue with quantitative data to try to uncover the reasons for the seeming counter-intuition. Figuring out the right kind of qualitative data isn’t their strong suit. And even when they do locate potentially useful qualitative data, to use it, they “shoehorn” it into quantitative data by creating rules on its underlying (perceived) patterns and parsing it. These rules can end up distorting the information signals.
The ultimate purpose of data is to “host” information signals, and qualitative data is arguably more information-rich. Having the skills to see and extract these information signals requires a different set of cognitive abilities. Data analysts / data scientists aren’t often exposed to what is known as qualitative thinking. This includes techniques such as interpretivism, abductive reasoning, and reflexivity.
- Interpretivism is the intentional effort to construct meaning in a given data study, instead of passively waiting for the data to “speak”.
- Abductive reasoning is the process of generating new explanations based on observations and prior assumptions about the world.
- Reflexivity is the sensitivity to examine how one’s assumptions, experiences, and relationships influence our work in terms of methodology and interpretations.
Conclusion
Qualitative data hasn’t gotten the same kind of love as quantitative data has in the data analytics domain. We have loads of techniques for quantitive data, but have not really put in as much time and effort to leverage qualitative data, often treating it as supplementary. We need to first drop this nonsensical belief that quantitative data is objective and qualitative data is subjective, which is probably the primary driver of the data analytics community favouring the former to the latter. A good starting point would be to encourage and expose the data analytics community to qualitative thinking.