Sitemap

WTF is Building Trust in Data?

3 min readMay 11, 2025

Old wine in a new bottle.

AI image generated by author

Background

In the ever-evolving world of AI and it’s potentiality, we often hear organisations and vendors say that a fundamental pillar is “building trust in data”, particularly in the development and application of Gen AI. But what do we really mean by that? How is this different from just saying that we need to have reliable AI products? Is “building trust in data” the same as “building trustable data”? I personally don’t think so. And so I dedicate my 90th article to calling out bullshit on these new nomenclatures; it’s simply old wine in a new bottle.

(I write a weekly series of articles where I call out bad thinking and bad practices in data analytics / data science which you can find here.)

Trustable Data

Let’s quickly unpack the definition of trust. In simple layman’s terms, trust is a belief in the reliability, goodness, honesty, or effectiveness of someone or something. When you say you trust someone or something, what you are saying is that you feel confident that that someone or something will act in a way that is consistent with your expectations, or in your best interest.

Let’s apply this to data. Wait … the only attribute of trust that extends to data is reliability. Data has no agency; it can’t be effective, honest or have your best interest at heart! And if it’s only reliability that’s relatable, then why don’t we just say we need to build have reliable data? And isn’t reliable data one of the key principles and deliverables of data governance? This is what I found on the website Data Science Central: “Trustable data can be defined as data that comes from specific and trusted sources and is used according to its intended use. It is delivered in the appropriate format and time frames for specific users.” Again, isn’t this data governance?

From several other websites, I found that they equate “trustable data” as being evaluated along the following dimensions: Accuracy, Consistency, Completeness, Security, Usefulness, Privacy, Reliability, Interpretability. Again, these are simply good data (governance) practices. There are vendors that have developed “trust scores” based on these dimensions. Good grief! Just like the word “AI”, “trust” is also a monetisable buzzword.

Trust in Data

I submit to you that the problem of “trust” isn’t about the data, but rather, about the user. Trust exists and is shaped at the intersection of interaction. It’s a 2-way belief system. Building trust in data suggests that the onus is on the user to have enough understanding and skills to know what they are doing with the data. If I’m an experienced data practitioner, I would, of course, know if the data I’m using is reliable. But the casual end-user (through dashboards and visualisation tools) doesn’t have the same cognitive expertise to sense when the data doesn’t triangulate.

The rise of the term “data trust” is driven in part by the rise of AI. At the heart of the issue is that users don’t really understand the workings of AI; they just want to get a useful output. Can they “trust” the outputs, can they “trust” that the “intentions” of the tools, can they trust themselves? The first 2 questions are beyond your control, but the last question is very much in play. That’s why data fluency (beyond data literacy or even savviness) must be dialled up and invested in. Trust in data must start with trust in abilities. Investing in making sure the end-user is sufficiently equipped, cognitively and digitally, must be paramount to any endeavour involving the introduction and adoption of data/AI-driven tools. The responsibility isn’t just about getting the “data ingredients” right.

Conclusion

I continue to be amused by the new lexicon sprouting up in this AI-infused epoch. It is quite obvious they are making it up as they go along, without any serious depth in first-principle thinking. These new lexicons are meant to create some kind of emotional resonance that hijacks our limbic systems. Never trust when the media starts using buzzwords.

--

--

Eric Sandosham, Ph.D.
Eric Sandosham, Ph.D.

Written by Eric Sandosham, Ph.D.

Founder & Partner of Red & White Consulting Partners LLP. A passionate and seasoned veteran of business analytics. Former CAO of Citibank APAC.

Responses (2)