How to Operationalise “Responsible AI”
How should we translate principles to procedures?
Background
A recent conversation with a client from the financial services industry inspired this week’s article. They had asked me the following question: “How do you, and should you, operationalise responsible AI across the enterprise as we scale up our use of AI?” The client is considered a leader in the practice of data analytics / data science, and over the years, they have been actively working at providing enterprise-level guidance for using data responsibly. But the scale and speed of AI development and implementation meant that AI was everyone’s business, and there was a need to ensure consistent thinking and approach.
And so I dedicate my 91st article to unpacking how an organisation might go about operationalising responsible AI.
(I write a weekly series of articles where I call out bad thinking and bad practices in data analytics / data science which you can find here.)
Responsible AI
Responsible AI is NOT the same thing as Responsible Data Use or Ethical AI. Responsible Data Use is about how you go about collecting, storing, utilising and sharing data that prioritises individual privacy, data security, and fairness (i.e. non-bias). Ethical AI is about the development and deployment of AI systems that prioritises fairness, transparency, accountability, and respect for human values. Based on the description from IBM (which I thought was pretty good and representative of what most others are articulating in the public domain), Responsible AI seems to marry both Responsible Data Use and Ethical AI, while adding on elements of system robustness.
I’ve summarised and paraphrased the principles of Responsible AI as defined by IBM below:
- Explainability — accurate prediction, traceability, how the AI makes decisions.
- Fairness — diverse & representative data, bias-aware, bias-mitigation, developed by diverse talent, subject to ethical review.
- Robustness — not brittle when encountering (intentional & unintentional) anomalous data, strong cyber and digital security.
- Transparency — end-users must be able to see how it works, evaluate the strengths and weaknesses of its features and functionality.
- Privacy — protection from leakage or misuse of personal data.
These are worthy principles. You get no quarrel from me. But given the breath and scope, I can understand the challenge my client was facing. How do you translate these principles into an SOP (standard operating procedure) that could then be efficiently and systematically applied and enforced across the enterprise?
Operational Frameworks
Principles are conceptual frameworks. They introduce concepts and ideas as guidance, but leave room for more nuanced interpretations in practice. Operational frameworks are much more structured. They deal with SOP, policies, roles & responsibilities. So what I’m doing here is taking a stab at creating an operational framework that embodies the principles of Responsible AI. It’s really draft-level thinking at this point.
Some of the Responsible AI principles are easier to translate into policies and procedures, while others are more challenging. For example, the Fairness principle in the IBM framework is mostly addressed through existing AI/ML model governance, and the Privacy principle is addressed within most organisation’s existing IT oversight. But the Transparency principle is harder to translate. After much consideration, I landed on this as a first pass: Responsible AI operational framework = model governance + experience design + diagnostic + integration. Let’s unpack each:
AI Model Governance
AI/ML model governance has been in place for over a decade, and so it’s a lift-&-shift. However, with the introduction of Gen AI, there is a need to upgrade the methodologies and evaluation metrics to accommodate for the more unpredictable and less structured nature of this new AI form. Work is happening at a global level, and so organisations simply have to keep their ears to the ground and participate.
AI Experience Design
Designing with AI in mind requires policies and procedures that address psychological and digital safety. The default is that users are uncomfortable, unsure, and untrusting of AI solutions. There should be explicit checklists on what information and/or warnings that should be made explicit to the users, including various recourse pathways. There should be testing checklists on what could go wrong in terms of digital safety (privacy and bias), including unintended harm that could come about because of undue influences and recommendations from the AI. The SOP would therefore be around the creation of standard checklists, owned by perhaps the Compliance function.
AI Diagnostic
Beyond the standard AI/ML model governance, there must be clear architectural design instructions that allow for troubleshooting diagnostics to be easily conducted on the AI solutions, and clear procedural steps that the data scientists, data engineers, and supporting production team must take when confronted with specifically defined sets of failure and complaint issues (i.e. “who does what”). There should be an SOP that covers this, owned by the Analytics function.
AI Integration
There must also be clear architectural design instructions that call out and minimises critical co-dependencies in how the AI solution is integrated into various operating systems and workflow. The AI solution has to be designed with the possibility of failure in mind, and in such an instance, “can we pull the plug without bringing the house down?” The responsibility to operationalise this should logically sit with the organisation’s Tech Architects community.
Example
Consider the example of integrating co-pilot into employee workflow. A good Responsible AI operating framework should be able to consider and accommodate the following scenarios:
- Can the AI agent get caught in a loop? And if so, how will the user identify it, and what steps should the user take to break it?
- Can it produce offensive or triggering text and images? And if so, how would the solution team know or be alerted?
- If the co-pilot output becomes consistently problematic, is there an SOP for the solution team to be immediately notified and distribute the work to conduct troubleshooting?
- If the co-pilot solution is somehow unavailable, what would be the contingency switch over, and how seamless would it be for the organisation?
Conclusion
Translating guiding principles into practical operational frameworks is a non-trivial endeavour. Responsible AI goes beyond the model development lifecycle. It has to cover end-user experience and recourse. It has to cover architecture and integration design. Let’s get cracking!