When False Positives Aren’t False
A counter-intuitive case of lateral thinking in predictive modelling.
Background
A few years back, I did a piece of consulting work for a corporate bank to develop a point solution to improve their outbound foreign remittance business through predictive analytics. Now, for those in the know, the standard approach would be to optimise the predictive model for precision, i.e. reduce the false positives from the model output because the sales people who will utilise the output of the model must trust that every call-out to a targeted customer has a very high probability of resulting in a conversion. However, when I developed the model, I had very intentionally opted to optimise the model for recall, i.e. reduce the false negatives from the model output. What this means is that the model will try to sweep up all the opportunities and not leave any on the table. This recall optimisation, however, has a mathematical corollary: it will reduce precision and result in more false positives.
Why did I do that? I had argued that the false positives generated by the predictive model were in fact not really false positive (huh?), and hence I wanted to increase the ratio of false positives by optimising for recall. And so I dedicate my 59th article to explaining why false positives aren’t necessarily false positives at all.
(I write a weekly series of articles where I call out bad thinking and bad practices in data analytics / data science which you can find here.)
Precision vs Recall
For students of predictive modelling, they will be familiar with the confusion matrix, a quirky-named 2×2 matrix that measures the actual vs predicted output of a predictive model. The below diagram summarises this confusion matrix with a further explanation of what it means to optimise for accuracy, precision, and recall. Optimising for accuracy means I want to capture as many true positives and true negatives in the total output. Optimising for precision means I want to reduce the false positives in my predicted output (e.g. great for telesales). Optimising for recall means I want to reduce the false negatives in my predicted output (e.g. great for fraud detection). Mathematically, you cannot optimise for both precision and recall; you have to pick one. If you optimise for recall, you WILL get more false positives.
Response vs Propensity
I need to further explain the difference in predictive modelling approaches before we unpack the case of false positives. You can build a response predictive model, which is essentially the predicting the likelihood of responding to a given stimulus. Or you can build a propensity predictive model, which predicts the likelihood of a state or behaviour change. Consider the simple example of credit cards. I can predict the likelihood of a credit card customer taking up a specifically priced balance transfer offer (and in so doing becoming a revolver), or I can predict the likelihood of a credit card customer becoming a revolver (through various means). Response models typically require you to run a controlled experiment to acquire data, since it’s related to a specific stimulus. Propensity models, on the other hand, do not require new experiments to acquire data but utilises historically available data. I was taught that response models are the best kinds of models to build if one could afford to run the experiments, but I’ve since changed my mind.
While response models may be highly accurate / precise, they assume the invariance of stimulus. But the reality is that the world is dynamic and messy, making response models probably more brittle than ever. Propensity models, on the other hand, are much more noisy given that the historic “responses” came from either various stimuli, or they could even have occurred naturally. But propensity models may ultimately prove to be more resilient in a dynamically changing world. Thus, response models need to be continuously rebuilt (running new experiments to gather new data) to cater for changing stimuli while propensity models, while less accurate / precise, require fewer rebuilds.
Nature of False Positives
The definition is straightforward enough — an output is deemed as a false positive if it is predicted to be true when in fact it is not. But there can be different shades of “true”. In a propensity model, a false positive simply means that the state / behaviour change did not materialise as predicted. Let’s further unpack in layman’s terms: the information signals suggest that the state / behaviour change should occur, but it did not.
Now, it’s not too difficult to see that in a competitive environment, that state / behaviour change could have instead materialised with the competitor. Consider the example of predicting the likelihood of a credit card customer becoming a revolver (propensity modelling). Despite the similarity in information signals, the customer eventually opted to revolve with a competitor (most people have more than one credit card). This would be classified as a false positive in your propensity model. But if the desire is to assess whether the customer has a “revolving need”, then the false positives in the propensity model aren’t necessarily false positives.
And this was precisely the thought process I had while developing the predictive model for the corporate bank — predicting which corporate customer is likely to have a need for outbound foreign remittances. Firstly, I wanted to capture ALL the opportunities in this space. Unlike retail customers, if a corporate customer has an outbound foreign remittance need, they would fulfil it immediately. And so, predicting the remittance need was not about predicting a future need but identifying those who were having their needs fulfilled by the competitor. These were corporate customers having the same information signal as those whose needs were already fulfilled by my client (i.e. the corporate banking wanting the predictive model).
And so I wanted to intentionally generate more false positives. I optimised for recall. I took a sampled output from the model — those that were predicted to be true. The model suggested that only 1 in 5 were true positives; 4 in 5 were false positives. I got the client to call the customers in the sampled output to have a conversation, and at least 3 in 5 revealed that they were indeed having their outbound foreign remittance needs fulfilled by a competitor bank. The focus of the model execution was therefore to design an offer to win over these customers.
Conclusion
Data science isn’t just about knowing and following the rules of data handling and modelling. Even from a technical perspective, you need to think about information signals and what they might be trying to tell us. Predictive modelling is just a method to distill those information signals; the ability to frame and interpret them distinguishes those can solve vs those who will simply execute.