How more data sometimes leads to poorer outcomes for businesses

In round 1, each handicapper got to choose five attributes to predict the winner. They were also asked to comment on their confidence in their predictions. If they had picked a horse at random, their success rate would have been 10%, since each race had 10 horses. But in round 1, the correct-prediction rate was 17%, implying that the five attributes used by handicappers had given them better-than-random predictive power.

In the second, third and fourth rounds of prediction, respectively, 10, 20 and all 40 attributes were provided. The study found that predictive ability remained around 17%, but the handicappers’ confidence in their predictions increased substantially.

While this spike in confidence was only human, there might be several reasons for why their prediction quality did not improve with more data. One reason could be that the human ability to process abundant information is limited. Another could be that the extra data was not relevant to the task. At times, just because data is available, one starts believing that it must be useful—a problem of availability bias.

These learnings may haunt organizations that are inundated by data but still starved of actionable insights. They see little improvement in business outcomes despite the belief of their decision-making becoming more ‘data-driven.’ Reflecting on the following aspects of data-use could be of help.

Information value versus information cost: The cost of information is the amount spent on buying or capturing data and the subsequent computation cost of extracting information from it. The value of information, on the other hand, is the rupee value of business gains attributable to improvements in decision quality on account of that data.

From the same set-up for data infrastructure, one organization may generate more information value than another. Organizations that excel in decision quality can improve business outcomes based on data.

However, this ability may not be very common. Going by Gartner’s estimates, more than 60% of data infrastructure capabilities fail to provide the business value expected.

While the outcomes of most processes at most businesses can be improved with data-backed decision-making, for a lot of organizations, the returns of their investment on data assets has been underwhelming.

So, the question is how to ensure that information value stays higher than the information cost?

What type of decisions can be improved by data? The business outcome of a data-driven decision support system is affected by two types of uncertainty. ‘Aleatoric uncertainty’ (AU), where ‘alea’ is Latin for dice, reflects the inherent randomness associated with most business outcomes. The other is ‘epistemic uncertainty’ (EU), where ‘episteme’ means ‘knowledge’ in Greek; this can be reduced by using data that is more relevant to the specific decision.

Most business decisions have both types of uncertainty in some combination. In some cases, such as predicting stock-price returns in the short-term or customer acceptance of a new product, AU dominates. In such cases, extracting information value that beats the information cost is difficult. Data does not dent AU. In EU-dominated processes, investment in data assets can improve outcomes. But here, the question is: what type of data?

What is the correct data for the case in hand? Loan decisions, for example, are EU-dominant. Thus, relevant data has always improved credit outcomes. In personal loans, past credit behaviour significantly predicts future delinquency.

However, farm-loan repayments display a higher dependence on weather conditions and agri-commodity prices. Thus, a farmer’s credit record may prove to be a weak predictor of a default. Here, satellite data on cultivation patterns, weather forecasts and agri-commodity prices would be needed in addition to the borrower’s credit history.

While a lender may have data about the apps installed on a farmer’s phone, this data may not be relevant to the likelihood of a farm-loan default. Empirical validation of the usefulness of any such data would be required.

Focus on data basics: Some organizations seem over-enthusiastic about capturing all the customer data they possibly can, such as mobility patterns and video and voice clips. However, at best, low-value processes could be improved this way. In many cases , a focus on improving the governance of existing structured data (even if it’s ‘boring’) and painstakingly raising data quality to feed more important business decisions would have yielded much better outcomes.

Without a clear idea of which business decisions must be improved and why, organizational initiatives to acquire expensive data assets and support infrastructure could have the effect of raising confidence in decisions without achieving better outcomes, as seen in the case of horse-racing bets.

The larger risk of bad investments in data capabilities is that it may unduly persuade some organizations to re-adopt judgemental decisions driven by gut-feel, with data used retroactively as a mere crutch in support of business calls that led to suboptimal results.

The author is a risk management and AI consultant, and a member of the visiting faculty, IIM Ahmedabad and IIM Calcutta.

#data #leads #poorer #outcomes #businesses