Newsletters

Optimising financial processes

Posted on:

Half My AI Generated Content is Wrong, But Which Half?


There is a great quote, often and falsely (I understand) attributed to David Ogilvy, the eponymous head of the famous Ogilvy & Mather advertising agency, now part of WPP.

“Half the Money I Spend on Advertising is Wasted, but the Trouble is I Don’t Know Which Half”.

Whilst that is both true and amusing, despite the years of progress. It sparked a similar thought in our current plans, work and discussions.

“Half my AI generated answers are plain wrong, and the trouble is we don’t know which half”.

Back to the Future? 😀

It turns out that this is true with both Generative AI and more “classical” AI Machine Learning, but for different reasons.

Let’s call the collection of AIs built on Large Language Models (LLMs) as GPTs for simplicity, from whatever stable they are derived.

The fact is, GPTs hallucinate. Amusingly, they don’t call it an “error”!

They make mistakes in their analysis.

They answer our prompts without asking clarifying questions.

We can’t just take their suggestions without verifying them.

You know when part of your response from ChatGPT or Bard (now Google Gemini) is just plain wrong?

HfS Research shared a witty, but to the point, image of Dunning Kruger effect applied to Generative AI, below . . . 

AI hallucinations (also occasionally called confabulations or delusions) are confident but incorrect or misleading results caused by a variety of factors, including insufficient training data, incorrect assumptions made by the model, or biases in the data used to train the model.

This is all rather unfortunate in an era where more classical AI Machine Learning has been responsible for amplifying “fake news”.

This phenomenon is based on one of the unfortunate side-effects of using statistical analysis as an indicator of veracity – the vast numbers of people that view, scroll and click on social media content is not correlated to accuracy, just simple human interest, intrigue or even shock.

We have lived with this for 10-15 years now and, as a consumer, you can choose to avoid it.

But can you avoid it in business scenarios? Consider THIS answer below, is this OK?

“There is an 85% likelihood of accuracy, but the specific response might just be plain wrong” (paraphrasing Mark Zuckerberg’s response in the recent Senate hearing).

The big business challenge is how to exploit AI in a way that doesn’t frustrate your customers, suppliers and colleagues more than it helps them.

Would it be OK if your AI customer service chatbot gave incorrect information to your customers 15 times out of a hundred?

I would say not.

From discussions with the experts and illuminati on the topic, our own AI practitioners and our customer community, it seems to me that AI needs to be focused on the “human in the loop” (HITL).

AI is about human augmentation and amplification, not replacement.

The HITL and the habits, responses and behaviours of the humans are central to success.

For most real business problems and Key Business Questions, we need Large Language Models (LLMs) focused on curated “gold standard” data in the enterprise.

Some people call this the “SLM (rather than LLM) Approach”. 

Smaller Process Specific Models where data veracity is key.

Never before has data quality been such a priority.

Remember that HITL (and thus the source data) will make or break your success with Generative AI (or any other AI !)

Otherwise, you will end up with GIGO (you have to be of a certain age to know that acronym (maybe the first I ever experienced) otherwise, just Google it ! 😀 )

Ironically, we humans have our own “hallucinations” caused by cognitive biases, Dunning Kruger effect, confirmation bias, availability heuristic, framing effect, stereotyping et al . . .

But thats another story!

IBM give a good overview of AI hallucinations here  . . . 

There are many times in business, as in life, where you need the facts.

Or as close to fact as you can get.

Maybe “well informed opinion” is good enough?

Thanks for reading . . . .