Controlling hallucinations in LLMs

There is understandably concern about hallucinations in LLM (large language model) right now.

Hallucinations refer to the phenomenon in LLM where the LLM answers a question that is factually incorrect based on the underlying data.

Is there anything we can do about this or do we just have to hope that LLMs just get better and stop having hallucinations?

In this article, I will introduce a framework that you can use to control hallucinations in your solutions involving LLMs. Most of the framework can be automated in software that calls the LLM.

There is also a common myth that the only way to control hallucinations is to train your own LLM. In reality this is an overkill in most cases and does NOT eliminate hallucinations.

You can start with implementing this framework first and see if you get accurate results. Then move on to the expensive process of training new LLMs if needed. (You will need this framework even for your new LLMs.)

Why do hallucinations happen?

woman in gray knitted sweater
Photo by cottonbro studio on Pexels.com

As we learned in Hallucinations in ChatGPT article, AI models identify relationships between different data elements (or features) during training.

In How ChatGPT works: Creating a map of words, we learned that, in the case of LLMs, this relationship is modeled as a map of words. LLMs create the map by reading sentences in its content library and calculating relationships (distances) between various words and phrases.

(Technically this is a map of terms and sequences of terms but we’ll keep it simple and call them “words”.)

Example: “George Washington was born in 1809”.

In the above case, the AI model mistakenly built a relationship with George Washington and 1809 as the birth date. This could be due to the fact that George Washington is a famous US president and the birth date of another famous US president, Abraham Lincoln, is 1809.

What does an LLM use to generate an answer?

rivers and snowcapped mountains
Photo by Roberto Shumski on Pexels.com

An LLM uses four different data sources to generate an answer:

  1. Training data
    • LLM training uses this data to create the map of words (How ChatGPT works: Creating a map of words).
    • The main goal of this exercise is to learn language.
    • Since there are about 40,000 words that comprise almost all English conversation, the LLM is learning the relationships between these words.
  2. Content you provide
    • This is content that you provide to LLM when you ask a question
    • This content can be included in the prompt, referred to in the prompt or requested by LLM using a plugin.
    • LLMs assign higher importance to this content over their training content.
  3. Prompt with your question
    • Your prompt tells LLMs what content to focus on when answering the question.
  4. Reinforcement learning feedback data you provide (if any)
    • This data teaches LLM what responses are good or bad for your use case
    • This data is provided as a list of question/answer pairs with a flag that indicates whether the response was good.

Framework for controlling hallucinations

This framework can be used to control hallucinations.

There are eight steps in this framework:

  1. Curate
  2. Instruct
  3. Filter
  4. Ask
  5. Evaluate
  6. Fact Check
  7. Notify
  8. Learn

1 – Curate

woman standing in modern museum with pictures on walls
Photo by Cesar Hiar on Pexels.com

Any AI model will only be as good as the data it uses to answer the question.

Answering questions based on incorrect data will, of course, result in incorrect answers.

One way to provide this data to an LLM is to train it from scratch with only this data.

Recall that the main goal during training is actually to learn the language and the relationships between words. So training a new model is an overkill in most cases.

Another way is to provide the content as part of the question/answer process and instructing the LLM to use only your data for answering a question.

This avoids the expensive process of training new models and still gives good answers.

I would suggest you implement this framework (all eight of the steps) first with the second approach and only move to training new LLMs if you’re not getting accurate enough answers.

2 – Instruct

young diverse workers of coffee shop using laptop together
Photo by Ketut Subiyanto on Pexels.com

By default, LLM in ChatGPT will answer based on ALL the content it has read. All 250 billion sentences! So you should instruct ChatGPT to answer ONLY based on the content you are giving it.

And by default, it will try to give you creative answers. You can turn down the temperature configuration to 0. This essentially turns down the creativity so the answers you get are more factual and less creative.

3 – Filter

faceless barista pouring water in dripper
Photo by Michael Burrows on Pexels.com

You want to restrict the category of questions you will or will not answer. For example, an LLM should (probably) refuse to answer the question: should I increase the dosage of my medicine?

You can ask ChatGPT to categorize the user’s question in a few defined buckets.

Prompt: Categorize this question in one of the following categories. If the question does not fit then say “no match”: 1) Question about my health record, 2) Question about healthcare services, 3) Question about changing medication.

Then your solution can politely refuse to answer questions that fall into third category and refer the user to talk to their doctor.

You can also use this technique to guide the user into other workflows that may be better for this category of question. For example, a question about scheduling should probably be redirected to your scheduling workflow.

4 – Ask

young ethnic woman pointing at camera
Photo by Andrea Piacquadio on Pexels.com

This is the part where you send the question to the LLM to answer.

The key thing here is to have your prompt be as specific as possible.

Asking a vague question is more likely to get you a hallucination.

So instead of asking “Which is the best car”, ask “Which car was best in mileage in 2024”.

Another technique is to force the LLM to put more importance on recent events. By default, LLMs do not have any concept of time. So they will treat a fact from 30 years ago equal to a fact today.

You can instruct the LLM to only use data in the past x years to avoid this issue. You can also ask it to give more preference to content created in past x years.

5 – Evaluate

pencil drawings on wooden table and women hands
Photo by KATRIN BOLOVTSOVA on Pexels.com

In this stage you evaluate the answer(s) you received from the LLM to see if it is good enough to send to the user.

One technique is to have ChatGPT rephrase the question a few different ways. You ask the LLM to answer all variations of the question. Then you ask ChatGPT to evaluate whether all the answers are essentially the same. If the answer is no, then you are more likely to have a hallucination. You can ask the user to phrase the question differently.

Another technique is to have ChatGPT evaluate the mood of the response. If the mood is not acceptable then again you can refuse to pass along the answer.

6 – Fact Check

multiethnic businesswomen checking information in documents
Photo by Alexander Suhorucov on Pexels.com

Ask ChatGPT to rephrase some of the facts in the answer in a structured format that you can then compare with the underlying data.

For example, you can ask ChatGPT to provide you the NPI of the doctor it is recommending. Then you can do a quick lookup in the NPI database to ensure the doctor exists.

This extraction of facts in the response is also good to show to the user so they can cross-check and feel more trust towards the response.

7 – Notify

rear bumper of a car covered with colorful stickers
Photo by Kelly on Pexels.com

If you see a car with a Student Driver bumper sticker doing something wrong you are. more likely to let it slide.

The same principle applies to AI model output.

Clearly notify the user that this response is coming from AI.

Provide them a way to see the extracted facts from the Fact Check step and allow them to click on them to go to the source of the data.

8 – Learn

little girl helping her brother with homework
Photo by olia danilevich on Pexels.com

As discussed in the “What data LLMs use to answer the question” section, the fourth type of data used by LLMs is the reinforcement learning feedback data.

Before rolling out your solution, do an exercise with a small set of (forgiving) users. Have them use your solution and then press a button when the response is good and another button when the response is bad.

Collect this data and send it to the LLM as reinforcement learning data. You will notice that the answers will get better.

I would recommend keeping this mechanism even when you have real users. The feedback data will help you with reinforcement learning AND the users will feel better that they are able to give feedback.

Summary

While hallucination is a concerning issue with LLMs, you can implement a framework to reduce hallucinations and be able to use LLMs:

Curate -> Instruct -> Filter -> Ask -> Evaluate -> Fact Check -> Notify -> Learn

Training new LLMs can help but is a very expensive process and is not needed in most cases.


Popular Blog Posts