LLM Architecture for Healthcare

In healthcare, we have spent hundreds of millions of dollars over two decades building our current technology architecture in healthcare. However we haven’t really seen the return on investment (ROI) that we had hoped for.

LLM technology holds potential for us to rethink our approach and achieve the goals at a much lower total cost of ownership.

How can we incorporate LLMs without a rip and replace of our existing technology infrastructure?

We’ll start with reviewing the current architecture in the healthcare, why it is to incredibly expensive to build and maintain, and why it doesn’t give clinicians, patients and other healthcare workers what they want.

We’ll then introduce the LLM architecture for healthcare that allows us to achieve the same goals with much less cost to build and maintain and allows clinicians, patients and other healthcare workers quick answers.

The LLM architecture leverages our current architecture and allows you to slowly migrate completely over time.

Current Architecture in Healthcare

While there are variations across organizations, the current architecture in healthcare can be stripped down to the following conceptual diagram:

This process starts with data in various databases.

For example, the primary database is the one that stores all the data from the EMR (Electronic Medical Record). EMR is the tool that clinicians use to enter clinical data about the patient.

In addition we have databases that contain other data such as insurance claims, reference data etc.

Note that each database has its own schema. Schema is just how that data is stored in the database. For example, a column containing date of birth may be called “date_of_birth” in one database, “birth_date” in another and “dob” in yet another database.

The issue of difference in the name of a column is the simplest case. Often the data element will be stored completely differently in different tables.

Similarly the content may also differ in each database. In one database, the date of birth can be “1980-01-01”, in another database “January 1, 1980” and in another database “1/1/1980”.

This is again the simplest example. This problem goes really deep down the rabbit hole.

1 – Standardize schema and format

Then we have a ETL (Extract, Transform and Load) process that reads each database, converts all the schema and content to yet another format that we use in the data warehouse.

This ETL process is an INCREDIBLY expensive process to build and maintain. Over the past two decades, we continue to spend tens of millions in every healthcare organization every year.

I’ve yet to see a healthcare organization that can say ALL their data is in the data warehouse!

Most organization are only able to move a small subset of their data into the data warehouse.

And we continue to spend the same amount or more every year on this! For twenty years and going!

2 – User Interfaces

We build user interfaces on top of the data to enable clinicians to access this data. The user interface has buttons, dropdowns and other controls to allow the user to interact with the data.

This is also an incredibly expensive process. How much do you pay for your EMR?

We can only afford to design the user interfaces to answer a small set of questions efficiently. For everything else, the user is forced to DIY (do-it-yourself) and dig through multiple screens to piece together the data to answer their question.

The more questions we want to answer, the more complex the user interface becomes. Our current EMRs are at the point where it takes clinicians a long time of clicking various screens to answer even simple questions.

What if the clinicians want to get to an answer quickly? Well in most cases, you have to wait months for the EMR vendor to add another feature.

For patients, we create user interfaces like patient portals. Since patients are not as loud as clinicians, we build really crappy patient portals that can answer VERY few patient questions. Just go to the patient portal of your health system to see for yourself.

3 – SQL Layer

Part of why the cost of improving these user interfaces is so expensive is the “SQL layer”.

This layer takes input from the user and then translates it into a language the database understands. Most databases only understand SQL (Structured Query Language). Some of the EMRs like Epic use a different language but the idea is the same.

Every time we introduce something new to the user interface we have to then write software to convert that user input into a SQL query.

And the database just returns the raw data so for any functionality we need to write software that can convert that to tables and other UI elements to show to the user. So more work to do!

4 – Other Healthcare Workers

EMRs are primarily designed for physicians (and nurses to a lesser extent). Every other healthcare worker has to just learn how to use a tool that is not designed for them.

Have you ever watched a call center operator or home health aide learn to use an EMR?

We then decide that we are scared about these healthcare workers messing something up in the EMR so we either invent alternate processes for them or kick them out of the EMR and try to build a separate user interface for them (see cost of building user interface above).

The end result that most of the healthcare workers are not able to contribute effectively to patient care.

5 – Aggregation from multiple sources

One of the processes we haven’t discussed yet is how to aggregate data from multiple sources.

For example, let’s say we get a condition (diagnosis) from the EMR database and we get another condition from another database.

Should we now have two diagnoses for this patient or is this the same diagnosis coming from two sides?

This is also an incredibly hard problem since each system may be storing and denoting diagnosis differently.

Impact of the current architecture

Due to the problems we discussed above, we end up with:

Data warehouses that are yet another silo of data. Exactly the problem they were expected to solve!
Clinicians are unhappy because the EMRs are so hard to use
Healthcare workers that are not clinicians are unable to effectively contribute to patient care
Patients and their caregivers, who were promised these patient portals, get very little value from the exact patient portals.
Over the past two decades, each healthcare organization continues to spend tens of millions of dollars or more every year with no signs of relief.

Surely there must be a better way now that we are living in the age of AI.

LLMs (large language models) offer us a new way to look at this problem.

We can evolve from our current architecture to an architecture that resolves the problems above. And we get to keep and continue to get value from our existing investment.

We start with the same databases we already have today.

1- Convert to text

Instead of spending time converting this data to the data warehouse schema and format, we can just export the data as text from each database. If we already have a data warehouse you can export the data as text from that too.

This is a much easier problem!!

For example, we don’t need to rename the columns from each database: “date_of_birth” vs “birth_date” vs “dob”. Since LLMs understand English they already know that these are referring to the same information.

Similarly LLMs can also handle the different date formats: “1980-01-01” vs “January 1, 1980” vs “1/1/1980”. LLMs already know that these are different ways to represent the same date.

Converting to text is as simple as writing out text like:

Name: Imran Qureshi

Born on: 1st of January, 1980

Address: 123 Main Street, Walnut Creek, CA 94598

Notice that the text can be in any format since LLMs understand English:

Qureshi, Imran

Dob: 1980-01-01

Lives at: Walnut Creek in California

There is NO work required on your part to handle these two different formats! Just think about how big a deal that is….

2 – Knowledge Store

You can then store this text in a knowledge store. A knowledge store is typically a vector database – a database designed to store text.

In a data warehouse, we typically use a SQL (relational) database. This kind of database requires very strict schema and format. It also requires the query to be written in a very strict form (SQL). This forces us to do a lot of work to get the data into a data warehouse and to create queries to get results.

In a knowledge store, we use a vector database. Vector databases allow you to store unstructured text. You can query a vector database by just passing in the question in English and the vector database will return all text snippets that relate to the text of the question.

Vector databases are offered by all the major database vendors (e.g., Mongo, ElasticSearch etc) and all the cloud platforms (AWS, Azure, Google).

3 – Language Interface – Query in English

Since you are using an LLM, you can let the user ask the question in plain English.

Or in Spanish. LLMs understand many of the popular languages so it doesn’t matter which language the user uses.

The users are not limited by dropdowns, buttons and screens.

The language interface can answer questions we have never seen before. Without changing our software!

And when we get data back from the LLM, it will be an answer and not just data that the user has to piece together to form an answer.

For example, in the current UI, if you want to know what the average of the patient’s blood pressure was over the past two weeks, you will get a list of blood pressure readings and then have to calculate the average yourself. Or wait for the EMR software vendor to add this functionality in a year or two.

In an LLM architecture, the LLM would return ONLY the average blood pressure. No need for the user to calculate anything.

Again with no work needed from you. LLMs understand language and can already math operations.

4 – Other Healthcare Workers

In the LLM architecture for healthcare, the healthcare workers, who are not physicians, are able to ask questions of the data and get answers WITHOUT involving data analysts.

And the answers can be phrased by the LLM in a way that the user can actually understand. Nurses would get a differently worded answer than a nursing aide than a home health aide than a call center operator.

5 – Risk Management Layer

This layer applies techniques to minimize the risk associated with the data and AI.

I covered this in Framework to Control Hallucinations in LLMs so you can read that to learn more.

6 – Aggregation from multiple sources

Recall that in the Current Architecture section, we brought up the problem of aggregating data from multiple sources.

For example, let’s say we get a condition (diagnosis) from the EMR database and we get another condition from another database.

Since LLMs understand language, they can figure out if two names for the same diagnosis are the same.

Impact of the LLM Architecture

LLM Architecture for Healthcare can solve some of the problems in the current architecture.

The knowledge store eliminates the problem of silos of data. It is so easy to copy data into it that we can actually finish copying all our data to it. We can even copy data from our existing data warehouse.
Clinicians are happier because they can just ask questions and get answers without having to deal with complex user interfaces.
Healthcare workers that are not clinicians are able to get answers in terms that they can understand. So now they can contribute more to patient care.
Patients and their caregivers, are able to get answers to their healthcare questions so they are more engaged in their care.
We can build and support the LLM architecture at a much, much lower cost than what we spend on our current architecture.

Disclaimer

In order to keep this article short and easy to understand I’ve skipped the usual caveats. There are corner cases I haven’t mentioned that can be handled.

We would also implement triage in the system so the system is answering certain types of questions and referring other types of questions to appropriate workflows such as “call your doctor”, “go to urgent care” etc etc.

This article is about the big idea and not all the details. There will, of course, be cases that the LLM Architecture will not solve.

There are steps we can define in our evolution so we can start to get the benefits incrementally.

Summary

Our current architecture in healthcare costs each healthcare organization tens of millions of dollars every year. Even after two decades we can’t seem to get all our data into the data warehouse.

Clinicians are very frustrated by their EMR experience. Other healthcare workers are not able to contribute effectively to patient care. Patients and caregivers find it hard to get answers to questions about their health record.

The LLM Architecture for Healthcare can change this.

Clinicians will be able to get answers at their finger tips. Other healthcare workers will be able to take some of the load off the clinicians by being able to handle simple, routine care needs. Patients and caregivers will finally have answers about their health information.

All at a much, much lower total cost of ownership.

LLM Architecture for Healthcare

Current Architecture in Healthcare

1 – Standardize schema and format

2 – User Interfaces

3 – SQL Layer

4 – Other Healthcare Workers

5 – Aggregation from multiple sources

Impact of the current architecture