Generative AI in health, particularly with its 1st generation product- dictation support- has received a lot of attention in the last year, with names like Abridge and Nabla have claimed funding rounds topping $200M. In this article, we’ll take a closer look at the opportunities vs. the cost of implementing these tools from a development standpoint.
Most ambient EHR systems that rely on AI utilize either general GPT-4o backbones available through OpenAI’s token stack or via MS Azure. As health continues to work up to reasonable “enough” training data for both replicability and correctness of representation via a generative output system, the issues lie 2-fold:
- The unclear rules for data interoperability across data silos in a clinical environment
Why does this matter? Around 70% of hospital clinical systems today maintain some degree of data compartmentalization across clinical floors or departments to ensure no cross-contamination of data. Under these circumstances, much of the training data is ported in from other sources. Each set resulting from this therefore forms an incomplete view of the patient for better treatment protocols, but adds substantially to the cost and accuracy load within regulatory environments.
The hope here relies likely on state efforts to start corralling and aggregating health data via public health exchanges (such as the DxF program in California). This becomes a more subsidized effort that can meet state level regulatory burden for data quality through respective state health services Departments.
2. Most public training data is just that- public
Specificity under unique environments is central to how a generative AI output set may be tasked in different hospital environments and constraints. Most of the public data available comes via data.cdc.gov or more local population health datasets. For an internally valid AI system to truly be precise, it requires a strong foundation of “like” data- such as historic longitudinal patient data from the same center or set of centers.
Additionally, training even with the same data, takes a larger than normal number of iterations to remove hallucinations, bias, and ensure accuracy for any area related to clinical diagnosis support or just dictational follow-up that is not related to imaging or fixed small data sets- and usually runs from a start of $100,000 for simple nurse call support tools to over $500,000 for anything touching a patient action field that resembles unsupervised learning- a very costly venture for startups aiming to get into this area.
This is what makes it all the more incumbent on emerging companies to focus on achieving excellence with small tools that can effectively translate non-regulatorily heavy instances of data use such as consumer health, chat support and care plan complements at far lower cost- usually through some form of outsourcing that can reduce production load in the early MVP stages as the company can move to a GTM-ready initial product. More to come in the next blog…