Jenner.ai: A Text-Based AI Diagnostic Tool for Medicine

4 min readSep 12, 2024

During Miami Hack Week 2023, I developed Jenner.ai, an AI-powered medical diagnostic tool designed to enable text-based interactions with patients. Essentially, it’s a chatbot-like “AI doctor” that assists in diagnosing illnesses based on user input. While the potential for text-based diagnostics through large language models (LLMs) could be great, there are several complexities that need to be addressed before such tools can gain government approval as medical devices suitable for direct consumer use.

Challenges of AI in Direct Patient Care

Deploying an LLM for direct patient diagnosis and treatment faces numerous hurdles, primarily related to regulatory standards, medical accuracy, and patient safety. These models would need to reach a point where they can safely prescribe treatments and track patient progress autonomously. However, I believe these obstacles will eventually be overcome, leading to AI doctors providing direct patient care in the not-too-distant future.

Jenner.ai as a Doctor’s Assistant

Through my experience, I realized that a more immediate and practical use case for Jenner is to serve as a virtual assistant for doctors and medical professionals. Rather than replacing human judgment, Jenner can act as a supplemental tool that can improve a human doctor’s diagnosis and refresh their memory on previously studied illnesses, a sort of assistant, providing multiple potential diagnoses and possible causes of symptoms. Ultimately, medical professionals would maintain full responsibility for decisions, allowing them to cross-check Jenner’s suggestions and ensure patient safety. While increasing their productivity, and improving diagnosis. This also mitigates risks such as “AI hallucinations” — where the model might generate inaccurate or incomplete diagnoses due to lack of information or insufficient context.

I talked about this approach with a practicing doctor, who found Jenner’s suggestions valuable. In one instance, Jenner offered plausible diagnoses that the doctor hadn’t initially considered, which could prompt further testing and confirming the diagnosis wasn’t as straightforward as it first appeared. Interactions like these highlight the tool’s potential as a support system for medical professionals rather than a replacement.

The Potential of General Models

At Miami Hack Week in May 2023, Jenner demonstrated an impressive 81% accuracy on 11 test cases, which is remarkable given it was built during a short two-week competition. This experience convinced me that with the right tweaks and adjustments, general-purpose LLMs can achieve substantial success in fields requiring specialized knowledge, like medicine. These models can synthesize information that isn’t always readily available or neatly packaged online.

With the advancements in LLMs since then — such as GPT-4 and GPT-4-turbo — Jenner would likely perform even better today. Although I had early beta access to GPT-4 before it was public, I wasn’t able to conduct large-scale tests, so I can’t definitively say with numbers, how much the improvements in the size of these models have impacted their medical capabilities.

The Road Ahead: Rigorous Testing and Limitations

For Jenner to be viable for end-consumers, extensive testing is required. Its accuracy may vary depending on the illness, patient demographics (such as ethnicity), and even geographical factors. Certain diseases are more common in specific regions, and others manifest differently across ethnic groups. These variables can result in inconsistent advice from a general AI model, especially when compared to specialists who are deeply familiar with these conditions.

Another challenge is that some diseases are underrepresented in freely accessible data. For example, rare illnesses or conditions that predominantly affect specific populations may not be well-documented in the online datasets that these models are trained on. Consequently, patients with these less-common conditions might not receive optimal guidance from general-purpose AI tools that would most likely excel at more known illnesses and more common patient conditions.

Specialized Knowledge and Local Insights

After having conversations with medical students, I learned that certain universities and hospitals have access to privileged, location-specific knowledge about diseases that affect local populations which might not be published yet in the form of scientific papers, or if they are, these papers might be published in languages other than English, and receive less distribution than papers concerned with bigger populations.

These illnesses may present with symptoms that, outside of that region, are commonly associated with different illnesses. This type of nuanced, localized knowledge isn’t always reflected in publicly available datasets, meaning medical LLMs may not have sufficient exposure to these rare or geographically specific conditions.

This underscores the importance of specialized training for AI models, particularly in incorporating diverse and region-specific medical data. Without this, the utility of AI tools like Jenner might be limited in regions where common illnesses differ significantly from global trends.

Conclusion

The development of an AI medical assistant like Jenner.ai presents both exciting possibilities and formidable challenges. As LLMs continue to evolve, they will play an increasingly important role in supporting healthcare professionals. However, before AI tools can be used independently by patients, we must address the numerous variables and complexities inherent in medical diagnostics, including geographic, ethnic, and condition-specific variations. For now, Jenner’s most immediate value lies in enhancing the decision-making process of doctors, acting as a helpful assistant rather than a primary diagnostician.

Disclaimer: GPT4-o was used to improve this article.