A view of the medical bed and the procedure room inside Tulsa Women's Clinic
In photo: a medical bed and procedure room inside Tulsa Women's Clinic. Reuters

KEY POINTS

  • Med-PaLM 2 is designed to answer medical questions and potentially even summarize medical data
  • Google is also working with Mayo Clinic for a radiotherapy model
  • Med-PaLM 2 scored 67.2% on a medical licensing exam dataset

Search giant Google has reportedly been testing a medical chatbot since April as the company continues to work on artificial intelligence.

Google's Med-PaLM 2, a version of the company's PaLM 2 large language model (LLM), has been going through tests at the Rochester-based Mayo Clinic and other unspecified hospitals, The Wall Street Journal reported.

The medical chatbot, which was designed to answer medical questions, is expected to be particularly helpful in countries with "more limited access to doctors," according to an internal email viewed by WSJ. The search engine behemoth believes Med-PaLM 2 will be better at handling healthcare conversations compared to the other chatbots used for general use such as Google's own Bard, Microsoft's Bing and OpenAI's ChatGPT.

Vivek Natarajan, a research scientist working on the Med-PaLM chatbot, told International Business Times that LLMs like Med-PaLM "possess the potential to act as care multipliers and significantly enhance the standard of care in geographies with limited numbers of qualified medical professionals."

Its applications include patient concern triaging, simplifying complex medical terminologies for laymen and even enabling a comprehensive history-taking for pre-visiting patients. Outside of direct patient care, LLMs like Med-PaLM 2 can help with effective healthcare information dissemination and the development of educational resources.

While there are numerous opportunities for the application of medical chatbots, Natarajan reiterated that such models still need further development and validation to ensure concerns related to fairness, privacy and equity are addressed.

Providing an overview of Med-PaLM 2, Alan Karthikesalingam, a senior staff clinical research scientist at Google Health, revealed the chatbot was "the first AI system to exceed the pass mark" in multiple-choice medical licensing exam questions.

Karthikelasingam said the passing mark for new doctors was "often around 60%" in U.S. medical licensing exams, while AI systems usually "plateaued at 50." Google's medical LLM scored 67.2%.

The tech giant said in April it was looking to collaborate with Google Cloud customers on how the medical chatbot can be used to find insights "in complicated and unstructured medical texts." The company hinted that Med-PaLM may also help with other tasks such as drafting "short- and long-form responses and summarize documentation and insights from internal data sets and bodies of scientific knowledge."

In a study published in May, Google provided information about the shortcomings of Med-PaLM. It found that the medical chatbot provided more inaccuracies and irrelevant information than those of doctors.

On the other hand, Med-PaLM was able to perform more or less like actual doctors in terms of showing evidence of reasoning and providing consensus-supported answers. The AI chatbot also showed no sign of wrong comprehension.

Greg Corrado, Google senior research director, reiterated Med-PaLM 2 is still in its early stages, adding that he wouldn't want the chatbot to be involved in his own family's healthcare journey. Still, Corrado believes the bot "takes the places in healthcare where AI can be beneficial and expands them by 10-fold," WSJ reported.

Google also revealed in mid-March that it will "soon" publish research findings on AI support for radiotherapy planning. Google said it was "formalizing" an agreement with Mayo Clinic "to explore further research, model development and commercialization" on the radiotherapy model.

News of Google's medical chatbot come as other AI labs and companies race toward generative AI domination, with some also working on models for the medical field.

A team at NYU Grossman School of Medicine unveiled a new AI tool last month that demonstrated the ability to interpret physicians' notes and accurately anticipate a patient's risk of death and other information. In April, British experts developed an AI model that could detect cancer at the early stage, raising hopes for faster diagnosis and treatment among patients.

Despite the promising potential of AI in healthcare, there are still concerns regarding clinical safety, ethical practices and accountability.

Med-PaLM 2 and similar AI models can still produce "factually inaccurate statements" and make mistakes. "Therefore, it is important to ground the responses from the model and ensure that the knowledge encoded in the model is continuously refreshed with the latest medical information," Natarajan said.

(This article has been updated to include Vivek Natarajan's responses to IBT's request for comments.)