Expert consensus outlines a standardized framework to evaluate clinical large language models
Peer-Reviewed Publication
This month, we’re focusing on artificial intelligence (AI), a topic that continues to capture attention everywhere. Here, you’ll find the latest research news, insights, and discoveries shaping how AI is being developed and used across the world.
Updates every hour. Last Updated: 27-Jan-2026 18:11 ET (27-Jan-2026 23:11 GMT/UTC)
Large language models (LLMs) play a key role in advancing intelligent healthcare. While LLMs are increasingly applied in medical fields such as disease screening, diagnostic assistance, and health management, there are no evidence-based guidelines for assessing their effectiveness in healthcare. Now, researchers have developed a consensus that provides a systematic and evidence-based evaluation framework to assess effectiveness of LLMs in medical applications. The framework includes scientific evaluation metrics and procedures, providing guidance for model evaluators.
Highlights
•
The lack of an acceptable and reliable guide for selecting in vitro or in vivo brain metastasis models hinders the development of brain metastasis therapies.
•
There is an urgent need to employ accurate in vitro and in vivo models to recapitulate the complexities of brain tumor metastasis and to unravel the intricate cellular and physiological processes involved.
•
Precise in vitro and in vivo brain metastasis models are crucial for investigating cellular and molecular mechanisms and serve as preclinical platforms to assess novel treatments.
•
An array of emerging techniques, such as bio-three-dimensional (3D) printing, novel real-time imaging, artificial intelligence, and precise gene editing, holds promise for redefining the landscape of cancer brain metastasis model development.
Prediabetes is an extremely heterogeneous metabolic disorder. Scientists from several partner institutes of the German Center for Diabetes Research (DZD)* have now used artificial intelligence (AI) to identify epigenetic markers that indicate an elevated risk of complications. A simple blood test could be sufficient to identify individuals at high risk of developing type 2 diabetes and its complications at an early stage. The study shows how data-driven approaches and molecular medicine interact in the diagnostic process.
Researchers have conducted the comprehensive review of recent advances in multimodal natural interaction techniques for Extended Reality (XR) headsets, revealing significant trends in spatial computing technologies. This timely review analyzes how recent breakthroughs in artificial intelligence (AI) and large language models (LLMs) are transforming how users interact with virtual environments, offering valuable insights for the future development of more natural, efficient, and immersive XR experiences.
In machine learning, it is often necessary to statistically compare the overall performance of two algorithms (e.g., our proposed algorithm and each compared baseline) based on multiple benchmark datasets. In this case, our proposed algorithm is typically referred to as the control algorithm. However, in some cases, it is also essential to conduct pairwise statistical comparisons of multiple algorithms without a control algorithm.
Current continual learning methods can utilize labeled data to alleviate catastrophic forgetting effectively. However, obtaining labeled samples can be difficult and tedious as it may require expert knowledge. In many practical application scenarios, labeled and unlabeled samples exist simultaneously, with more unlabeled than labeled samples in streaming data. Unfortunately, existing class-incremental learning methods face limitations in effectively utilizing unlabeled data, thereby impeding their performance in incremental learning scenarios.
Database optimization has long relied on traditional methods that struggle with the complexities of modern data environments. These methods often fail to efficiently handle large-scale data, complex queries, and dynamic workloads, leading to suboptimal performance and increased computational costs. To address these challenges, researchers have turned to AI4DB (Artificial Intelligence for Database), integrating advanced machine learning and deep learning techniques to enhance database optimization.