大语言模型辅助医学系统综述:方法、发展方向和应用

Empowering medical systematic reviews with large language models: methods, development directions, and applications

  • 摘要: 随着生物医学文献数量的爆炸式增长,传统的基于关键词匹配的检索方法日益难以满足临床与科研实践中对效率与精准性的双重需求。近年来,以ChatGPT和DeepSeek为代表的大语言模型,凭借其强大的自然语言处理能力,在医学系统综述领域展现出显著的应用潜力。然而,其固有的“幻觉”问题与知识更新滞后等挑战限制了其直接应用的可靠性。本文系统介绍了当前缓解大语言模型“幻觉”的6类核心技术路径,重点解释检索增强生成技术的原理与应用优势,并在综合梳理系统综述任务中的22篇代表性研究的技术特点与应用场景后,进一步指出基于“证据等级的结构化理解与生成”的大语言模型是未来的重要发展方向之一。本文旨在为医学研究人员与临床从业者提供系统性的参考,助力其科学高效地利用大语言模型提升医学文献信息处理效率与循证医疗决策质量。

     

    Abstract: With the exponential growth of biomedical literature, traditional keyword-based retrieval methods are increasingly inadequate for meeting the dual demands of efficiency and precision in clinical and research contexts. In recent years, large language models (LLMs), exemplified by ChatGPT and DeepSeek, have demonstrated significant potential in supporting medical systematic reviews due to their powerful natural language processing capabilities. However, its inherent challenges such as the"hallucination"problem and lagging knowledge update limit the reliability of its direct application. This paper systematically introduces six core technical approaches currently used to mitigate hallucinations in LLMs, with a particular focus on explaining the principles and application advantages of retrieval-augmented generation (RAG). After comprehensively reviewing the technical characteristics and application scenarios of 22 representative studies in the context of systematic reviews, the paper further identifies LLMs capable of"structured understanding and generation based on levels of evidence"as one of the key future directions. The goal is to provide systematic guidance for medical researchers and clinical practitioners, helping them make scientific and efficient use of LLMs to enhance the efficiency of biomedical literature processing and the quality of evidence-based medical decision-making.

     

/

返回文章
返回