Abstract:
With the exponential growth of biomedical literature, traditional keyword-based retrieval methods are increasingly inadequate for meeting the dual demands of efficiency and precision in clinical and research contexts. In recent years, large language models (LLMs), exemplified by ChatGPT and DeepSeek, have demonstrated significant potential in supporting medical systematic reviews due to their powerful natural language processing capabilities. However, its inherent challenges such as the"hallucination"problem and lagging knowledge update limit the reliability of its direct application. This paper systematically introduces six core technical approaches currently used to mitigate hallucinations in LLMs, with a particular focus on explaining the principles and application advantages of retrieval-augmented generation (RAG). After comprehensively reviewing the technical characteristics and application scenarios of 22 representative studies in the context of systematic reviews, the paper further identifies LLMs capable of"structured understanding and generation based on levels of evidence"as one of the key future directions. The goal is to provide systematic guidance for medical researchers and clinical practitioners, helping them make scientific and efficient use of LLMs to enhance the efficiency of biomedical literature processing and the quality of evidence-based medical decision-making.