paper2 [논문] LLMLingua: Compressing Prompts for Accelerated Inference of Large Language Models AbstractLarge language models (LLMs) have been applied in various applications due to their astonishing capabilities. With advancements in technologies such as chain-of-thought (CoT) prompting and in-context learning (ICL), the prompts fed to LLMs are becoming increasingly lengthy, even exceeding tens of thousands of tokens. To accelerate model inference and reduce cost, this paper presents LLML.. 2025. 1. 9. [논문] Learning to Filter Context for Retrieval-Augmented Generation Learning to Filter Context for Retrieval-Augmented GenerationOn-the-fly retrieval of relevant knowledge has proven an essential element of reliable systems for tasks such as open-domain question answering and fact verification. However, because retrieval systems are not perfect, generation models are required to genarxiv.orgAbstractOn-the-fly retrieval of relevant knowledge has proven an essenti.. 2025. 1. 9. 이전 1 다음 반응형