prompts1 [논문] LLMLingua: Compressing Prompts for Accelerated Inference of Large Language Models AbstractLarge language models (LLMs) have been applied in various applications due to their astonishing capabilities. With advancements in technologies such as chain-of-thought (CoT) prompting and in-context learning (ICL), the prompts fed to LLMs are becoming increasingly lengthy, even exceeding tens of thousands of tokens. To accelerate model inference and reduce cost, this paper presents LLML.. 2025. 1. 9. 이전 1 다음 반응형