GPT模型单次inference输入生成下一个token，为什么会产生kv-cache？-有趣的事

GPT模型单次inference输入生成下一个token，为什么会产生kv-cache？

2024-03-15 阅读 11

更新于 2024年11月21日