大模型RAG
大模型相关系统设计可参考Microsoft graphRAG,Google notebookLM,deep research
1. requirements
freshness considered as correct answer may change over time
use case
agent功能
多轮 multi turn follow-up
是否提供reference
personalization
constraint
latency
Throughput
Availability
Scalability
2. ML task & pipeline
indexing
retrieval
generation
optional
rewrite
routing
3. data
document chunk
data augmentation (e.g. query expansion)
query rewrite
4. model
retrieval
hybrid
finetuning loss
finetuning dataset prepare
llm
finetuning
prompt engineering
5. evaluation
retrieval
mrr, ndcg
relevance, coherence(连贯性)
generation
rouge-l, 关键词重合度
主观评估:质量,准确性
online
AB testing
useful/truthful
6. deploy & service
KV cache
7. monitoring & maintenance
8. 优化与问答
NL2SQL
幻觉
如何单独更新知识库中某个文档?
增量更新, 给文档添加版本号
多轮对话的RAG如何实现
历史记录重写查询: 基于多轮的会话记录与当前问题,调用大模型生成一个新问题. llamaindex提供了CondenseQuestionChatEngine, ContextChatEngine
Reference
Last updated