Youtube视频搜索
搜索的核心是relevance
1. requirements
产品&场景
What are the specific use cases and scenarios where it will be applied? general search or vertical domain search
Do we need consider Personalization? not required
the video + the video title/description
目标
What is the primary (business) objective of the search system?
约束
Is their any data available? What format?
What are the system requirements (such as response time, accuracy, scalability, and integration with existing systems or platforms)?
What is the expected scale of the system in terms of data and user interactions?
How many languages needs to be supported?
2. ML task & pipeline
任务:利用历史交互来推荐用户可能交互的item 顶层设计:query转化为embedding, video可以转化为整体embedding 或分模态的多个embedding,根据对比学习进行为微调,推理时取最近临
3. data collection
Query 1
Video 1
Query 2
Video 2
4. feature
5. model
text
video
loss
6. evaluation
Offline
Precision@k, mAP, Recall@k, MRR
we choose MRR (avg rank of first relevant element in results) due to the format of our eval data <video, text> pair
Online(A/B test)
CTR: problem: doesn't track relevancy, click baits
video completion rate: partially watched videos might still found relevant by user
total watch time
we choose total watch time: good indicator of relevance
7. deployment & prediction service
AB testing
Scaling
8. monitoring & maintenance
9. 优化与问答
reference
精读
扩展
Last updated