POI推荐
场景: yelp, 美团, airbnb
Design a system to find nearby restaurants
Design a system to match drivers with riders for Uber
Design a system to compute ETA for food delivery
特点: 如果是event 推荐这种注重实效性、位置性的推荐,event发生后不存在了,所有item可以认为都是冷启动
对于位置的挖掘可采用图特征或模型
1. requirements
products/use cases
User Search: Users search for restaurants based on location, cuisine, and preferences.
Real-Time Recommendations: Provide real-time recommendations based on user queries.
Cold Start: Handle new restaurants with limited data.
objective
Connect Users with Local Businesses: Help users discover great local businesses
Increase Engagement: Encourage users to explore and interact with more POIs.
constraint
Data Constraints: Limited data on new restaurants.
Volume: Handle a high volume of users and queries.
Latency: Provide real-time results with low latency (e.g., < 200ms).
2. ML task & pipeline
预测目标
是否点击
停留时间(dwell time), 可转化为t/(t+1)来逼近sigmoid函数,t很大时接近1;很小时接近0
3. data
Data collection
User Profiles: Demographics, preferences, and past interactions
POI Data: Location, cuisine, ratings, reviews, and other attributes
User location: For localized recommendations we need to consider only businesses near the city or neighborhood where the user is located
Business Data: Restaurant location, cuisine type, user ratings, and reviews
Interaction Data: Past searches, clicks, and visits
Data Processing
Data Cleaning: Handle missing data and outliers
Data Integration: Combine data from different sources into a unified format
Data Augmentation: Use techniques like synthetic data generation to handle cold start problems
4. feature
sparse
dense
User Features
Demographics: Age, location
Preferences: Favorite cuisines, price range
Behavior: Past searches, clicks, and visits
POI Features
Location: Latitude, longitude, and proximity to the user
Attributes: Cuisine, price range, ratings, reviews
Popularity: Number of visits, ratings, and reviews
Context Features
Time of Day: Recommendations may vary based on the time of day
Device: Recommendations may differ based on the device used (e.g., mobile vs. desktop)
5. model
retrieval
取决于filter
Collaborative Filtering: Recommend POIs based on similar users' preferences
Content-Based Filtering: Recommend POIs similar to those the user has interacted with in the past
Graph-Based Models: Use graph algorithms (e.g., Node2Vec, GraphSAGE) to capture spatial relationships between POIs
ranking
rerank
6. evaluation
offline
NDCG
MAP
precision, recall, and AUC-ROC
online: A/B testing holdout canary
7. deploy & serving
Batch Serving: Periodically update restaurant recommendations.
Online Serving: Real-time requests for user queries.
8. monitor & maintenance
9. 优化与问答
冷启动的item
双塔可以采用default embedding, 而不是random initial
reference
Last updated