Recommendation System Architecture & Plan
This document is a consolidated, step-by-step plan reflecting all conclusions for building a high-performance, personalized recommendation system.
Phase 1: Data & Feature Engineering
1. Required Data Sources:
- User Profiles: Static data including user ID, demographics, and stated preferences.
- Product Catalogs: All attributes for items from both restaurants and supermarkets.
- Interaction Logs: High-volume event streams including clicks, views, add-to-carts, purchases, and impressions.
2. Feature Engineering Pipeline:
- Text Features: Processed by
all-MiniLM-L6-v2
for semantic embeddings. - Image Features: Processed by
EfficientNetB0
for visual embeddings. - Categorical Features: Mapped to trainable embedding vectors.
3. Negative Sampling Strategy:
A Hybrid Negative Sampling approach will be used:
- Primary (Corrected In-Batch Negatives): Uses other items in a training batch as negatives, with a correction factor to reduce popularity bias.
- Augmentation (Hard Negatives): Supplements with true hard negatives from impression logs (items shown but ignored).
Phase 2: Model Architecture & Training
1. Core Architecture: Two-Tower Model
- User Tower: Generates a 128-dim embedding for a user's intent and taste. Passes concatenated inputs through two
Dense
layers (1024 -> 512). - Product/Business Tower: A single, unified tower for all product types to enable cross-domain learning. Passes concatenated inputs through two
Dense
layers (1024 -> 512).
2. Training Protocol:
- Loss Function: The definitive choice is Softmax Cross-Entropy for its stability and efficiency.
- Cold-Start Handling: Feature Dropout will be used during training to force the model to learn content-based recommendations.
Phase 3: Deployment & Serving Pipeline
1. Multi-Stage Serving Pipeline:
- Stage 1: Retrieval: The User Tower computes an embedding in real-time to query a ScaNN index, retrieving ~200 candidates. (Target Latency: ~70ms)
- Stage 2: Re-ranking: A lightweight XGBoost model re-orders the candidates using richer cross-features for precision. (Target Latency: ~25ms)
- Stage 3: Fairness & Diversity: Business logic boosts scores for new/less popular businesses and ensures diversity. (Target Latency: ~10ms)
- Stage 4: Real-Time Filtering: The list is filtered against a real-time database (e.g., Redis) to remove out-of-stock items. (Target Latency: ~10ms)
2. Handling New Items:
Continuous Index Rebuilding via an automated, offline "hot-swap" pipeline ensures new items are discoverable within minutes with maximum accuracy and zero downtime.
AWS Service Mapping
This section maps each component of the plan to the appropriate AWS service, creating a scalable and manageable cloud architecture.
Phase 1: Data & Feature Engineering
- Interaction Logs (Real-time): Amazon Kinesis Data Streams
- Raw Data Lake: Amazon S3
- Product Catalogs / User Profiles: Amazon DynamoDB or Amazon Aurora
- Batch ETL Processing: AWS Glue or Amazon EMR
- Feature Storage & Serving: Amazon SageMaker Feature Store
Phase 2: Model Architecture & Training
- Model Development: Amazon SageMaker Studio
- Model Training: Amazon SageMaker Training Jobs
- Hyperparameter Tuning: Amazon SageMaker Automatic Model Tuning
- Model Artifact Storage: Amazon S3
Phase 3: Deployment & Serving Pipeline
- Container Management: Amazon ECR (Elastic Container Registry)
- User Tower & Re-ranker Inference: Amazon SageMaker Endpoints
- Candidate Index (ANN): Amazon OpenSearch Service (with k-NN)
- Fairness, Diversity & Filtering Logic: AWS Lambda
- Real-time State (e.g., stock): Amazon ElastiCache for Redis
- Continuous Index Rebuilding Orchestration: AWS Step Functions + AWS CodePipeline
- Load Balancing: Application Load Balancer (ALB)
Data Lineage & Latency Graph
This graph shows the real-time data flow for a single recommendation request, including target latencies for each stage. Hover over a component for details.