Consolidated Recommendation System Dashboard

Recommendation System Architecture & Plan

This document is a consolidated, step-by-step plan reflecting all conclusions for building a high-performance, personalized recommendation system.

Phase 1: Data & Feature Engineering

1. Required Data Sources:

User Profiles: Static data including user ID, demographics, and stated preferences.
Product Catalogs: All attributes for items from both restaurants and supermarkets.
Interaction Logs: High-volume event streams including clicks, views, add-to-carts, purchases, and impressions.

2. Feature Engineering Pipeline:

Text Features: Processed by all-MiniLM-L6-v2 for semantic embeddings.
Image Features: Processed by EfficientNetB0 for visual embeddings.
Categorical Features: Mapped to trainable embedding vectors.

3. Negative Sampling Strategy:

A Hybrid Negative Sampling approach will be used:

Primary (Corrected In-Batch Negatives): Uses other items in a training batch as negatives, with a correction factor to reduce popularity bias.
Augmentation (Hard Negatives): Supplements with true hard negatives from impression logs (items shown but ignored).

Phase 2: Model Architecture & Training

1. Core Architecture: Two-Tower Model

User Tower: Generates a 128-dim embedding for a user's intent and taste. Passes concatenated inputs through two Dense layers (1024 -> 512).
Product/Business Tower: A single, unified tower for all product types to enable cross-domain learning. Passes concatenated inputs through two Dense layers (1024 -> 512).

2. Training Protocol:

Loss Function: The definitive choice is Softmax Cross-Entropy for its stability and efficiency.
Cold-Start Handling: Feature Dropout will be used during training to force the model to learn content-based recommendations.

Phase 3: Deployment & Serving Pipeline

1. Multi-Stage Serving Pipeline:

Stage 1: Retrieval: The User Tower computes an embedding in real-time to query a ScaNN index, retrieving ~200 candidates. (Target Latency: ~70ms)
Stage 2: Re-ranking: A lightweight XGBoost model re-orders the candidates using richer cross-features for precision. (Target Latency: ~25ms)
Stage 3: Fairness & Diversity: Business logic boosts scores for new/less popular businesses and ensures diversity. (Target Latency: ~10ms)
Stage 4: Real-Time Filtering: The list is filtered against a real-time database (e.g., Redis) to remove out-of-stock items. (Target Latency: ~10ms)

2. Handling New Items:

Continuous Index Rebuilding via an automated, offline "hot-swap" pipeline ensures new items are discoverable within minutes with maximum accuracy and zero downtime.

AWS Service Mapping

This section maps each component of the plan to the appropriate AWS service, creating a scalable and manageable cloud architecture.

Phase 1: Data & Feature Engineering

Interaction Logs (Real-time): Amazon Kinesis Data Streams
Raw Data Lake: Amazon S3
Product Catalogs / User Profiles: Amazon DynamoDB or Amazon Aurora
Batch ETL Processing: AWS Glue or Amazon EMR
Feature Storage & Serving: Amazon SageMaker Feature Store

Phase 2: Model Architecture & Training

Model Development: Amazon SageMaker Studio
Model Training: Amazon SageMaker Training Jobs
Hyperparameter Tuning: Amazon SageMaker Automatic Model Tuning
Model Artifact Storage: Amazon S3

Phase 3: Deployment & Serving Pipeline

Container Management: Amazon ECR (Elastic Container Registry)
User Tower & Re-ranker Inference: Amazon SageMaker Endpoints
Candidate Index (ANN): Amazon OpenSearch Service (with k-NN)
Fairness, Diversity & Filtering Logic: AWS Lambda
Real-time State (e.g., stock): Amazon ElastiCache for Redis
Continuous Index Rebuilding Orchestration: AWS Step Functions + AWS CodePipeline
Load Balancing: Application Load Balancer (ALB)

Data Lineage & Latency Graph

This graph shows the real-time data flow for a single recommendation request, including target latencies for each stage. Hover over a component for details.