SST-5 Sentiment Classes
SST-5 (Stanford Sentiment Treebank, 5-class setup) is a fine-grained sentiment task with five labels: very negative, negative, neutral, positive, and very positive. This project uses those same labels for movie review prediction.
Why DistilBERT
DistilBERT is a compact transformer derived from BERT through knowledge distillation. It preserves strong language understanding while reducing model size and inference cost, making it practical for student projects and lightweight deployment.
LIME and SHAP
LIME and SHAP are model-agnostic explanation approaches. LIME explains a specific prediction by fitting a local surrogate model around one example. SHAP estimates each feature's contribution using a game-theoretic framework. In this starter project, the API includes LIME image generation and keeps SHAP support as an extensible dependency.
System Workflow
- User searches for a movie title on the homepage.
- Backend calls TMDb to fetch movie metadata and audience reviews.
- Each review is scored by the fine-tuned DistilBERT SST-5 model.
- Sentiment counts and an overall summary are returned to the frontend.
- User can request a LIME explanation for any selected review.