Publication Date: 2024/06/08
Abstract: This paper presents a sentiment analysis project focusing on IMDb movie reviews, aimed at classifying reviews as either positive or negative based on their textual content. Utilizing a dataset of 50,000 IMDb movie reviews, sourced from Kaggle, the study addresses the binary classification challenge by employing pre- processing techniques such as TF-IDF vectorization. The dataset is split into training and testing sets, with models trained on the former and evaluated on the latter. Three machine learning algorithms—Logistic Regression, Random Forest, and Decision Tree—are implemented and compared using performance metrics including precision, recall, and F1-score. Results indicate that Logistic Regression outperforms other models in sentiment analysis classification. The report concludes by highlighting the project’s contributions and suggesting avenues for future research, emphasizing the potential benefits of expanding sentiment types and dataset size.
Keywords: No Keywords Available
DOI: https://doi.org/10.38124/ijisrt/IJISRT24MAY1625
PDF: https://ijirst.demo4.arinfotech.co/assets/upload/files/IJISRT24MAY1625.pdf
REFERENCES