Published on 30 March 2025 |

Version 2024

Sentiment Analysis of User Reviews in Online Stores Using Natural Language Processing and Machine Learning

View Dataset
Hasani, Amirhosein;Ebrahimi, Fatemeh

Description

Users' opinions and perspectives, as the cornerstone of many human activities and decisionmaking processes, play a crucial role in analyzing customer behavior and preferences. Nowadays, user reviews recorded on e-commerce websites have become a valuable source for identifying users’ needs and interests. Through sentiment analysis, these insights can be effectively extracted. In this study, sentence-level opinion mining techniques were employed to analyze user reviews collected from two popular Persian websites. Several models, including XGBoost, Naive Bayes, KNearest Neighbors (KNN), Logistic Regression, Random Forest, and Support Vector Machine (SVM), were utilized to classify sentiments into positive and negative categories. The results indicate that the XGBoost, Logistic Regression, and Random Forest models demonstrated outstanding performance on the Digikala dataset, achieving perfect scores of 1 in all evaluation metrics, including accuracy, precision, recall, and F1-score. The SVM model also showed very strong performance with an accuracy of 0.97 and an F1-score of 0.98. In contrast, the KNN and Naive Bayes models achieved accuracies of 0.85 and 0.76, respectively, indicating weaker performance compared to the other models. Similar results were obtained on the Fidiboo dataset, where the XGBoost, Logistic Regression, and Random Forest models again achieved perfect scores of 1 across all metrics, delivering flawless performance. These findings suggest that advanced models such as XGBoost and Random Forest, due to their complex and flexible structures, possess a high capability in identifying patterns within Persian data and accurately classifying them.

Citations (0)

Mentions (0)

Metrics

Dataset Index

0.3

FAIR Score

13%

Citations

0

Mentions

0

Metrics Over Time

Publication Details

DOI

Publisher

Zenodo

Assigned Domain

Subfield

Artificial Intelligence

Field

Computer Science

Domain

Physical Sciences

Confidence Score

94%

Source

Open Alex

Keywords

Sentiment Analysis, Persian Language Processing, Machine Learning Algorithms, E commerce User Reviews

Normalization Factors

FT

13.46

CTw

1.00

MTw

1.00