Fault of Our Stars: Behavioral Drivers of Rating-Sentiment Incongruence
Mirrored from arXiv — NLP / Computation & Language for archival readability. Support the source by reading on the original site.
Computer Science > Computation and Language
Title:Fault of Our Stars: Behavioral Drivers of Rating-Sentiment Incongruence
Abstract:When people share experiences online, they often express thoughts in two ways: a star rating and a written review. In sentiment analysis, ratings are widely used as convenient weak labels for textual sentiment, yet whether the two actually agree is rarely questioned. This study investigates sentiment-rating incongruence, where the sentiment expressed in review text differs from the sentiment implied by the assigned star rating, in Sri Lankan tourism attraction reviews. A dataset of 16,156 reviews from 2010 to 2023 is analyzed using a transformer-based sentiment pipeline that derives textual sentiment independently of assigned ratings. Incongruence occurs in 18.6% of reviews and falls into six directional patterns, with Conservative Rater and Obligatory 5-Star behaviors accounting for the majority of mismatches. Prevalence also varies across venue types, with museums showing the highest rates. Statistical tests, logistic regression, Random Forest, and SHAP analysis identify venue type, reviewer expertise, review length, and temporal factors as contributors to rating-text divergence. Overall, this study demonstrates that star ratings are not interchangeable with textual sentiment and should be validated before being treated as ground-truth labels in NLP.
| Comments: | 7 pages, 3 figures. Submitted to MerCon 2026 |
| Subjects: | Computation and Language (cs.CL) |
| Cite as: | arXiv:2606.25518 [cs.CL] |
| (or arXiv:2606.25518v1 [cs.CL] for this version) | |
| https://doi.org/10.48550/arXiv.2606.25518
arXiv-issued DOI via DataCite (pending registration)
|
Submission history
From: Abaiyan Ramanaish Mr. [view email][v1] Wed, 24 Jun 2026 07:51:56 UTC (426 KB)
Access Paper:
- View PDF
- HTML (experimental)
- TeX Source
References & Citations
Bibliographic and Citation Tools
Code, Data and Media Associated with this Article
Demos
Recommenders and Search Tools
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.
More from arXiv — NLP / Computation & Language
-
Generating in the Limit with Infinitely Many Hallucinations
Jun 30
-
Extracting Knowledge from an Arabic-English Machine-Readable Dictionary Using Information Extraction
Jun 30
-
Developmental Trajectories of Situation Modeling and Mentalizing in Transformer Language Models
Jun 30
-
A French OSCE Dialogue Dataset and Controllable Virtual Patient System for Clinical Training
Jun 30
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.