Two years ago Fengjun Li decided she wanted to get a haircut.
She decided to use Google user reviews since she was relatively new to the city of Lawrence to find the best salon, but what she found made her suspicious.
Li, an associate professor in the department of electrical engineering and computer science at the University, found that a majority of the user reviews were short, simple, nondescriptive and the vast majority of them were positive—usually perfect scores. After checking user reviews of other products and services on a number of review sites, she noticed several occurrences of similarly worded and seemingly fake, or biased reviews.
Ultimately, this led to Li starting a research project with Hyunjin Seo, a journalism professor at the University, to help readers determine if reviews are fake or real. Their goal is to create an open-source algorithm to detect whether or not a review is fake, or if the reviewer is trustworthy or not. It will be freely available so it can be incorporated into use by other researchers, users, and companies/corporations.
“Spam behavior is not that obvious,” Li said. “And because hundreds and thousands of reviews are there for one brand, it’s difficult for people to go over all of them. So I started to ask myself ‘can we use computer science techniques, can we use algorithms, to help people make sound decisions about the quality of reviews?’”
The two-year project began in early 2014 and received $206,305 in funding from the University Internal Revenue Code (IRC). They said they hope to have everything done by the end of the two-year schedule in early 2016.
The existing research in content analysis has already made progress in content analysis detectors, but Li believes there is room for advancement, specifically in detection rates. An algorithm developed in 2011 by researchers at Cornell University, is used in their website reviewskeptic.com to detect fake reviews, but the detection rate is around 90 percent accurate.
Li and Seo said they hope that by first improving existing methods of content analysis, and then incorporating new information into the existing methods, such as the relationships between a reviewer and the audience, they can improve detection rates.
“We took a look at Yelp.com and found that it has a friendship structure,” said Li. “It allows the reviewers to form social friendships and we wonder ‘can this additional structure help us?’ We want to incorporate the trust introduced by the friendship structure into our spam detection model.”
Additionally, Li and Seo hope to incorporate reputation into their algorithm. By looking at reviews done in the past by a reviewer, the algorithm might be used to predict the future opinions on similar items.
Today, online reviews are regularly created and used by millions of people. The popular website Yelp.com reported an average monthly rate of 138 million unique visitors just this year. Commonly viewed as more trustworthy and honest than traditional paid reviews, the popularity of user reviews has led to all-too-common instances of fake user reviews, made to manipulate the perceived trustworthiness of the medium.
Fake reviews are created for a number of reasons — be it spam, endorsement or to damage a competitor’s reputation. Aside from the legal and ethical issues that arise concerning fake reviews, Seo and Li think fake reviews are damaging to the practice of user peer-reviews for more personal, endorsement free opinions.
“The assumption is that these reviews are done by peers [and] customers who voluntarily share information,” Seo said. “It’s considered as less biased and more trustworthy compared with reviews done by companies. It’s an issue of trust. If this environment is polluted by manipulators then we can’t advance the social-media environment.”
The hope is that through this research, the methods used to create fake reviews can be understood, making them much easier to spot and anticipate. The detector is meant to benefit the users of a review site, as well as the review site itself, and for businesses that may be negatively impacted by reviews posted without their knowledge.
“For [a business] it is in their interest to be able to say that ‘we have improved our algorithms for reviews and these [reviews] are more likely to be authentic’,” Seo said. “Policy makers in particular, the Fair Trade Commission and those who regulate online content, can utilize our study, and algorithms, and findings to develop their rules and guidelines.”
— Edited by Logan Schlossberg