Various online sentiment classification services are available now. Such giants as Google, Amazon, or Microsoft offer their cloud solutions for natural language processing. But if you do not plan to pay 1-2$ for 1000 API calls (price increases when the review has more than 1000 characters) and/or are ready to prepare your local classification service or program, then you can use offline classifiers. There are several .NET classifiers for the offline case that will be discussed in this post.
Following topics will be covered here:
What is sentiment classification (analysis)?
Sentiment classification is a powerful approach to better understand how customers feel about your product(s) or service(s). It also can help you in the brand monitoring task. Sentiment classification allows you to classify people’s emotions and feedbacks that they post on social media, blogs, or articles. There are several common taxonomies that are used for sentiment classification.
- 5 classes
- from 1 to 5 stars
In all the cases negative sentiments are the most important for analysis. So the main task that sentiment classifier should solve looks like “Is that text negative?” and “How much negative is it?”.
Accuracy is a simple and obvious metric. It has well-known disadvantages when classes are imbalanced. However, it can be used in our case of balanced classes. It is also widely used in scientific literature.
Classifiers listed below can be compared only in binary classification tasks because one of them doesn’t have a “Neutral” class.
Offline sentiment classification (C#)
There are multiple C# libraries that could be found via Google, NuGet e.t.c. But they should be checked manually because many of them are actually just SDKs for cloud APIs. The next packages were collected after manual verification of sentiment classification C# libraries:
They all are offline sentiment classification C# libraries. Let’s describe them in more detail.
License: License File
Complexity of installation: Easy
Accuracy: 93.3 (Evaluation) / 96.3 (Licensed)
GroupDocs.Classification is a library with its own built-in engine for text and document classification. Models are also a part of the a NuGet package. Installation is simple: you just need to install the NuGet package. There is a difference between evaluation and licensed versions. For the evaluation version, you should split the text into 100-character chunks and then average the results. GroupDocs.Classification shows 93.3% accuracy in evaluation mode and 96.3% when the license is applied.
Complexity of installation: Easy
VaderSharp is a popular solution for C# sentiment classification (analysis). It works very quickly and processes messages per second on a regular home PC, not so about server machines. Installation is also simple (just install NuGet package). However, there is also an important disadvantage: the accuracy is not as perfect as the processing time. It achieves 78% accuracy on the test dataset.
License: GPL V2
Complexity of installation: Hard.
Stanford NLP is a C# library based on the corresponding Java library for Natural Language Processing. There are some difficulties with the installation. It is necessary to install Java version of Stanford NLP and copy the models to the program’s current directory if necessary. There are also issues with .NET Core 3.0 compatibility. Accuracy and processing time are unstable for this library and seem to depend on the text length. For the short texts from SST-2 dataset, it achieves 80.2% accuracy with seconds per example processing time. For longer texts from Cross-Domain one, accuracy decreases to 70%. and processing time increases to 1 min or more per text.
Complexity of installation: Medium
This library is ML.NET-based. During installation, there may be problems related to the work of ML.NET. This is an unsafe library, it requires configuration (x86 / x64) explicitly, and you may need to install some dependencies as well. SentimentAnalyzer returns positive or negative class and corresponding score. It’s the reason why the binary classification task was used for that comparison. The best result can be achieved with a precisely selected threshold. Accuracy reached 79% after the optimization of the threshold value.
License: Apache License 2.0
Complexity of installation: Simple
Another C# sentiment classification library is Wikiled. This library should be trained before testing to produce adequate results. So the accuracy can’t be calculated for the library.
We used a cross-domain dataset for testing purposes. GroupDocs.Classification has not been trained on it. The rest of the libraries most likely did not use it in training either. We will send the result to any interested party by e-mail. Please create an issue in the corresponding GitHub repo.
We also test classifiers on SST-2 dataset: Stanford Sentiment Treebank (Socher et al. 2013. Recursive deep models for semantic compositionality over a sentiment treebank. In proc. EMNLP).
Table 1 shows accuracy (%) for the experimented classifiers.
|GroupDocs.Classification||93.3 (licensed: 94.7)||93.3 (licensed: 96.3)|
Most of them show different results for those 2 datasets. It seems that the cause for such behavior is that SST-2 texts are shorter than Cross-Domain texts. There were also some hangings during Stanford NLP testing on CrossDomain for text longer than 500 characters. That’s why Cross-Domain accuracy marked with “~”.
Analysis of common mistakes
The following misclassifications are typical:
“As are its star, its attitude and its obliviousness.” – Hard to understand what the person means.
“A well acted and well intentioned snoozer.” or “Gee, a second assassin shot Kennedy?” – Sarcasm or irony.
“Moot point” – Too short texts.
Such problems occur in all the sentiment classification (analysis) products. For instance, sarcasm, it is subjective. Therefore, it’s hard to train a sarcasm model and hard to classify it correctly. But let’s hope that these weaknesses will be eliminated in the near future.
While all the above libraries are appropriable for sentiment classification in C#, GroupDocs.Classification is the most accurate. So, it’s the best one for search negative or positive sentiments in the large massive of mentions and/or reviews. In conclusion, it should be noted, that there is also an important case of negative/neutral/positive classification. It will be also considered in the future.
We provide a code that was used for this blog-post: