August 2023
Providing a trusted foundation for sentiment analysis via verified dictionaries
Sentiments are at the core of human decision-making and understanding – Influencing social dynamics, transforming the political landscape and even driving economic fluctuations. And thus naturally, sentiment analysis has become an increasingly critical technique across social science domains, ranging from policy-making to business analytics. While deep learning models have excelled in achieving high accuracy, often surpassing simpler lexicon models in sentiment analysis tasks, their inherently opaque nature poses challenges for applications in high-stakes domains like government policymaking or mental health diagnosis, where transparent and interpretable decision-making is crucial. Recognising the continued importance of dictionary-based sentiment analysis, particularly in computational social science fields where interpretability is paramount, improving dictionary-based sentiment analysis remains vital. Dictionary-based sentiment analysis relies on expert-curated lexicons, yet effectively applying these lexicons in rule-based systems faces several challenges.
Scattered Landscape Various lexicons are scattered across various sources, such as GitHub repositories, appendices of publications, supplementary materials, and author/institutional websites. This fragmented distribution poses a significant challenge for researchers who seek to leverage sentiment analysis effectively. Furthermore, they are distributed in diverse file formats, necessitating the tedious process of exporting and importing data into a format compatible with the researcher’s workflow.
Championing Quality Numerous lexicons, including those that undergo peer review, frequently encounter challenges such as the presence of duplicates accompanied by conflicting labels. A substantial portion of existing dictionaries in sentibank (60%) required removal of duplicates, function words, and lexicons lacking substantive sentiment content.
An encyclopaedic hub of sentiment dictionaries With sentibank, researchers can access a vast array of sentiment lexicons from diverse domains and genres. This streamlined approach not only reduces the hassle of locating and validating individual lexicons but also expands the use cases of these invaluable legacy resources. sentibank's centralised hub empowers users to combine and compare lexicons, facilitating more comprehensive sentiment analysis. Moreover, it opens up opportunities for data integration across domains and genres, enriching the depth and accuracy of research findings.
Import and access to our curated collection of sentiment dictionaries in 3 frictionless steps.
1# 1. Installation
2
3 pip install sentibank
4
5
6 # 2. Import modules
7
8 from sentibank import archive
9
10
11 # 3. Access dictionaries
12
13 load = archive.load( )
14 vader = load.dict(“VADER_v2014”)
15
For further information, such as loading dictionaries based on the predefined lexicon identifiers, visit our documentation.
Nick Oh
A dictionary for product reviews, comprising words curated for informal language
Learn moreA comprehensive dictionary that assigns graded sentiment scores to WordNet synsets
Learn more