Visual Analysis of Albanian Tweets
2015 - 2018

Elda Pere & Elias Saravia

 The purpose of this project is to demonstrate the use of open source technologies as a tool for globalization in developing countries. Using the open source Twitter API, the team queried tweets in Albanian that have very rarely been analyzed before due to demand and technical limitations in natural language processing. This page aggregates insights about the tweets and dives deeper into insights from specific moments in time. In particular, the analysis focuses on tweets in Albanian between December 2015 and December 2018.
 Possible users are agents with academic, political or economic interest. Researchers may be interested in using insights from this subset of the Twitterverse as part of their studies. Political figures may be interested in reactions to events such as policy proposals or new ideologies forming. Businesses may look for user reactions to products similar to the new ones they are preparing to launch. The collection of these actions, despite their small footprint, would support Albania’s integration into a global conversation and pave the way for growth.





In the section below, the keyword filter limits each chart to the tweets with the selected keyword. To choose a keyword, select the dropdown filter and either click on an option or type it quickly and the dropdown will find it if it exists in a tweet. Notice that the Albanian language also contains the non latin letters: ë and ç. The line chart can also be filtered by year by dragging the slider to the specific year when the tweets were published. You may also hover over the points in either chart to find more information about the tweet.
An important note is that, for better usability, we have limited these charts to last 10000 tweets for each of the four years. Please contact Elda at epere@berkeley.edu for additional data points.






In the section below, hover over each bar to discover the number of times each source was used in Albanian tweets across 2015-2018. Interesting insights that could be extracted include the number of Android vs iPhone users across the years, as well as the popularity of third party platforms such as WordPress. Note that unlike the previous charts, this one aggregates the entire set of tweets from 12/2015 to 12/2018.







Thank you for your attention! We used calls to the Twitter API using the Postman API Platform to search for tweets in Albanian. Since Twitter does not currently support the Albanian language, we used the paper “The 100 Most Common Words in Albanian” to filter the tweets. Due to the large volume of tweets published between 12/2015 and 12/2018, we loaded the data onto Google Cloud Platform’s Big Query for easier access. From there, we used Altair and Tableau to perform exploratory data analysis and build visualizations.
Future work includes a more in-depth analysis of the Albanian tweet text. We hope to create a lexicon in Albanian for sentiment analysis, and perform language-agnostic language models on the tweet corpus. We're very excited to explore more of this newly acquired data set and will update this page as soon as we can!



Elda Pere and Elias Saravia © 2021. All rights reserved. Powered by Bootstrap.

Theme based on the Creative Bootstrap Theme by Start Bootstrap.