This post is also available in: Arabic

Taking advantage of Twitter is no longer exclusive for marketing research, justice protests, or products/services promotion. Disciplines such as psychology, sociology and health-care have tried capitalizing on the platform, and there could be other usages of twitter data that no one is currently utilizing.

Speaking of psychology, Twitter can be viewed as one giant real-time psychological database that keeps updating itself constantly. People tend to record their immediate feelings online, where online anonymity on social networks can be powerful and allow for sincere expressiveness for many users. There are arguments about whether our online personality is our real personality, and whether or not we’re our truest selves behind a mask (i.e. the screen).  This can be subjective, depending on how people use the tool at hand, what they intend to get out of it, and the context in which they are using it. If this is to tell us anything, it shows two important points: 1) how powerful and moving social media tools are, and 2) how the rich data volume of discussions, posts and tweets available is real gold.

When it comes to data and how someone can benefit from it, what makes Twitter win over Facebook is that twitter data is free and available to be extracted using some simple programming languages or coding, while this is restricted and limited with Facebook. In this case, we are not talking about the data gathering that is based on manual search (which involves typing keywords on the search bar), but the more sophisticated search where elements such as time, vocabulary, geographical location, gender of user and more are involved. Algorithms and models can be built on the available data that was extracted, and those models can help in detecting hidden patterns or even predicting future actions. You can imagine the overwhelming possibilities and goals that you can achieve with such a combination of fields.

Since we are speaking about psychology here, analyzing twitter data in a psychological context can answer many questions, such as: how large is the depressive tendency in a certain country (or users of that country)? Or why country X is more schizophrenic than country Y (yes, any question is valid in analytics)? Or during which time of year do tweeps become more negative in their online language? What is the ratio of male to female users that are the most pessimistic? Does a seasonal pattern exist?

In a paper published through a collaboration between Bejing University and the University of Technology Sydney, computer science researchers and psychologists worked together towards tracking mental disorders in social networks. Their focus was depression, and they did the experiment on a micro-blogging website that is the Chinese equivalent of Twitter. According to cognitive scientists, the features depicting online users behavior used were: number of retweets, the time of those retweets, how often users are commenting or retweeting, the time users are most frequently online, the use of some certain words and emojis, and the structuring of the sentences. The hard work about this analytical task is building a vocabulary of positive/negative words for the algorithm to start its work, and that is where psychologists and cognitive scientists intervene. 

The following figure shows which features or characteristics contribute the most in differentiating between depressed from non-depressed platform users.

undefined

 

One cool thing about analytics and Artificial Intelligence (AI) is their universality; they can be applied in diverse areas. 

undefined

Experimenting with Sudanese twitter for analytical purposes was recently conducted by a group of IT students from the University of Khartoum. A group comprised of four ran a hashtag campaign requiring users to express their opinions about telecom services in Sudan. People interacted and tweeted in both Arabic and English. The students used the interactions to build a vocabulary that considered Sudanese dialect (and sarcasm), and designed an algorithm that can tell the different between a positive opinion from a negative one. With the help of statistics and some visualization, it was possible to quantify how much and when did Sudanese users like certain services, and how & when they hated others. Other insights were also mined to better understand the issues that Sudanese face in their daily use of telecom services.

undefined

Image Credit: Study coordinators, Roaa Mohamed, Muwada Ahmed, Mowada Omer & Noor Anwar 

Psychologists, or even twitter users themselves, can help in the first stages of analysis, where the features or characteristics that are used to identify depression are listed for the algorithm to learn from. New features may include how much does a user ‘not’ tweet, or the length of their twitter pauses. Or what other associations confirm that a period of not tweeting is indicating depression or negative mental tendencies. Another question that arises here is whether we Sudanese have a different vocabulary for depression than other nationalities when tweeting about it. We can even probe into how intense “depression language” is among Sudanese online users and what features can be used to quantify or indicate this?

Experimenting with twitter data using analytical approaches can be fun, informative and exhibit an extra dimension of our mental health problems. However, many technical issues and challenges exist along the whole process. For example, building the dictionary of the terms of interest can be a cumbersome procedure, tweets are not always obvious, in addition to the sanction issues; because for web developers to work on extracting & mining twitter data, they need to create a twitter-web account for analytical experimenting programming which should be associated with a phone number, and Sudanese were until recently unable to use their numbers to register. Roaa Mohammed, one of the team members of the Sudanese telecom experiment added “data extraction took time but it was easy and straightforward. What was really difficult is the cleaning that you have to do before building the algorithm/model”. She adds “there were also a couple of limitations with twitter API, and working with Arabic is not as simple as English.”

Some of these issues can be tackled when the results are useful and fascinating to the researchers and community. Hopefully in the future we will see more “sentiment analysis” involved in understanding Sudanese digital communities, especially in the psychology arena where such analytical insights may pave the way for helping the majority of people with mental disorders who often go undiagnosed and untreated. 


Tagwa Warrag

Tagwa Warrag studied computer science and IT at Sudan university. She is interested in books, artificial intelligence, sometimes security, art (specifically cartoon drawing), and the German language.