• A
• A
• A
• ABC
• ABC
• ABC
• А
• А
• А
• А
• А
Regular version of the site
Campus inMoscow

# HSE Student Analyses Social Network to Find Runaway Brother

It is a fairly common story for families – a runaway teenager leaves a note saying ‘I’m not coming back, and don’t try looking for me’ and turns off their cell phone. In a recent case, however, a sister was able to find her brother by using the knowledge she acquired as a student in HSE’s Applied Mathematics and Information Science programme. Her story shows what social networks can say about its users to someone who knows how to listen.

## Not a Trace

No one knew where the boy could have run off to; his parents were absolutely certain that the only people close to the teenager were his classmates. They nonetheless called everyone they knew, but no one had seen the boy. There was other route the runaway teenager’s family could take – social networking, in this case the Russian site VKontakte. The boy’s sister, whom for ethical reasons we will call L., is who handled this aspect of the search.

‘He had too many friends to go to each one,’ she says. ‘As statistics show, each social network user has an average of around 150 mutual subscribers, while in real life a person’s friend group is five times smaller than their virtual friend group. We had to look for another way. In a class on combinatorics, we wrote a programme to process, analyse, and visualise a network of friends on VKontakte. I decided to see if I could get something useful from this analysis since any piece of information was welcomed.’

## Mathematical Magic

The graphic below shows the VKontakte network of L.’s brother at the time. Each of the denser coloured cluster show individuals who are more familiar with other members of that cluster and less familiar with members of other clusters.

The mix of clusters at the bottom contains friends from school, acquaintances from different grades, and other individuals about whom L. already knew. But the green cluster is made up of people whom L. did not know at all. This cluster can be considered what one might call ‘bad company,’ and this assumption was confirmed in L.’s research. Since school friends were contacted almost immediately after the search began, and since this proved to be unsuccessful in locating the boy, it was logical to assume that the runaway had likely gotten in contact with someone from the green cluster. It would not have been helpful to write each member of the group one after another – it was necessary to talk to them already having a certain level of certainty that you had found the right person. This was also important so the surprise effect was not lost. If it had been, the members of this cluster may have kept quiet in order to avoid ‘turning in’ the runaway to his parents.

‘Having narrowed the scope of the work, I continued my research by identifying communities and calculating centrality metrics that show how influential a certain person is in terms of different types of interaction and the spread of information,’ L. explains.

Analysing the entire graph would not have helped much since the higher groups would contain people from the same grade as L.’s brother, and they had already been questions. This is why L. carefully studied only members of the suspicious green cluster, and she used three different metrics to do this: degree centrality (the number of people a person knows), betweenness centrality (how frequently information flows through this person in a community), and closeness centrality (how quickly information spreads in the community if it reaches this person first). The results were as follows (in descending order of influence with each participant being assigned a letter):

 Degree centrality Betweenness centrality Closeness centrality A A A B E E C F B D G C E B D

## What Does This All Mean?

First, L. uncovered something quite curious. Person A only ranked 16th in betweenness centrality, but led in all metrics if the green cluster is looked as separately. Person A is a girl, and it was later uncovered that she played a critical role in the story because she had pulled L.’s brother into this ‘bad company.’

Second, Person B ranked high in degree centrality, but low in betweenness. This likely means this person’s connections were not significant and that information did not flow through him. L. was able to conclude with a certain level of confidence that Person B would not play a key role in the search. This was likely the type of behaviour on a social network when someone adds everyone as a friend one after another. Person C and Person D were below Person B in all centrality metrics, so they could be skipped automatically.

Third, Person E ranked low in degree centrality, but rather highly in the other metrics. This means that Person E’s connections are important in the network, and vital information is likely to flow through this member of the community in particular. This probably does not signify this person is hiding the runaway, but the person likely knows about the runaway’s location at least.

Fourth, Person F and Person G showed only betweenness centrality. They linked up the ‘green’ group with the cluster containing schoolmates; in other words, it was possible they knew and were hiding something. L. assumed that Person F was a very involved classmate, but it turned out that this was just a young romantic who himself would like to run away but couldn’t.

The other clusters had interesting features as well. While the red cluster contained friends from high school, the people in the purple cluster were not always at the same level. It turned out that the programme was completely accurate in determining who the ‘troubled teenagers’ were, the teenagers who drank hard alcohol and smoked. At the same time, the purple cluster did not overlap with the ‘bad kids’ in the green cluster. It is important to note that there was only one think connecting the two – the person who supplied the alcohol to the underage students.

## Return

All of the conclusions L. made in her analysis were later confirmed when her brother returned. L.s’ brother still thinks that his sister can do some sort of mathematical magic.

‘At this stage, I finally decided to get in contact with Persons A, E, and F,’ L. notes. ‘It was risky, but it was worth a shot. Person F was not of much use in the end, though she was openly hostile in our communication. Person A ignored me. Person E said several times that they did not know anything, but when pushed she confessed that my brother was safe and not very far. After this, it was not hard to indirectly convince him to get in contact first with me and then with our parents.’

By this time, L.’s parents had already taken a ‘black list’ of these three individuals’ names to the police to file a missing person report. It was assumed that if L.’s brother didn’t show up over the weekend, the search could start with these three people. But it did not reach this point because L.’s brother returned home.

This is how his map looked after everything was over:

The light blue dots are friends from the new school to which the boy was transferred. The school strives to provide a creative and patriotic education. These blue points do not connect to the rest. Members of the old green and purple clusters, which are no longer on this graphic, were deleted from the boy’s VKontakte soon after he returned.

Faculty from the School of Data Analysis and Artificial Intelligence have thanked L. for sharing her story with them. ‘It is a great pleasure for a teacher to hear that their students are not only successful, but are also applying their knowledge to real life, especially with something as important as saving the life of a child,’ comments Ilya Makarov, who is L.’s academic supervisor and the creator of the combinatorics course. ‘I am certain that this example will force many to think about how much personal information is actually stored on social networks and about how well-trained professionals can use this data to prevent similar situations from taking place in the future.’

The social network analysis methods described above are studied in several master’s programmes offered at the Higher School of Economics: Data Science, Applied Statistics with Network Analysis, and Data Journalism. First-year master’s students from any HSE programme can also take an introductory class in Social Network Analysis, which is a MAGOLEGO course. (MAGOLEGO is the name for a general set of elective subjects first-year master’s students must chose in their second semester.) In addition, undergraduate students can take the combinatorics course described above.

## Academic Performance Shapes Student Social Networks

Based on data from the VKontakte social network, researchers have found a relationship between students' academic performance and their closest social environment.

## Social Networks Can Support Academic Success

Social networks have been found to influence academic performance: students tend to perform better with high-performers among their friends, as some people are capable of inspiring others to try harder, according to Maria Yudkevich, Sofia Dokuka and Dilara Valeyeva of the HSE Centre for Institutional Studies.

## 47%

of Russians age 16 to 74 use social networks.

## Businesspeople Starting out Find Support from Friends and Social Networks

Society views businesspeople negatively, and family is often unable to help. Alexander Tatarko, lead researcher at the HSE’s International Scientific-Educational Laboratory for Socio-Cultural Research, recently released a study entitled 'Individual Social Capital as a Success Factor in Starting a New Business'.

## Social Networks Rule the Photo Market

Social networks are one of the reasons for the growing demand for professional photography. Social networks allow anyone to create their public image in a virtual reality. Anastiasia Evstratova has been studying the mechanisms of demand on the photo market, as well as the functions of photography for photographers and their clients.

## More network connections give us more influence

Professor of Stockholm University, Yves Zenou, formerly post-graduate student of Jacques Thisse, Director of the HSE International Laboratory at the Centre for Market Studies and Spatial Economics gave a series of lectures to students at HSE St Petersburg on networks in economics.