Third Machine Learning Summer School Held in UK
On July 17-23 the Third Machine Learning summer school organized by Yandex School of Data Analysis, Laboratory of Methods for Big Data Analysis at the National Research University Higher School of Economics and Imperial College London was held in Reading, UK. 60 students, doctoral students and researchers from 18 countries and 47 universities took part in the event.
Three HSE students, who received travel grants from the Laboratory of Methods for Big Data Analysis (LAMBDA), also participated in the event. Any student of the Faculty of Computer Science had the opportunity to take part in the grant competition.
The summer school programme included over 60 hours of lectures and seminars, as well as two data analysis competitions on the search for dark matter. The school covered such topics as linear models, gradient boosting, hyper optimization, deep learning, convolutional and recurrent neural networks. The materials contained examples of using machine learning to solve specific practical tasks implemented by Yandex School of Data Analysis and HSE in the LHCb experiment. LAMBDA staff members Andrey Ustyuzhanin, Maxim Borisyak, and Nikita Kazeev took part as lecturers and teachers at the school.
Yandex staff members Alexey Artemov and Alexander Panin were among the school speakers. The following leading physicists and specialists in machine learning from various universities and experiments took part as guest speakers: Dr. Noel Dawe (University of Melbourne, Australia, ATLAS experiment), Dr. Timothy Daniel Head (WildTree Technologies, Switzerland), Prof. Mike Williams (MIT, US, LHCb experiment).
‘Schools on machine learning in high energy physics have become a traditional event for the Computer Science Faculty. The first school was held in 2015 in St. Petersburg, last year’s school was held at Lund University in Sweden, and this year’s was hosted by the University of Reading in the UK. It is encouraging that students join staff members from LAMBDA and faculty lecturers in taking part in the school. These summer schools make a significant contribution to the further development of cooperation between the faculty and the European Center for Nuclear Research (CERN),’ says Ivan Arzhantsev, Dean of the Faculty of Computer Science.
The competition on data analysis was organized on the Kaggle platform. The sample comprised a mixture of real background data and simulated electromagnetic showers from the OPERA experiment.
Solutions suggested by the participants could help OPERA and SHiP experiments identify new effective approaches to identifying the traces of dark matter interaction. Following the results of competitions held as part of the school, the organisers decided to extend the final online round until the end of summer.
Yandex and Microsoft Azure resources were made available for self-study during seminars and competitions. The projects jupyterhub and everware provided access to these resources. School materials are available in the github repository.
2nd-year student, programme in Applied Mathematics and Information Science
I heard about the school at an event on September 1, 2017 during which Petr Zhizhin, who took part in last year’s school, talked about it. In school I was involved in olympiads in physics, and read about machine learning on the Internet. I was definitely interested in both topics. That is why I set myself the goal of participating in this school.
When I got the email about faculty travel grants, I immediately wrote a long motivational letter about why they should choose me. After the application was approved, I had to pass the interview stage. Despite feeling a bit nervous, I managed to present myself well and won a grant to cover participation.
About the summer school, I really liked that we were able to analyse all major machine learning algorithms, from the simplest linear regression to neural networks in such a short period of time. Now, if I want to deepen my knowledge of machine learning, I will have a good background for it.
Competitions on the Kaggle platform, which were open to all participants, were organized during the summer school. The main aim was to learn to detect neutrinos. I tried different methods, and XGBoost proved to be the best one. I was very surprised when I found out that it was used for top solutions instead of neural networks. The main difficulty in the competition proved to be feature engineering, which physicists handled better than the others, and that turned out to be the most difficult part of the competition.
Communication with researchers and students around the world was a key part of the school, I even exchanged contacts with some of them. Many participants are engaged in research in physics. It was interesting to learn about their new findings. It is worth noting that most of them use machine learning in their research.
I have good memories of the school, and would like to thank Yandex and the Faculty of Computer Science for the opportunity to take part in such event.
1st-year student, MA programme in Data Science
For me the summer school MLHEP 2017 in Reading (UK) was a truly inspiring event that I’ll remember for a long time. Where else could you to immerse yourself in studying machine learning in high energy physics, attend lectures by leading researchers, have the chance to communicate with fellow participants from all over the world, try to solve a problem from an experiments at CERN as part of the Kaggle platform competition all in one week?
The school was held in a charming university campus a 30-minute walk from the centre, and the atmosphere on campus was calm and amenable to studies. The programme was intensive: classes started at 9 am and finished around dinner time. Lectures and seminars were delivered by staff members of Yandex School of Data Analysis and LAMBDA. The lecturers covered a variety of topics, from the basics of machine learning (linear models for regression problems and classification, quality metrics, retraining, etc.) to advanced issues of deep learning, recurrent (RNN) and generative-adversarial (GAN) neural networks. During the second half of the day, guest lecturers from the US, China, Australia, and elsewhere in the UK spoke at the event.
The task that we were set to solve during the week sounded promising. At the international OPERA experiment researchers are trying to detect neutrinos - the particles thought to form the dark matter in the Universe. Since a positive signal in simulated data was less than 0.02% of the whole training set, there was considerable complexity involved in making the model capture anything other than noise.
I should mention that most student physicists (doctoral students and junior research fellows) could be divided into two groups. The first group had previously used machine learning algorithms in their work processing the accumulated data from the particle detectors, but wanted to master more advanced technologies, while the second group was not familiar with this area and this was the first time they had been so closely involved with it. Everyone found it interesting, which shows that the staff from Yandex and the Faculty of Computer Science did an excellent job organizing this school, and schools like this, and in particular in informing international colleagues about how useful machine learning could be in their research.