‘Not once since I decided to pursue science have I ever been bored’

Sergei Samsonov

Earned a bachelor’s degree from the Moscow State University Faculty of Computational Mathematics and Cybernetics (MSU FCMC) and a joint master’s degree from HSE University and the Skolkovo Institute of Science and Technology. Currently a fourth-year doctoral student at the HSE Faculty of Mathematics. Researcher with the computer science faculty of the International Laboratory of Stochastic Algorithms and High-Dimensional Inference, part of the Faculty of Computer Science (FCS). Lecturer at the Big Data and Information Retrieval School of the FCS. Winner of the Ilya scientific award from Yandex. Research interests include probability theory and mathematical statistics and their applications to machine learning.

Sergey Samsonov could have become a historian or worked in a hedge fund, but he devoted himself to mathematics. In this interview with the HSE Young Scientists project, he explained why he chose research in statistics and machine learning and how to generate a million images of cats.

How I became a scientist

At school, I thought for a long time about what I would do in the future. I suppose I most wanted to do what I found interesting. History was my first love at school, and I still have an interest in it. I thought about becoming a professional historian, but I think that I wasn’t serious enough for this. And in high school, I began to get good at physics and mathematics, largely thanks to my teachers. I think that my final choice between physics and mathematics was rather haphazard; it just happened that I did better in math at the Academic Olympics in the 11th grade. At the same time, I didn’t want to do very abstract mathematics like they have at the Moscow State University Faculty of Mechanics and Mathematics. I wanted what I do to have some real-world applications. This desire led me to the MSU FCMC. And actually, my mom and dad graduated from the same faculty in different years, so we can say that I continued the family dynasty.

During my studies, I saw a lot of people who plunged headlong into programming, but this field was never attractive to me. At the same time, I still liked mathematics and did well in it. And doing what you’re good at is always nice.

Back at Moscow State University, I started working with Alexey Naumov, my current research advisor, and after my bachelor’s degree, he suggested that I apply for a joint master’s programme with HSE University and Skoltech. At that time, I had already taken a course on non-parametric statistics, which the scientific director of this academic programme, Vladimir Spokoiny, taught at the Independent University of Moscow. I also started attending a special seminar led by Vladimir Spokoiny at the Institute for Information Transmission Problems of the Russian Academy of Sciences. So the decision to leave Moscow State University came naturally.

After graduating from the master’s programme, I remained in doctoral school at HSE University, but in mathematics, not in computer science. I’m always between worlds. I proudly say to mathematicians that I am a mathematician, and to applied scientists I proudly declare that I am one of them. It is possible that neither group fully believes me. Of course, the Faculty of Mathematics sees me as engaged strictly in applied science while the Faculty of Computer Science considers me a theoretician. In general, at the HSE Faculty of Mathematics, scientists are engaged in probability theory where it intersects with more abstract areas of mathematics. I specialise in the intersection of probability theory with statistics and machine learning—that is, with more applied things.

The subject of my research

I primarily research methods for data generation, or what in English is called ‘sampling’. It has both theoretical and practical aspects.

The main practical application is the simulation of certain physical processes. I am engaged in the so-called MCMC algorithms—Markov chain Monte Carlo. They were first developed by American applied mathematicians on the Manhattan Project for the approximate solution of integral equations. You can try to solve such an equation with a high degree of accuracy using standard numerical methods, but it will be very long, and as the dimension of the problem increases, the accuracy of your solution quickly decreases. Alternatively, an approximate solution can be obtained using stochastic methods. This is exactly for what the Monte Carlo family of methods are used.

Now these methods are used for new, more impressive applications such as generating pictures, photographs and other complex objects. For example, for a movie you want to draw two armies of 10,000 people fighting each other, and you don't want everyone to have the same faces. You can try to generate random but reasonably plausible faces, but here again Monte Carlo algorithms along with other generative modeling techniques can help.

Or for some reason you want your neural network to learn how to generate cats. Let’s say you have several million black and white photos of cats and each is 1024x1024 pixels. You can match each cat with a number in space with a high degree of accuracy. You now have a collection of dots that correspond to photographs of realistic cats. If you are now presented with a new point in this space, the corresponding picture may be (in some sense) similar to a cat, or it may be a completely different object. You need to have both the ability to evaluate the plausibility of a new object (whether it looks like a cat) or a mechanism to suggest new photos of something your model thinks looks like a cat. In such a problem, Monte Carlo algorithms will be needed both for training the model and, possibly, for generating new objects with its help.

In general, in our field of research, it often happens that things that are elegant in theory do not always work well in practice. Conversely, there are many excellent practical algorithms for which a satisfactory theory has not yet been formulated. It might also be impossible to build such a theory because such algorithms can have too many engineering heuristics. But our area of stochastic algorithms is good at enabling you to work out both theoretical and practical things; and they often complement each other. Observations that one of the algorithms behaves significantly worse than the other on a certain class of practical problems can give rise to an interesting theoretical study. Conversely, it is often only in practice that one can see which problems arise in algorithms that, in theory, should have worked smoothly.

What I am proud of

Last year, Alexey Naumov, Eric Moulinet, and a few other people from our lab presented an article at NeurIPS, a top international conference on machine learning. This article explains how several existing MCMC algorithms can be combined into a single algorithm that has better theoretical properties than its individual parts. We also have very elegant applications of new algorithms to improve the performance of generative adversarial networks. Namely, if you have a neural network that has already learned how to generate objects in some way, then with the help of certain approaches, you can slightly improve the quality of a model that has already been trained. This can be done without retraining, although it does require some additional computing.

My dream

In science, I rather like the process of solving interesting problems. I can’t say there’s an unsolved problem that I would like to solve no matter how difficult the process, and that once I had solved it my life’s purpose would have been attained. I cannot point to something specific that I want to achieve or to a theorem that I want to prove. I only hope that my enthusiasm doesn’t dry up over time.

Not once since I decided to pursue science have I ever been bored. Science is interesting precisely because of its unpredictability and the sudden insights when you see a connection between two things that you had never even thought about before.

You tend to study whatever the researchers around you are currently doing; it is very rare to chart your own path.

I am not meant to spend weeks, months or years on the same problem in order to finally solve it. That’s not my temperament. But there are people who can do that.

If you repeat every day, ‘I will prove Fermat’s great theorem,’ then you will never prove it. Andrew Wiles proved it not because he wanted it more than everyone who had come before him (although the factors of determination and perseverance should not be discounted either). Rather, he succeeded because the time had come when it became possible to prove it, when a sufficient mathematical apparatus had been developed.

Why I began working with statistics

In my second year of undergraduate studies, I read a monograph by A. Kolmogorov and A. Prokhorov about probability theory and mathematical statistics. I liked it very much. It was written for a general readership. At that time I was already thinking that I would at least try to stay in science. So why not choose this field? Then I took a course on probability theory at MSU FCMC and it didn’t spoil my impression. In general, from those areas that were available at the faculty, it was in the field of probability theory that there were quite a lot of active researchers, including the people who taught me. It just worked out well, and here I am.

My first big challenge

In science there is a moment of transition from academic tasks to serious ones. If you were given a problem at a seminar, then most likely it can be solved in half an hour, maybe an hour. The person who gives it to you knows that it has a solution. But when you start working on real research, nobody knows anything. Can the problem be solved? Maybe not. Maybe new approaches are needed. Maybe it won’t be solved for the next 50 years. Or maybe someone has already done everything possible and you just need to read the literature.

In my third year at MSU FCMC (I worked with Vladimir Ushakov, who was the research advisor at the time), I took a problem from the field of analytical methods of theory and probabilities associated with the properties of the characteristic functions of a certain class of distributions. The work on this problem provided a starting point for the term paper that I did at MSU, then for the bachelor’s degree and the article that Ushakov and I published. But to be honest, I still don’t whether the original hypothesis with which it all began is true.

Why I am not an expert on all machine learning

Mathematics and its applications have expanded enormously, as have almost all other sciences and a lot of very talented researchers have done work in each field. The era of universal mathematicians ended, apparently, in the first half of the last century. I think that now there is not even one person who could say that he is an expert in all areas of probability theory because there is so much mixed up with it! I can say that I understand several related fields from probability and statistics. But modern statistics contains several dozen discernible areas, and I understand only some of them. However, if I have to, it is easier for me to gain an understanding of problems in those areas than in areas in which I do not work at all.

How my scientific interests are changing

There are areas that used to be at the forefront of science and research that reached a culmination of sorts and that are no longer active and no longer employ large numbers of researchers. This happened all the time with machine learning algorithms, . Many approaches that were popular in the 1990s and 2000s have been completely replaced by neural networks. The breakthrough in neural networks occurred in 2012 when they made the outstanding AlexNet architecture that showed how easy it is to learn on a GPU. Suddenly, it turned out to be literally next-generation technology.

When a promising field appears, people usually focus on it. If something interesting happens in one research group and it has potential, then people from related groups who do similar things elsewhere in the world will immediately switch to this emerging area of interest. It attracts people. But there are always limiting factors. You might understand that a field is very popular, but if you have never done anything like it before, this particular field is probably not for you. There is a famous quote from the mathematician Stefan Banach: ‘A mathematician is a person who can find analogies between theorems; a better mathematician is one who can see analogies between proofs and the best mathematician can notice analogies between theories. One can imagine that the ultimate mathematician is one who can see analogies between analogies.’ In machine learning, being able to see the general ideas behind different algorithms also helps a lot.

What I’d be doing if I hadn't become a scientist

I would work at a hedge fund. I did a bit of this after my third year of undergraduate studies. I liked it. It involves the same applications of mathematics, but with more verve.

My parents

They have an overall understanding of what I do. Mom and Dad graduated from the MSU FCMC in different years, so this is a family tradition. It is difficult to change something in the foundations of probability theory and statistics; the basic course taught to second- and third-year students was the same for them. Mom then worked at NICEVT—the Research Center for Electronic Computer Technology, an enormously long building on Varshavskaya Road.

Why a scientist should also teach

There are times when you’ve exhausted the area you’re working in and you’re walking around wondering what to do next. To paraphrase Feynman: in a purely research institution, where there is no teaching and no contact with students, you may find yourself shut in your own world with nowhere from which to get a new idea. But if you have prepared a lecture or seminar and thought of something new, you might come up with a new direction.

What to do about burnout

If you’ve been working on an article for weeks and it’s still not where it should be, it might be best to set it aside. What’s the point of being so exhausted and accomplishing nothing? It’s better to try something else or go teach. You can even switch to something unrelated to science. I have already mentioned that I really love history, and this helps me a lot.

What I like to read and watch

I love Dostoevsky. For understanding human nature, I think I’ve never seen anything deeper. I recently re-read ‘Demons’. I first tried reading it as a school student but didn’t understand anything. But when I read it in graduate school, it made a completely different impression on me. Some politicians seem to have stepped straight out of this book into our world.

I also recently read Ilya Ehrenburg’s novel ‘The Extraordinary Adventures of Julio Jurenito and His Disciples’. It is a beautifully absurd thing that could only have been written at the beginning of the 20th century. It is a reflection on the theme of the First World War and the meaninglessness of the existing world order. It reads very easily. I would say that it is somewhat similar to my favourite book, ‘The Good Soldier Schweik’ by Hasek. Also, Ilya Ehrenburg has a fascinating series of reports devoted to the civil war in Spain.

I can’t say that I'm a big fan of cinema, but I can mention films directed by Roman Mikhailov. The man worked with algebra, but he was always attracted to the cinema and so he started making films. His work is experimental, deeply philosophical. His plots are like sketches without a beginning or end. Among foreign film directors, I like to watch Kubrick’s films.

I love opera, especially Wagner. Of the Russian operas, perhaps my favourites are Rimsky-Korsakov’s ‘The Tsars Bride’ and Glinka’s ‘A Life for the Tsar’. I also love dramatic theatre. I recently went to the Mayakovsky Theatre to see Molière's ‘The School for Wives’. And the wonderful ‘Uncle Vanya’ is playing at the Theatre of Nations.

My favourite place in Moscow

As a child, I would definitely say it was the Museum of Paleontology. I was always nagging my mother to take me there at least once every couple of months. Now it is more difficult to decide, but if to name one place, I would say, like everyone else, Vorobyovy Gory. I can’t say that I am irresistibly drawn to it, but, in my opinion, this is a very atmospheric place. When you look at the Main Building of Moscow State University, at the adjacent buildings of the architectural ensemble, you wonder how people saw the future in the ‘50s and how those dreams materialised.

Advice for aspiring scientists

It sounds very trite, but you need to believe in what you are doing, in your own strengths. Don’t imagine that the people around you are much smarter, that some hidden knowledge is available to them which a mere mortal can never master. There are always a million interesting tasks in which you can prove yourself. This is a matter of your perseverance, determination and willingness to take a broader look at things. Yes, not everyone will win the Fields Medal. But after all, that isn’t the whole point of life anyway.

My main advice is not to doubt your own abilities and to ask yourself: ‘Why am I doing this in the first place?’