Lectures of Professor Theodor Trafalis in DeCAn lab
Lectures by Professor Theodor Trafalis (university of Oklahoma, USA) were held on "Kernel Methods with Imbalanced Data and Applications to Weather Prediction" (May 23, 2016) and "Bayesian Kernel Methods for Classification and Online Learning Problems" (May 24, 2016)
Topic: Kernel Methods with Imbalanced Data and Applications to Weather Prediction
Abstract: The main objective of this talk is to present recent developments in the applications of kernel methods and Support Vector Machines (SVMs) to imbalanced data related to weather prediction. I will also discuss how kernel methods can be used to uncover physically meaningful, predictive patterns in weather radar data that alert to severe weather before the severe weather occurs. Specific indices related to the analysis of severe weather data (for example tornado data) using kernel methods will be also discussed. In addition a family of learning algorithms, motivated by Support Vector Machines, capable of replacing traditional methods for assimilating data and generating forecasts, without requiring the assumptions made by the assimilation methods (Kalman filters) and an application of kernel methods to processing the states of a Quasi-Geostrophic (QG) numerical model will be presented. Extensions of those techniques to other areas of applications will be investigated.
Topic: Bayesian Kernel Methods for Classification and Online Learning Problems
Abstract: Recent advances in data mining have integrated kernel functions with Bayesian probabilistic analysis of Gaussian distributions. These machine-learning approaches can incorporate prior information with new data to calculate probabilistic rather than deterministic values for unknown parameters. In this talk we discuss Bayesian kernel methods and analyze a specific Bayesian kernel model that uses a kernel function to calculate a posterior beta distribution that is conjugate to the prior beta distribution. We show that the proposed beta kernel model outperforms other strategies to handle imbalanced datasets, such as under-sampling, over-sampling, and the Synthetic Minority Over-Sampling Technique. In addition if data arrive sequentially over time, the beta kernel model easily and quickly updates the probability distribution, and this model is more accurate than an incremental (Support Vector Machine) SVM algorithm for online learning problems. Numerical testing of the beta kernel model on several benchmark datasets, including weather data reveals that this model’s accuracy is comparable with those of the support vector machine (SVM), relevance vector machine, naive Bayes, and logistic regression, and the model runs more quickly than all the other algorithms except for logistic regression.