Theory and Algorithms in Data Science

This seminar provides a forum for researchers working in foundational areas of Data Science (and related fields). All talks are held at the Alan Turing Institute.

The organizers are Mihai Cucuringu, Armin Eftekhari, Vidit Nanda and Hemant Tyagi. We represent the Low-dimensional structure as well as the Topology and geometry research interest groups within the Turing Institute. If you'd like to speak in this seminar, please email Vidit at

Upcoming Talks

13 Apr 2018: Hao Chen, University of California, Davis


Speaker : Hao Chen, University of California, Davis
Title : Change-point Detection for Multivariate and Object Data

Abstract: After observing snapshots of a network, can we tell if there has been a change in dynamics? We developed a general nonparametric framework for change-point detection utilizing similarity among observations. This new approach, which relies on graph-based tests, can be applied to high-dimensional data, as well as data from non-Euclidean sample spaces, such as network data. In addition, this framework is combined with a new way of permutation, circular block permutation with a random starting point, to further deal with local dependence, which is common in such data sequences. Analytic approximations to the significance of this graph-based scan statistics are derived for both the independence and local dependence scenarios, facilitating their applications to real datasets. We illustrate the methods through the analysis of a phone-call network from the MIT Reality Mining.

Room and Time: Augusta, 14h-15h

27 Apr 2018: Guy Nason, University of Bristol


Speaker : Guy Nason, University of Bristol
Title : TBA

Abstract: TBA

Room and Time: Augusta, 14h-15h

30 Apr 2018: Matteo Barigozzi, London School of Economics


Speaker : Matteo Barigozzi, London School of Economics
Title : TBA

Abstract: TBA

Room and Time: Augusta, 14h-15h

25 Jun 2018: Stephen Wright, University of Wisconsin-Madison


Speaker : Stephen J Wright, University of Wisconsin-Madison
Title : TBA

Abstract: TBA

Room and Time: Enigma, 14h-15h

Past Talks

06 Apr 2018: François-Xavier Briol, University of Warwick


Speaker : François-Xavier Briol, University of Warwick
Title : Bayesian Quadrature for Multiple Related Integrals.

Abstract: Bayesian probabilistic numerical methods are a set of tools providing posterior distributions on the output of numerical methods. The use of these methods is usually motivated by the fact that they can represent our uncertainty due to incomplete/finite information about the continuous mathematical problem being approximated. In this work, we demonstrate that this paradigm can provide additional advantages, such as the possibility of transferring information between several numerical methods. This allows users to represent uncertainty in a more faithful manner and, as a by-product, provide increased numerical efficiency. We propose the first such numerical method by extending the well-known Bayesian quadrature algorithm to the case where we are interested in computing the integral of several related functions. We then prove convergence rates for the method in the well-specified and misspecified cases, and demonstrate its efficiency in the context of multi-fidelity models for complex engineering systems and a problem of global illumination in computer graphics.

Room and Time: Augusta, 14h-15h

27 Mar 2018: Pranay Seshadri, University of Cambridge


Speaker : Pranay Seshadri, University of Cambridge
Title : On Dimension Reduction Investigations in Turbomachinery

Abstract: Blades in modern jet engines are parameterized by 20-300 design variables. It is impossible to visualize, and more importantly, effectively explore such vast design spaces. Yet such an exploration may be required not solely for optimization and uncertainty quantification endeavors, but more importantly, for understanding the physics that underpins key designs characteristics. For instance, when carrying out an optimization, the designs that are often associated with high efficiency are likely to be different from those associated with high flow capacity and high pressure ratio. Likewise, designs that are insensitive to variations in operational boundary conditions are likely to be different from those that are sensitive under the very same conditions. So what is the physical rationale for different design characteristics and how do we identify them?

Our goal is to be able to identify key linear combinations of designs that influence certain outputs, especially when the total number of design variables is large. These salient concerns motivate an output-based—i.e., efficiency, flow capacity or pressure ratio—dimension reduction strategy. In this talk, a new algorithm for subspace-based dimension reduction is introduced. It combines manifold optimization with classical Gaussian process regression. The fundamental idea we exploit is that for many physics-driven problems, outputs can be expressed as a non-linear function of a few linear combinations of all the variables. There are parallels between this idea (known in the applied mathematics community under the moniker of `ridge functions'), neural networks, active subspaces and classical sufficient dimension reduction methods. We test our algorithm on a range of different data sets where the number of dimensions vary from 2 to 200. Our results demonstrate that if the problem does admit a ridge structure, our technique provides near optimal recovery. Finally, we describe how the computed dimension reducing subspace can be used for optimization, uncertainty quantification and more importantly for gaining physical insight into turbomachinery fluid mechanics.

Room and Time: Augusta, 15h-16h

27 Mar 2018: Wei Dai, Imperial College London


Speaker : Wei Dai, Imperial College London
Title : Blind Estimation of Origin-Destination Flows in Networks

Abstract: This work focuses on the inverse problem of estimating origin-destination (OD) flows from link flows. OD flow estimation is essential to many network analysis tasks, such as urban planning, as OD information specifies the travel demands in the network. Approaches in the literature for OD estimation typically require some prior information of the OD flows as well as the mapping from OD flows to link flows. In particular, a linear model is often used to map OD flows to link flows and the corresponding linear operator is commonly referred to as the traffic assignment. In practice, prior information of the traffic assignment can be inaccurate and incomplete. Furthermore, even with full information of the traffic assignment, the OD flow estimation problem is still ill-posed and therefore further prior information of OD flows is required.

The main contribution of this work is estimating OD flows with no or little prior information. Inspired by the terminology blind deconvolution in signal processing community, we refer to the OD estimation without knowledge of traffic assignment as blind estimation. Different from the literature where linear systems are built on OD flows, a new linear model is introduced of which the inquired flows are specified with only their origins but not their destinations. The new model preserves the OD information and at the same time reduces the dimension of the inquired flows in orders of magnitude. Due to this dimension reduction, the OD flow estimation problem becomes well-posed when the corresponding traffic assignment information is available. More importantly it is possible to estimate the OD flows and the traffic assignment simultaneously with no prior information of them. Detailed discussions are provided regarding the uniqueness of solutions for the simultaneous estimation. Simulations are presented for the demonstration purpose.

Room and Time: Augusta, 14h-15h

20 Mar 2018: Jacek Brodzki, University of Southampton


Speaker : Jacek Brodzki, University of Southampton
Title : Geometry, Topology, and the structure of data

Abstract: Modern data is astonishing in its variety, and is far removed from anything that could be held in spreadsheet or analysed using traditional methods. Through intense research effort, topology has emerged as a source of novel methodology to provide insight into the structure of very complex, high dimensional data. It provided us with tools like persistent homology, which are used to compute numerical topological characteristics of the data. More recently, these methods have been augmented by geometric insights, which are valuable in capturing the structure of and relationships between complex shapes.

In this talk, I will provide an overview of new techniques from topology and geometry and illustrate them on particular examples. We will discuss applications of persistent homology to the study of the set of CT scans of human lungs. The geometric part of the talk will concern classification of three-dimensional shapes through synchronisation.

Room and Time: Augusta, 13h-14h

06 Mar 2018: Coralia Cartis, University of Oxford


Speaker : Coralia Cartis, University of Oxford
Title : Complexity of Nonconvex Optimization: The Story So Far

Abstract: Establishing the global rate of convergence of standard algorithms or their global evaluation complexity for nonconvex smooth optimization problems is a natural but challenging aspect of algorithm analysis. In the last decade, substantial progress has been made and continues to be made, in this direction. We review some of the key developments, illustrating the crucial role played by cubic regularisation methods, a credible alternative to linesearch and trust-region.

We then focus on two recent results. Firstly, we consider a general/new class of adaptive regularization methods, that use first- or higher-order local Taylor models of the objective regularized by a(ny) power of the step size. We investigate the worst-case evaluation complexity of these algorithms, when the level of sufficient smoothness of the objective and its derivatives may be unknown or may even be absent, and we find that the methods accurately reflect in their complexity the (unknown) degree of smoothness of the objective/derivatives and satisfy increasingly better bounds with the order of the derivatives available. We then focus on the antipodal context, when no derivatives of the objective are available, and the local models in adaptive cubic regularisation methods are constructed for example, by sampling of function values. We show that the evaluation complexity of the ensuing methods do not change in the order of the accuracy from their deterministic counterparts, increasing only by a constant factor which depends on the probability of the sampled models being occasionally, sufficiently accurate.

Time permitting, we will also discuss the stochastic case when even function evaluations may be inaccurate. This work is joint with Nick Gould (Rutherford Appleton Laboratory, UK), Philippe Toint (University of Namur, Belgium) and Katya Scheinberg (Lehigh University, USA).

Room and Time: Augusta, 12h-13h

06 Mar 2018: Ruggerio Bellio, University of Udine


Speaker : Ruggero Bellio, University of Udine
Title : Modern Likelihood-Frequentist Inference

Abstract: This talk illustrates some results of our recent paper Modern Likelihood-Frequentist Inference, which is an exposition of the development of likelihood asymptotics obtained by many workers. The exposition is intended to be accessible to a wide audience. It does this by charting a narrow course through the developments over more than 30 years. This covers the likelihood ratio approximation to the distribution of the parameter estimator, approximate ancillary conditioning, transformation to a highly accurate approximation to the likelihood ratio test, and related topics.

The talk focuses in particular on the R package likelihoodAsy, that accompanies the paper. The package implements some tools for likelihood inference, including in particular the Skovgaard version of Barndorff-Nielsen's modified signed likelihood ratio test statistic r*. The main point of this is to approximate the sample space derivatives, that held up this theory for a decade until the Skovgaard breakthrough. The user needs only to provide R code for the likelihood function, for generating a sample from the model, and for the formulation of the interest parameter. The latter is allowed to be a rather general function of the model parameter, and this is the most important feature of the package.

This is a joint work with Donald A. Pierce, Oregon State University.
Pierce, D.A. and Bellio, R. (2017). Modern likelihood-frequentist inference. International Statistical Review, 85, 519-541.

Room and Time: Augusta, 11h-12h

22 Feb 2018: Yuji Nakatsukasa, University of Oxford


Speaker : Yuji Nakatsukasa, University of Oxford
Title : Global optimization via eigenvalues

Abstract: While non-convex optimization problems are generally regarded as difficult or computationally intractable, the Courant-Fischer theorem for symmetric eigenvalue problems suggests that each eigenpair corresponds to a stationary point of a non-convex optimization problem (in this case the Rayleigh quotient). More generally, matrix eigenvalue problems can be regarded as an important class of optimization problems that can be solved efficiently. This observation suggests conversely that perhaps a global solution for some non-convex optimization problems could be obtained by solving an eigenvalue problem. In this work we identify such optimization problems, and show that they lead to a variety of eigenvalue problems, including generalized, polynomial and multiparameter eigenvalue problems, all of which can be solved reliably and in polynomial time, and essentially non-iteratively. Specifically, we consider non-convex QCQP (quadratically constrained quadratic programming) QCQP is known to be NP-hard in general (when the number of constraints k is not fixed), and for k ≥ 2 fixed, it has been an open problem whether there are polynomial-time solvable until recently. We show that (i) for QCQP with one constraint, k = 1, it can be reduced to essentially a single generalized eigenvalue problem, (ii) for QCQP with two constraints, k = 2, the problem can be reduced to a two-parameter eigenvalue problem, and (iii) unconstrained optimization problems with cubic regularization can be reduced to quadratic eigenvalue problems.

Based on joint work with Satoru Adachi, Satoru Iwata, Shinsaku Sakaue and Akiko Takeda (University of Tokyo).

Room and Time: Augusta, 14h-15h

20 Feb 2018: Mu Niu, U of Plymouth + Pokman Cheung, Goldman Sachs

Speakers : Mu Niu, University of Plymouth and Pokman Cheung, Goldman Sachs
Title : Intrinsic Gaussian processes on complex constrained domains

Abstract: We propose a class of intrinsic Gaussian processes (in-GPs) for interpolation and regression on manifolds with a primary focus on complex constrained domains or irregular-shaped spaces arising as subsets or submanifolds of R2, R3 and beyond. For example, in-GPs can accommodate spatial domains arising as complex subsets of Euclidean space. in-GPs respect the potentially complex boundary or interior conditions as well as the intrinsic geometry of the spaces. The key novelty of the proposed approach is to utilise the relationship between heat kernels and the transition density of Brownian motion on manifolds for constructing and approximating valid and computation- ally feasible covariance kernels. This enables in-GPs to be practically applied in great generality, while existing approaches for smoothing on constrained domains are limited to simple special cases. The broad utilities of the in-GP approach is illustrated through simulation studies and data examples.

Room and Time: Augusta, 14h-15h

09 Jan 2018: Sebastian Stich, EPFL


Speaker : Sebastian Stich, École Polytechnique Fédérale de Lausanne
Title : Adaptive importance sampling for convex optimization

Abstract: In the last years, randomized optimization algorithms, and especially coordinate descent methods attracted more and more attention in the optimization community. These methods often replace the efficient -- but often relatively expensive -- iterations of the classical methods with much cheaper -- but less efficient -- updates. Consequently, such methods can be applied to problems of very big size. This application is sometimes also theoretically justified by very attractive worst-case efficiency guarantees.

In this talk, we will first provide a tutorial-like introduction to random decent methods and review their complexity. We will discuss variable metric or quasi-Newton type of methods [1] and accelerated schemes [2]. In the second part of the talk we further introduce the concept of (adaptive) importance sampling and present a scheme that computes the optimal sampling frequencies based on safe estimates of the gradient [3]. The proposed sampling is provably the best sampling with respect to the given bounds and always better than uniform sampling and fixed importance sampling. We conclude by discussing open problems in this field of research.

[1] Joint work with B. Gartner, C. Muller (Math. Prog. 2016)
[2] Joint work with Y. Nesterov (SIAM J. Opt. 2017)
[3] Joint work with A. Raj, M. Jaggi (NIPS 2017)

Room and Time: Augusta, 14h-15h


20 Dec 2017: Nicolas Boumal, Princeton University


Speaker : Nicolas Boumal, Princeton University
Title : Near optimal bounds for phase synchronization

Abstract: Phase synchronization consists in estimating n phases (unit-modulus complex numbers) given measurements of all relative phases. We assume the measurements are corrupted by i.i.d. Gaussian noise. The maximum likelihood estimator is the solution to a non-convex optimization problem, NP-hard in general. Yet, it has been observed some years ago that the maximum likelihood estimator can be computed in polynomial time through semidefinite relaxation, even in the face of tremendous amounts of noise.

In this talk, I will present proof techniques that allow to explain this observation. This is based on the generalized power method, replicas to reduce statistical dependence, and dual certificates.

Joint work with Yiqiao Zhong, Afonso Bandeira and Amit Singer.

Room and Time: Augusta, 11h-12h

20 Dec 2017: Rajen Shah, University of Cambridge


Speaker : Rajen Shah, University of Cambridge
Title : The xyz algorithm for fast interaction search in high-dimensional data

Abstract: When performing regression on a dataset with p variables, it is often of interest to go beyond using main effects and include interactions as products between individual variables. However, if the number of variables p is large, as is common in genomic datasets, the computational cost of searching through O(p^2) interactions can be prohibitive. In this talk I will introduce a new randomised algorithm called xyz that is able to discover interactions with high probability and under mild conditions has a runtime that is subquadratic in p. The underlying idea is to transform interaction search into a much simpler closest pair problem. We will see how strong interactions can be discovered in almost linear time, whilst finding weaker interactions requires O(p^u) operations for 1 < u < 2 depending on their strength. An application of xyz to a genome-wide association study shows how more than 10^11 interactions can be screened in minutes using a standard laptop. This is joint work with Gian Thanei and Nicolai Meinshausen (ETH Zurich).

Speaker Bio: Rajen Shah is a lecturer in Statistics at the Statistical Laboratory, University of Cambridge, having completed his PhD under the supervision of Richard Samworth (also at Cambridge) in 2014. His main research interests are in high-dimensional statistics and large-scale data analysis. Rajen is currently an Associate Editor for the Journal of the Royal Statistical Society Series B, and a Turing Fellow.

Room and Time: Augusta, 14h-15h

28 Nov 2017: Adilet Otemissov, University of Oxford


Speaker : Adilet Otemissov, University of Oxford
Title : Dimensionality reduction techniques for global optimization.

Abstract: Abstract: (Joint work with Coralia Cartis) The problem of finding the most extreme value of a function, also known as global optimization, is a challenging task. The difficulty is associated with the exponential increase in the computational time for a linear increase in the dimension. This is known as the ``curse of dimensionality''. In this talk, we demonstrate that such challenges can be overcome for functions with low effective dimensionality – functions which are constant along certain linear subspaces. Such functions can often be found in applications, for example, in hyper-parameter optimization for neural networks, heuristic algorithms for combinatorial optimization problems and complex engineering simulations.

We propose the use of random subspace embeddings within a(ny) global minimisation algorithm, extending the approach in Wang et al. (2013). We introduce two techniques (REGO and AREGO) that transform the high-dimensional optimization problem into a low-dimensional problem. REGO is formulated by adding constraints in the low-dimensional embedded space and AREGO by including constraints in the original space. We derive probabilistic bounds on the success of the dimension-reduced problems and present encouraging numerical results.

Speaker Bio: Adilet is a second year DPhil Student at the Alan Turing Institute and University of Oxford. He graduated from Nazarbayev University with a BSc in Mathematics and then pursued a master's degree at the University of Manchester. His research interests include high-dimensional non-convex and global optimization.

Room and Time: Augusta, 13h-14h

21 Nov 2017: Rodrigo Mendoza Smith, University of Oxford


Speaker : Rodrigo Mendoza Smith, University of Oxford
Title : Embarrassingly parallel reduction of boundary matrices in Persistent Homology

Abstract: Topological Data Analysis is a relatively new paradigm in data-science that models datasets as point-clouds sampled from a shape embedded in an Euclidean space. The inference problem is to estimate the essential topological features of the underlying shape from a point-cloud sampled from it. This is done through a technique called persistent homology which is a mathematical formalism based on algebraic topology. A necessary step in the persistent homology pipeline is the reduction a so-called boundary matrix, which is a process reminiscent to Gaussian elimination. In this talk, I present a number of structural dependencies in boundary matrices and use them to design a novel embarrassingly parallel algorithm for their reduction. Simulations on synthetic examples show that the computational burden can be conducted in a small fraction of the number of iterations needed by traditional methods. For example, numerical experiments show that for a boundary matrix with 10^4 columns, the reduction completed to within 1% in about ten iterations as opposed to nearly approximately eight thousand iterations for traditional methods.

Speaker Bio: Rodrigo Mendoza Smith is a DPhil candidate in Applied Mathematics working with Prof. Jared Tanner. He is affiliated with the Numerical Analysis Group in Oxford and with the Alan Turing Institute in London. He is working on the design of provably fast numerical algorithms for sparse and low-rank data models. His research interests lie in numerical computing, compressed-sensing, optimisation, machine-learning, network theory and computational algebraic topology. His motivation and long term goal is to contribute to the understanding of Artificial Intelligence and its interplay with geometry, statistics and computability

Room and Time: Augusta, 14h-15h

16 Nov 2017: Guillame Lecue, CREST + ENSAE


Speaker : Guillame Lecue, Center for Research in Economics and Statistics
Title : Learning from MOM’s principles

Abstract: (Joint work with Matthieu Lerasle) We obtain theoretical and practical performances for median of means estimators.

From a theoretical point of view, estimation and prediction error bounds achieved by the MOM estimators hold with exponentially large probability -- as in the gaussian framework with independent noise-- under only weak moments assumptions on the data and without assuming independence between the noise and the design. Moreover, MOM procedures are robust since a large part of the data may have nothing to do with the oracle we want to reconstruct. Our general risk bound is of order max(minimax rate of convergence in the i.i.d. setup, (number of outliers)/number of observations)). In particular, the number of outliers may be as large as (number of data)*(minimax rate) without affecting the statistical properties of the MOM estimator.

A regularization norm may also be used together with the MOM criterium. In that case, any norm can be used for regularization. When it has some sparsity inducing power we recover sparse rates of convergence and sparse oracle inequalities. For example, the minimax rate s log(d/s)/N of recovery of a s-sparse vector in R^d is achieved by a median-of-means version of the LASSO when the noise has q_0 moments for some q_0>2, the design matrix C_0\log(d) moments and the dataset is corrupted by s log(d/s) outliers. This result holds with exponentially large probability as if the noise and the design were i.i.d. Gaussian random variables.

On the practical side, MOM estimators (and their associated regularized versions) can easily be implemented. Actually, most gradient descent algorithms used to implement (non-robust) estimators like the LASSO can easily be transformed into a robust one by using a MOM approach.

Room and Time: Augusta, 15h-16h

14 Nov 2017: Piotr Fryzlewicz, London School of Economics


Speaker : Piotr Fryzlewicz, London School of Economics

Title: Multiscale thinking in data analysis, recursive algorithms, and data-adaptive change-point detection

Abstract: The talk starts on a general note: we first attempt to define a "multiscale" method / algorithm as a recursive program acting on a dataset in a suitable way. Wavelet transformations, unbalanced wavelet transformations and binary segmentation are all examples of multiscale methods in this sense. Using the example of binary segmentation, we illustrate the benefits of the recursive formulation of multiscale algorithms from the software implementation and theoretical tractability viewpoints.

We then turn more specific and study the canonical problem of a-posteriori detection of multiple change-points in the mean of a piecewise-constant signal observed with noise. Even in this simple set-up, many publicly available state-of-the-art methods struggle for certain classes of signals. In particular, this misperformance is observed in methods that work by minimising a "fit to the data plus a penalty" criterion, the reason being that it is challenging to think of a penalty that works well over a wide range of signal classes. To overcome this issue, we propose a new approach whereby methods "learn" from the data as they proceed, and, as a result, operate differently for different signal classes. As an example of this approach, we revisit our earlier change-point detection algorithm, Wild Binary Segmentation, and make it data-adaptive by equipping it with a recursive mechanism for deciding "on the fly" how many sub-samples of the input data to draw, and where to draw them. This is in contrast to the original Wild Binary Segmentation, which is not recursive. We show that this significantly improves the algorithm particularly for signals with frequent change-points.

Bio: Piotr Fryzlewicz is a professor of statistics in the Department of Statistics at the London School of Economics. He has previously worked at Winton Capital Management, the University of Bristol and Imperial College London. His research interests are in multiscale modelling and estimation, time series, change-point detection, high-dimensional statistical inference and statistical learning. He currently serves as Joint Editor of the Journal of the Royal Statistical Society Series B.

Room and Time: Augusta, 14h-15h

07 Nov 2017: Quentin Berthet, University of Cambridge


Speaker : Quentin Berthet, University of Cambridge
Title : Bandit Optimization with Upper-Confidence Frank-Wolfe

Abstract: We consider the problem of bandit optimization, inspired by stochastic optimization and online learning problems with bandit feedback. In this problem, the objective is to minimize a global loss function of all the actions, not necessarily a cumulative loss. This framework allows us to study a very general class of problems, with applications in statistics, machine learning, and other fields. To solve this problem, we introduce the Upper-Confidence Frank-Wolfe algorithm, inspired by techniques for bandits and convex optimization. We show upper bounds on the optimization error of this algorithm over various classes of functions, and discuss the optimality of these results. This is joint work with Vianney Perchet (ENS Paris-Saclay and Criteo Research)

Speaker Bio: Quentin Berthet is a Turing Fellow and a Lecturer in the Statslab, in the DPMMS at the University of Cambridge, since 2015. He is a former student of the Ecole Polytechnique, received a Ph.D. from Princeton University in 2014, and was a CMI postdoctoral fellow at Caltech. His research interests are in Statistics, Optimization and Computer Science.

Room and Time: Augusta, 14h-15h

01 Nov 2017: Yiannis Koutis, New Jersey Institute of Technology


Speaker : Yiannis Koutis, New Jersey Institute of Technology
Title: A survey of provably fast linear system solvers

Abstract: The design of fast linear system solvers has been a very active research area for decades. Solvers based on various approaches, from multigrid methods to incomplete matrix factorization techniques have been quite successful in practical engineering problems, but lack fast convergence guarantees for broader classes of unstructured linear systems that are common in modern data science problems. Solvers that are provably fast for broad classes of unstructured linear systems are a fairly recent discovery. We will survey this recent progress, and discuss key enabling facts from the theory of concentration bounds for random sums of positive matrices. We will also discuss known obstacles that stand on the way to progress for general linear systems.

Bio: Yiannis Koutis is an associate professor in the Ying Wu College of Computing at the New Jersey Institute of Technology. He completed his Ph.D. in Compute Science at Carnegie Mellon University, and got his undergraduate Diploma in Computer Engineering and Informatics from the University of Patras in Greece. Before NJIT, he was an associate professor at the University of Puerto Rico, Rio Piedras. In the past he has also worked as a Systems Scientist at Carnegie Mellon University.

Room and Time: Augusta, 15h-16h

31 Oct 2017: Dino Sejdinovic, University of Oxford


Speaker : Dino Sejdinovic, University of Oxford
Title: Inference with Approximate Kernel Embeddings

Abstract: Kernel embeddings of distributions and the Maximum Mean Discrepancy (MMD), the resulting distance between distributions, are useful tools for fully nonparametric hypothesis testing and for learning on distributional inputs. I will give an overview of this framework and present some of the applications of the approximate kernel embeddings to Bayesian computation. Further, I will discuss a recent modification of MMD which aims to encode invariance to additive symmetric noise and leads to learning on distributions robust to the distributional covariate shift, e.g. where measurement noise on the training data differs from that on the testing data. ArXiv:1703.07596

Bio: Dino Sejdinovic is an Associate Professor at the Department of Statistics, University of Oxford, a Fellow of Mansfield College, Oxford, and a Turing Fellow of the Alan Turing Institute. He previously held postdoctoral positions at the Gatsby Computational Neuroscience Unit, University College London and at the Department of Mathematics, University of Bristol. He received a PhD in Electrical and Electronic Engineering from the University of Bristol (2009). His research is at the interface between machine learning and statistical methodology, with a focus on nonparametric and kernel methods.

Room and Time: Augusta, 14h-15h

17 Oct 2017: Varun Kanade, University of Oxford


Speaker : Varun Kanade, University of Oxford
Title : From which world is your graph?

Abstract: Discovering statistical structure from links is a fundamental problem in the analysis of social networks. Choosing a misspecified model, or equivalently, an incorrect inference algorithm will result in an invalid analysis or even falsely uncover patterns that are in fact artifacts of the model. This work focuses on unifying two of the most widely used link-formation models: the stochastic blockmodel (SBM) and the small world (or latent space) model (SWM). Integrating techniques from kernel learning, spectral graph theory, and nonlinear dimensionality reduction, we develop the first statistically sound polynomial-time algorithm to discover latent patterns in sparse graphs for both models. When the network comes from an SBM, the algorithm outputs a block structure. When it is from an SWM, the algorithm outputs estimates of each node's latent position. We report experiments performed on a twitter dataset collected between October and November 2016. Based on joint work with Cheng Li, Zhenming Liu and Felix Wong.

Speaker Bio: Varun Kanade is a Research Lecturer in the Department of Computer Science, Oxford University. He has been a Simons Postdoctoral Fellow at the University of California, Berkeley and an FSMP postdoctoral fellow at ENS, Paris. He obtained his Ph.D. from Harvard University in 2012. His research interests are in machine learning and theoretical computer science and focuses on mathematical foundations of machine learning, in particular computational and statistical tradeoffs and improving our understanding of methods that have been successful in practice.

Room and Time: Augusta, 14h-15h

09 Oct 2017: Martin Lotz, University of Manchester


Speaker: Martin Lotz, University of Manchester
Title: Randomized dimensionality reduction in persistent homology

Abstract: We discuss a few ideas related to dimensionality reduction at the point-cloud level for the computation of persistent homology. In particular, it is shown that in certain cases, the ambient dimension of the problem can literally be replaced with an intrinsic notion of dimension related to the structure of the data, and that such a reduction can be computed efficiently. Some surprising connections to the theory of suprema of Gaussian processes are presented, and potential extensions discussed.

Speaker Bio: Martin Lotz is a Lecturer in Mathematics at the University of Manchester. Previous to his appointment at Manchester he was a post-doctoral researcher at the City University of Hong Kong, at the University of Oxford, and as a Research Fellow at the University of Edinburgh, supported by a Leverhulme Trust and Seggie Brown Fellowship. He studied Mathematics at the ETH (Swiss Federal Institute of Technology) in Zurich, and the University of Paderborn, where he finished his doctorate in Algebraic Complexity Theory. His interests span a wide range of mathematics and its applications. His current work explores connections between probability, optimization, and applications in signal processing, data analysis and inverse problems. Previous work has been in algebraic complexity theory, computational algebra and geometry, foundations of numerical analysis, and combinatorics.

Room and Time: Augusta, 14h-15h

03 Oct 2017: He Sun, University of Edinburgh

He Speaker: He Sun, University of Edinburgh
Title: Heat kernels in graphs: A journey from random walks to geometry, and back

Abstract: Heat kernels are one of the most fundamental concepts in physics and mathematics. In physics, the heat kernel is a fundamental solution of the heat equation and connects the Laplacian operator to the rate of heat dissipation. In spectral geometry, many fundamental techniques are based on heat kernels. In finite Markov chain theory, heat kernels correspond to continuous-time random walks and constitute one of the most powerful techniques in estimating the mixing time.

In this talk, we will briefly discuss this line of research and its relation to heat kernels in graphs. In particular, we will see how heat kernels are used to design the first nearly-linear time algorithm for finding clusters in real-world graphs. Some interesting open questions will be addressed in the end.

Speaker Bio: He Sun is a senior lecturer in Algorithms and Complexity in the School of Informatics, University of Edinburgh. Prior to this, he held positions at the Max Planck Institute for Informatics from 2010 to 2015, and the University of Bristol from 2015 to 2017. His main research interests range over the fields of spectral graph theory, applied probability and statistics, matrix analysis, and machine learning.

Room and Time: Augusta, 14h-15h

29 Sep 2017: Greg Ongie, University of Michigan

Greg Speaker: Greg Ongie, University of Michigan
Title: Learning non-linear models with missing data

Abstract: Linear models have had great success for inference problems with missing data. A key example is low-rank matrix completion, where the missing entries of a matrix are imputed by exploiting low-dimensional linear structure. However, many real-world datasets fail to have linear structure but still have low-dimensional non-linear structure that can be exploited. We introduce a systematic framework for learning low-dimensional non-linear models in problems in missing data. The approach has close connections to subspace clustering, manifold learning, and kernel methods in machine learning. As a special case, we focus on the problem of matrix completion where we assume data belongs to a low-dimensional algebraic variety, i.e., each data point is a solution to a system of polynomial equations, and study the sampling requirements of this model. We propose an efficient matrix completion algorithm that is able to recover synthetically generated data up to the predicted sampling complexity bounds and outperforms traditional low-rank matrix completion and subspace clustering approaches on real datasets in computer vision and genomics.

Room and Time: Enigma, 14h-15h

This talk is part of Greg Ongie's visit via the "Research in small groups" grant for Turing host: Armin Eftekhari.

19 Sep 2017: Richard Samworth, University of Cambridge


Speaker : Richard Samworth, University of Cambridge
Title : High dimensional change point estimation via sparse projection

Abstract: Change points are a very common feature of big data that arrive in the form of a data stream. We study high dimensional time series in which, at certain time points, the mean structure changes in a sparse subset of the co-ordinates. The challenge is to borrow strength across the co-ordinates to detect smaller changes than could be observed in any individual component series. We propose a two-stage procedure called 'inspect' for estimation of the change points: first, we argue that a good projection direction can be obtained as the leading left singular vector of the matrix that solves a convex optimization problem derived from the cumulative sum transformation of the time series. We then apply an existing univariate change point estimation algorithm to the projected series. Our theory provides strong guarantees on both the number of estimated change points and the rates of convergence of their locations, and our numerical studies validate its highly competitive empirical performance for a wide range of data-generating mechanisms. Software implementing the methodology is available in the R package 'InspectChangepoint'.

Speaker Bio: Richard Samworth obtained his PhD in Statistics from the University of Cambridge in 2004, and has remained in Cambridge since, becoming a professor in 2013 and the Professor of Statistical Science in 2017. His main research interests are in nonparametric and high-dimensional statistics, including problems in shape-constrained inference, classification, variable selection and changepoint estimation, among others.

Room and Time: Augusta, 14h-15h.