Theory and Algorithms in Data Science

This seminar provides a forum for researchers working in foundational areas of Data Science (and related fields). All talks are held at the Alan Turing Institute.

The organizers are Mihai Cucuringu and Vidit Nanda. We represent the Low-dimensional structure as well as the Topology and geometry research interest groups within the Turing Institute. If you'd like to speak in this seminar, please email Vidit at vnanda@turing.ac.uk

Upcoming Talks

29 Jan 2019: Xiaowen Dong, University of Oxford

Speaker : Xiaowen Dong, University of Oxford
Title : TBA

Abstract: TBA

Room and Time: TBA, 14h - 15h

05 Feb 2019: Clifford Lam, London School of Economics

Speaker : Clifford Lam, London School of Economics
Title : TBA

Abstract: TBA

Room and Time: TBA

07 Feb 2019: Xinghao Qiao, London School of Economics

Speaker : Xinghao Qiao, London School of Economics
Title : TBA

Abstract: TBA

Room and Time: TBA

14 Feb 2019: TBA

Speaker : TBA
Title : TBA

Abstract: TBA

Room and Time: TBA

19 Feb 2019: Tengyao Wang, University College London

Speaker : Tengyao Wang, University College London
Title : TBA

Abstract: TBA

Room and Time: TBA

21 Feb 2019: Weining Wang, City, University of London

Speaker : Weining Wang, City, University of London
Title : TBA

Abstract: TBA

Room and Time: Augusta, 14h - 15h

28 Feb 2019: Seok Young Hong, University of Nottingham

Speaker : Seok Young Hong, University of Nottingham
Title : TBA

Abstract: TBA

Room and Time: Enigma, TBA

07 Mar 2019: David Barber, University College London

Speaker : David Barber, University College London
Title : TBA

Abstract: TBA

Room and Time: Augusta, 14h - 15h

12 Mar 2019: Jack Jewson, University of Warwick

Speaker : Jack Jewson, University of Warwick
Title : TBA

Abstract: TBA

Room and Time: Augusta, 14h - 15h

2018

04 Dec 2018: Michael Bronstein, Imperial College London

Speaker : Michael Bronstein, Imperial College London
Title : Deep learning on graphs and manifolds: going beyond Euclidean data

Abstract: In the past decade, deep learning methods have achieved unprecedented performance on a broad range of problems in various fields from computer vision to speech recognition. So far research has mainly focused on developing deep learning methods for Euclidean-structured data. However, many important applications have to deal with non-Euclidean structured data, such as graphs and manifolds. Such data are becoming increasingly important in computer graphics and 3D vision, sensor networks, drug design, biomedicine, high energy physics, recommendation systems, and social media analysis. The adoption of deep learning in these fields has been lagging behind until recently, primarily since the non-Euclidean nature of objects dealt with makes the very definition of basic operations used in deep networks rather elusive. In this talk, I will introduce the emerging field of geometric deep learning on graphs and manifolds, overview existing solutions and applications, and outline the key difficulties and future research directions.

Room and Time: Augusta, 14h-15h

29 Nov 2018: Nicolas Flammarion, University of California, Berkeley

Speaker : Nicolas Flammarion, University of California, Berkeley
Title : Gen-Oja: A Simple and Efficient Algorithm for Streaming Generalized Eigenvector Computation

Abstract: In this talk, we study the problems of principal Generalized Eigenvector computation and Canonical Correlation Analysis in the stochastic setting. We propose a simple and efficient algorithm, Gen-Oja, for these problems. We prove the global convergence of our algorithm, borrowing ideas from the theory of fast-mixing Markov chains and two-time-scale stochastic approximation, showing that it achieves the optimal rate of convergence.

Room and Time: Augusta, 14h-15h

22 Nov 2018: Zhenming Liu, College of William and Mary

Speaker : Zhenming Liu, College of William and Mary
Title : Near-Neighbor Methods in Random Preference Completion

Abstract: We study a stylized, yet natural, learning-to-rank problem and point out the critical incorrectness of a widely used nearest neighbor algorithm. We consider a model with $n$ agents (users) $\{x_i\}_{i \in [n]}$ and $m$ alternatives (items) $\{y_l\}_{l \in [m]}$, each of which is associated with a latent feature vector. Agents rank items nondeterministically according to the Plackett-Luce model, where the higher the utility of an item to the agent, the more likely this item will be ranked high by the agent. Our goal is to identify near neighbors of an arbitrary agent in the latent space for prediction.

We first show that the Kendall-tau distance based kNN produces incorrect results in our model. Next, we propose a new anchor-based algorithm to find neighbors of an agent. A salient feature of our algorithm is that it leverages the rankings of many other agents (the so-called anchors'') to determine the closeness/similarities of two agents. We provide a rigorous analysis for one-dimensional latent space, and complement the theoretical results with experiments on synthetic and real datasets. The experiments confirm that the new algorithm is robust and practical.

Room and Time: Augusta, 14h-15h

15 Nov 2018: Tiago de Paula Peixoto, University of Bath

Speaker : Tiago de Paula Peixoto, University of Bath
Title : Reconstructing Networks with Unknown and Heterogeneous Errors

Abstract: The vast majority of network data sets contain errors and omissions, although this fact is rarely incorporated in traditional network analysis. Recently, an increasing effort has been made to fill this methodological gap by developing network-reconstruction approaches based on Bayesian inference. These approaches, however, rely on assumptions of uniform error rates and on direct estimations of the existence of each edge via repeated measurements, something that is currently unavailable for the majority of network data. Here, we develop a Bayesian reconstruction approach that lifts these limitations by allowing for not only heterogeneous errors, but also for single edge measurements without direct error estimates. Our approach works by coupling the inference approach with structured generative network models, which enable the correlations between edges to be used as reliable uncertainty estimates. Although our approach is general, we focus on the stochastic block model as the basic generative process, from which efficient nonparametric inference can be performed and yields a principled method to infer hierarchical community structure from noisy data. We demonstrate the efficacy of our approach with a variety of empirical and artificial networks.

Room and Time: Augusta, 14h-15h

18 Oct 2018: Cong Shen, University of Science and Technology of China

Speaker : Cong Shen, University of Science and Technology of China

Abstract: We will discuss a cost-aware cascading bandits mode that is motivated by many practical applications. This is a new variant of the multi-armed bandit model but incorporating the random cost of pulling arms and cascading feedback. In each step, the learning agent chooses an ordered list of items and examines them sequentially, until certain stopping condition is satisfied. Our objective is then to maximize the expected net reward in each step, i.e., the reward obtained in each step minus the total cost incurred in examining the items, by deciding the ordered list of items, as well as when to stop examination.

We study both the offline and online settings, depending on whether the state and cost statistics of the items are known beforehand. For the offline setting, we show that the Unit Cost Ranking with Threshold 1 (UCR-T1) policy is optimal. For the online setting, we propose a Cost-aware Cascading Upper Confidence Bound (CC-UCB) algorithm, and show that the cumulative regret scales in O(log T). We also provide a lower bound for all α-consistent policies, which scales in Ω(log T) and matches our upper bound. The performance of the CC-UCB algorithm is evaluated with real-world datasets.

Joint work with R. Zhou (University of Science and Technology of China), C. Gan and J. Yang (Pennsylvania State University)

Room and Time: Lovelace, 13h-14:15h

27 Sep 2018: Michaël Fanuel, Université Catholique de Louvain

Speaker : Michaël Fanuel, Université Catholique de Louvain
Title : Positive semi-definite embedding, dimensionality reduction and out-of-sample extension.

Abstract: Important methods in machine learning, such as the diffusion maps, are designed in order to embed high dimensional data into a lower dimensional Euclidean space. We propose an analogous kernel-based method for dimensionality reduction which relies on the solution of a Semi-Definite Program (SDP). In addition, an out-of-sample formula allows to extend the embedding of a discrete dataset to an empirical function. The robustness of method with respect to the choice of hyper- parameters as well as connections with graph embeddings will be discussed. Algorithms for solving this SDP via a low-rank factorization will also be presented.

Room and Time: Augusta, 14h-15h

18 Sep 2018: Hamza Fawzi, University of Cambridge

Speaker : Hamza Fawzi, University of Cambridge
Title : Semidefinite approximations of the matrix logarithm

Abstract: The matrix logarithm, when applied to symmetric positive definite matrices, is known to be concave with respect to the positive semidefinite order. This operator concavity property leads to numerous concavity and convexity results for other matrix functions, many of which are of importance in quantum information theory, and matrix concentration inequalities. In this talk I will show that certain rational approximations of the matrix logarithm remarkably preserve this concavity property and moreover, are amenable to semidefinite programming. Such approximations allow us to use off-the-shelf semidefinite programming solvers for convex optimization problems involving the matrix logarithm. I will conclude by showing some applications to problems in quantum information theory. Joint work with James Saunderson (Monash University) and Pablo Parrilo (MIT).

Room and Time: Augusta, 14h-15h

03 Sep 2018: Ronny Bergmann, Technische Universität Chemnitz

Speaker : Ronny Bergmann, Technische Universität Chemnitz
Title : Variational Methods for Manifold Valued Image Processing

Abstract: In many real life scenarios measured data appears as values from a Riemannian manifold. For example in interferometric Synthetic Aperture Radar (InSAR) data is given as a phase, in electron backscattered diffraction (EBSD) as data items being from a quotient of the orientation group SO(3), and in diffusion tensor magnetic resonance imaging (DT-MRI) the measured data are symmetric positive definite matrices. These data items are often measured on a pixel grid like usual images and they also suffer from the same measurement errors like noisy or incompleteness.Hence there is a need to perform image processing tasks like denoising, inpainting or segmentation on these manifold-valued images. Recently the ROF-Model for TV regularization has been transferred to the setting of manifold-valued images to perform such image processing tasks.

In this talk we present methods and algorithms to compute the TV regularization for manifold-valued images efficiently using variational methods. We present generalizations to second order models, and to the case, where the data is not given on a pixel grid, for example when dealing with nonlocal methods.

Room and Time: Augusta, 14h-15h

29 Aug 2018: Pedro Mercado, Saarland University

Speaker : Pedro Mercado, Saarland University
Title : Matrix Means for Signed and Multilayer Graph Clustering

Abstract: In this talk we present an extension of spectral clustering for the case when different kinds of interactions are present. We study suitable matrix functions to merge information that comes from different kinds of interactions encoded in multilayer graphs, and their effect in cluster identification. We consider a one-parameter family of matrix functions, known as matrix power means, and show that different means identify clusters under different settings of the stochastic block model in expectation. For instance, we show that a limit case identifies clusters if at least one layer is informative and the remaining layers are potentially just noise.

Room and Time: Augusta, 11h-12h

28 Aug 2018: Laurent Jacques, Université Catholique de Louvain

Speaker : Laurent Jacques, Université Catholique de Louvain
Title : Time for dithering! Quantized random embeddings with RIP random matrices

Abstract: Quantized compressive sensing (QCS) deals with the problem of coding compressive measurements of low-complexity signals (e.g., sparse vectors in a given basis, low-rank matrices) with quantized, finite precision representations, i.e., a mandatory process involved in any practical sensing model. While the resolution of this quantization clearly impacts the quality of signal reconstruction, there even exist incompatible combinations of quantization functions and sensing matrices that proscribe arbitrarily low reconstruction error when the number of measurements increases.

In this introductory talk, we will see that a large class of random matrix constructions, i.e., known to respect the restricted isometry property (RIP) in the compressive sensing literature, can be made "compatible" with a simple scalar and uniform quantizer (e.g., a rescaled rounding operation). This compatibility is simply ensured by the addition of a uniform random vector, or random "dithering", to the compressive signal measurements before quantization.

In this context, we will first study how quantized, dithered random projections of "low-complexity" signals is actually an efficient dimensionality reduction technique that preserves the distances of low-complexity signals up to some controllable additive and multiplicative distortions. Second, the compatibility of RIP sensing matrices with the dithered quantization process will be demonstrated by the existence of (at least) one signal reconstruction method, the projected back projection (PBP), which achieves low reconstruction error, decaying when the number of measurements increases. Finally, by leveraging the quasi-isometry property reached by quantized, dithered random embeddings, we will show how basic signal classification (or clustering) can be realized from their QCS observations, i.e., without a reconstruction step. Here also the complexity, or intrinsic dimension, of the observed signals drives the final classification accuracy.

Room and Time: Augusta, 14h-15h

16 Aug 2018: Patrick Rebescheni, University of Oxford

Speaker : Patrick Rebeschini, University of Oxford
Title : Multi-Agent Learning: Implicit Regularization and Order-Optimal Gossip

Abstract: In distributed machine learning data is stored and processed in multiple locations by different agents. Each agent is represented by a node in a graph, and communication is allowed between neighbours. In the decentralised setting typical of peer-to-peer networks, there is no central authority that can aggregate information from all the nodes. A typical setting involves agents cooperating with their peers to learn models that can perform better on new, unseen data.

In this talk, we present the first results on the generalisation capabilities of distributed stochastic gradient descent methods. Using algorithmic stability, we derive upper bounds for the test error and provide a principled approach for implicit regularization, tuning the learning rate and the stopping time as a function of the graph topology. We also present a new Gossip protocol for the aggregation step in distributed methods that can yield order-optimal communication complexity. Based on non-reversible Markov chains, our protocol is local and does not require global routing, hence improving on existing methods. (Joint work with Dominic Richards)

Room and Time: Augusta, 14h-15h

03 Aug 2018: Rahul Jain, University of Southern California

Speaker : Rahul Jain, University of Southern California
Title : Reinforcement Learning for Unknown Autonomous Systems

Abstract: System Identification followed by control synthesis has long been the dominant paradigm for control engineers. And yet, many autonomous systems must learn and control at the same time. Adaptive Control Theory has indeed been motivated by this need. But it has focused on asymptotic stability while many contemporary applications demand finite time (non-asymptotic) performance optimality. Results in stochastic adaptive control theory are sparse with any such algorithms being impractical for real-world implementation. We propose Reinforcement Learning algorithms inspired by recent developments in Online (bandit) Learning. Two settings will be considered: Markov decision processes (MDPs) and Linear stochastic systems. We will introduce a posterior-sampling based regret-minimization learning algorithm that optimally trades off exploration v. exploitation and achieves order optimal regret. This is a practical algorithm that obviates the need for expensive computation and achieves non-asymptotic regret optimality. I will then talk about a general non-parametric stochastic system model on continuous state spaces. Designing universal control algorithms (that work for any problem) for such settings (even with known model) that are provably (approximately) optimal has long been a very challenging problem in both Stochastic Control and Reinforcement Learning. I will propose a simple algorithm that combines randomized function approximation in universal function approximation spaces with Empirical Q-Value Learning which is not only universal but also approximately optimal with high probability.

Room and Time: Augusta, 15h-16h

26 Jul 2018: Yves van Gennip, University of Nottingham

Speaker : Yves van Gennip, University of Nottingham
Title : A short overview of partial differential equations on graphs

Abstract: Recent years have seen a lot of interest in partial differential equations discretised on arbitrary network structures. Especially driven by applications in data analysis and image processing, models such as the graph-based Ginzburg-Landau functional and variations on it have been used for processes such as data clustering, image segmentation, and maximum-cut computations. In this talk I will give a brief overview of some of the theory and applications in this area.

Room and Time: Augusta, 14h-15h

19 Jul 2018: Yosi Keller, Bar-Ilan University

Speaker : Yosi Keller, Bar-Ilan University
Title : Deep Spectral Graph Matching

Abstract: In this work we present a Deep Learning based approach for visual correspondence estimation, by deriving a Deep spectral graph matching network. We formulate the state-of-the-art unsupervised Spectral Graph Matching (SGM) approach, as part of an end-to-end supervised deep learning network. Thus allowing to utilize backpropagation to learn optimal image features, as well as algorithm parameters. For that we present a transformation layer that converts the learned image feature, within a pair of images, to an affinity matrix used to solve the matching problem via a new metric loss function. The proposed scheme is shown to compare favorably with contemporary state-of-the-art matching schemes when applied to annotated data obtained from the PASCAL, ILSVRC, KITTI and CUB-2011 datasets. Joint work with Yoav Liberman.

Room and Time: Enigma, 11h-12h

17 Jul 2018: Stephane Chretien, National Physical Laboratory

Speaker : Stephane Chretien, National Physical Laboratory
Title : Column/Feature extraction: old and new results

Abstract: A problem of paramount importance in both pure (Restricted Invertibility problem) and applied mathematics (Feature extraction) is the one of selecting a set of columns from a given matrix, such that the resulting submatrix has its smallest singular value lying above a pre-specified level. Such problems can be addressed using many different mathematical techniques and have recently been a topic of renewed interest, due to their strong connections to machine learning, compressed sensing, high dimensional statistics, etc, but also, quite surprisingly, quantum mechanics via the solution to the celebrated Kadison-Singer conjecture. In this talk, we will present an overview of the different results available in the literature and propose a new elementary perturbation approach to derive faster algorithms for column/feature extraction.

Room and Time: Augusta, 13h-14h

12 Jul 2018: Paolo Barucca, University College London

Speaker : Paolo Barucca, University College London
Title : Dynamic network models for data mining and time series prediction

Abstract: The global financial system can be represented as a large complex network in which banks, hedge funds and other financial institutions are interconnected to each other through direct and indirect financial linkages. Recently, a lot of attention has been paid to the understanding of the mechanisms that can lead to a breakdown of this network. This can happen when the existing financial links turn from being a means of risk diversification to channels for the propagation of risk across financial institutions. In this presentation, I summarize recent developments in the modeling of financial systems. I focus in particular on dynamic network approaches, and I also report on recent findings on the empirical structure of interbank networks. The presentation provides a landscape of the newly arising interdisciplinary field lying at the intersection of several disciplines, such as network science, statistical physics, and finance.

Room and Time: Augusta, 1330h-1430h

03 Jul 2018: Roman Vershynin, University of California Irvine

Speaker : Roman Vershynin, University of California, Irvine
Title : From number theory to machine learning: hunting for smooth Boolean functions

Abstract: The most fundamental kind of functions studied in computer science are Boolean functions. Most Boolean functions oscillate a lot, which is analogous to the fact that "most" continuous functions on R are nowhere differentiable. A "smooth" Boolean function can be generated by taking the sign of some polynomial of low degree in n variables. Such functions are called polynomial threshold functions, and they are widely used in machine learning as classification devices. Surprisingly, we do not know how many low-degree polynomial threshold functions there exist! An approximate answer to this question has been known only for polynomials of degree 1, i.e. for linear functions. In a recent joint work with Pierre Baldi, we found a way to approximately count polynomial threshold functions of any fixed degree. This solves a problem of M. Saks that goes back to 1993 and much earlier. Our argument draws insights from analytical number theory, additive combinatorics, enumerative combinatorics, probability and discrete geometry. I will describe some of these connections, focusing particularly on a beautiful interplay of zeta and Mobius functions in number theory, hyperplane arrangements in enumerative combinatorics and random tensors in probability theory.

Room and Time: Augusta, 14h - 15h

25 Jun 2018: Stephen Wright, University of Wisconsin-Madison

Speaker : Stephen J Wright, University of Wisconsin-Madison
Title : Practical conditional gradient algorithms

Abstract: Interest in the conditional gradient algorithm, proposed by Frank and Wolfe in 1956, has revived in recent years because of its relevance to many data science paradigms. The basic algorithm for optimizing a smooth convex function over a closed convex compact set is appealing for its simplicity and its elementary convergence theory. However, the convergence is too slow for many applications. This talk describes enhancements of conditional gradient that improve its performance in several ways. A common feature of these enhancements is the maintenance of a “basis” of extreme points of the feasible set, being the solutions of the linear oracle from a subset of previous calls. We describe a “lazy” variant in which the linearized objective is minimized only over the convex hull of this basis on most iterations, a variant in which the objective is periodically reoptimized over the convex hull of this basis, and a sparsified variant in which this basis is sometimes reduced in size. (These techniques can be applied in tandem.) We show how convergence rates are affected by these techniques, and present computational results on a variety of applications. This talk describes joint work with Sebastian Pokutta, Gabor Braun, Dan Tu, Nikhil Rao, and Parikshit Shah.

Room and Time: Enigma, 14h-15h

25 Jun 2018: Will Perkins, University of Birmingham

Speaker : Will Perkins, University of Birmingham
Title : Information Theoretic Thresholds from the Cavity Method

Abstract: The stochastic block model is a benchmark model for the computational problem of graph clustering: partitioning a graph of vertices and edges into clusters of “similar” vertices. In the model we divide a set of vertices into q clusters then add edges at random with probabilities that depend on whether the two vertices joined belong to the same cluster or not. The computational problem is to (approximately) determine the original clusters given the graph structure. I will present a result that determines the threshold edge density at which the problem of detecting the clusters becomes information theoretically possible. The proof vindicates the intricate but non-rigorous “Cavity Method” from statistical physics. Based on joint work with A. Coja-Oghlan, F. Krzakala, L. Zdeborova.

Room and Time: Augusta, 11h-12h

23 May 2018: Paul Constantine, University of Colorado Boulder

Speaker : Paul Constantine, University of Colorado Boulder
Title : Tools and techniques for subspace-based parameter space dimension reduction

Abstract: Scientists and engineers use computer simulations to study relationships between a physical model's input parameters and its output predictions. However, thorough parameter studies---e.g., constructing response surfaces, optimizing, or averaging---are challenging, if not impossible, when the simulation is expensive and the model has several inputs. To enable parameter studies in these instances, the engineer may attempt to reduce the dimension of the model's input parameter space using techniques such as sensitivity analysis or variable screening to identify unimportant variables that can be fixed for model analysis. Generalizing classical coordinate-based reduction, there are several emerging subspace-based dimension reduction tools that identify important directions in the input parameter space with respect to a particular model output. I will motivate and provide an overview of subspace-based dimension reduction techniques and discuss strategies for exploiting such low-dimensional structures---including analysis and computation---to enable otherwise infeasible parameter studies. For more information, see activesubspaces.org.

Room and Time: Augusta, 14h-15h

21 May 2018: Volkhan Cevher, École Polytechnique Fédérale de Lausanne

Speaker : Volkhan Cevher, École Polytechnique Fédérale de Lausanne
Title : Trade-offs in resource constrained machine learning systems

Abstract: Massive data poses a fundamental challenge to learning algorithms, which is captured by the following computational dogma: the running time of an algorithm increases with the size of its input data. The available computational power, however, is growing slowly relative to data sizes. Hence, large-scale machine learning problems of interest require increasingly more time to solve.

Our research demonstrates that this dogma is false in general, and supports an emerging perspective in computation: data should be treated as a resource that can be traded off with other resources, such as running time. For data acquisition and communications, we have also shown related sampling, energy, and circuit area trade-offs.

This talk will summarize our work confronting these challenges by building on the new mathematical foundations on how we generate data via sampling, how we set up learning objectives that govern our fundamental goals, and how we optimize these goals to obtain solutions and to make optimal decisions. We then demonstrate task-specific, end-to-end trade-offs (e.g., samples, power, computation, storage, and statistical precision) in broad domains.

Room and Time: Augusta, 14h-15h

15 May 2018: Heng Guo, University of Edinburgh

Speaker : Heng Guo, University of Edinburgh
Title : A polynomial-time approximation algorithm for all-terminal network reliability

Abstract: All-terminal network reliability is the probability that, in a undirected graph, assuming each edge fails independently, the remaining graph is still connected. We will present a fully polynomial-time randomised approximation scheme (FPRAS) for this problem. Our main contribution is to confirm a conjecture by Gorodezky and Pak (Random Struct. Algorithms, 2014), that the expected running time of the "cluster-popping" algorithm in bi-directed graphs is bounded by a polynomial in the size of the input.

Joint work with Mark Jerrum (QMUL)

Room and Time: Augusta, 15h-16h

30 Apr 2018: Matteo Barigozzi, London School of Economics

Speaker : Matteo Barigozzi, London School of Economics
Title : Sequential testing for structural stability in approximate factor models

Abstract: We develop an on-line monitoring procedure to detect a change in a large approximate factor model. Our statistics are based on a well-known property of the (r + 1)-th eigenvalue of the sample covariance matrix of the data (having defined r as the number of common factors): whilst under the null the (r + 1)-th eigenvalue is bounded, under the alternative of a change (either in the loadings, or in the number of factors itself) it becomes spiked. Given that the sample eigenvalue cannot be estimated consistently under the null, we regularise the problem by randomising the test statistic in conjunction with sample conditioning, obtaining a sequence of i.i.d., asymptotically chi-square statistics which are then employed to build the monitoring scheme. Numerical evidence shows that our procedure works very well in finite samples, with a very small probability of false detections and tight detection times in presence of a genuine change-point.

Room and Time: Augusta, 14h-15h

27 Apr 2018: Guy Nason, University of Bristol

Speaker : Guy Nason, University of Bristol
Title : Network Time Series

Abstract: A network time series is a multivariate time series where the individual series are known to be linked by some underlying network structure. Sometimes this network is known a priori, and sometimes the network has to be inferred, often from the multivariate series itself. Network time series are becoming increasingly common, long, and collected over a large number of variables. We are particularly interested in network time series whose network structure changes over time.

We describe some recent developments in the modeling and analysis of network time series via network autoregressive integrated moving average (NARIMA) process models. NARIMA models provide a network extension to a familiar environment that can be used to extract valuable information and aid prediction. As with classical ARIMA models, trend can impair the estimation of NARIMA parameters. The scope for trend removal is somewhat wider with NARIMA models and we exhibit some possibilities.

We will illustrate the operation of NARIMA modeling on data sets arising from human and veterinary epidemiology.

This is joint work with Kathryn Leeming (Bristol), Marina Knight (York) and Matt Nunes (Lancaster).

Room and Time: Augusta, 14h-15h

13 Apr 2018: Hao Chen, University of California, Davis

Speaker : Hao Chen, University of California, Davis
Title : Change-point Detection for Multivariate and Object Data

Abstract: After observing snapshots of a network, can we tell if there has been a change in dynamics? We developed a general nonparametric framework for change-point detection utilizing similarity among observations. This new approach, which relies on graph-based tests, can be applied to high-dimensional data, as well as data from non-Euclidean sample spaces, such as network data. In addition, this framework is combined with a new way of permutation, circular block permutation with a random starting point, to further deal with local dependence, which is common in such data sequences. Analytic approximations to the significance of this graph-based scan statistics are derived for both the independence and local dependence scenarios, facilitating their applications to real datasets. We illustrate the methods through the analysis of a phone-call network from the MIT Reality Mining.

Room and Time: Augusta, 14h-15h

06 Apr 2018: François-Xavier Briol, University of Warwick

Speaker : François-Xavier Briol, University of Warwick
Title : Bayesian Quadrature for Multiple Related Integrals.

Abstract: Bayesian probabilistic numerical methods are a set of tools providing posterior distributions on the output of numerical methods. The use of these methods is usually motivated by the fact that they can represent our uncertainty due to incomplete/finite information about the continuous mathematical problem being approximated. In this work, we demonstrate that this paradigm can provide additional advantages, such as the possibility of transferring information between several numerical methods. This allows users to represent uncertainty in a more faithful manner and, as a by-product, provide increased numerical efficiency. We propose the first such numerical method by extending the well-known Bayesian quadrature algorithm to the case where we are interested in computing the integral of several related functions. We then prove convergence rates for the method in the well-specified and misspecified cases, and demonstrate its efficiency in the context of multi-fidelity models for complex engineering systems and a problem of global illumination in computer graphics.

Room and Time: Augusta, 14h-15h

27 Mar 2018: Pranay Seshadri, University of Cambridge

Speaker : Pranay Seshadri, University of Cambridge
Title : On Dimension Reduction Investigations in Turbomachinery

Abstract: Blades in modern jet engines are parameterized by 20-300 design variables. It is impossible to visualize, and more importantly, effectively explore such vast design spaces. Yet such an exploration may be required not solely for optimization and uncertainty quantification endeavors, but more importantly, for understanding the physics that underpins key designs characteristics. For instance, when carrying out an optimization, the designs that are often associated with high efficiency are likely to be different from those associated with high flow capacity and high pressure ratio. Likewise, designs that are insensitive to variations in operational boundary conditions are likely to be different from those that are sensitive under the very same conditions. So what is the physical rationale for different design characteristics and how do we identify them?

Our goal is to be able to identify key linear combinations of designs that influence certain outputs, especially when the total number of design variables is large. These salient concerns motivate an output-based—i.e., efficiency, flow capacity or pressure ratio—dimension reduction strategy. In this talk, a new algorithm for subspace-based dimension reduction is introduced. It combines manifold optimization with classical Gaussian process regression. The fundamental idea we exploit is that for many physics-driven problems, outputs can be expressed as a non-linear function of a few linear combinations of all the variables. There are parallels between this idea (known in the applied mathematics community under the moniker of ridge functions'), neural networks, active subspaces and classical sufficient dimension reduction methods. We test our algorithm on a range of different data sets where the number of dimensions vary from 2 to 200. Our results demonstrate that if the problem does admit a ridge structure, our technique provides near optimal recovery. Finally, we describe how the computed dimension reducing subspace can be used for optimization, uncertainty quantification and more importantly for gaining physical insight into turbomachinery fluid mechanics.

Room and Time: Augusta, 15h-16h

27 Mar 2018: Wei Dai, Imperial College London

Speaker : Wei Dai, Imperial College London
Title : Blind Estimation of Origin-Destination Flows in Networks

Abstract: This work focuses on the inverse problem of estimating origin-destination (OD) flows from link flows. OD flow estimation is essential to many network analysis tasks, such as urban planning, as OD information specifies the travel demands in the network. Approaches in the literature for OD estimation typically require some prior information of the OD flows as well as the mapping from OD flows to link flows. In particular, a linear model is often used to map OD flows to link flows and the corresponding linear operator is commonly referred to as the traffic assignment. In practice, prior information of the traffic assignment can be inaccurate and incomplete. Furthermore, even with full information of the traffic assignment, the OD flow estimation problem is still ill-posed and therefore further prior information of OD flows is required.

The main contribution of this work is estimating OD flows with no or little prior information. Inspired by the terminology blind deconvolution in signal processing community, we refer to the OD estimation without knowledge of traffic assignment as blind estimation. Different from the literature where linear systems are built on OD flows, a new linear model is introduced of which the inquired flows are specified with only their origins but not their destinations. The new model preserves the OD information and at the same time reduces the dimension of the inquired flows in orders of magnitude. Due to this dimension reduction, the OD flow estimation problem becomes well-posed when the corresponding traffic assignment information is available. More importantly it is possible to estimate the OD flows and the traffic assignment simultaneously with no prior information of them. Detailed discussions are provided regarding the uniqueness of solutions for the simultaneous estimation. Simulations are presented for the demonstration purpose.

Room and Time: Augusta, 14h-15h

20 Mar 2018: Jacek Brodzki, University of Southampton

Speaker : Jacek Brodzki, University of Southampton
Title : Geometry, Topology, and the structure of data

Abstract: Modern data is astonishing in its variety, and is far removed from anything that could be held in spreadsheet or analysed using traditional methods. Through intense research effort, topology has emerged as a source of novel methodology to provide insight into the structure of very complex, high dimensional data. It provided us with tools like persistent homology, which are used to compute numerical topological characteristics of the data. More recently, these methods have been augmented by geometric insights, which are valuable in capturing the structure of and relationships between complex shapes.

In this talk, I will provide an overview of new techniques from topology and geometry and illustrate them on particular examples. We will discuss applications of persistent homology to the study of the set of CT scans of human lungs. The geometric part of the talk will concern classification of three-dimensional shapes through synchronisation.

Room and Time: Augusta, 13h-14h

06 Mar 2018: Coralia Cartis, University of Oxford

Speaker : Coralia Cartis, University of Oxford
Title : Complexity of Nonconvex Optimization: The Story So Far

Abstract: Establishing the global rate of convergence of standard algorithms or their global evaluation complexity for nonconvex smooth optimization problems is a natural but challenging aspect of algorithm analysis. In the last decade, substantial progress has been made and continues to be made, in this direction. We review some of the key developments, illustrating the crucial role played by cubic regularisation methods, a credible alternative to linesearch and trust-region.

We then focus on two recent results. Firstly, we consider a general/new class of adaptive regularization methods, that use first- or higher-order local Taylor models of the objective regularized by a(ny) power of the step size. We investigate the worst-case evaluation complexity of these algorithms, when the level of sufficient smoothness of the objective and its derivatives may be unknown or may even be absent, and we find that the methods accurately reflect in their complexity the (unknown) degree of smoothness of the objective/derivatives and satisfy increasingly better bounds with the order of the derivatives available. We then focus on the antipodal context, when no derivatives of the objective are available, and the local models in adaptive cubic regularisation methods are constructed for example, by sampling of function values. We show that the evaluation complexity of the ensuing methods do not change in the order of the accuracy from their deterministic counterparts, increasing only by a constant factor which depends on the probability of the sampled models being occasionally, sufficiently accurate.

Time permitting, we will also discuss the stochastic case when even function evaluations may be inaccurate. This work is joint with Nick Gould (Rutherford Appleton Laboratory, UK), Philippe Toint (University of Namur, Belgium) and Katya Scheinberg (Lehigh University, USA).

Room and Time: Augusta, 12h-13h

06 Mar 2018: Ruggerio Bellio, University of Udine

Speaker : Ruggero Bellio, University of Udine
Title : Modern Likelihood-Frequentist Inference

Abstract: This talk illustrates some results of our recent paper Modern Likelihood-Frequentist Inference, which is an exposition of the development of likelihood asymptotics obtained by many workers. The exposition is intended to be accessible to a wide audience. It does this by charting a narrow course through the developments over more than 30 years. This covers the likelihood ratio approximation to the distribution of the parameter estimator, approximate ancillary conditioning, transformation to a highly accurate approximation to the likelihood ratio test, and related topics.

The talk focuses in particular on the R package likelihoodAsy, that accompanies the paper. The package implements some tools for likelihood inference, including in particular the Skovgaard version of Barndorff-Nielsen's modified signed likelihood ratio test statistic r*. The main point of this is to approximate the sample space derivatives, that held up this theory for a decade until the Skovgaard breakthrough. The user needs only to provide R code for the likelihood function, for generating a sample from the model, and for the formulation of the interest parameter. The latter is allowed to be a rather general function of the model parameter, and this is the most important feature of the package.

This is a joint work with Donald A. Pierce, Oregon State University.
Pierce, D.A. and Bellio, R. (2017). Modern likelihood-frequentist inference. International Statistical Review, 85, 519-541.

Room and Time: Augusta, 11h-12h

22 Feb 2018: Yuji Nakatsukasa, University of Oxford

Speaker : Yuji Nakatsukasa, University of Oxford
Title : Global optimization via eigenvalues

Abstract: While non-convex optimization problems are generally regarded as difficult or computationally intractable, the Courant-Fischer theorem for symmetric eigenvalue problems suggests that each eigenpair corresponds to a stationary point of a non-convex optimization problem (in this case the Rayleigh quotient). More generally, matrix eigenvalue problems can be regarded as an important class of optimization problems that can be solved efficiently. This observation suggests conversely that perhaps a global solution for some non-convex optimization problems could be obtained by solving an eigenvalue problem. In this work we identify such optimization problems, and show that they lead to a variety of eigenvalue problems, including generalized, polynomial and multiparameter eigenvalue problems, all of which can be solved reliably and in polynomial time, and essentially non-iteratively. Specifically, we consider non-convex QCQP (quadratically constrained quadratic programming) QCQP is known to be NP-hard in general (when the number of constraints k is not fixed), and for k ≥ 2 fixed, it has been an open problem whether there are polynomial-time solvable until recently. We show that (i) for QCQP with one constraint, k = 1, it can be reduced to essentially a single generalized eigenvalue problem, (ii) for QCQP with two constraints, k = 2, the problem can be reduced to a two-parameter eigenvalue problem, and (iii) unconstrained optimization problems with cubic regularization can be reduced to quadratic eigenvalue problems.

Based on joint work with Satoru Adachi, Satoru Iwata, Shinsaku Sakaue and Akiko Takeda (University of Tokyo).

Room and Time: Augusta, 14h-15h

20 Feb 2018: Mu Niu, U of Plymouth + Pokman Cheung, Goldman Sachs

Speakers : Mu Niu, University of Plymouth and Pokman Cheung, Goldman Sachs
Title : Intrinsic Gaussian processes on complex constrained domains

Abstract: We propose a class of intrinsic Gaussian processes (in-GPs) for interpolation and regression on manifolds with a primary focus on complex constrained domains or irregular-shaped spaces arising as subsets or submanifolds of R2, R3 and beyond. For example, in-GPs can accommodate spatial domains arising as complex subsets of Euclidean space. in-GPs respect the potentially complex boundary or interior conditions as well as the intrinsic geometry of the spaces. The key novelty of the proposed approach is to utilise the relationship between heat kernels and the transition density of Brownian motion on manifolds for constructing and approximating valid and computation- ally feasible covariance kernels. This enables in-GPs to be practically applied in great generality, while existing approaches for smoothing on constrained domains are limited to simple special cases. The broad utilities of the in-GP approach is illustrated through simulation studies and data examples.

Room and Time: Augusta, 14h-15h

09 Jan 2018: Sebastian Stich, EPFL

Speaker : Sebastian Stich, École Polytechnique Fédérale de Lausanne
Title : Adaptive importance sampling for convex optimization

Abstract: In the last years, randomized optimization algorithms, and especially coordinate descent methods attracted more and more attention in the optimization community. These methods often replace the efficient -- but often relatively expensive -- iterations of the classical methods with much cheaper -- but less efficient -- updates. Consequently, such methods can be applied to problems of very big size. This application is sometimes also theoretically justified by very attractive worst-case efficiency guarantees.

In this talk, we will first provide a tutorial-like introduction to random decent methods and review their complexity. We will discuss variable metric or quasi-Newton type of methods [1] and accelerated schemes [2]. In the second part of the talk we further introduce the concept of (adaptive) importance sampling and present a scheme that computes the optimal sampling frequencies based on safe estimates of the gradient [3]. The proposed sampling is provably the best sampling with respect to the given bounds and always better than uniform sampling and fixed importance sampling. We conclude by discussing open problems in this field of research.

[1] Joint work with B. Gartner, C. Muller (Math. Prog. 2016)
[2] Joint work with Y. Nesterov (SIAM J. Opt. 2017)
[3] Joint work with A. Raj, M. Jaggi (NIPS 2017)

Room and Time: Augusta, 14h-15h

(2017)

20 Dec 2017: Nicolas Boumal, Princeton University

Speaker : Nicolas Boumal, Princeton University
Title : Near optimal bounds for phase synchronization

Abstract: Phase synchronization consists in estimating n phases (unit-modulus complex numbers) given measurements of all relative phases. We assume the measurements are corrupted by i.i.d. Gaussian noise. The maximum likelihood estimator is the solution to a non-convex optimization problem, NP-hard in general. Yet, it has been observed some years ago that the maximum likelihood estimator can be computed in polynomial time through semidefinite relaxation, even in the face of tremendous amounts of noise.

In this talk, I will present proof techniques that allow to explain this observation. This is based on the generalized power method, replicas to reduce statistical dependence, and dual certificates.

Joint work with Yiqiao Zhong, Afonso Bandeira and Amit Singer.

Room and Time: Augusta, 11h-12h

20 Dec 2017: Rajen Shah, University of Cambridge

Speaker : Rajen Shah, University of Cambridge
Title : The xyz algorithm for fast interaction search in high-dimensional data

Abstract: When performing regression on a dataset with p variables, it is often of interest to go beyond using main effects and include interactions as products between individual variables. However, if the number of variables p is large, as is common in genomic datasets, the computational cost of searching through O(p^2) interactions can be prohibitive. In this talk I will introduce a new randomised algorithm called xyz that is able to discover interactions with high probability and under mild conditions has a runtime that is subquadratic in p. The underlying idea is to transform interaction search into a much simpler closest pair problem. We will see how strong interactions can be discovered in almost linear time, whilst finding weaker interactions requires O(p^u) operations for 1 < u < 2 depending on their strength. An application of xyz to a genome-wide association study shows how more than 10^11 interactions can be screened in minutes using a standard laptop. This is joint work with Gian Thanei and Nicolai Meinshausen (ETH Zurich).

Speaker Bio: Rajen Shah is a lecturer in Statistics at the Statistical Laboratory, University of Cambridge, having completed his PhD under the supervision of Richard Samworth (also at Cambridge) in 2014. His main research interests are in high-dimensional statistics and large-scale data analysis. Rajen is currently an Associate Editor for the Journal of the Royal Statistical Society Series B, and a Turing Fellow.

Room and Time: Augusta, 14h-15h

28 Nov 2017: Adilet Otemissov, University of Oxford

Speaker : Adilet Otemissov, University of Oxford
Title : Dimensionality reduction techniques for global optimization.

Abstract: Abstract: (Joint work with Coralia Cartis) The problem of finding the most extreme value of a function, also known as global optimization, is a challenging task. The difficulty is associated with the exponential increase in the computational time for a linear increase in the dimension. This is known as the `curse of dimensionality''. In this talk, we demonstrate that such challenges can be overcome for functions with low effective dimensionality – functions which are constant along certain linear subspaces. Such functions can often be found in applications, for example, in hyper-parameter optimization for neural networks, heuristic algorithms for combinatorial optimization problems and complex engineering simulations.

We propose the use of random subspace embeddings within a(ny) global minimisation algorithm, extending the approach in Wang et al. (2013). We introduce two techniques (REGO and AREGO) that transform the high-dimensional optimization problem into a low-dimensional problem. REGO is formulated by adding constraints in the low-dimensional embedded space and AREGO by including constraints in the original space. We derive probabilistic bounds on the success of the dimension-reduced problems and present encouraging numerical results.

Speaker Bio: Adilet is a second year DPhil Student at the Alan Turing Institute and University of Oxford. He graduated from Nazarbayev University with a BSc in Mathematics and then pursued a master's degree at the University of Manchester. His research interests include high-dimensional non-convex and global optimization.

Room and Time: Augusta, 13h-14h

21 Nov 2017: Rodrigo Mendoza Smith, University of Oxford

Speaker : Rodrigo Mendoza Smith, University of Oxford
Title : Embarrassingly parallel reduction of boundary matrices in Persistent Homology

Abstract: Topological Data Analysis is a relatively new paradigm in data-science that models datasets as point-clouds sampled from a shape embedded in an Euclidean space. The inference problem is to estimate the essential topological features of the underlying shape from a point-cloud sampled from it. This is done through a technique called persistent homology which is a mathematical formalism based on algebraic topology. A necessary step in the persistent homology pipeline is the reduction a so-called boundary matrix, which is a process reminiscent to Gaussian elimination. In this talk, I present a number of structural dependencies in boundary matrices and use them to design a novel embarrassingly parallel algorithm for their reduction. Simulations on synthetic examples show that the computational burden can be conducted in a small fraction of the number of iterations needed by traditional methods. For example, numerical experiments show that for a boundary matrix with 10^4 columns, the reduction completed to within 1% in about ten iterations as opposed to nearly approximately eight thousand iterations for traditional methods.

Speaker Bio: Rodrigo Mendoza Smith is a DPhil candidate in Applied Mathematics working with Prof. Jared Tanner. He is affiliated with the Numerical Analysis Group in Oxford and with the Alan Turing Institute in London. He is working on the design of provably fast numerical algorithms for sparse and low-rank data models. His research interests lie in numerical computing, compressed-sensing, optimisation, machine-learning, network theory and computational algebraic topology. His motivation and long term goal is to contribute to the understanding of Artificial Intelligence and its interplay with geometry, statistics and computability

Room and Time: Augusta, 14h-15h

16 Nov 2017: Guillame Lecue, CREST + ENSAE

Speaker : Guillame Lecue, Center for Research in Economics and Statistics
Title : Learning from MOM’s principles

Abstract: (Joint work with Matthieu Lerasle) We obtain theoretical and practical performances for median of means estimators.

From a theoretical point of view, estimation and prediction error bounds achieved by the MOM estimators hold with exponentially large probability -- as in the gaussian framework with independent noise-- under only weak moments assumptions on the data and without assuming independence between the noise and the design. Moreover, MOM procedures are robust since a large part of the data may have nothing to do with the oracle we want to reconstruct. Our general risk bound is of order max(minimax rate of convergence in the i.i.d. setup, (number of outliers)/number of observations)). In particular, the number of outliers may be as large as (number of data)*(minimax rate) without affecting the statistical properties of the MOM estimator.

A regularization norm may also be used together with the MOM criterium. In that case, any norm can be used for regularization. When it has some sparsity inducing power we recover sparse rates of convergence and sparse oracle inequalities. For example, the minimax rate s log(d/s)/N of recovery of a s-sparse vector in R^d is achieved by a median-of-means version of the LASSO when the noise has q_0 moments for some q_0>2, the design matrix C_0\log(d) moments and the dataset is corrupted by s log(d/s) outliers. This result holds with exponentially large probability as if the noise and the design were i.i.d. Gaussian random variables.

On the practical side, MOM estimators (and their associated regularized versions) can easily be implemented. Actually, most gradient descent algorithms used to implement (non-robust) estimators like the LASSO can easily be transformed into a robust one by using a MOM approach.

Room and Time: Augusta, 15h-16h

14 Nov 2017: Piotr Fryzlewicz, London School of Economics

Speaker : Piotr Fryzlewicz, London School of Economics

Title: Multiscale thinking in data analysis, recursive algorithms, and data-adaptive change-point detection

Abstract: The talk starts on a general note: we first attempt to define a "multiscale" method / algorithm as a recursive program acting on a dataset in a suitable way. Wavelet transformations, unbalanced wavelet transformations and binary segmentation are all examples of multiscale methods in this sense. Using the example of binary segmentation, we illustrate the benefits of the recursive formulation of multiscale algorithms from the software implementation and theoretical tractability viewpoints.

We then turn more specific and study the canonical problem of a-posteriori detection of multiple change-points in the mean of a piecewise-constant signal observed with noise. Even in this simple set-up, many publicly available state-of-the-art methods struggle for certain classes of signals. In particular, this misperformance is observed in methods that work by minimising a "fit to the data plus a penalty" criterion, the reason being that it is challenging to think of a penalty that works well over a wide range of signal classes. To overcome this issue, we propose a new approach whereby methods "learn" from the data as they proceed, and, as a result, operate differently for different signal classes. As an example of this approach, we revisit our earlier change-point detection algorithm, Wild Binary Segmentation, and make it data-adaptive by equipping it with a recursive mechanism for deciding "on the fly" how many sub-samples of the input data to draw, and where to draw them. This is in contrast to the original Wild Binary Segmentation, which is not recursive. We show that this significantly improves the algorithm particularly for signals with frequent change-points.

Bio: Piotr Fryzlewicz is a professor of statistics in the Department of Statistics at the London School of Economics. He has previously worked at Winton Capital Management, the University of Bristol and Imperial College London. His research interests are in multiscale modelling and estimation, time series, change-point detection, high-dimensional statistical inference and statistical learning. He currently serves as Joint Editor of the Journal of the Royal Statistical Society Series B.

Room and Time: Augusta, 14h-15h

07 Nov 2017: Quentin Berthet, University of Cambridge

Speaker : Quentin Berthet, University of Cambridge
Title : Bandit Optimization with Upper-Confidence Frank-Wolfe

Abstract: We consider the problem of bandit optimization, inspired by stochastic optimization and online learning problems with bandit feedback. In this problem, the objective is to minimize a global loss function of all the actions, not necessarily a cumulative loss. This framework allows us to study a very general class of problems, with applications in statistics, machine learning, and other fields. To solve this problem, we introduce the Upper-Confidence Frank-Wolfe algorithm, inspired by techniques for bandits and convex optimization. We show upper bounds on the optimization error of this algorithm over various classes of functions, and discuss the optimality of these results. This is joint work with Vianney Perchet (ENS Paris-Saclay and Criteo Research)

Speaker Bio: Quentin Berthet is a Turing Fellow and a Lecturer in the Statslab, in the DPMMS at the University of Cambridge, since 2015. He is a former student of the Ecole Polytechnique, received a Ph.D. from Princeton University in 2014, and was a CMI postdoctoral fellow at Caltech. His research interests are in Statistics, Optimization and Computer Science.

Room and Time: Augusta, 14h-15h

01 Nov 2017: Yiannis Koutis, New Jersey Institute of Technology

Speaker : Yiannis Koutis, New Jersey Institute of Technology
Title: A survey of provably fast linear system solvers

Abstract: The design of fast linear system solvers has been a very active research area for decades. Solvers based on various approaches, from multigrid methods to incomplete matrix factorization techniques have been quite successful in practical engineering problems, but lack fast convergence guarantees for broader classes of unstructured linear systems that are common in modern data science problems. Solvers that are provably fast for broad classes of unstructured linear systems are a fairly recent discovery. We will survey this recent progress, and discuss key enabling facts from the theory of concentration bounds for random sums of positive matrices. We will also discuss known obstacles that stand on the way to progress for general linear systems.

Bio: Yiannis Koutis is an associate professor in the Ying Wu College of Computing at the New Jersey Institute of Technology. He completed his Ph.D. in Compute Science at Carnegie Mellon University, and got his undergraduate Diploma in Computer Engineering and Informatics from the University of Patras in Greece. Before NJIT, he was an associate professor at the University of Puerto Rico, Rio Piedras. In the past he has also worked as a Systems Scientist at Carnegie Mellon University.

Room and Time: Augusta, 15h-16h

31 Oct 2017: Dino Sejdinovic, University of Oxford

Speaker : Dino Sejdinovic, University of Oxford
Title: Inference with Approximate Kernel Embeddings

Abstract: Kernel embeddings of distributions and the Maximum Mean Discrepancy (MMD), the resulting distance between distributions, are useful tools for fully nonparametric hypothesis testing and for learning on distributional inputs. I will give an overview of this framework and present some of the applications of the approximate kernel embeddings to Bayesian computation. Further, I will discuss a recent modification of MMD which aims to encode invariance to additive symmetric noise and leads to learning on distributions robust to the distributional covariate shift, e.g. where measurement noise on the training data differs from that on the testing data. ArXiv:1703.07596

Bio: Dino Sejdinovic is an Associate Professor at the Department of Statistics, University of Oxford, a Fellow of Mansfield College, Oxford, and a Turing Fellow of the Alan Turing Institute. He previously held postdoctoral positions at the Gatsby Computational Neuroscience Unit, University College London and at the Department of Mathematics, University of Bristol. He received a PhD in Electrical and Electronic Engineering from the University of Bristol (2009). His research is at the interface between machine learning and statistical methodology, with a focus on nonparametric and kernel methods.

Room and Time: Augusta, 14h-15h

17 Oct 2017: Varun Kanade, University of Oxford

Speaker : Varun Kanade, University of Oxford
Title : From which world is your graph?

Abstract: Discovering statistical structure from links is a fundamental problem in the analysis of social networks. Choosing a misspecified model, or equivalently, an incorrect inference algorithm will result in an invalid analysis or even falsely uncover patterns that are in fact artifacts of the model. This work focuses on unifying two of the most widely used link-formation models: the stochastic blockmodel (SBM) and the small world (or latent space) model (SWM). Integrating techniques from kernel learning, spectral graph theory, and nonlinear dimensionality reduction, we develop the first statistically sound polynomial-time algorithm to discover latent patterns in sparse graphs for both models. When the network comes from an SBM, the algorithm outputs a block structure. When it is from an SWM, the algorithm outputs estimates of each node's latent position. We report experiments performed on a twitter dataset collected between October and November 2016. Based on joint work with Cheng Li, Zhenming Liu and Felix Wong.

Speaker Bio: Varun Kanade is a Research Lecturer in the Department of Computer Science, Oxford University. He has been a Simons Postdoctoral Fellow at the University of California, Berkeley and an FSMP postdoctoral fellow at ENS, Paris. He obtained his Ph.D. from Harvard University in 2012. His research interests are in machine learning and theoretical computer science and focuses on mathematical foundations of machine learning, in particular computational and statistical tradeoffs and improving our understanding of methods that have been successful in practice.

Room and Time: Augusta, 14h-15h

09 Oct 2017: Martin Lotz, University of Manchester

Speaker: Martin Lotz, University of Manchester
Title: Randomized dimensionality reduction in persistent homology

Abstract: We discuss a few ideas related to dimensionality reduction at the point-cloud level for the computation of persistent homology. In particular, it is shown that in certain cases, the ambient dimension of the problem can literally be replaced with an intrinsic notion of dimension related to the structure of the data, and that such a reduction can be computed efficiently. Some surprising connections to the theory of suprema of Gaussian processes are presented, and potential extensions discussed.

Speaker Bio: Martin Lotz is a Lecturer in Mathematics at the University of Manchester. Previous to his appointment at Manchester he was a post-doctoral researcher at the City University of Hong Kong, at the University of Oxford, and as a Research Fellow at the University of Edinburgh, supported by a Leverhulme Trust and Seggie Brown Fellowship. He studied Mathematics at the ETH (Swiss Federal Institute of Technology) in Zurich, and the University of Paderborn, where he finished his doctorate in Algebraic Complexity Theory. His interests span a wide range of mathematics and its applications. His current work explores connections between probability, optimization, and applications in signal processing, data analysis and inverse problems. Previous work has been in algebraic complexity theory, computational algebra and geometry, foundations of numerical analysis, and combinatorics.

Room and Time: Augusta, 14h-15h

03 Oct 2017: He Sun, University of Edinburgh

Speaker: He Sun, University of Edinburgh
Title: Heat kernels in graphs: A journey from random walks to geometry, and back

Abstract: Heat kernels are one of the most fundamental concepts in physics and mathematics. In physics, the heat kernel is a fundamental solution of the heat equation and connects the Laplacian operator to the rate of heat dissipation. In spectral geometry, many fundamental techniques are based on heat kernels. In finite Markov chain theory, heat kernels correspond to continuous-time random walks and constitute one of the most powerful techniques in estimating the mixing time.

In this talk, we will briefly discuss this line of research and its relation to heat kernels in graphs. In particular, we will see how heat kernels are used to design the first nearly-linear time algorithm for finding clusters in real-world graphs. Some interesting open questions will be addressed in the end.

Speaker Bio: He Sun is a senior lecturer in Algorithms and Complexity in the School of Informatics, University of Edinburgh. Prior to this, he held positions at the Max Planck Institute for Informatics from 2010 to 2015, and the University of Bristol from 2015 to 2017. His main research interests range over the fields of spectral graph theory, applied probability and statistics, matrix analysis, and machine learning.

Room and Time: Augusta, 14h-15h

29 Sep 2017: Greg Ongie, University of Michigan

Speaker: Greg Ongie, University of Michigan
Title: Learning non-linear models with missing data

Abstract: Linear models have had great success for inference problems with missing data. A key example is low-rank matrix completion, where the missing entries of a matrix are imputed by exploiting low-dimensional linear structure. However, many real-world datasets fail to have linear structure but still have low-dimensional non-linear structure that can be exploited. We introduce a systematic framework for learning low-dimensional non-linear models in problems in missing data. The approach has close connections to subspace clustering, manifold learning, and kernel methods in machine learning. As a special case, we focus on the problem of matrix completion where we assume data belongs to a low-dimensional algebraic variety, i.e., each data point is a solution to a system of polynomial equations, and study the sampling requirements of this model. We propose an efficient matrix completion algorithm that is able to recover synthetically generated data up to the predicted sampling complexity bounds and outperforms traditional low-rank matrix completion and subspace clustering approaches on real datasets in computer vision and genomics.

Room and Time: Enigma, 14h-15h

This talk is part of Greg Ongie's visit via the "Research in small groups" grant for Turing host: Armin Eftekhari.

19 Sep 2017: Richard Samworth, University of Cambridge

Speaker : Richard Samworth, University of Cambridge
Title : High dimensional change point estimation via sparse projection

Abstract: Change points are a very common feature of big data that arrive in the form of a data stream. We study high dimensional time series in which, at certain time points, the mean structure changes in a sparse subset of the co-ordinates. The challenge is to borrow strength across the co-ordinates to detect smaller changes than could be observed in any individual component series. We propose a two-stage procedure called 'inspect' for estimation of the change points: first, we argue that a good projection direction can be obtained as the leading left singular vector of the matrix that solves a convex optimization problem derived from the cumulative sum transformation of the time series. We then apply an existing univariate change point estimation algorithm to the projected series. Our theory provides strong guarantees on both the number of estimated change points and the rates of convergence of their locations, and our numerical studies validate its highly competitive empirical performance for a wide range of data-generating mechanisms. Software implementing the methodology is available in the R package 'InspectChangepoint'.

Speaker Bio: Richard Samworth obtained his PhD in Statistics from the University of Cambridge in 2004, and has remained in Cambridge since, becoming a professor in 2013 and the Professor of Statistical Science in 2017. His main research interests are in nonparametric and high-dimensional statistics, including problems in shape-constrained inference, classification, variable selection and changepoint estimation, among others.

Room and Time: Augusta, 14h-15h.