Monday, 05 August 2019

Gersende Fort
Title: Monte Carlo methods and Optimization: Intertwinings (Lecture 1)
Abstract:

New challenges in Data Science necessitate the combination of optimization techniques and Monte Carlo methods. Incorporating optimization tools in Monte Carlo methods is a known technique to improve their efficiency in order to learn on the fly a better value of some design parameters. Incorporating Monte Carlo techniques in Optimization algorithms allows the introduction of (stochastic) numerical approximations of intractable quantities. In both cases, this yields algorithms which are perturbations of simpler ones whose asymptotic behavior is (usually) well known. When introducing these intertwinings, it is fundamental to be sure that it will not destroy the convergence: convergence of the sampler to the target distribution, or convergence of the optimization method to the target set of solutions.

In the lectures, two points of view will be successively considered:

  • sufficient conditions for the adaptation mecanism of Monte Carlo algorithms, in order to identify the asymptotic distribution of the sampled points.

  • sufficient conditions on the stochastic perturbations, in order to ensure the correct convergence of the perturbed optimization algorithm. We will essentially consider the case of Stochastic Approximation (SA) algorithms (possibly also, Majorize-Minimization methods).

As a preliminary of these two parts, examples from Computational Statistics will be introduced.

In both parts, we will consider the Markovian case: Monte Carlo methods based on Markov chains (MCMC); Stochastic Optimization fed with MCMC samples, which implies, as a unusual setting for SA, a biased stochastic approximation of intractable quantities.

Devavrat Shah
Title: What Do We Know About Matrix Estimation? (Lecture 1)
Abstract:

The task of estimating matrix based on its noisy, partial observation has emerged as the canonical challenge across a variety of fields spanning Inference, Machine Learning and Statistics over the past decade or so. Popularized examples abound, including Recommendation systems, Asymptotic Graph Theory (e.g. Graphons), Network Science (e.g. Community Detection), Social Data Processing (e.g. Ranking and Crowd Sourcing), Causal Inference (e.g. Synthetic Control), Panel Data Analysis, Bio-informatics (e.g. DNA sequencing) and more.

The purpose of this tutorial is to provide a comprehensive survey of various algorithmic and analytic approaches developed over the past decade. The goal is to ground these developments in the context of a “universal” model through connections with the theory of exchangeability (i.e. De Finetti (1937), Aldous and Hoover (1980s)). A particular attention will be paid to statistical and computational trade-off that arise in this class of problems. Open questions pertaining to conjectured fundamental limits and mysterious empirical algorithmic successes will be discussed.

  - Matrix Estimation: Model, Statement and Examples
  - Overview of Algorithms
  - Singular Value Thresholding

A. Ganesh
Title: Rumours, consensus and epidemics on networks (Lecture 1)
Abstract:

The course will discuss a variety of stochastic processes over networks, which are both of interest in their own right and also play a role in many applications in computer science, as well as the social and biological sciences.

Lecture 1: Rumours and first passage percolation

Consider the complete graph on n vertices, where there is an edge between every pair of vertices. We associate with each edge a random weight, or length, drawn from an exponential distribution, independently across edges. We begin by studying the lengths of shortest paths between two vertices, its expectation, and fluctuations. We go on to study some related combinatorial optimisation problems.

Lecture 2: Voter model and variants

Consider a set of n agents connected by an arbitrary network. Each agent has an opinion, taking values in a finite set. Agents update their opinions by contacting other agents at random times and adopting their opinions. We study the time to consensus, and the probability of consensus on each possible value, in this model. We go on to discuss some variants of this model.

Lecture 3: Epidemics

We study the contact process or SIS epidemic on a connected network. Here, each node can be infected or susceptible. Susceptible nodes are infected at some constant rate by each of their infected neighbours, while infected nodes spontaneously recover after a random time and become immediately susceptible. We describe the survival time of the epidemic in large networks, and a sharp transition that it exhibits in many random graph models.

Gersende Fort
Title: Monte Carlo methods and Optimization: Intertwinings (Lecture 2)
Abstract:

New challenges in Data Science necessitate the combination of optimization techniques and Monte Carlo methods. Incorporating optimization tools in Monte Carlo methods is a known technique to improve their efficiency in order to learn on the fly a better value of some design parameters. Incorporating Monte Carlo techniques in Optimization algorithms allows the introduction of (stochastic) numerical approximations of intractable quantities. In both cases, this yields algorithms which are perturbations of simpler ones whose asymptotic behavior is (usually) well known. When introducing these intertwinings, it is fundamental to be sure that it will not destroy the convergence: convergence of the sampler to the target distribution, or convergence of the optimization method to the target set of solutions.

In the lectures, two points of view will be successively considered:

  • sufficient conditions for the adaptation mecanism of Monte Carlo algorithms, in order to identify the asymptotic distribution of the sampled points.

  • sufficient conditions on the stochastic perturbations, in order to ensure the correct convergence of the perturbed optimization algorithm. We will essentially consider the case of Stochastic Approximation (SA) algorithms (possibly also, Majorize-Minimization methods).

As a preliminary of these two parts, examples from Computational Statistics will be introduced.

In both parts, we will consider the Markovian case: Monte Carlo methods based on Markov chains (MCMC); Stochastic Optimization fed with MCMC samples, which implies, as a unusual setting for SA, a biased stochastic approximation of intractable quantities.

Vivek Borkar
Title: A representation theorem for risk-sensitive reward
Abstract:

The talk will describe a controlled version of the Collatz-Wielandt formula for principal eigenvalue of a nonnegative matrix, for the optimal risk-sensitive reward for discrete time risk-sensitive reward problems. This in turn leads to a Donsker-Varadhan type variational formula for the same. Extensions to continuous time risk-sensitive problems and applications to control with chance constraints and linear/dynamic programming formulations for degenerate finite state/action risk-sensitive reward problems are described.

Parts of these results are joint work with Venkat Anantharam, Ari Arapostathis, K. Suresh Kumar and Jerzy Flar.

Tuesday, 06 August 2019

Gersende Fort
Title: Monte Carlo methods and Optimization: Intertwinings (Lecture 3)
Abstract:

New challenges in Data Science necessitate the combination of optimization techniques and Monte Carlo methods. Incorporating optimization tools in Monte Carlo methods is a known technique to improve their efficiency in order to learn on the fly a better value of some design parameters. Incorporating Monte Carlo techniques in Optimization algorithms allows the introduction of (stochastic) numerical approximations of intractable quantities. In both cases, this yields algorithms which are perturbations of simpler ones whose asymptotic behavior is (usually) well known. When introducing these intertwinings, it is fundamental to be sure that it will not destroy the convergence: convergence of the sampler to the target distribution, or convergence of the optimization method to the target set of solutions.

In the lectures, two points of view will be successively considered:

  • sufficient conditions for the adaptation mecanism of Monte Carlo algorithms, in order to identify the asymptotic distribution of the sampled points.

  • sufficient conditions on the stochastic perturbations, in order to ensure the correct convergence of the perturbed optimization algorithm. We will essentially consider the case of Stochastic Approximation (SA) algorithms (possibly also, Majorize-Minimization methods).

As a preliminary of these two parts, examples from Computational Statistics will be introduced.

In both parts, we will consider the Markovian case: Monte Carlo methods based on Markov chains (MCMC); Stochastic Optimization fed with MCMC samples, which implies, as a unusual setting for SA, a biased stochastic approximation of intractable quantities.

Devavrat Shah
Title: What Do We Know About Matrix Estimation? (Lecture 2)
Abstract:

The task of estimating matrix based on its noisy, partial observation has emerged as the canonical challenge across a variety of fields spanning Inference, Machine Learning and Statistics over the past decade or so. Popularized examples abound, including Recommendation systems, Asymptotic Graph Theory (e.g. Graphons), Network Science (e.g. Community Detection), Social Data Processing (e.g. Ranking and Crowd Sourcing), Causal Inference (e.g. Synthetic Control), Panel Data Analysis, Bio-informatics (e.g. DNA sequencing) and more.

The purpose of this tutorial is to provide a comprehensive survey of various algorithmic and analytic approaches developed over the past decade. The goal is to ground these developments in the context of a “universal” model through connections with the theory of exchangeability (i.e. De Finetti (1937), Aldous and Hoover (1980s)). A particular attention will be paid to statistical and computational trade-off that arise in this class of problems. Open questions pertaining to conjectured fundamental limits and mysterious empirical algorithmic successes will be discussed.

- Convex Relaxation
- Non Parametric Nearest Neighbor
- Connection to Collaborative Filtering

A. Ganesh
Title: Rumours, consensus and epidemics on networks (Lecture 2)
Abstract:

The course will discuss a variety of stochastic processes over networks, which are both of interest in their own right and also play a role in many applications in computer science, as well as the social and biological sciences.

Lecture 1: Rumours and first passage percolation

Consider the complete graph on n vertices, where there is an edge between every pair of vertices. We associate with each edge a random weight, or length, drawn from an exponential distribution, independently across edges. We begin by studying the lengths of shortest paths between two vertices, its expectation, and fluctuations. We go on to study some related combinatorial optimisation problems.

Lecture 2: Voter model and variants

Consider a set of n agents connected by an arbitrary network. Each agent has an opinion, taking values in a finite set. Agents update their opinions by contacting other agents at random times and adopting their opinions. We study the time to consensus, and the probability of consensus on each possible value, in this model. We go on to discuss some variants of this model.

Lecture 3: Epidemics

We study the contact process or SIS epidemic on a connected network. Here, each node can be infected or susceptible. Susceptible nodes are infected at some constant rate by each of their infected neighbours, while infected nodes spontaneously recover after a random time and become immediately susceptible. We describe the survival time of the epidemic in large networks, and a sharp transition that it exhibits in many random graph models.

Devavrat Shah
Title: What Do We Know About Matrix Estimation? (Lecture 3)
Abstract:

The task of estimating matrix based on its noisy, partial observation has emerged as the canonical challenge across a variety of fields spanning Inference, Machine Learning and Statistics over the past decade or so. Popularized examples abound, including Recommendation systems, Asymptotic Graph Theory (e.g. Graphons), Network Science (e.g. Community Detection), Social Data Processing (e.g. Ranking and Crowd Sourcing), Causal Inference (e.g. Synthetic Control), Panel Data Analysis, Bio-informatics (e.g. DNA sequencing) and more.

The purpose of this tutorial is to provide a comprehensive survey of various algorithmic and analytic approaches developed over the past decade. The goal is to ground these developments in the context of a “universal” model through connections with the theory of exchangeability (i.e. De Finetti (1937), Aldous and Hoover (1980s)). A particular attention will be paid to statistical and computational trade-off that arise in this class of problems. Open questions pertaining to conjectured fundamental limits and mysterious empirical algorithmic successes will be discussed.

- Non-convex Approach: Matrix Estimation as a Case-Study
- Beyond Matrix Estimation: Tensors, PCR, Time Series Imputation

Gersende Fort
Title: Monte Carlo methods and Optimization : Intertwinings (Lecture 4)
Abstract:

Continuing from the short course lectures on the topic, this talk will present recent works on Stochastic Approximation combined with adaptive MCMC methods, and motivated by Computational Machine Learning problems. More precisely, we will consider the convergence of Proximal-Gradient based methods (also called “forward-backward” methods) for the optimization of a composite objective function i.e. a function defined as the sum a smooth part and of a non-smooth part. The convergence will be established, and we will discuss many implementation issues such as: choice of the design parameters, benefit of averaging techniques, benefit of Nesterov accelerations schemes, etc.

Wednesday, 07 August 2019

Fabio Martinelli
Title: Interacting particle systems with kinetic constraints: theory and examples (Lecture 1)
Abstract:

I shall gently introduce and review a class of reversible spin systems evolving with constrained Glauber dynamics known as "kinetically constrained models" (KCM). After being introduced in the physics community in order to better understand in simple models some of the basic features of the so-called "glassy dynamics", they became the object of mathematical research starting with the first work by D. Aldous and P. Diaconis in 2001. KCM also appear naturally in some random walk models for upper triangular matrices and some of the techniques developed to study their time evolution have been used in other contexts (e.g. dynamics of random surfaces). In order to make the three lectures as self-contained as possible, during the first lecture I shall quickly review the basic notions of mixing time and relaxation time for continuous time, reversible Markov chains.

Prasad Tetali
Title: Modified Logarithmic Sobolev Inequalities: Theory, Examples and Consequences (lecture 1)
Abstract:

As a tool to understand the relative entropy decay and approach to stationarity of finite Markov chains, the so-called Modified Logarithmic Sobolev Inequality (MLSI) has been in the literature for at least a couple of decades. The study of various versions of the MLSI and relation to other functional inequalities on discrete spaces was formally introduced (in 2003) by Bobkov and the speaker. While typically harder to establish than the spectral gap (Poincare), thanks to various researchers, recently there has been impressive progress on this topic in a few directions – concentration of measure, mixing time of the matroid basis exchange random walk, as well as connections to notions of discrete curvature.

The author will review the above in a series of 3 lectures in a reasonably self-contained manner and will mention several open problems.

A. Ganesh
Title: Rumours, consensus and epidemics on networks (Lecture 3)
Abstract:

The course will discuss a variety of stochastic processes over networks, which are both of interest in their own right and also play a role in many applications in computer science, as well as the social and biological sciences.

Lecture 1: Rumours and first passage percolation

Consider the complete graph on n vertices, where there is an edge between every pair of vertices. We associate with each edge a random weight, or length, drawn from an exponential distribution, independently across edges. We begin by studying the lengths of shortest paths between two vertices, its expectation, and fluctuations. We go on to study some related combinatorial optimisation problems.

Lecture 2: Voter model and variants

Consider a set of n agents connected by an arbitrary network. Each agent has an opinion, taking values in a finite set. Agents update their opinions by contacting other agents at random times and adopting their opinions. We study the time to consensus, and the probability of consensus on each possible value, in this model. We go on to discuss some variants of this model.

Lecture 3: Epidemics

We study the contact process or SIS epidemic on a connected network. Here, each node can be infected or susceptible. Susceptible nodes are infected at some constant rate by each of their infected neighbours, while infected nodes spontaneously recover after a random time and become immediately susceptible. We describe the survival time of the epidemic in large networks, and a sharp transition that it exhibits in many random graph models.

Prasad Tetali
Title: Modified Logarithmic Sobolev Inequalities: Theory, Examples and Consequences (lecture 2)
Abstract:

As a tool to understand the relative entropy decay and approach to stationarity of finite Markov chains, the so-called Modified Logarithmic Sobolev Inequality (MLSI) has been in the literature for at least a couple of decades. The study of various versions of the MLSI and relation to other functional inequalities on discrete spaces was formally introduced (in 2003) by Bobkov and the speaker. While typically harder to establish than the spectral gap (Poincare), thanks to various researchers, recently there has been impressive progress on this topic in a few directions – concentration of measure, mixing time of the matroid basis exchange random walk, as well as connections to notions of discrete curvature.

The author will review the above in a series of 3 lectures in a reasonably self-contained manner and will mention several open problems.

Devavrat Shah
Title: Multi-Dimensional Robust Synthetic Control: Exploring Counterfactuals and Predicting Cricket Scores
Abstract:

The “what ifs?” or ability to explore counterfactuals is central to the study of causal inference. Randomized control and A/B testing provides an approach to address this when counterfactuals can be experimented simultaneously. However, in a large number of scenarios such as policy evaluation, this is not feasible: we can’t have two Massachusetts, one having Gun Control and the other not at the same time, so that we can evaluate the impact of Gun Control on crime rate!

To address this challenge in a data-driven manner, the method of synthetic control was proposed by Abadie et al (2003) where synthetic or virtual counterfactuals are formed using historical data: impose Gun Control in Massachusetts to observe the crime rate under Gun Control and estimate the crime rate of Massachusetts without Gun Control as a combination of the crime rate of other states (without Gun Control). Despite tremendous success of this method, it suffers from two limitations: it is not robust to noisy, missing observations and it is not able to incorporate auxiliary information such as High School Dropout rate for estimating counterfactuals with respect to Crime Rate. In this talk, we will describe a method to address these limitations building on recent advances in Matrix and Tensor estimation, and present a novel application as a forecasting method for time series trajectories exemplified through the game of Cricket.

Thursday, 08 August 2019

Prasad Tetali
Title: Modified Logarithmic Sobolev Inequalities: Theory, Examples and Consequences (Lecture 3)
Abstract:

As a tool to understand the relative entropy decay and approach to stationarity of finite Markov chains, the so-called Modified Logarithmic Sobolev Inequality (MLSI) has been in the literature for at least a couple of decades. The study of various versions of the MLSI and relation to other functional inequalities on discrete spaces was formally introduced (in 2003) by Bobkov and the speaker. While typically harder to establish than the spectral gap (Poincare), thanks to various researchers, recently there has been impressive progress on this topic in a few directions – concentration of measure, mixing time of the matroid basis exchange random walk, as well as connections to notions of discrete curvature.

The author will review the above in a series of 3 lectures in a reasonably self-contained manner and will mention several open problems.

Fabio Martinelli
Title: Interacting particle systems with kinetic constraints: theory and examples (lecture 2)
Abstract:

I shall gently introduce and review a class of reversible spin systems evolving with constrained Glauber dynamics known as "kinetically constrained models" (KCM). After being introduced in the physics community in order to better understand in simple models some of the basic features of the so-called "glassy dynamics", they became the object of mathematical research starting with the first work by D. Aldous and P. Diaconis in 2001. KCM also appear naturally in some random walk models for upper triangular matrices and some of the techniques developed to study their time evolution have been used in other contexts (e.g. dynamics of random surfaces). In order to make the three lectures as self-contained as possible, during the first lecture I shall quickly review the basic notions of mixing time and relaxation time for continuous time, reversible Markov chains.

Eric Moulines
Title: Langevin Dynamics: an overview (Lecture 1)
Abstract:

Sampling distributions over high-dimensional state-space is a problem which has recently attracted a lot of research efforts; applications include Bayesian non-parametrics, Bayesian inverse problems and aggregation of estimators. All these problems boil down to sample a target distribution $\pi$ having a density w.r.t. the Lebesgue measure on $\mathbb{R}^d$, known up to a normalisation factor: $x \mapsto \frac{1}{Z}e^{-U(x)}$, where $U$ is continuously differentiable and smooth. In this paper, we study a sampling technique based on the Euler discretization of the Langevin stochastic differential equation. Contrary to the Metropolis Adjusted Langevin Algorithm (MALA), we do not apply a Metropolis-Hastings correction. We obtain for both constant and decreasing step sizes in the Euler discretization, non-asymptotic bounds for the convergence to stationarity in both total variation and Wasserstein distances. A particular attention is paid on the dependence on the dimension of the state space, to demonstrate the applicability of this method in the high dimensional setting. These bounds are based on recently obtained estimates of the convergence of the Langevin diffusion to stationarity using either transport, Poincaré or log-Sobolev inequalities. We also investigate the convergence of an appropriately weighted empirical measure and we report sharp bounds for the mean square error and exponential deviation inequality for Lipschitz functions.

We will also cover the stochastic version of the Langevin dynamics.

Fabio Martinelli
Title: Interacting particle systems with kinetic constraints: theory and examples (lecture 3)
Abstract:

I shall gently introduce and review a class of reversible spin systems evolving with constrained Glauber dynamics known as "kinetically constrained models" (KCM). After being introduced in the physics community in order to better understand in simple models some of the basic features of the so-called "glassy dynamics", they became the object of mathematical research starting with the first work by D. Aldous and P. Diaconis in 2001. KCM also appear naturally in some random walk models for upper triangular matrices and some of the techniques developed to study their time evolution have been used in other contexts (e.g. dynamics of random surfaces). In order to make the three lectures as self-contained as possible, during the first lecture I shall quickly review the basic notions of mixing time and relaxation time for continuous time, reversible Markov chains.

Manjunath Krishnapur
Title: Absolute continuity of limiting spectral distributions of Toeplitz and Hankel random matrices
Abstract:

There are many models of random symmetric square matrices for which the empirical distribution of eigenvalues converges to a non-random limiting distribution. Starting with Wigner, a powerful method to show this has been the method of moments, which works by first showing that the trace of any power of the matrix converges (after scaling appropriately) to a numbers, and secondly that these numbers form the moment sequence of a unique probability measure. However, it is in general a difficult problem to determine properties of a distribution, such as the absolute continuity or smoothness of density, from the moment sequence. In this talk we work with random Toeplitz and Hankel matrices and show the absolute continuity of the corresponding limiting distributions. The existence of the limiting distribution was already shown by Bryc, Dembo and Jiang (2003/06) and the absolute continuity was settled for the Toeplitz case by Sen and Virag (2011). The result is new for the Hankel matrix. Our methods also work for Toeplitz and Hankel matrices defined by certain other groups such as Zd. The key idea is to write the random matrix under consideration as a sum of two or more independent random circulant-like matrices.

All this is joint work with Anish Mallick.

Friday, 09 August 2019

Eric Moulines
Title: Langevin Dynamics: an overview (Lecture 2)
Abstract:

Sampling distributions over high-dimensional state-space is a problem which has recently attracted a lot of research efforts; applications include Bayesian non-parametrics, Bayesian inverse problems and aggregation of estimators. All these problems boil down to sample a target distribution $\pi$ having a density w.r.t. the Lebesgue measure on $\mathbb{R}^d$, known up to a normalisation factor: $x \mapsto \frac{1}{Z}e^{-U(x)}$, where $U$ is continuously differentiable and smooth. In this paper, we study a sampling technique based on the Euler discretization of the Langevin stochastic differential equation. Contrary to the Metropolis Adjusted Langevin Algorithm (MALA), we do not apply a Metropolis-Hastings correction. We obtain for both constant and decreasing step sizes in the Euler discretization, non-asymptotic bounds for the convergence to stationarity in both total variation and Wasserstein distances. A particular attention is paid on the dependence on the dimension of the state space, to demonstrate the applicability of this method in the high dimensional setting. These bounds are based on recently obtained estimates of the convergence of the Langevin diffusion to stationarity using either transport, Poincaré or log-Sobolev inequalities. We also investigate the convergence of an appropriately weighted empirical measure and we report sharp bounds for the mean square error and exponential deviation inequality for Lipschitz functions.

We will also cover the stochastic version of the Langevin dynamics.

Eric Moulines
Title: Langevin Dynamics: an overview (Lecture 3)
Abstract:

Sampling distributions over high-dimensional state-space is a problem which has recently attracted a lot of research efforts; applications include Bayesian non-parametrics, Bayesian inverse problems and aggregation of estimators. All these problems boil down to sample a target distribution $\pi$ having a density w.r.t. the Lebesgue measure on $\mathbb{R}^d$, known up to a normalisation factor: $x \mapsto \frac{1}{Z}e^{-U(x)}$, where $U$ is continuously differentiable and smooth. In this paper, we study a sampling technique based on the Euler discretization of the Langevin stochastic differential equation. Contrary to the Metropolis Adjusted Langevin Algorithm (MALA), we do not apply a Metropolis-Hastings correction. We obtain for both constant and decreasing step sizes in the Euler discretization, non-asymptotic bounds for the convergence to stationarity in both total variation and Wasserstein distances. A particular attention is paid on the dependence on the dimension of the state space, to demonstrate the applicability of this method in the high dimensional setting. These bounds are based on recently obtained estimates of the convergence of the Langevin diffusion to stationarity using either transport, Poincaré or log-Sobolev inequalities. We also investigate the convergence of an appropriately weighted empirical measure and we report sharp bounds for the mean square error and exponential deviation inequality for Lipschitz functions.

We will also cover the stochastic version of the Langevin dynamics.

P. R. Kumar
Title: Multi-armed Bandits Revisited
Abstract:

We revisit policies for multi-armed bandits based on the cost-biased maximum likelihood method developed earlier for adaptive control of Markov chains. We establish regret bounds both in an expected as well as in a sample path sense, and exhibit simulation results that suggest they have the best empirical regret performance compared to current alternatives for exponential family rewards distributions. We also provide comparative results vis-a-vis current candidates on computational times per decision.

[Joint work with Xi Liu, Ping-Chun Hsieh, and Anirban Bhattacharya].

Prasad Tetali
Title: Finding Cliques with Few Probes
Abstract:

We consider algorithms with unbounded computation time that probe the entries of the adjacency matrix of an n-vertex graph and output a clique. We show positive and negative results - what is doable versus unlikely - assuming the input graph is an Erdős-Rényi random graph with edge probability 1/21/2, under the assumption that the algorithm uses a constant number of rounds, with each round limited to a linear number (or more generally, o(n2)o(n2)) of probes. Several questions remain open.

This is joint work with Uri Feige, David Gamarnik, Joe Neeman, and Miklos Racz.

Saturday, 10 August 2019

Fabio Martinelli
Title: Universality for Kinetically Constrained Spin Models
Abstract:

Kinetically constrained models (KCM) are reversible interacting particle systems with continuous time Markov dynamics of Glauber type, which have been extensively used in the physics literature to model the liquid-glass transition, a major and longstanding open problem in condensed matter physics. They also represent a natural stochastic (and non-monotone) counterpart of the family of cellular automata known as 𝒰-bootstrap percolation thoroughly analyzed by P. Balister, B. Bollobás, H. Duminil-Copin, R. Morris, J. Przykucki, P. Smith and A. Uzzell. I shall present a series of universality results for the mean infection time of the origin for KCM which have been obtained in various collaborations with C. Toninelli, L. Marêché, I. Hartarski, and R. Morris.

Anirban Basak
Title: Spectral properties of random perturbations of Toeplitz matrices with finite symbols
Abstract:

Consider a finitely banded Toeplitz matrix $T_N$ and let $E_N$ be an error matrix with $N^{-1/2}\|E_N\|_{{\rm HS}} \to 0$, polynomially in its dimension $N$. In this talk, we will discuss spectral properties of $T_N+E_N$ such as the limit of the bulk of the spectrum as well as identification of the point process induced by the outliers. We will see that under a very mild assumption on the entries of $E_N$ the limit is a push forward of the uniform measure on the unit circle by the symbol of the Toeplitz matrix $T_N$. Furthermore, there are no outliers outside the spectrum of the limiting Toeplitz operator, whereas in its complement the limit of the point process induced by the outliers is given by the zero set of some random analytic function. These confirm the predictions based on the pseudo-spectrum of $T_N$.

The talk is based on joint works with Elliot Paquette and Ofer Zeitouni.

Harsha Honnappa
Title: Variational Bayes: An Overview and Risk-Sensitive Formulations
Abstract:

Variational Bayesian (VB) methods have emerged as a highly competitive, fast alternative to Markov chain Monte Carlo methods for computing Bayesian posteriors. The essential idea behind VB is to turn the posterior computation problem into one of optimization over an appropriate measure (sub-)space. This talk will first present an overview of the VB method and recent progress on understanding properties of VB posterior estimators, in particular asymptotic consistency. I will then pivot towards our recent work on using VB methods for data-driven decision-making problems contextualized by parameterized stochastic models, where tail risks are important. In particular, I will present our recent work on bounding optimality gaps for a risk-sensitive VB methodology and point out connections to recent work on fairness in machine learning methods.

This is joint work with Vinayak A. Rao (Purdue Statistics) and Prateek Jaiswal (Purdue Industrial Engineering.

Srikanth K Iyer
Title: Connecting Random Connection Models
Abstract:

We consider homogeneous (RCM) and an in-homogeneous (iRCM) random connection models in the connectivity regime. The vertex set is a homogeneous Poisson point process on the unit torus. A edge exists between any two pair of nodes with a probability specified by a connection function. In the iRCM the vertices are endowed with independently distributed weights and the connection function depends on these weights. The weights are assumed to have a heavy tailed distribution. We shall discuss scaling regimes under which the isolated nodes in the two models stabilize as the intensity becomes large. In this regime the number of isolated nodes converges in distribution to a Poisson random variable. We also discuss sufficient conditions for the graph to be connected with high probability.

Praneeth Netrapalli
Title: Convergence rates for SGD without replacement for smooth and strongly-convex functions
Abstract:

While most theoretical analyses of stochastic gradient descent, as applied to empirical risk minimization problems, consider sampling stochastic gradients with replacement, in practice, stochastic gradient descent without replacement (SGDo) is the method of choice as it is empirically observed to have much better rate of convergence (Bottou, 2009). However, SGDo's convergence properties are not well understood as sampling without replacement leads to coupling between iterates and gradients. By using method of exchangeable pairs to bound Wasserstein distance, we provide the first non-asymptotic results for SGDo when applied to general smooth, strongly-convex functions. In particular, we show that SGDo converges at a rate of $O(1/K^2)$ while SGD is known to converge at $O(1/K)$ rate, where $K$ denotes the number of passes over data and is required to be large enough. Existing results for SGDo in this setting require additional Hessian Lipschitz assumption (Gurbuzabalaban et al., 2015; HaoChen & Sra, 2018). For small $K$, we show SGDo can achieve same convergence rate as SGD for general smooth strongly-convex functions. Existing results in this setting require $K=1$ and hold only for generalized linear models (Shamir, 2016).

Joint work with Dheeraj Nagaraj (MIT) and Prateek Jain (MSR India).

Eric Moulines
Title: Mixing time of the Hamiltonian Monte Carlo
Abstract:

I will discuss in this talk Hamiltonian Monte Carlo (HMC), a Metropolis-Hastings algorithm designed to sample target probability density π on Rd. This method was first proposed by Duane et al. (1987) in computational physics. It has later been introduced in the statistics community in the early paper of Neal (1993) and quickly gained popularity; see for example Liu (2008, chap. 9), Neal (2011) and Girolami and Calderhead (2011). The most attractive feature of the HMC algorithm is to allow the possibility of generating proposals - obtained by integrating a system of Hamiltonian equations - that are far away from the current position but still having a high probability of being accepted. The HMC algorithm therefore offer promise for eliminating the random walk behavior of most classical Monte Carlo algorithms. The distance between the current state and the proposal is controlled by length of the time interval along which the Hamiltonian equations are integrated; see Liu (2008, chap. 9) and Sanz-Serna (2014).

I will discuss the irreducibility and geometric ergodicity of the Hamiltonian Monte Carlo (HMC) algorithm.

We consider cases where the number of steps of the symplectic integrator is either fixed or random. Under mild conditions on the potential F associated with target distribution π, we first show that the Markov kernel associated to the HMC algorithm is irreducible and recurrent.

Under more stringent conditions, we then establish that the Markov kernel is Harris recurrent. Finally, we provide verifiable conditions on F under which the HMC sampler is geometrically ergodic.

References:

Duane, S., A. D. Kennedy, B. J. Pendleton, and D. Roweth. 1987. “Hybrid Monte Carlo.” Physics Letters B 195 (2): 216–22. https://doi.org/10.1016/0370-2693(87)91197-X.

Girolami, M., and B. Calderhead. 2011. “Riemann Manifold Langevin and Hamiltonian Monte Carlo Methods.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 73 (2): 123–214. https://doi.org/10.1111/j.1467-9868.2010.00765.x.

Liu, J. S. 2008. Monte Carlo Strategies in Scientific Computing. Springer.

Neal, R. M. 1993. “Probabilistic Inference Using Markov Chain Monte Carlo Methods.” CRG-TR-93-1. Department of Computer Science, University of Toronto.

———. 2011. “MCMC Using Hamiltonian Dynamics.” In Handbook of Markov Chain Monte Carlo, 113–62. Chapman & Hall/Crc Handb. Mod. Stat. Methods. Boca Raton, FL: CRC Press.

Sanz-Serna, J. M. 2014. “Markov Chain Monte Carlo and Numerical Differential Equations.” In Current Challenges in Stability Issues for Numerical Differential Equations: Cetraro, Italy 2011, Editors: Luca Dieci, Nicola Guglielmi, 39–88. Cham: Springer. https://doi.org/10.1007/978-3-319-01300-8_2.

Monday, 12 August 2019

Jose Blanchet
Title: Optimal Transport Methods and Applications to Statistics and Operations Research (Lecture 1)
Abstract:

In these series of lectures we will review recent results at the intersection of optimal transport, stochastic OR and statistics. After reviewing basic notions of optimal transport costs and Wasserstein distances, we will discuss distributionally robust performance analysis and optimization results. For example, we will show how the theory of diffusion approximations can be harnessed to provide model-robust sample path estimates for general stochastic systems. In addition, using the same mathematical principles, we will show that many machine learning algorithms such as sqrt-Lasso, regularized logistic regression, group Lasso, adaptive Lasso, support vector machines, among many others admit distributionally robust representations based on optimal transport costs. Additional connections to artificial intelligence via generative networks will also be discussed. Finally, we also discuss the fundamental principles of a novel statistical methodology which can be used to optimally choose the uncertainty size in distributionally robust formulations.

Krishanu Maulik
Title: Almost Sure and In Probability Convergence in Randomized Urn Model
Abstract:
We consider an urn model where the replacement matrices are random and may depend on past information. Such models are useful in several situations including clinical trials. Using stochastic approximation techniques, we study convergence of scaled color count and configuration vectors, when the replacement matrices have only first moment and the conditionally expected replacement matrix "converges to" an irreducible, but not necessarily balanced, matrix. We show the convergence to be in probability. Under some additional L \log L-type assumptions, we extend the convergence to be almost sure. We shall compare the assumptions and the stochastic approximation results in these two setups and indicate ideas behind the proof.
 
This is a joint work with Ujan Gangopadhyay from University of Southern California. Part of this research has been funded by an Unrestricted Research Grant from Microsoft Research, India.
Sreekar Vadlamani
Title: Adaptive schemes for MCMC in infinite dimensions
Abstract:

Latent Gaussian processes are widely applied in many fields like statistics, inverse problems, and machine learning. A popular method for inference is through the posterior distribution, which is typically carried out by Markov Chain Monte Carlo (MCMC) algorithms. However, the infinite dimensional framework creates certain technical hurdles which need to be addressed with care. Taking cue from recent developments in adaptive Metropolis adjusted MCMC algorithms, we propose a family of proposals to sample from "good" infinite dimensional measures. We discuss the relevant issues concerned with convergence of our proposed schemes, and also demonstrate their efficiency via standard computational examples.

Jose Blanchet
Title: Optimal Transport Methods and Applications to Statistics and Operations Research (Lecture 2)
Abstract:

In these series of lectures we will review recent results at the intersection of optimal transport, stochastic OR and statistics. After reviewing basic notions of optimal transport costs and Wasserstein distances, we will discuss distributionally robust performance analysis and optimization results. For example, we will show how the theory of diffusion approximations can be harnessed to provide model-robust sample path estimates for general stochastic systems. In addition, using the same mathematical principles, we will show that many machine learning algorithms such as sqrt-Lasso, regularized logistic regression, group Lasso, adaptive Lasso, support vector machines, among many others admit distributionally robust representations based on optimal transport costs. Additional connections to artificial intelligence via generative networks will also be discussed. Finally, we also discuss the fundamental principles of a novel statistical methodology which can be used to optimally choose the uncertainty size in distributionally robust formulations.

Krishna Athreya
Title: Regenerative Stochastic Processes
Abstract:

A stochastic process in discrete or continuous time parameter set is called regenerative if after some initial segment the process can be broken into iid components.Examples include Brownian Motion in one and two dimensions, Ornstein Uhlenbeck processes, recurrent irreducible Markov chains with discrete state space, Harris irreducible and recurrent Markov chains on a general state space, semi Markov chains seven by a recurrent irreducible Harris chains. The invariant measure need not be finite.

Some ergodic theorems will be discussed and applications will be given.

Tuesday, 13 August 2019

Jose Blanchet
Title: Optimal Transport Methods and Applications to Statistics and Operations Research (Lecture 3)
Abstract:

In these series of lectures we will review recent results at the intersection of optimal transport, stochastic OR and statistics. After reviewing basic notions of optimal transport costs and Wasserstein distances, we will discuss distributionally robust performance analysis and optimization results. For example, we will show how the theory of diffusion approximations can be harnessed to provide model-robust sample path estimates for general stochastic systems. In addition, using the same mathematical principles, we will show that many machine learning algorithms such as sqrt-Lasso, regularized logistic regression, group Lasso, adaptive Lasso, support vector machines, among many others admit distributionally robust representations based on optimal transport costs. Additional connections to artificial intelligence via generative networks will also be discussed. Finally, we also discuss the fundamental principles of a novel statistical methodology which can be used to optimally choose the uncertainty size in distributionally robust formulations.

Kavita Ramanan
Title: Analysis of Mean-Field Games (Lecture 1)
Abstract:

Nash equilibria for symmetric multi-player games are notoriously hard to compute, even for moderate values of the number of players. In this course, I will describe how, as the number of players goes to infinity, these equilibria can be approximated by solutions to what is known as a mean-field game, which can be viewed as a game for a continuum of players. I will first describe static games and related approximation theorems, and then discuss results for differential games.

Sriram Rajamani
Title: Program Synthesis meets Machine Learning
Abstract:

We give a tutorial overview of program synthesis, from its first formulation by Church in 1957, through its pragmatic evolution through sketching and programing-by-examples, and compare program synthesis with supervised machine learning. We then present our recent efforts in combining program synthesis and machine learning techniques to solve the problem of synthesizing extractors from heterogeneous data, as well as in solving NLP problems. Finally, we explore several opportunities at the intersection of program synthesis (and more broadly the PL community) and machine learning, such as pruning and ranking programs during synthesis, neural program synthesis and automatic differentiation.

Kavita Ramanan
Title: Analysis of Mean-Field Games (Lecture 2)
Abstract:

Nash equilibria for symmetric multi-player games are notoriously hard to compute, even for moderate values of the number of players. In this course, I will describe how, as the number of players goes to infinity, these equilibria can be approximated by solutions to what is known as a mean-field game, which can be viewed as a game for a continuum of players. I will first describe static games and related approximation theorems, and then discuss results for differential games.

Riddhipratim Basu
Title: Eigenvalue Rigidity in Random Matrices and Applications in Last Passage Percolation
Abstract:

Rigidity of eigenvalues about their classical locations is a topic of interest in random matrix theory that has seen much progress in the recent years both in the classical Gaussian case as well as under more general universal settings. We shall recall some of these results, and discuss two examples of how the rigidity results for certain classical random matrix ensembles lead to interesting consequences in exactly solvable models of last passage percolation via some remarkable bijections.

Wednesday, 14 August 2019

Peter W. Glynn
Title: Topics in Applied Proability (Lecture 1)
Abstract:

Talk 1: Sequential Stopping for Parallel Monte Carlo

The most natural way to run many Monte Carlo computations is to sample until a given error tolerance is achieved. The solution of this problem goes back to Chow and Robbins when the underlying variance can be easily estimated from the samples generated. But there are many Monte Carlo problems in which estimating variances is difficult. We discuss a new procedure for use in such settings in which the Bessel process turns out to play a key role. It turns out that it is especiallly well-suited for parallel computing applications.

The work discussed in this talk is joint with Jing Dong.

Talk 2: Non-stationary Markov Processes: Approximations and Numerical Methods

In many Markov modeling contexts, the system under consideration exhibits strong time-of-day effects, day-of-week effects, or seasonality effects. In fact, most real-world systems that are modeled as Markov processes exhibit such non-stationarities. Nevertheless, the great majority of the academic literature focuses on modeling and theory for Markov processes with stationary transition probabilities, which describe systems that have no such time-of-day effects. In this talk, we will briefly describe three different methodologies for handling non-stationary Markov models. The first approach involves analytical approximations, anchored on traditional stationary analysis, that are available when the transition probabilities are slowly changing over the time horizon in question. The second class of methods relates to the use of stable numerical solvers for the Kolmogorov differential equations that exploit the stochastic structure that is present in these models. Finally, the third methodology involves use of Monte Carlo simulation to study such non-stationary processes. In this setting, we will discuss the question of how to initialize such simulations, and the role of backwards coupling in this context.

This work is joint with Harsha Honnappa, Alex Infanger, Mohammad Mousavi, and Zeyu Zheng.

Talk 3: Experimentation with Temporal Interference: Adaptive Markov Chain Sampling and Estimation

For many experiments, the treatment is a new algorithm or policy (e.g., a matching policy in a ridesharing system, delivery system, or ad auction). These policies change the state of the system, and so classical A/B tests that randomize platform participants to treatment or control are infeasible. Instead, the typical design is a “switchback” experiment: each policy is alternately applied in successive time intervals, and then the performance of each policy is estimated by its sample average across intervals where it was used.

Despite their appeal, switchback designs suffer from temporal interference: the application of policy A in one interval affects the initial state of the system as seen by policy B in the next interval. Motivated by the need to mitigate temporal interference, in this talk we study optimal experimental design and estimation for comparison of two policies. Our key innovation is to model the problem as one of efficiently estimating the difference of the steady state reward between two different Markov chains on the same state space (the two policies).

While the dynamic programming optimality principle does not hold here, it turns out that the optimal policy can be computed as the solution to a convex optimization problem over the standard polyhedron of feasible state-action frequencies.

This is joint work with Ramesh Johari and Mohammad Rasouli.

Kavita Ramanan
Title: Analysis of Mean-Field Games (lecture 3)
Abstract:

Nash equilibria for symmetric multi-player games are notoriously hard to compute, even for moderate values of the number of players. In this course, I will describe how, as the number of players goes to infinity, these equilibria can be approximated by solutions to what is known as a mean-field game, which can be viewed as a game for a continuum of players. I will first describe static games and related approximation theorems, and then discuss results for differential games.

Remco van der Hofstad
Title: Small-worlds, complex networks and random graphs (Lecture 1)
Abstract:

Empirical findings have shown that many real-world networks share fascinating features. Indeed, many real-world networks are small worlds, in the sense that typical distances are much smaller than the size of the network. Further, many real-world networks are scale-free, in the sense that there is a high variability in the number of connections of the elements of the networks making them highly inhomogeneous.

Spurred by these empirical findings, many models have been proposed for such networks. In this lecture series, we discuss empirical findings of real-world networks, and describe some of the random graph models proposed for them. In particular, we will discuss the classical Erdös-Rényi random graph, and then move to the more relevant configuration model, generalized random graphs and preferential attachment models. We then discuss the small-world phenomenon in these random graph models and its link to 'six degrees of separation'. We highlight some of the ingredients used in the proofs, namely the tree-like nature of the random graphs under consideration and their connection to branching processes.

We close by discussing the more recent problems of information diffusion on random graphs, where the edges are equipped with general independent and identically distributed edge weights leading to first-passage percolation. Such first-passage percolation problems can also be used to describe epidemics on networks, and competition between different species exploring a network.

**Lecture 1**: Real-world networks and random graphs

This lecture series is based on joint work with, amongst others: Gerard Hooghiemstra, Shankar Bhamidi, Júlia Komjáthy, Piet Van Mieghem, Henri van den Esker, and Dmitri Znamenski.

Peter Glynn
Title: The Power of Sampling (Infosys-ICTS Turing Lecture)
Abstract:

Sampling-based methods arise in many statistical, computational, and engineering settings. In engineering settings, sampling can provide an easy means of constructing distributed algrorithms that scale well and avoid the need for centralized information-gathering. In computational environments, the use of sampling often leads to algorithms that have complexities that are relatively insensitive to dimensional effects, and that largely overcome the “curse of dimensionality”. In this talk, we will give an overview of these ideas and discuss some additional problem contexts within which sampling-based approaches are proving fruitful.

Thursday, 15 August 2019

Peter Glynn
Title: Topics in Applied Proability (Lecture 2)
Abstract:

Talk 1: Sequential Stopping for Parallel Monte Carlo

The most natural way to run many Monte Carlo computations is to sample until a given error tolerance is achieved. The solution of this problem goes back to Chow and Robbins when the underlying variance can be easily estimated from the samples generated. But there are many Monte Carlo problems in which estimating variances is difficult. We discuss a new procedure for use in such settings in which the Bessel process turns out to play a key role. It turns out that it is especiallly well-suited for parallel computing applications.

The work discussed in this talk is joint with Jing Dong.

Talk 2: Non-stationary Markov Processes: Approximations and Numerical Methods

In many Markov modeling contexts, the system under consideration exhibits strong time-of-day effects, day-of-week effects, or seasonality effects. In fact, most real-world systems that are modeled as Markov processes exhibit such non-stationarities. Nevertheless, the great majority of the academic literature focuses on modeling and theory for Markov processes with stationary transition probabilities, which describe systems that have no such time-of-day effects. In this talk, we will briefly describe three different methodologies for handling non-stationary Markov models. The first approach involves analytical approximations, anchored on traditional stationary analysis, that are available when the transition probabilities are slowly changing over the time horizon in question. The second class of methods relates to the use of stable numerical solvers for the Kolmogorov differential equations that exploit the stochastic structure that is present in these models. Finally, the third methodology involves use of Monte Carlo simulation to study such non-stationary processes. In this setting, we will discuss the question of how to initialize such simulations, and the role of backwards coupling in this context.

This work is joint with Harsha Honnappa, Alex Infanger, Mohammad Mousavi, and Zeyu Zheng.

Talk 3: Experimentation with Temporal Interference: Adaptive Markov Chain Sampling and Estimation

For many experiments, the treatment is a new algorithm or policy (e.g., a matching policy in a ridesharing system, delivery system, or ad auction). These policies change the state of the system, and so classical A/B tests that randomize platform participants to treatment or control are infeasible. Instead, the typical design is a “switchback” experiment: each policy is alternately applied in successive time intervals, and then the performance of each policy is estimated by its sample average across intervals where it was used.

Despite their appeal, switchback designs suffer from temporal interference: the application of policy A in one interval affects the initial state of the system as seen by policy B in the next interval. Motivated by the need to mitigate temporal interference, in this talk we study optimal experimental design and estimation for comparison of two policies. Our key innovation is to model the problem as one of efficiently estimating the difference of the steady state reward between two different Markov chains on the same state space (the two policies).

While the dynamic programming optimality principle does not hold here, it turns out that the optimal policy can be computed as the solution to a convex optimization problem over the standard polyhedron of feasible state-action frequencies.

This is joint work with Ramesh Johari and Mohammad Rasouli.

Remco van der Hofstad
Title: Small-worlds, complex networks and random graphs (Lecture 2)
Abstract:

Empirical findings have shown that many real-world networks share fascinating features. Indeed, many real-world networks are small worlds, in the sense that typical distances are much smaller than the size of the network. Further, many real-world networks are scale-free, in the sense that there is a high variability in the number of connections of the elements of the networks making them highly inhomogeneous.

Spurred by these empirical findings, many models have been proposed for such networks. In this lecture series, we discuss empirical findings of real-world networks, and describe some of the random graph models proposed for them. In particular, we will discuss the classical Erdös-Rényi random graph, and then move to the more relevant configuration model, generalized random graphs and preferential attachment models. We then discuss the small-world phenomenon in these random graph models and its link to 'six degrees of separation'. We highlight some of the ingredients used in the proofs, namely the tree-like nature of the random graphs under consideration and their connection to branching processes.

We close by discussing the more recent problems of information diffusion on random graphs, where the edges are equipped with general independent and identically distributed edge weights leading to first-passage percolation. Such first-passage percolation problems can also be used to describe epidemics on networks, and competition between different species exploring a network.

**Lecture 2**: Degree structure and connectivity in random graphs.

This lecture series is based on joint work with, amongst others: Gerard Hooghiemstra, Shankar Bhamidi, Júlia Komjáthy, Piet Van Mieghem, Henri van den Esker, and Dmitri Znamenski.

Anup Biswas
Title: Risk-sensitive control for diffusions with jumps
Abstract:

We would consider infinite horizon (ergodic) risk-sensitive control problems for a class of diffusions controlled through the drift, and driven by a jump Lévy process, and a nondegenerate Wiener process. Under suitable hypotheses, we fully characterize the value function by analyzing the associated integro-differential equations. In the process, we also study the Dirichlet eigenvalue problem for bounded smooth domains. If time permits, we would also discuss few related control problems.

Piyush Srivastava
Title: “Triangular” Dvoretzky matrices and online coding
Abstract:

A special case of the classical Dvoretzky theorem states that the space of $n$-dimensional real vectors equipped with the $\ell_1$ norm admits a large “Euclidean section”, i.e. a subspace of dimension $\Theta(n)$ on which a scaled $\ell_1$ norm essentially agrees with the Euclidean norm. In particular, such a subspace can be realized as the column space of a "tall" $n \times (n/k)$ random matrix $A$ with identically distributed independent Gaussian entries $(k > 1)$.

This special case of the Dvoretzky theorem has a natural interpretation in the setting of encoding real vectors for transmission across an adversarial noisy channel when the vector $x$ to be encoded is given in advance (the so-called “block coding” scenario), so that the encoding can be computed in an “offline” fashion. Motivated by the same problem in the setting when the encoding has to be “online”, i.e., has to be carried out as each entry of x becomes available, we give randomized constructions of triangular matrices with properties similar to Dvoretzky matrices. The guarantees provided by these constructions in the “online” scenario are close to, but still somewhat worse than, those provided by the Dvoretzky theorem in the “block” scenario, and the question of finding the optimal triangular version of the Dvoretzky theorem remains open.

Joint work with Leonard J. Schulman.

Kavita Ramanan
Title: Asymptotic Normality of r-to-p norms for non-negative random matrices
Abstract:

The study of the rr-to-pp operator norms for large random matrices is a problem of importance in many areas; for example, when r=p=2r=p=2, this reduces to the question of finding the largest singular value of the matrix, whereas when p is the Holder conjugate of rr, then this problem reduces to the famous Grothendiek ℓpℓp problem, which is of relevance in optimization. We analyze this norm for fairly general sequences of non-negative random matrices with dimension growing to infinity, and show that it satisfies a slightly non-standard central limit theorem. Along the way, we obtain sharp approximations for the maximizing vector, which may be of independent interest.

This is based on joint work with Debankur Mukherjee and Souvik Dhara.

A. Ganesh
Title: Broadcast on random graphs
Abstract:

Consider a set of nn nodes, each of whom has a single message or packet to transmit to all the other nodes. The nodes are connected via a random graph with constant edge probability pp. Time is split into rounds where, in each round, each node may broadcast a single packet to all its neighbours, simultaneously with all other nodes. The goal is to minimise the time until all nodes have received all packets.

We study two algorithms for this problem. First, we study a random relaying scheme and show that it the number of rounds it requires is bounded below and above by constant multiples of lognlog⁡n. Next, we consider a scheme that uses random network coding, and show that it can disseminate the message to all nodes in a constant number of rounds, with high probability.

Friday, 16 August 2019

Remco van der Hofstad
Title: Small-worlds, complex networks and random graphs (Lecture 3)
Abstract:

Empirical findings have shown that many real-world networks share fascinating features. Indeed, many real-world networks are small worlds, in the sense that typical distances are much smaller than the size of the network. Further, many real-world networks are scale-free, in the sense that there is a high variability in the number of connections of the elements of the networks making them highly inhomogeneous.

Spurred by these empirical findings, many models have been proposed for such networks. In this lecture series, we discuss empirical findings of real-world networks, and describe some of the random graph models proposed for them. In particular, we will discuss the classical Erdös-Rényi random graph, and then move to the more relevant configuration model, generalized random graphs and preferential attachment models. We then discuss the small-world phenomenon in these random graph models and its link to 'six degrees of separation'. We highlight some of the ingredients used in the proofs, namely the tree-like nature of the random graphs under consideration and their connection to branching processes.

We close by discussing the more recent problems of information diffusion on random graphs, where the edges are equipped with general independent and identically distributed edge weights leading to first-passage percolation. Such first-passage percolation problems can also be used to describe epidemics on networks, and competition between different species exploring a network.

**Lecture 3**: Small-world phenomenon in random graphs.

This lecture series is based on joint work with, amongst others: Gerard Hooghiemstra, Shankar Bhamidi, Júlia Komjáthy, Piet Van Mieghem, Henri van den Esker, and Dmitri Znamenski.

Peter W. Glynn
Title: Topics in Applied Proability (Lecture 3)
Abstract:

Talk 1: Sequential Stopping for Parallel Monte Carlo

The most natural way to run many Monte Carlo computations is to sample until a given error tolerance is achieved. The solution of this problem goes back to Chow and Robbins when the underlying variance can be easily estimated from the samples generated. But there are many Monte Carlo problems in which estimating variances is difficult. We discuss a new procedure for use in such settings in which the Bessel process turns out to play a key role. It turns out that it is especiallly well-suited for parallel computing applications.

The work discussed in this talk is joint with Jing Dong.

Talk 2: Non-stationary Markov Processes: Approximations and Numerical Methods

In many Markov modeling contexts, the system under consideration exhibits strong time-of-day effects, day-of-week effects, or seasonality effects. In fact, most real-world systems that are modeled as Markov processes exhibit such non-stationarities. Nevertheless, the great majority of the academic literature focuses on modeling and theory for Markov processes with stationary transition probabilities, which describe systems that have no such time-of-day effects. In this talk, we will briefly describe three different methodologies for handling non-stationary Markov models. The first approach involves analytical approximations, anchored on traditional stationary analysis, that are available when the transition probabilities are slowly changing over the time horizon in question. The second class of methods relates to the use of stable numerical solvers for the Kolmogorov differential equations that exploit the stochastic structure that is present in these models. Finally, the third methodology involves use of Monte Carlo simulation to study such non-stationary processes. In this setting, we will discuss the question of how to initialize such simulations, and the role of backwards coupling in this context.

This work is joint with Harsha Honnappa, Alex Infanger, Mohammad Mousavi, and Zeyu Zheng.

Talk 3: Experimentation with Temporal Interference: Adaptive Markov Chain Sampling and Estimation

For many experiments, the treatment is a new algorithm or policy (e.g., a matching policy in a ridesharing system, delivery system, or ad auction). These policies change the state of the system, and so classical A/B tests that randomize platform participants to treatment or control are infeasible. Instead, the typical design is a “switchback” experiment: each policy is alternately applied in successive time intervals, and then the performance of each policy is estimated by its sample average across intervals where it was used.

Despite their appeal, switchback designs suffer from temporal interference: the application of policy A in one interval affects the initial state of the system as seen by policy B in the next interval. Motivated by the need to mitigate temporal interference, in this talk we study optimal experimental design and estimation for comparison of two policies. Our key innovation is to model the problem as one of efficiently estimating the difference of the steady state reward between two different Markov chains on the same state space (the two policies).

While the dynamic programming optimality principle does not hold here, it turns out that the optimal policy can be computed as the solution to a convex optimization problem over the standard polyhedron of feasible state-action frequencies.

This is joint work with Ramesh Johari and Mohammad Rasouli.

Hariharan Narayanan
Title: Reconstruction of a Riemannian manifold from noisy intrinsic distances
Abstract:

We consider reconstruction of a manifold, or, invariant manifold learning, where a smooth Riemannian manifold $M$ is determined from intrinsic distances (that is, geodesic distances) of points in a discrete subset of $M$. In the studied problem the Riemannian manifold $(M,g)$ is considered as an abstract metric space with intrinsic distances, not as an embedded submanifold of an ambient Euclidean space. Let $\{X_1,X_2,\dots,X_N\}$ be a set of $N$ sample points sampled randomly from an unknown Riemannian $M$ manifold. We assume that we are given the numbers $D_{jk}=d_M(X_j,X_k)+\eta_{jk}$, where $j,k\in \{1,2,\dots,N\}$. Here, $d_M(X_j,X_k)$ are geodesic distances, $\eta_{jk}$ are independent, identically distributed random variables such that $\mathbb E e^{|\eta_{jk}|}$ is finite. We show that when $N$ is large enough, it is possible to construct an approximation of the Riemannian manifold $(M,g)$ with a large probability. This problem is a generalization of the geometric Whitney problem with random measurement errors. We consider also the case when the information on noisy distance $D_{jk}$ of points $X_j$ and $X_k$ is missing with some probability. In particular, we consider the case when we have no information on points that are far away.

This is joint work with Charles Fefferman, Sergei Ivanov and Matti Lassas.

arXiv

Rajat Hazra
Title: Largest eigenvalue of the in-homogeneous Erdős–Rényi random graph
Abstract:

We shall discuss about the limiting behaviour of the largest eigenvalue of the adjacency matrix of an in-homogeneous Erdős–Rényi random graph. We shall show that in certain cases after appropriate centering and scaling the largest eigenvalue converges in distribution to a Gaussian random variable. If time permits, we shall discuss the open problems regarding the spectrum of in-homogeneous Erdős–Rényi and configuration model.

The talk is based on joint work with Arijit Chakrabarty and Sukrit Chakraborty.

Jayakrishnan Nair
Title: Environment oblivious risk-aware bandit algorithms
Abstract:

Classical multi-armed bandit problems use the expected value of an arm as a metric to evaluate its goodness. However, the expected value is a risk-neutral metric. In many applications like finance, one is interested in balancing the expected return of an arm (or portfolio) with the risk associated with that return. In this paper, we consider the problem of selecting the arm that optimizes a linear combination of the expected reward and the associated Conditional Value at Risk (CVaR). We allow the reward distributions to be unbounded or even heavy-tailed. For this problem, we devise algorithms that are entirely environment oblivious, i.e., the algorithm is not aware of any information on the reward distributions, including bounds on the moments/tails, or the suboptimality gaps across arms.

Jose Blanchet
Title: Statistical and Computational Results involving Optimal Transport and Distributionally Robust Optimization
Abstract:

Distributionally robust optimization (DRO) based on optimal transport has been shown to enhance out-of-sample performance in statistical estimation (this has been done by means of connecting DRO with well-known estimators in machine learning. But DRO can be applied systematically and there is plenty of flexibility in choosing the uncertainty set to achieve desirable performance. This talk explores computational tools which enable the use of a wide range of uncertainty sets systematically and corresponding statistical tools to make inference out of decisions obtained out of data driven DRO formulations. Joint work with K. Murthy, N. Si, and F. Zhang.

Saturday, 17 August 2019

Siva Athreya
Title: Graphon dynamics from population genetics
Abstract:

In this talk we shall discuss a natural class of dynamics in dense networks arising from resampling in multi-type populations. We consider sequence of finite graphs whose dynamics are governed by the empirical type distribution of the whole population at that time and in the large-population-size limit this dynamics converges to a random process in the space of graphons.

This is joint work with Frank den Hollander and Adrian Rollin.

Subhamay Saha
Title: Strict monotonicity of principal eigenvalues of elliptic operators in Rd and risk-sensitive control
Abstract:

In this talk we will look at the eigenvalue problem on $\mathbb{R}^d$ for a class of second order, elliptic operators of the form $\mathcal{L}^f = a^{ij}\partial_{x_i}\partial_{x_j} + b^{i}\partial_{x_i} + f$, associated with non-degenerate diffusions. We show that strict monotonicity of the principal eigenvalue of the operator with respect to the potential function $f$ fully characterizes the ergodic properties of the associated ground state diffusion, and the unicity of the ground state, and we present a comprehensive study of the eigenvalue problem from this point of view. This allows us to extend or strengthen various results in the literature for a class of viscous Hamilton--Jacobi equations of ergodic type with smooth coefficients to equations with measurable drift and potential. We also apply these results to the study of the infinite horizon risk-sensitive control problem for diffusions, and establish existence of optimal Markov controls.

This is a joint work with A. Arapostathis and A. Biswas.

Parthanil Roy
Title: Continued fractions, the Chen-Stein method and extreme value theory
Abstract:

In this work, we deal with extreme value theory in the context of continued fractions using techniques from probability theory, ergodic theory and real analysis. We give an upper bound for the rate of convergence in the Doeblin-Iosifescu asymptotics for the exceedances of digits obtained from the regular continued fraction expansion of a number chosen randomly from (0,1) according to the Gauss measure. As a consequence, we significantly improve the best known upper bound on the rate of convergence of the maxima bettering an error term used in the proof of a conjecture of Paul Erdős. We observe that the asymptotics of order statistics and the extremal point process can also be investigated using our methods.

This talk is based on a joint work with Anish Ghosh (TIFR Mumbai) and Maxim Sølund Kirsebom (Univ of Hamburg).

Sandeep Juneja
Title: Partition identification using multi-armed bandits for general distributions
Abstract:

Given a vector of unknown probability distributions, or arms, each of which can be sampled independently, we consider the problem of identifying the partition to which this vector belongs from a finitely partitioned universe of such vector of distributions. We primarily consider partitions where the vector of means of arms lie either in a given set or its complement in a Euclidean space. The sets considered correspond to half spaces or where either the set or its complement is a polytope, or more generally, a convex set. In these settings, exploiting the problem geometry, we characterize the lower bounds on mean number of samples from each arm that any δ-correct algorithm (algorithm that restricts wrong identification probability to a pre-specified δ) must satisfy. Further, we propose δ-correct algorithms that for small δ can match these bounds. Traditionally multi-armed bandit literature focusses on restrictive distribution families such as Bernoulli distributions, sub-Gaussian distributions or those belonging to the single parameter exponential family. We discuss how our analysis extends to general distributions where an explicit upper bound on 1+ϵ moment is available.

Remco van der Hofstad
Title: Citation networks as a window to science: a case study
Abstract:

Citation patterns between papers form a window to how science works. In citation networks, the nodes are papers and the directed edges are formed by one paper citing the other. Citation networks give us a wealth of information about differences between subfields, how scientists collaborate, etc. These insights are useful to interpret citation statistics of papers and scientists beyond simply counting citations. How long does it take publications to be cited? How long does it take papers to be forgotten, and how much does this depend on the citation patterns of the papers early on? How many citations do papers get, and how do these develop in time? How variable are citation patterns when comparing papers within a field, and between fields? Can one quantify what a 'good' paper is based on citation network structure?

In this talk, we present empirical data of citation patterns obtained from the Web of Science database. We investigate aging of citation patterns, the presence of dynamical power laws in them, as well as how predictable citations behave after having observed them for a while. We then continue to describe a possible model for them based on preferential attachment models with aging and fitness, that, on a qualitative level, shows similar connectivity patterns as in citation networks. We also discuss PageRank, which is an algorithm that gives an order to vertices in a directed network, and show how local weak convergence can be used to analyze PageRank in large graph limits. We close with open problems.

We assume no prior knowledge in graph theory, probability or otherwise. This is joint work with Alessandro Garavaglia, Nelly Litvak, and Gerhard Woeginger.