Jointly organized by Harvard University, Massachusetts Institute of Technology, and Microsoft Research New England, the Charles River Lectures on Probability and Related Topics is a oneday event for the benefit of the greater Boston area mathematics community.
The 2017 lectures will take place 9:15am – 5:30pm on Monday, October 2 at Harvard University in the Harvard Science Center.
***************************************************
**************************************************
Please note that registration has closed.
In Harvard Science Center Hall C:
8:45 am – 9:15 am: Coffee/light breakfast
9:15 am – 10:15 am: Ofer Zeitouni
Title:
Abstract:
10:20 am – 11:20 am: Andrea Montanari
Title:
Abstract:
11:20 am – 11:45 am: Break
11:45 am – 12:45 pm: Paul Bourgade
Title:
Abstract:
1:00 pm – 2:30 pm: Lunch
In Harvard Science Center Hall E:
2:45 pm – 3:45 pm: Roman Vershynin
Title: Deviations of random matrices and applications
Abstract: Uniform laws of large numbers provide theoretical foundations for statistical learning theory. This lecture will focus on quantitative uniform laws of large numbers for random matrices. A range of illustrations will be given in high dimensional geometry and data science.
3:45 pm – 4:15 pm: Break
4:15 pm – 5:15 pm: Massimiliano Gubinelli
Title:
Abstract:
Alexei Borodin, Henry Cohn, Vadim Gorin, Elchanan Mossel, Philippe Rigollet, Scott Sheffield, and H.T. Yau
]]>Confirmed Participants:
The Center of Mathematic Sciences and Applications will host a conference on From Algebraic Geometry to Vision and AI: A Symposium Celebrating the Mathematical Work of David Mumford. The event will be held in the Harvard Science Center, Hall D.
For a list of lodging options convenient to the Center, please visit our recommended lodgings page.
Confirmed speakers and panelists:
*This event is jointly supported by the CMSA, National Science Foundation, and the International Science Foundation of Cambridge.
]]>
The Big Data Conference features many speakers from the Harvard community as well as scholars from across the globe, with talks focusing on computer science, statistics, math and physics, and economics. This is the third conference on Big Data the Center will host as part of our annual events, and is coorganized by Richard Freeman, Scott Kominers, Jun Liu, HorngTzer Yau and ShingTung Yau.
For a list of lodging options convenient to the Center, please visit our recommended lodgings page.
Please note that lunch will not be provided during the conference, but a map of Harvard Square with a list of local restaurants can be found by clicking Map & Restaurants.
Confirmed Speakers:
Following the conference, there will be a twoday workshop from August 2021. The workshop is organized by Scott Kominers, and will feature:
Conference Schedule
A PDF version of the schedule below can also be downloaded here.
Time  Speaker  Topic 
8:30 am – 9:00 am  Breakfast  
9:00 am – 9:40 am  Mohammad Akbarpour  Title: Information aggregation in overlapping generations and the emergence of experts
Abstract: We study a model of social learning with “overlapping generations”, where agents meet others and share data about an underlying state over time. We examine under what conditions the society will produce individuals with precise knowledge about the state of the world. There are two information sharing regimes in our model: Under the full information sharing technology, individuals exchange the information about their point estimates of an underlying state, as well as their sources (or the precision of their signals) and update their beliefs by taking a weighted average. Under the limited information sharing technology, agents only observe the information about the point estimates of those they meet, and update their beliefs by taking a weighted average, where weights can depend on the sequence of meetings, as well as the labels. Our main result shows that, unlike most social learning settings, using such linear learning rules do not guide the society (or even a fraction of its members) to learn the truth, and having access to, and exploiting knowledge of the precision of a source signal are essential for efficient social learning (joint with Amin Saberi & Ali Shameli). 
9:40 am – 10:20 am  Lucas Janson  Title: ModelFree Knockoffs For HighDimensional Controlled Variable Selection
Abstract: Many contemporary largescale applications involve building interpretable models linking a large set of potential covariates to a response in a nonlinear fashion, such as when the response is binary. Although this modeling problem has been extensively studied, it remains unclear how to effectively control the fraction of false discoveries even in highdimensional logistic regression, not to mention general highdimensional nonlinear models. To address such a practical problem, we propose a new framework of modelfree knockoffs, which reads from a different perspective the knockoff procedure (Barber and Candès, 2015) originally designed for controlling the false discovery rate in linear models. The key innovation of our method is to construct knockoff variables probabilistically instead of geometrically. This enables modelfree knockoffs to deal with arbitrary (and unknown) conditional models and any dimensions, including when the dimensionality p exceeds the sample size n, while the original knockoffs procedure is constrained to homoscedastic linear models with n greater than or equal to p. Our approach requires the design matrix be random (independent and identically distributed rows) with a covariate distribution that is known, although we show our procedure to be robust to unknown/estimated distributions. As we require no knowledge/assumptions about the conditional distribution of the response, we effectively shift the burden of knowledge from the response to the covariates, in contrast to the canonical modelbased approach which assumes a parametric model for the response but very little about the covariates. To our knowledge, no other procedure solves the controlled variable selection problem in such generality, but in the restricted settings where competitors exist, we demonstrate the superior power of knockoffs through simulations. Finally, we apply our procedure to data from a casecontrol study of Crohn’s disease in the United Kingdom, making twice as many discoveries as the original analysis of the same data. 
10:20 am – 10:50 am  Break  
10:50 pm – 11:30 pm  Noureddine El Karoui  Title: Random matrices and highdimensional statistics: beyond covariance matrices
Abstract: Random matrices have played a central role in understanding very important statistical methods linked to covariance matrices (such as Principal Components Analysis, Canonical Correlation Analysis etc…) for several decades. In this talk, I’ll show that one can adopt a randommatrixinspired point of view to understand the performance of other widely used tools in statistics, such as Mestimators, and very common methods such as the bootstrap. I will focus on the highdimensional case, which captures well the situation of “moderately” difficult statistical problems, arguably one of the most relevant in practice. In this setting, I will show that random matrix ideas help upend conventional theoretical thinking (for instance about maximum likelihood methods) and highlight very serious practical problems with resampling methods. 
11:30 am – 12:10 pm  Nikhil Naik  Title: Understanding Urban Change with Computer Vision and Streetlevel Imagery
Abstract: Which neighborhoods experience physical improvements? In this work, we introduce a computer vision method to measure changes in the physical appearances of neighborhoods from timeseries streetlevel imagery. We connect changes in the physical appearance of five US cities with economic and demographic data and find three factors that predict neighborhood improvement. First, neighborhoods that are densely populated by collegeeducated adults are more likely to experience physical improvements. Second, neighborhoods with better initial appearances experience, on average, larger positive improvements. Third, neighborhood improvement correlates positively with physical proximity to the central business district and to other physically attractive neighborhoods. Together, our results illustrate the value of using computer vision methods and streetlevel imagery to understand the physical dynamics of cities. (Joint work with Edward L. Glaeser, Cesar A. Hidalgo, Scott Duke Kominers, and Ramesh Raskar.) 
12:10 pm – 12:25 pm  Data Science Lightning Talks  
12:25 pm – 1:30 pm  Lunch  
1:30 pm – 2:10 pm  Tracy Ke  Title: A new SVD approach to optimal topic estimation
Abstract: In the probabilistic topic models, the quantity of interest—a lowrank matrix consisting of topic vectors—is hidden in the text corpus matrix, masked by noise, and Singular Value Decomposition (SVD) is a potentially useful tool for learning such a lowrank matrix. However, the connection between this lowrank matrix and the singular vectors of the text corpus matrix are usually complicated and hard to spell out, so how to use SVD for learning topic models faces challenges. We overcome the challenge by revealing a surprising insight: there is a lowdimensional simplex structure which can be viewed as a bridge between the lowrank matrix of interest and the SVD of the text corpus matrix, and which allows us to conveniently reconstruct the former using the latter. Such an insight motivates a new SVDbased approach to learning topic models. For asymptotic analysis, we show that under a popular topic model (Hofmann, 1999), the convergence rate of the l1error of our method matches that of the minimax lower bound, up to a multilogarithmic term. In showing these results, we have derived new elementwise bounds on the singular vectors and several large deviation bounds for weakly dependent multinomial data. Our results on the convergence rate and asymptotical minimaxity are new. We have applied our method to two data sets, Associated Process (AP) and Statistics Literature Abstract (SLA), with encouraging results. In particular, there is a clear simplex structure associated with the SVD of the data matrices, which largely validates our discovery. 
2:10 pm – 2:50 pm  AlbertLászló Barabási  Title: Taming Complexity: From Network Science to Controlling Networks
Abstract: The ultimate proof of our understanding of biological or technological systems is reflected in our ability to control them. While control theory offers mathematical tools to steer engineered and natural systems towards a desired state, we lack a framework to control complex selforganized systems. Here we explore the controllability of an arbitrary complex network, identifying the set of driver nodes whose timedependent control can guide the system’s entire dynamics. We apply these tools to several real networks, unveiling how the network topology determines its controllability. Virtually all technological and biological networks must be able to control their internal processes. Given that, issues related to control deeply shape the topology and the vulnerability of real systems. Consequently unveiling the control principles of real networks, the goal of our research, forces us to address series of fundamental questions pertaining to our understanding of complex systems.

2:50 pm – 3:20 pm  Break  
3:20 pm – 4:00 pm  Marena Lin  Title: Optimizing climate variables for human impact studies
Abstract: Estimates of the relationship between climate variability and socioeconomic outcomes are often limited by the spatial resolution of the data. As studies aim to generalize the connection between climate and socioeconomic outcomes across countries, the best available socioeconomic data is at the national level (e.g. food production quantities, the incidence of warfare, averages of crime incidence, gender birth ratios). While these statistics may be trusted from government censuses, the appropriate metric for the corresponding climate or weather for a given year in a country is less obvious. For example, how do we estimate the temperatures in a country relevant to national food production and therefore food security? We demonstrate that highresolution spatiotemporal satellite data for vegetation can be used to estimate the weather variables that may be most relevant to food security and related socioeconomic outcomes. In particular, satellite proxies for vegetation over the African continent reflect the seasonal movement of the Intertropical Convergence Zone, a band of intense convection and rainfall. We also show that agricultural sensitivity to climate variability differs significantly between countries. This work is an example of the ways in which insitu and satellitebased observations are invaluable to both estimates of future climate variability and to continued monitoring of the earthhuman system. We discuss the current state of these records and potential challenges to their continuity. 
4:00 pm – 4:40 pm  Alex Peysakhovich  Title: Building a cooperator
Abstract: A major goal of modern AI is to construct agents that can perform complex tasks. Much of this work deals with single agent decision problems. However, agents are rarely alone in the world. In this talk I will discuss how to combine ideas from deep reinforcement learning and game theory to construct artificial agents that can communicate, collaborate and cooperate in productive positive sum interactions. 
4:40 pm – 5:20 pm  Tze Leung Lai  Title: Gradient boosting: Its role in big data analytics, underlying mathematical theory, and recent refinements
Abstract: We begin with a review of the history of gradient boosting, dating back to the LMS algorithm of Widrow and Hoff in 1960 and culminating in Freund and Schapire’s AdaBoost and Friedman’s gradient boosting and stochastic gradient boosting algorithms in the period 19992002 that heralded the big data era. The role played by gradient boosting in big data analytics, particularly with respect to deep learning, is then discussed. We also present some recent work on the mathematical theory of gradient boosting, which has led to some refinements that greatly improves the convergence properties and prediction performance of the methodology. 
Time  Speaker  Topic 
8:30 am – 9:00 am  Breakfast  
9:00 am – 9:40 am  Natesh Pillai  Title: Accelerating MCMC algorithms for Computationally Intensive Models via Local Approximations
Abstract: We construct a new framework for accelerating Markov chain Monte Carlo in posterior sampling problems where standard methods are limited by the computational cost of the likelihood, or of numerical models embedded therein. Our approach introduces local approximations of these models into the Metropolis–Hastings kernel, borrowing ideas from deterministic approximation theory, optimization, and experimental design. Previous efforts at integrating approximate models into inference typically sacrifice either the sampler’s exactness or efficiency; our work seeks to address these limitations by exploiting useful convergence characteristics of local approximations. We prove the ergodicity of our approximate Markov chain, showing that it samples asymptotically from the exact posterior distribution of interest. We describe variations of the algorithm that employ either local polynomial approximations or local Gaussian process regressors. Our theoretical results reinforce the key observation underlying this article: when the likelihood has some local regularity, the number of model evaluations per Markov chain Monte Carlo (MCMC) step can be greatly reduced without biasing the Monte Carlo average. Numerical experiments demonstrate multiple orderofmagnitude reductions in the number of forward model evaluations used in representative ordinary differential equation (ODE) and partial differential equation (PDE) inference problems, with both synthetic and real data. 
9:40 am – 10:20 am  Ravi Jagadeesan  Title: Designs for estimating the treatment effect in networks with interference
Abstract: In this paper we introduce new, easily implementable designs for drawing causal inference from randomized experiments on networks with interference. Inspired by the idea of matching in observational studies, we introduce the notion of considering a treatment assignment as a quasicoloring” on a graph. Our idea of a perfect quasicoloring strives to match every treated unit on a given network with a distinct control unit that has identical number of treated and control neighbors. For a wide range of interference functions encountered in applications, we show both by theory and simulations that the classical Neymanian estimator for the direct effect has desirable properties for our designs. This further extends to settings where homophily is present in addition to interference. 
10:20 am – 10:50 am  Break  
10:50 am – 11:30 am  Annie Liang  Title: The Theory is Predictive, but is it Complete? An Application to Human Generation of Randomness
Abstract: When we test a theory using data, it is common to focus on correctness: do the predictions of the theory match what we see in the data? But we also care about completeness: how much of the predictable variation in the data is captured by the theory? This question is difficult to answer, because in general we do not know how much “predictable variation” there is in the problem. In this paper, we consider approaches motivated by machine learning algorithms as a means of constructing a benchmark for the best attainable level of prediction. We illustrate our methods on the task of predicting humangenerated random sequences. Relative to a theoretical machine learning algorithm benchmark, we find that existing behavioral models explain roughly 15 percent of the predictable variation in this problem. This fraction is robust across several variations on the problem. We also consider a version of this approach for analyzing field data from domains in which human perception and generation of randomness has been used as a conceptual framework; these include sequential decisionmaking and repeated zerosum games. In these domains, our framework for testing the completeness of theories provides a way of assessing their effectiveness over different contexts; we find that despite some differences, the existing theories are fairly stable across our field domains in their performance relative to the benchmark. Overall, our results indicate that (i) there is a significant amount of structure in this problem that existing models have yet to capture and (ii) there are rich domains in which machine learning may provide a viable approach to testing completeness (joint with Jon Kleinberg and Sendhil Mullainathan). 
11:30 am – 12:10 pm  Zak Stone  Title: TensorFlow: Machine Learning for Everyone
Abstract: We’ve witnessed extraordinary breakthroughs in machine learning over the past several years. What kinds of things are possible now that weren’t possible before? How are opensource platforms like TensorFlow and hardware platforms like GPUs and Cloud TPUs accelerating machine learning progress? If these tools are new to you, how should you get started? In this session, you’ll hear about all of this and more from Zak Stone, the Product Manager for TensorFlow on the Google Brain team. 
12:10 pm – 1:30 pm  Lunch  
1:30 pm – 2:10 pm  Jann Spiess  Title: (Machine) Learning to Control in Experiments
Abstract: Machine learning focuses on highquality prediction rather than on (unbiased) parameter estimation, limiting its direct use in typical program evaluation applications. Still, many estimation tasks have implicit prediction components. In this talk, I discuss accounting for controls in treatment effect estimation as a prediction problem. In a canonical linear regression framework with highdimensional controls, I argue that OLS is dominated by a natural shrinkage estimator even for unbiased estimation when treatment is random; suggest a generalization that relaxes some parametric assumptions; and contrast my results with that for another implicit prediction problem, namely the first stage of an instrumental variables regression. 
2:10 pm – 2:50 pm  Bradly Stadie  Title: Learning to Learn Quickly: OneShot Imitation and Meta Learning
Abstract: Many reinforcement learning algorithms are bottlenecked by data collection costs and the brittleness of their solutions when faced with novel scenarios. 
2:50 pm – 3:20 pm  Break  
3:20 pm – 4:00 pm  HauTieng Wu  Title: When Medical Challenges Meet Modern Data Science
Abstract: Adaptive acquisition of correct features from massive datasets is at the core of modern data analysis. One particular interest in medicine is the extraction of hidden dynamics from a single observed time series composed of multiple oscillatory signals, which could be viewed as a singlechannel blind source separation problem. The mathematical and statistical problems are made challenging by the structure of the signal which consists of nonsinusoidal oscillations with time varying amplitude/frequency, and by the heteroscedastic nature of the noise. In this talk, I will discuss recent progress in solving this kind of problem by combining the cepstrumbased nonlinear timefrequency analysis and manifold learning technique. A particular solution will be given along with its theoretical properties. I will also discuss the application of this method to two medical problems – (1) the extraction of a fetal ECG signal from a single lead maternal abdominal ECG signal; (2) the simultaneous extraction of the instantaneous heart/respiratory rate from a PPG signal during exercise; (3) (optional depending on time) an application to atrial fibrillation signals. If time permits, the clinical trial results will be discussed. 
4:00 pm – 4:40 pm  Sifan Zhou  Title: Citing People Like Me: Homophily, Knowledge Spillovers, and Continuing a Career in Science
Abstract: Forward citation is widely used to measure the scientific merits of articles. This research studies millions of journal article citation records in life sciences from MEDLINE and finds that authors of the same gender, the same ethnicity, sharing common collaborators, working in the same institution, or being geographically close are more likely (and quickly) to cite each other than predicted by their proportion among authors working on the same research topics. This phenomenon reveals how social and geographic distances influence the quantity and speed of knowledge spillovers. Given the importance of forward citations in academic evaluation system, citation homophily potentially put authors from minority group at a disadvantage. I then show how it influences scientists’ chances to survive in the academia and continue publishing. Based on joint work with Richard Freeman. 
To view photos and video interviews from the conference, please visit the CMSA blog.
]]>
Title: Graph Coloring: Local and Global
Abstract: Graph Coloring is arguably the most popular subject in Discrete Mathematics, and its combinatorial, algorithmic and computational aspects have been studied intensively. The most basic notion in the area, the chromatic number of a graph, is an inherently global property. This is demonstrated by the hardness of computation or approximation of this invariant as well as by the existence of graphs with arbitrarily high chromatic number and no short cycles. The investigation of these graphs had a profound impact on Graph Theory and Combinatorics. It combines combinatorial, probabilistic, algebraic and topological techniques with number theoretic tools. I will describe the rich history of the subject focusing on some recent results.
]]>This workshop will focus on new developments in coding and information theory that sit at the intersection of combinatorics and complexity, and will bring together researchers from several communities — coding theory, information theory, combinatorics, and complexity theory — to exchange ideas and form collaborations to attack these problems.
Squarely in this intersection of combinatorics and complexity, locally testable/correctable codes and listdecodable codes both have deep connections to (and in some cases, direct motivation from) complexity theory and pseudorandomness, and recent progress in these areas has directly exploited and explored connections to combinatorics and graph theory. One goal of this workshop is to push ahead on these and other topics that are in the purview of the yearlong program. Another goal is to highlight (a subset of) topics in coding and information theory which are especially ripe for collaboration between these communities. Examples of such topics include polar codes; new results on ReedMuller codes and their thresholds; coding for distributed storage and for DNA memories; coding for deletions and synchronization errors; storage capacity of graphs; zeroerror information theory; bounds on codes using semidefinite programming; tensorization in distributed source and channel coding; and applications of informationtheoretic methods in probability and combinatorics. All these topics have attracted a great deal of recent interest in the coding and information theory communities, and have rich connections to combinatorics and complexity which could benefit from further exploration and collaboration.
Participation: The workshop is open to participation by all interested researchers, subject to capacity. Click here to register.
A list of lodging options convenient to the Center can also be found on our recommended lodgings page.
Confirmed participants include:
Coorganizers of this workshop include Venkat Guruswami, Alexander Barg, Mary Wootters. More details about this event, including participants, will be updated soon.
]]>Additive combinatorics is a mathematical area bordering on number theory, discrete mathematics, harmonic analysis and ergodic theory. It has achieved a number of successes in pure mathematics in the last two decades in quite diverse directions, such as:
Ideas and techniques from additive combinatorics have also had an impact in theoretical computer science, for example
The main focus of this workshop will be to bring together researchers involved in additive combinatorics, with a particular inclination towards the links with theoretical computer science. Thus it is expected that a major focus will be additive combinatorics on the boolean cube (Z/2Z)^n , which is the object where the exchange of ideas between pure additive combinatorics and theoretical computer science is most fruitful. Another major focus will be the study of pseudorandom phenomena in additive combinatorics, which has been an important contributor to modern methods of generating provably good randomness through deterministic methods. Other likely topics of discussion include the status of major open problems (the polynomial FreimanRuzsa conjecture, inverse theorems for the Gowers norms with bounds, explicit correlation bounds against low degree polynomials) as well as the impact of new methods such as the introduction of algebraic techniques by Croot–Pach–Lev and Ellenberg–Gijswijt.
Participation: The workshop is open to participation by all interested researchers, subject to capacity. Click here to register.
A list of lodging options convenient to the Center can also be found on our recommended lodgings page.
Confirmed participants include:
Coorganizers of this workshop include Ben Green, Swastik Kopparty, Ryan O’Donnell, Tamar Ziegler.
Monday, October 2
Time  Speaker  Title/Abstract 
9:009:30am  Breakfast  
9:3010:20am  Jacob Fox  Towertype bounds for Roth’s theorem with popular differences
Abstract: A famous theorem of Roth states that for any $\alpha > 0$ and $n$ sufficiently large in terms of $\alpha$, any subset of $\{1, \dots, n\}$ with density $\alpha$ contains a 3term arithmetic progression. Green developed an arithmetic regularity lemma and used it to prove that not only is there one arithmetic progression, but in fact there is some integer $d > 0$ for which the density of 3term arithmetic progressions with common difference $d$ is at least roughly what is expected in a random set with density $\alpha$. That is, for every $\epsilon > 0$, there is some $n(\epsilon)$ such that for all $n > n(\epsilon)$ and any subset $A$ of $\{1, \dots, n\}$ with density $\alpha$, there is some integer $d > 0$ for which the number of 3term arithmetic progressions in $A$ with common difference $d$ is at least $(\alpha^3\epsilon)n$. We prove that $n(\epsilon)$ grows as an exponential tower of 2’s of height on the order of $\log(1/\epsilon)$. We show that the same is true in any abelian group of odd order $n$. These results are the first applications of regularity lemmas for which the towertype bounds are shown to be necessary. The first part of the talk by Jacob Fox includes an overview and discusses the upper bound. The second part of the talk by Yufei Zhao focuses on the lower bound construction and proof. These results are all joint work with Huy Tuan Pham. 
10:2011:00am  Coffee Break  
11:0011:50am  Yufei Zhao  Towertype bounds for Roth’s theorem with popular differences
Abstract: Continuation of first talk by Jacob Fox. The first part of the talk by Jacob Fox includes an overview and discusses the upper bound. The second part of the talk by Yufei Zhao focuses on the lower bound construction and proof. These results are all joint work with Huy Tuan Pham. 
12:001:30pm  Lunch  
1:302:20pm  Jop Briët  Locally decodable codes and arithmetic progressions in random settings
Abstract: This talk is about a common feature of special types of error correcting codes, socalled locally decodable codes (LDCs), and two problems on arithmetic progressions in random settings, random differences in Szemerédi’s theorem and upper tails for arithmetic progressions in a random set in particular. It turns out that all three can be studied in terms of the Gaussian width of a set of vectors given by a collection of certain polynomials. Using a matrix version of the Khintchine inequality and a lemma that turns such polynomials into matrices, we give an alternative proof for the bestknown lower bounds on LDCs and improved versions of prior results due to Frantzikinakis et al. and Bhattacharya et al. on arithmetic progressions in the aforementioned random settings. Joint work with Sivakanth Gopi 
2:203:00pm  Coffee Break  
3:003:50pm  Fernando Shao 
Large deviations for arithmetic progressions Abstract: We determine the asymptotics of the logprobability that the number of kterm arithmetic progressions in a random subset of integers exceeds its expectation by a constant factor. This is the arithmetic analog of subgraph counts in a random graph. I will highlight some open problems in additive combinatorics that we encountered in our work, namely concerning the “complexity” of the dual functions of APcounts. 
4:006:00pm  Welcome Reception 
Tuesday, October 3
Time  Speaker  Title/Abstract 
9:009:30am  Breakfast  
9:3010:20am  Emanuele Viola  Interleaved group products
Authors: Timothy Gowers and Emanuele Viola Abstract: Let G be the special linear group SL(2,q). We show that if (a1,a2) and (b1,b2) are sampled uniformly from large subsets A and B of G^2 then their interleaved product a1 b1 a2 b2 is nearly uniform over G. This extends a result of Gowers (2008) which corresponds to the independent case where A and B are product sets. We obtain a number of other results. For example, we show that if X is a probability distribution on G^m such that any two coordinates are uniform in G^2, then a pointwise product of s independent copies of X is nearly uniform in G^m, where s depends on m only. Similar statements can be made for other groups as well. These results have applications in computer science, which is the area where they were first sought by Miles and Viola (2013). 
10:2011:00am  Coffee Break  
11:0011:50am  Vsevolod Lev  On Isoperimetric Stability
Abstract: We show that a nonempty subset of an abelian group with a small edge boundary must be large; in particular, if $A$ and $S$ are finite, nonempty subsets of an abelian group such that $S$ is independent, and the edge boundary of $A$ with respect to $S$ does not exceed $(1c)SA$ with a real $c\in(0,1]$, then $A\ge4^{(11/d)cS}$, where $d$ is the smallest order of an element of $S$. Here the constant $4$ is best possible. As a corollary, we derive an upper bound for the size of the largest independent subset of the set of popular differences of a finite subset of an abelian group. For groups of exponent $2$ and $3$, our bound translates into a sharp estimate for the additive dimension of the popular difference set. We also prove, as an auxiliary result, the following estimate of possible independent interest: if $A\subseteq{\mathbb Z}^n$ is a finite, nonempty downset, then, denoting by $w(z)$ the number of nonzero components of the vector $z\in\mathbb{Z}^n$, we have $$ \frac1{A} \sum_{a\in A} w(a) \le \frac12\, \log_2 A. $$ 
12:001:30pm  Lunch  
1:302:20pm  Elena Grigorescu  NPHardness of ReedSolomon Decoding and the ProuhetTarryEscott Problem
Abstract: I will discuss the complexity of decoding ReedSolomon codes, and some results establishing NPhardness for asymptotically smaller decoding radii than the maximum likelihood decoding radius. These results follow from the study of a generalization of the classical Subset Sum problem to higher moments, which may be of independent interest. I will further discuss a connection with the ProuhetTarryEscott problem studied in Number Theory, which turns out to capture a main barrier in extending our techniques to smaller radii. Joint work with Venkata Gandikota and Badih Ghazi. 
2:203:00pm  Coffee Break  
3:003:50pm  Sean Prendiville  Partition regularity of certain nonlinear Diophantine equations.
Abstract: We survey some results in additive Ramsey theory which remain valid when variables are restricted to sparse sets of arithmetic interest, in particular the partition regularity of a class of nonlinear Diophantine equations in many variables. 
Wednesday, October 4
Time  Speaker  Title/Abstract 
9:009:30am  Breakfast  
9:3010:20am  Olof Sisask  Bounds on capsets via properties of spectra
Abstract: A capset in F_3^n is a subset A containing no three distinct elements x, y, z satisfying x+z=2y. Determining how large capsets can be has been a longstanding problem in additive combinatorics, particularly motivated by the corresponding question for subsets of {1,2,…,N}. While the problem in the former setting has seen spectacular progress recently through the polynomial method of Croot–Lev–Pach and Ellenberg–Gijswijt, such progress has not been forthcoming in the setting of the integers. Motivated by an attempt to make progress in this setting, we shall revisit the approach to bounding the sizes of capsets using Fourier analysis, and in particular the properties of large spectra. This will be a two part talk, in which many of the ideas will be outlined in the first talk, modulo the proof of a structural result for sets with large additive energy. This structural result will be discussed in the second talk, by Thomas Bloom, together with ideas on how one might hope to achieve Behrendstyle bounds using this method. Joint work with Thomas Bloom. 
10:2011:00am  Coffee Break  
11:0011:50am  Thomas Bloom  Bounds on capsets via properties of spectra
This is a continuation of the previous talk by Olof Sisask. 
12:001:30pm  Lunch  
1:302:20pm  Hamed Hatami  Polynomial method and graph bootstrap percolation
Abstract: We introduce a simple method for proving lower bounds for the size of the smallest percolating set in a certain graph bootstrap process. We apply this method to determine the sizes of the smallest percolating sets in multidimensional tori and multidimensional grids (in particular hypercubes). The former answers a question of Morrison and Noel, and the latter provides an alternative and simpler proof for one of their main results. This is based on a joint work with Lianna Hambardzumyan and Yingjie Qian. 
2:203:00pm  Coffee Break  
3:003:50pm  Arnab Bhattacharyya  Algorithmic Polynomial Decomposition
Abstract: Fix a prime p. Given a positive integer k, a vector of positive integers D = (D_1, …, D_k) and a function G: F_p^k → F_p, we say a function P: F_p^n → F_p admits a (k, D, G)decomposition if there exist polynomials P_1, …, P_k: F_p^n > F_p with each deg(P_i) <= D_i such that for all x in F_p^n, P(x) = G(P_1(x), …, P_k(x)). For instance, an nvariate polynomial of total degree d factors nontrivially exactly when it has a (2, (d1, d1), prod)decomposition where prod(a,b) = ab. When show that for any fixed k, D, G, and fixed bound d, we can decide whether a given polynomial P(x_1, …, x_n) of degree d admits a (k,D,G)decomposition and if so, find a witnessing decomposition, in poly(n) time. Our approach is based on higherorder Fourier analysis. We will also discuss improved analyses and algorithms for special classes of decompositions. Joint work with Pooya Hatami, Chetan Gupta and Madhur Tulsiani. 
Thursday, October 5
Time  Speaker  Title/Abstract 
9:009:30am  Breakfast  
9:3010:20am  Madhur Tulsiani  Higherorder Fourier analysis and approximate decoding of ReedMuller codes
Abstract: Decomposition theorems proved by Gowers and Wolf provide an appropriate notion of “Fourier transform” for higherorder Fourier analysis. I will discuss some questions and techniques that arise from trying to develop polynomial time algorithms for computing these decompositions. I will discuss constructive proofs of these decompositions based on boosting, which reduce the problem of computing these decompositions to a certain kind of approximate decoding problem for codes. I will also discuss some earlier and recent works on this decoding problem. Based on joint works with Arnab Bhattacharyya, Eli BenSasson, Pooya Hatami, Noga RonZewi and Julia Wolf. 
10:2011:00am  Coffee Break  
11:0011:50am  Julia Wolf  Stable arithmetic regularity
The arithmetic regularity lemma in the finitefield model, proved by Green in 2005, states that given a subset A of a finitedimensional vector space over a prime field, there exists a subspace H of bounded codimension such that A is Fourieruniform with respect to almost all cosets of H. It is known that in general, the growth of the codimension of H is required to be of tower type depending on the degree of uniformity, and that one must allow for a small number of nonuniform cosets. Our main result is that, under a natural modeltheoretic assumption of stability, the towertype bound and nonuniform cosets in the arithmetic regularity lemma are not necessary. Specifically, we prove an arithmetic regularity lemma for kstable subsets in which the bound on the codimension of the subspace is a polynomial (depending on k) in the degree of uniformity, and in which there are no nonuniform cosets. This is joint work with Caroline Terry. 
12:001:30pm  Lunch  
1:302:20pm  Will Sawin 
Constructions of Additive Matchings Abstract: I will explain my work, with Robert Kleinberg and David Speyer, constructing large tricolored sumfree sets in vector spaces over finite fields, and how it shows that some additive combinatorics problems over finite fields are harder than corresponding problems over the integers. 
2:203:00pm  Coffee Break  
3:003:50pm  MeiChu Chang  Arithmetic progressions in multiplicative groups of finite fields
Abstract: Let G be a multiplicative subgroup of the prime field F_p of size G> p^{1\kappa} and r an arbitrarily fixed positive integer. Assuming \kappa=\kappa(r)>0 and p large enough, it is shown that any proportional subset A of G contains nontrivial arithmetic progressions of length r. 
Friday, October 6
Time  Speaker  Title/Abstract 
9:009:30am  Breakfast  
9:3010:20am  Asaf Ferber  On a resilience version of the LittlewoodOfford problem
Abstract: In this talk we consider a resilience version of the classical LittlewoodOfford problem. That is, consider the sum X=a_1x_1+…a_nx_n, where the a_is are nonzero reals and x_is are i.i.d. random variables with (x_1=1)= P(x_1=1)=1/2. Motivated by some problems from random matrices, we consider the question: how many of the x_is can we typically allow an adversary to change without making X=0? We solve this problem up to a constant factor and present a few interesting open problems. Joint with: Afonso Bandeira (NYU) and Matthew Kwan (ETH, Zurich). 
10:2011:00am  Coffee Break  
11:0011:50am  Kaave Hosseini  Protocols for XOR functions and Entropy decrement
Abstract: Let f:F_2^n –> {0,1} be a function and suppose the matrix M defined by M(x,y) = f(x+y) is partitioned into k monochromatic rectangles. We show that F_2^n can be partitioned into affine subspaces of codimension polylog(k) such that f is constant on each subspace. In other words, up to polynomial factors, deterministic communication complexity and parity decision tree complexity are equivalent. This relies on a novel technique of entropy decrement combined with Sanders’ BogolyubovRuzsa lemma. Joint work with Hamed Hatami and Shachar Lovett 
12:001:30pm  Lunch  
1:302:20pm  Guy Kindler 
From the Grassmann graph to TwotoTwo games Abstract: In this work we show a relation between the structure of the so called Grassmann graph over Z_2 and the TwotoTwo conjecture in computational complexity. Specifically, we present a structural conjecture concerning the Grassmann graph (together with an observation by Barak et. al., one can view this as a conjecture about the structure of nonexpanding sets in that graph) which turns out to imply the TwotoTwo conjecture. The latter conjecture its the lesserknown and weaker sibling of the UniqueGames conjecture [Khot02], which states that unique games (a.k.a. onetoone games) are hard to approximate. Indeed, if the GrassmannGraph conjecture its true, it would also rule out some attempts to refute the UniqueGames conjecture, as these attempts provide potentially efficient algorithms to solve unique games, that would actually also solve twototwo games if they work at all. These new connections between the structural properties of the Grassmann graph and complexity theoretic conjectures highlight the Grassmann graph as an interesting and worthy object of study. We may indicate some initial results towards analyzing its structure. This is joint work with Irit Dinur, Subhash Khot, Dror Minzer, and Muli Safra. 
Learning as a Theory of Everything
Abstract: We start from the hypothesis that all the information that resides in living organisms was initially acquired either through learning by an individual or through evolution. Then any unified theory of evolution and learning should be able to characterize the capabilities that humans and other living organisms can possess or acquire. Characterizing these capabilities would tell us about the nature of humans, and would also inform us about feasible targets for automation. With this purpose we review some background in the mathematical theory of learning. We go on to explain how Darwinian evolution can be formulated as a form of learning. We observe that our current mathematical understanding of learning is incomplete in certain important directions, and conclude by indicating one direction in which further progress would likely enable broader phenomena of intelligence and cognition to be realized than is possible at present.
]]>