BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//CMSA - ECPv6.16.3//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-ORIGINAL-URL:https://cmsa.fas.harvard.edu
X-WR-CALDESC:Events for CMSA
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20240310T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20241103T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20250309T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20251102T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20260308T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20261101T060000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20251117T090000
DTEND;TZID=America/New_York:20251119T170000
DTSTAMP:20260709T235012
CREATED:20250502T182846Z
LAST-MODIFIED:20251215T145740Z
UID:10003749-1763370000-1763571600@cmsa.fas.harvard.edu
SUMMARY:Conference on Geometry and Statistics
DESCRIPTION:Conference on Geometry and Statistics \nDates: November 17–19\, 2025 \nLocation: CMSA G10\, 20 Garden Street\, Cambridge MA & via Zoom \n  \nSpeakers \n\nCharles Fefferman\, Princeton University\nStephan Huckemann\, Georg-August Universität Göttingen\nSungkyu Jung\, Seoul National University\nKei Kobayashi\, Keio University\nClément Levrard\, Université de Rennes\nKer-Chau Li\, University of California\, Los Angeles\nRong Ma\, Harvard University\nSteve Marron\, University of North Carolina\nEzra Miller\, Duke University\nHans-Georg Müller\, University of California\, Davis\nWilderich Tuschmann\, Karlsruhe Institute of Technology\nMelanie Weber\, Harvard University\nAndrew Wood\, Australian National University\nHorng-Tzer Yau\, Harvard University\n\nOrganizer: Zhigang Yao\, National University of Singapore \n  \nYoutube Playlist \n  \nSCHEDULE \ndownload pdf \nMonday\, Nov. 17\, 2025 \n9:00–9:25 am\nMorning refreshments \n9:25–9:30 am\nIntroductions \n9:30–10:30 am\nSpeaker: Stephan Huckemann\, Georg-August Universität Göttingen\nTitle: The Probability of the Cut Locus of a Fréchet Mean\nAbstract: We show that the cut locus of a Fréchet mean of a random variable on a connected and complete Riemanian manifold has zero probability\, a result known previously in special cases (Le and Barden\, 2014) and conjectured in general. The proof is based on first order and second order considerations\, where the latter are based on a recent result by Générau (2020) on “Laplacians in the barrier sense”. This generalizes to Fréchet p-means for p > 2. The former allow also to rule out stickiness on Riemannian manifolds\, and for generalization to 1 <= p < 2\, with a conjecture. We close with discussing and conjecturing extensions to noncomplete manifolds and more general metric spaces. This is joint work with Alexander Lytchak. \n\nGénérau\, F. (2020). Laplacian of the distance function on the cut locus on a Riemannian manifold. Nonlinearity 33(8)\, 3928.\nLe\, H. and D. Barden (2014).  On the measure of the cut locus of a Fréchet mean. Bulletin of the London Mathematical Society 46(4)\, 698–708.\nLytchak\, A. and S. F. Huckemann (2025). Zero mass at the cut locus of a Fréchet mean on a Riemannian manifold. arXiv preprint arXiv:2508.00747.\n\n10:30–10:45 am\nbreak \n10:45 am–11:45 am\nSpeaker: Hans-Georg Müller\, University of California\, Davis\nTitle: Conformal Inference for Random Objects\nAbstract: The underlying probability measure of random objects\, i.e.\, metric-space-valued random variables\, can be probed by distance profiles. These are one-dimensional distributions of probability mass falling into balls of increasing radius. In a regression setting with Euclidean covariates X and responses Y that are random objects\, one can consider conditional Fréchet means that can be implemented with Fréchet regression and also conditional distance profiles\, conditioning on X. Conditional distance profiles can then be leveraged to obtain conditional average transport costs\, the expected cost for transporting a fixed conditional distance profile to a randomly selected conditional distance profile. The conditional average transport costs can then be utilized to obtain conditional conformity scores. In conjunction with the split conformal algorithm these scores lead to conditional prediction sets located in the object space with asymptotic conditional validity and attractive finite sample behavior. Based on joint work Hang Zhou (UNC). \n11:45 am–1:15 pm\nLunch (Catered) \n1:15–2:15 pm\nSpeaker: Horng-Tzer Yau\, Harvard\nTitle: Ramanujan property of random regular graphs and delocalization of random band matrices\nAbstract: In this lecture\, we review recent works on random matrices. The first result is about the normalized adjacency matrix of a random $d$-regular graph on $N$ vertices with any fixed degree $d\geq 3$ and denote its eigenvalues as $\lambda_1=d/\sqrt{d-1}\geq \lambda_2\geq\lambda_3\cdots\geq \lambda_N$. We establish the edge universality for random $d$-regular graphs\, namely\, the distributions of $\lambda_2$ and $-\lambda_N$ converge to the Tracy-Widom$_1$ distribution associated with the Gaussian Orthogonal Ensemble. As a consequence\, for sufficiently large $N$\, approximately $69\%$ of $d$-regular graphs on $N$ vertices.\nare Ramanujan\, meaning $\max\{\lambda_2\,|\lambda_N|\}\leq 2$. This resolves a conjecture by Sarnak and Miller-Novikoff-Sabelli\nThe second result concerns $ N \times N$ Hermitian $d$-dimensional random band matrices with band width $W$. In the bulk of the spectrum and in the large $ N $ limit\, we prove that all $ L^2 $- normalized eigenvectors are delocalized in all dimensions under suitable conditions on $W$ and $N$. In addition\, we proved that the eigenvalue statistics are given by those of the Gaussian unitary ensemble. \n2:15–2:45 pm\nbreak with refreshments \n2:45–3:45 pm\nSpeaker: Clément Levrard\, Université de Rennes\nTitle: Optimal reach estimation\nAbstract: The reach of an embedded submanifold\, a notion that dates back to the famous work Curvature measures of H. Federer\, may be understood as a scale under which the submanifold is flat enough so that traditional Euclidean techniques in statistics locally apply\, up to some approximation. I will expose several ways to estimate the reach from sample (on the submanifold)\, some of them being optimal from the point of view of minimax estimation theory. Along the way\, intermediate estimation problems of local and global quantities will arise (curvature estimation\, weak feature size estimation\, distance estimation\, etc.)\, for which various phenomenons can occur from a statistical point of view (different convergence rates\, inconsistency). This will be an opportunity to provide a selective overview of the state of the art on these issues. \n4:30–5:30 pm\nCMSA Colloquium\nSpeaker: Zhigang Yao (National University of Singapore)\nTitle: Interaction of Statistics and Geometry: A New Landscape for Data Science\nAbstract:  Classical statistics views data as real numbers or vectors in Euclidean space\, but modern challenges increasingly involve data with intrinsic geometric structures. A central problem in this direction is manifold fitting\, with origins in H. Whitney’s work of the 1930s. The Geometric Whitney Problems ask: given a set\, when can we construct a smooth 𝑑-dimensional manifold that approximates it\, and how accurately can we estimate it?\nIn this talk\, I will discuss recent progress on manifold fitting and its role in bridging geometry and data science. While many existing methods rely on restrictive assumptions\, the manifold hypothesis—that data often lie near non-Euclidean structures—remains fundamental in modern statistical learning. I will highlight both theoretical insights and algorithmic challenges\, drawing on recent works with\, as well as ongoing research. \nYoutube video \n  \nTuesday\, Nov. 18\, 2025 \n9:00–9:30 am\nMorning refreshments \n9:30–10:30 am\nSpeaker: Charles Fefferman\, Princeton University (via Zoom)\nTitle: Extrinsic and intrinsic manifold learning\, old and new\nAbstract: The talk will include an exposition of the old paper “Testing the manifold hypothesis”\, joint work with S. Mitter and H. Narayanan\, on extrinsic manifold learning (the manifold to be learned is assumed to be embedded in a high-dimensional Euclidean space). The talk will also include a new result on intrinsic manifold learning (the manifold to be learned is not assumed to be embedded\, and the data consist of intrinsic distances corrupted by noise)\, provided the result is proven by the time of the conference. \n10:30–10:45 am\nbreak \n10:45 am–11:45 am\nSpeaker: Steve Marron\, University of North Carolina\nTitle: Data Integration Via Analysis of Manifolds (DIVAM)\nAbstract: A major challenge in the age of Big Data is the integration of disparate data types into a single data analysis. That was tackled by Data Integration Via Analysis of Subspaces (DIVAS) in the context of data blocks measured on a common set of experimental cases. Joint variation was defined in terms of modes of variation having identical scores across data blocks. DIVAS allowed mathematically rigorous formulation of individual variation within each data block in terms of individual modes. The goal of DIVAM is to intrinsically extend the DIVAS approach to data objects lying in manifolds\, such as shape data. \n11:45 am–1:15 pm\nLunch Break \n1:15–2:15 pm\nSpeaker: Ker-Chau Li\, University of California\, Los Angeles\nTitle: Investigation of Data clouds: From Galton’s Ellipses to Explainable AI (XAI)\, modeling or molding?\nAbstract: Francis Galton’s seminal 1886 visualization of regression toward the mean in trait inheritance is arguably the first and most influential example of geometric thinking applied to statistical modeling. The pioneering geometric insight driving Galton’s use of elliptical contours to discover the bivariate normal distribution laid down the foundation for classic multivariate analysis (e.g.\, PCA\, canonical correlation) and profoundly impacts modern methods like diffusion models.\nStatistical models\, particularly those based on parsimony\, are effective for characterizing data distribution and facilitating scientific rule induction. However\, the rise of unstructured big data (like images) has challenged these parsimonious approaches\, necessitating the use of deep learning models. These models\, containing billions of parameters\, sacrifice transparency to excel in prediction. Seeking solutions to this “black-box” dilemma is now the heart of Explainable AI (XAI).\nLeveraging the simplicity of elementary geometric concepts\, this talk will present a new path toward interpretable and parsimonious XAI. Unstructured big data is highly plastic. Our approach moves beyond the standard data modeling perspective—which answers what the data is—and introduces a novel data molding perspective. This shift is key to unlocking the full potential of data’s plasticity\, allowing us to effectively answer the crucial question: what the data can be used for.\nI will first discuss a connection between manifold learning and my earlier works\, helical confounding and liquid association. I will then turn to the data molding perspective and present two novel notions: mold-compliance and artificial-trait configurative-generation (ATCG). These notions guide our recent efforts in formulating novel algorithms for image data investigation\, addressing issues like prediction validity and within-class heterogeneity. Data molding entails a dramatically different feature space extraction\, which consequently shifts the subsequent investigation on the data clouds from out-of-distribution (OOD) to mold-violation\, and from UMAP clustering to ATCG-induced hierarchical clustering. \n2:15–2:45 pm\nbreak with refreshments \n2:45–3:45 pm\nSpeaker: Andrew Wood\, Australian National University\nTitle: Empirical likelihood methods for Fréchet means on open books\nAbstract: The open book is a simple example of a stratified space that captures some (but not all) of the properties of stratified spaces. Central limit theory for open books plus relevant background is given by Hotz et al. (2013\, Annals of Applied Probability). In this talk I will describe some basic inference procedures for Fréchet means in open books based on empirical likelihood (Owen\, book\, 2001). Empirical likelihood (EL) is a type of nonparametric likelihood that can be useful for many types of data\, including manifold-valued data and data from stratified spaces. An EL approach to basic inference for Fréchet means will be described. In particular\, it will be shown how the non-regularity in the geometry of open books can result in non-regular behaviour in Wilks’s theorem (i.e. the large sample likelihood ratio test). The talk will also discuss difficulties in extending the EL inference theory from open books to more general stratified spaces\, where the difference in dimension of adjacent strata can be 2 or more. For discussion of more general stratified spaces than open books\, see the orthant spaces discussed in Barden and Le (2018\, Proc of London Math Society) and the general stratified space setting considered by Mattingly et al. (2023\, arxiv). \n3:45–4:00 pm\nbreak \n4:00–5:00 pm\nSpeaker: Wilderich Tuschmann\, Karlsruhe Institute of Technology\nTitle: A Spectator’s Perspective on the Manifold Hypothesis\nAbstract: At its core\, the Manifold Hypothesis asserts that real-world\, high-dimensional data is not uniformly or randomly distributed throughout its high-dimensional “ambient” space\, but concentrated on or near a low-dimensional manifold (or a collection of manifolds) embedded within that high-dimensional ambient space.\nIn my talk\, I will discuss reasons and facts that speak for as well as against this hypothesis and also address geometric alternatives. \n  \nWednesday\, Nov. 19\, 2025 \n9:00–9:30 am\nMorning refreshments \n9:30–10:30 am\nSpeaker: Melanie Weber\, Harvard University\nTitle: Ricci Curvature\, Ricci Flow\, and the Geometry of Learning\nAbstract: Geometric structure in data plays a crucial role in machine learning. In this talk\, we study this observation through the lens of Ricci curvature and its associated Ricci flow. We start by reviewing a discrete notion of Ricci curvature introduced by Ollivier and the geometric flow that it induces. We further discuss the relationship between discrete Ricci curvature and its continuous counterpart via discrete-to-continuum consistency results\, which imply that discrete Ricci curvature can provably characterize the geometry of a data manifold based on a finite sample. This provides a theoretical foundation for several applications of discrete Ricci curvature in machine learning\, two of which we discuss in the remainder of this talk. First\, we analyze learned feature representations in deep neural networks and show that they transform during training in ways that closely resemble a discrete Ricci flow. Our analysis reveals that nonlinear activations shape class separability and suggests geometry-informed training principles such as early stopping and depth selection. Second\, we turn to deep learning on graphs\, where we address representational limitations of state of the art graph neural networks through curvature-based data augmentations. We show that augmenting input graphs with geometric information provably increases the representational power of such models and yields performance gains in practice. \n10:30–10:45 am\nbreak \n10:45 am–11:45 am\nSpeaker: Ezra Miller\, Duke University\nTitle: Extracting bar lengths from multiparameter persistent homology\nAbstract: Persistent homology in one parameter can be summarized using bar codes or persistence diagrams\, which are elementary gadgets with many features amenable to vectorization and hence statistical analysis. For example\, early work with Bendich\, Marron\, Pieloch\, and Skwerer showed how to extract meaningful statistics from the top 100 bar lengths in persistent homology summaries of brain arteries. The story for persistent homology with multiple parameters\, on the other hand\, is still developing. Although it has the potential to be much more flexible and informative\, multipersistence has structural issues that present fundamental mathematical challenges. There is no consensus on what might be meant by a “bar”\, let alone “the top 100 bar lengths”. This talk recalls the basics of single and multiparameter persistent homology and discusses some of the mathematical issues\, including obstacles and potential routes forward. \n11:45 am–1:15 pm\nLunch Break \n1:15–2:15 pm\nSpeaker: Kei Kobayashi\, Keio University\nTitle: Metric Transformations of Data Spaces: Curvature Control and Related Developments\nAbstract: We present our proposed method of increasing the accuracy of data analysis by means of two transformations of the metric of the data space. The first transformation is based on the curve length defined by the integral of the power of the density function\, which can be computed approximately using an empirical graph; the second transformation can be interpreted as the extrinsic distance when the data space is embedded in a metric cone. The advantage of both distance transformations is that the hyperparameters allow the curvature to be monotonically transformed in a specific sense. Some statistical applications of these transformations and theoretical justifications are presented. Detailed analyses of the geodesics obtained by this method for several simple probability distributions will also be presented. The main part of this work is based on joint works with Henry P. Wynn. \n2:15–2:45 pm\nbreak with refreshments \n2:45–3:45 pm\nSpeaker: Sungkyu Jung\, Seoul National University\nTitle: Generalized Frechet means with random minimizing domains and its strong consistency\nAbstract: In this talk\, I will discuss a novel extension of Frechet means\, referred to as generalized  Frechet  means\, as a comprehensive framework for describing the characteristics of random elements. The generalized Frechet mean is defined as the minimizer of a cost function\, and the framework encompasses various extensions of Frechet means that have appeared in the literature. The most distinctive feature of the proposed framework is that it allows the domain of minimization for the empirical generalized Frechet means to be random and different from that of its population counterpart. This flexibility broadens the applicability of the Frechet mean framework to various statistical scenarios\, including sequential dimension reduction for non-Euclidean data. We establish a strong consistency theorem for generalized Frechet means. Applications such as verifying the consistency of principal geodesic analysis on the hypersphere\, compositional principal component analysis on the composition space\, and k-medoids clustering for data on a metric space will be discussed. \n3:45–4:00 pm\nbreak \n4:00–5:00 pm\nSpeaker: Rong Ma\, Harvard University\nTitle: Modern Nonlinear Embedding Methods Unpacked\nAbstract: Learning and representing low-dimensional structures from noisy\, high-dimensional data is a cornerstone of modern data science. Stochastic neighbor embedding algorithms\, a family of nonlinear dimensionality reduction and data visualization methods\, with t-SNE and UMAP as two leading examples\, have become very popular in recent years. Yet despite their wide applications\, these methods remain subject to points of debate\, including limited theoretical understanding\, ambiguous interpretations\, and sensitivity to tuning parameters. In this talk\, I will present our recent efforts to decipher and improve these nonlinear embedding approaches. Our key results include a rigorous theoretical framework that uncovers the intrinsic mechanisms\, large-sample limits\, and fundamental principles underlying these algorithms; a set of theory-informed practical guidelines for their principled use in trustworthy biological discovery; and a collection of new algorithms that address current limitations and improve performance in areas such as bias reduction and stability. Throughout the talk\, I will highlight how these advances not only deepen our theoretical understanding but also open new avenues for scientific discovery.
URL:https://cmsa.fas.harvard.edu/event/geostat_2025/
LOCATION:CMSA 20 Garden Street Cambridge\, Massachusetts 02138 United States
CATEGORIES:Conference
ATTACH;FMTTYPE=image/jpeg:https://cmsa.fas.harvard.edu/media/Geostat.3-scaled.jpg
END:VEVENT
END:VCALENDAR