BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//CMSA - ECPv6.15.18//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-WR-CALNAME:CMSA
X-ORIGINAL-URL:https://cmsa.fas.harvard.edu
X-WR-CALDESC:Events for CMSA
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20230312T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20231105T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20240310T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20241103T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20250309T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20251102T060000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20240903T090000
DTEND;TZID=America/New_York:20241101T170000
DTSTAMP:20260418T232944
CREATED:20240105T033600Z
LAST-MODIFIED:20250305T175957Z
UID:10001112-1725354000-1730480400@cmsa.fas.harvard.edu
SUMMARY:Mathematics and Machine Learning Program
DESCRIPTION:Mathematics and Machine Learning Program \nDates: September 3 – November 1\, 2024 \nLocation: Harvard CMSA\, 20 Garden Street\, Cambridge\, MA 0213 \nMachine learning and AI are increasingly important tools in all fields of research. Recent milestones in machine learning for mathematics include data-driven discovery of theorems in knot theory and representation theory\, the discovery and proof of new singular solutions of the Euler equations\, new counterexamples and lower bounds in graph theory\, and more. Rigorous numerical methods and interactive theorem proving are playing an important part in obtaining these results. Conversely\, much of the spectacular progress in AI has a surprising simplicity at its core. Surely there are remarkable mathematical structures behind this\, yet to be elucidated. \nThe program will begin and end with two week-long workshops\, and will feature focus weeks on number theory\, knot theory\, graph theory\, rigorous numerics in PDE\, and interactive theorem proving\, as well as a course on geometric aspects of deep learning.\n\n  \nSeptember 3–5\, 2024: Opening Workshop: AI for Mathematicians\, with Leon Bottou\, François Charton\, David McAllester\, Adam Wagner and Geordie Williamson.   A series of six lectures covering logic and theorem proving\, AI methods\, theory of machine learning\, two lectures on case studies in math-AI\, and a lecture and discussion on open problems and the ethics of AI in science.\nOpening Workshop Youtube Playlist \n\nSeptember 6–7\, 2024: Big Data Conference \n  \nSeptember 9–13\, 2024: Applying Machine Learning to Math\, with François Charton and Geordie Williamson\nPublic Lecture September 12\, 2024: Geordie Williamson\, University of Sydney: Can AI help with hard mathematics? (Youtube link)\nThe focus of this week will be on practical examples and techniques for the mathematics researcher keen to explore or deepen their use of AI techniques. We will have talks showcasing easily stated problems\, on which machine learning techniques can be employed profitably. These provide excellent toy examples for generating intuition. We will also have expert talks on some of the technical subtleties which arise. There are several instances where the accepted heuristics emerging from the study of large language models (LLM) and image recognition don’t appear to apply on mathematics problems\, and we will try to highlight these subtleties.\nApplying Machine Learning to Math Youtube Playlist \n  \nSeptember 16–20\, 2024: Number theory\, with Drew Sutherland\nThe focus of this week will be on the use of ML as a tool for finding and understanding statistical patterns in number-theoretic datasets\, using the recently discovered (and still largely unexplained) “murmurations” in the distribution of Frobenius traces in families of elliptic curves and other arithmetic L-functions as a motivating example.\nNumber Theory Youtube Playlist \n  \nSeptember 23–27\, 2024: Knot theory\, with Sergei Gukov\nKnot theory is a great source of labeled data that can be synthetically generated. Moreover\, many outstanding problems in knot theory and low-dimensional topology can be formulated as decision and classification tasks\, e.g. “Is the knot 123_45 slice?” or “Can two given Kirby diagrams be related by a sequence of Kirby moves?” During this focus week we will explore various ways in which AI can be applied to problems in knot theory and how\, based on these applications\, mathematical reasoning can advance development of AI algorithms. Another goal will be to develop formal knot theory libraries (e.g. contributions to mathlib) and to apply AI models to formal proof systems\, in particular in the context of knot theory.\nKnot Theory Youtube Playlist \n  \nSeptember 30: Teaching and Machine Learning Panel Discussion\, 3:30-5:30 pm ET \n  \nSeptember 30–October 4\, 2024: Graph theory and combinatorics\, with Adam Wagner\nThis week\, we will consider how machine learning can help us solve problems in combinatorics and graph theory\, broadly interpreted\, in practice. The advantage of these fields is that they deal with finite objects that are simple to set up using computers\, and programs that work for one problem can often be adapted to work for several other related problems as well. Many times\, the best constructions for a problem are easy to interpret\, making it simpler to judge how well a particular algorithm is performing. On the other hand\, there are lots of open conjectures that are simple to state\, for which the best-known constructions are counterintuitive\, making it perhaps more likely that machine learning methods can spot patterns that are difficult to understand otherwise.\nGraph Theory and Combinatorics Youtube Playlist \n  \nOctober 7–11\, 2024: More number theory\, with Drew Sutherland\nThe focus of this week will be on the use of AI as a tool to search for and/or construct interesting or extremal examples in number theory and arithmetic geometry\, using LLM-based genetic algorithms\, generative adversarial networks\, game-theoretic methods\, and heuristic tree pruning as alternatives to conventional local search strategies.\nMore Number Theory Youtube Playlist \n  \nOctober 14 –18\, 2024: Interactive theorem proving\nThis week we will discuss the use of interactive theorem proving systems such as Lean\, Coq and Isabelle in mathematical research\, and AI systems which prove theorems and translate between informal and formal mathematics.\nInteractive Theorem Proving Youtube Playlist \n  \nOctober 21–25\, 2024: Numerical Partial Differential Equations (PDE)\, with Tristan Buckmaster and Javier Gomez-Serrano\nThe focus of this week will be on constructing solutions to partial differential equations and dynamical systems (finite and infinite dimensional) more broadly defined. We will discuss several toy problems and comment on issues like sampling strategies\, optimization algorithms\, ill-posedness\, or convergence. We will also outline strategies about further developing machine-learning findings and turn them into mathematical theorems via computer-assisted approaches.\nNumerical PDEs Youtube Playlist \n  \nOctober 28–Nov. 1\, 2024: Closing Workshop: The closing workshop will provide a forum for discussing the most current research in these areas\, including work in progress and recent results from program participants.\nMath and Machine Learning Closing Workshop Youtube Playlist \n  \nSeptember 3–Nov. 1: Graduate topics in deep learning theory (Boston College) taught by Eli Grigsby\, held at the CMSA Tuesdays and Thursdays 2:30–3:45 pm Eastern Time. Course website (link).\nGraduate Topics in Deep Learning Youtube Playlist \nCourse description: This is a course on geometric aspects of deep learning theory. Broadly speaking\, we’ll investigate the question: How might human-interpretable concepts be expressed in the geometry of their data encodings\, and how does this geometry interact with the computational units and higher-level algebraic structures in various parameterized function classes\, especially neural network classes? During the portion of the course Sep. 3-Nov. 1\, the course will be presented as part of the Math and Machine Learning program at the CMSA in Cambridge. During that portion\, we will focus on the current state of research on mechanistic interpretability of transformers\, the architecture underlying large language models like Chat-GPT. \n\n\n\n\nPrerequisites: This course is targeted to graduate students and advanced undergraduates in mathematics and theoretical computer science. No prior background in machine learning or learning theory will be assumed\, but I will assume a degree of mathematical maturity (at the level of–say—the standard undergraduate math curriculum+ first-year graduate geometry/topology sequence)\n\n\n\n\n\nProgram Organizers \n\nFrancois Charton (Meta AI)\nMichael R. Douglas (Harvard CMSA)\nMichael Freedman (Harvard CMSA)\nFabian Ruehle (Northeastern)\nGeordie Williamson (Univ. of Sydney)\n\n\nProgram Schedule  \nMonday\n10:30–noon\nOpen Discussion\nRoom G10 \n12:00–1:30 pm\nGroup lunch\nCMSA Common Room \nTuesday\n2:30–3:45 pm\nTopics in deep learning theory\nRoom G10 \n4:00–5:00 pm\nOpen Discussion/Tea\nCMSA Common Room \nWednesday\n10:30 am–12:00 pm\nOpen Discussion\nRoom G10 \n2:00–3:00 pm\nNew Technologies in Mathematics Seminar\nRoom G10 \nThursday\n2:30–3:45 pm\nTopics in deep learning theory\nRoom G10 \nFriday\n10:30 am–12:00 pm\nOpen Discussion\nRoom G10 \n\nHarvard CMSA thanks Mistral AI for a generous donation of computing credit.
URL:https://cmsa.fas.harvard.edu/event/mml2024/
LOCATION:CMSA Room G10\, CMSA\, 20 Garden Street\, Cambridge\, MA\, 02138\, United States
CATEGORIES:Event,Programs
ATTACH;FMTTYPE=image/jpeg:https://cmsa.fas.harvard.edu/media/Machine-Learning-Program-poster-1.jpg
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20240906T090000
DTEND;TZID=America/New_York:20240907T170000
DTSTAMP:20260418T232944
CREATED:20240325T141950Z
LAST-MODIFIED:20250415T154033Z
UID:10003287-1725613200-1725728400@cmsa.fas.harvard.edu
SUMMARY:Big Data Conference 2024
DESCRIPTION:  \n \nYoutube Playlist \nOn September 6-7\, 2024\, the CMSA hosted the tenth annual Conference on Big Data. The Big Data Conference features speakers from the Harvard community as well as scholars from across the globe\, with talks focusing on computer science\, statistics\, math and physics\, and economics. \nLocation: Harvard University CMSA\, 20 Garden Street\, Cambridge & via Zoom \n  \nSpeakers: \n\nTianxi Cai\, Harvard Chan School\nRaj Chetty\, Harvard\nBianca Dumitrascu\, Columbia\nBoris Hanin\, Princeton\nPeter Hull\, Brown\nJamie Morgenstern\, U Washington\nKavita Ramanan\, Brown\nNeil Thompson\, MIT\nMelanie Weber\, Harvard\nKun-Hsing Yu\, Harvard Medical School\n\nOrganizers: \n\nRediet Abebe\, Harvard Society of Fellows\nMorgane Austern\, Harvard University Statistics\nMichael R. Douglas\, Harvard CMSA\nYannai Gonczarowski\, Harvard University Economics and Computer Science\nSam Kou\, Harvard University Statistics\n\nSCHEDULE (downloadable pdf) \nFriday\, Sep. 6\, 2024 \n9:00 am: Breakfast \n9:30 am: Introductions \n9:45–10:45 am\nSpeaker: Peter Hull\, Brown University\nTitle: Measuring Discrimination in Multi-Phase Systems\, with an Application to Child Protection\nAbstract: Large racial disparities have been documented in many high-stakes settings—such as employment\, health care\, housing\, and criminal justice—raising concerns of discrimination by individual decision-makers. At the same time\, there is growing understanding that a focus on individual decisions can yield an incomplete view of discrimination; an extensive theoretical literature shows how discrimination can arise and compound across multiple decision-makers in interconnected systems. We develop new empirical tools for studying discrimination in such multi-phase systems and apply them to the setting of foster care placement by child protective services. Leveraging the quasi-random assignment of two sets of decision-makers—initial hotline call screeners and subsequent investigators—we study how unwarranted racial disparities arise and propagate through this system. Using a sample of over 200\,000 maltreatment allegations\, we find that calls involving Black children are 55% more likely to result in foster care placement than calls involving white children with the same potential for future maltreatment in the home. Call screeners account for up to 19% of this unwarranted disparity\, with the remainder due to investigators. Unwarranted disparity is concentrated in cases with potential for future maltreatment\, suggesting that white children may be harmed by “underplacement” in high-risk situations. \n10:45–11:00 am: Break \n11:00 am –12:00 pm\nSpeaker: Jamie Morgenstern\, U Washington\nTitle: What governs predictive disparity in modern machine learning applications?\nAbstract: The deployment of statistical models in impactful environments is far from new—simple correlations have been used to guide decisions throughout the sciences\, health care\, political campaigns\, and in pricing financial instruments and other products for decades. Many such models\, and the decisions they supported\, were known to have different degrees of predictive power for different demographic groups. These differences had numerous sources\, including: limited expressiveness of the statistical models; limited availability of data from marginalized populations; noisier measurements of both features and targets from certain populations; and features with less mutual information about the prediction target for some populations than others.\nModern decision systems which use machine learning are more ubiquitous than ever\, as are their differences in performance for different populations of people. In this talk\, I will discuss some similarities and differences in the sources of differing performance in contemporary ML systems including facial recognition systems and those incorporating generative AI. \n12:00–1:30 pm: Lunch Break \n1:30–2:30 pm\nSpeaker: Kavita Ramanan\, Brown University\nTitle: Understanding High-dimensional Stochastic Dynamics on Realistic Networks\nAbstract: Large collections of randomly evolving particles that interact locally with respect to an underlying network model a variety of phenomena ranging from magnetism\, the spread of diseases\, neural and neuronal networks\, opinion dynamics and load balancing on computer networks. Due to their high-dimensional nature\, these systems are typically intractable to analyze exactly. Classical work\, falling under the rubric of mean-field approximations\, has mostly focused on the case when this interaction graph is dense.  However\, most real-world networks are sparse and often random. We describe a new approach to develop principled approximations for dynamics on realistic networks that beats the curse of dimensionality\, and illustrate its efficacy on a class of epidemiological models. This is based on joint works with Michel Davydov\, Ankan Ganguly and Juniper Cocomello. \n2:30–2:45 pm: Break \n2:45–3:45 pm\nSpeaker: Raj Chetty\, Harvard University\nTitle: The Science of Economic Opportunity: New Insights from Big Data\nAbstract: How can we improve economic opportunities for children growing up in low-income families? This talk will present findings from a recent set of studies that use various sources of big data — ranging from anonymized tax records to social network data — to understand the science of economic opportunity. Among other topics\, the talk will discuss how and why children’s chances of climbing the income ladder vary across neighborhoods\, the drivers of racial disparities in economic mobility\, how highly selective colleges may amplify the persistence of privilege\, and the role of social capital as a driver of upward mobility. The talk will conclude by giving examples of how academic research using big data is informing policy decisions from the local to federal level to expand opportunities for all. \n3:45–4:00 pm: Break \n4:00–5:00 pm\nSpeaker: Neil Thompson\nTitle: How Algorithmic Progress is driving progress in Big Data and AI\nAbstract: Algorithm improvement is one of the purest forms of innovation: it allows the same computational task to be achieved with far fewer resources by proposing clever new ways to do that computation. In this talk\, I will discuss the work that my lab has done tracking and quantifying progress across decades of algorithm research and practice. As I will show\, this algorithmic progress has often outpaced hardware improvement as the most important driver of progress in Big Data and AI. \n  \nSaturday\, Sep. 7\, 2024 \n9:00 am: Breakfast \n9:30 am: Introductions \n9:45–10:45 am\nSpeaker: Tianxi Cai\, Harvard Chan School\nTitle: Crowdsourcing with Multi-institutional EHR to Improve Reliability of Real World Evidence – Opportunities and Challenges\nAbstract: The wide adoption of electronic health records (EHR) systems has led to the availability of large clinical datasets available for discovery research. EHR data\, linked with bio- repository\, is a valuable new source for deriving real-word\, data-driven prediction models of disease risk and progression. Yet\, they also bring analytical difficulties especially when aiming to leverage multi-institutional EHR data. Synthesizing information across healthcare systems is challenging due to heterogeneity and privacy. Statistical challenges also arise due to high dimensionality in the feature space. In this talk\, I’ll discuss analytical approaches for mining EHR data to improve the reliability and generalizability of real world evidence generated from the analyses. These methods will be illustrated using EHR data from Mass General Brigham and Veteran Health Administration. \n10:45–11:00 am: Break \n11:00 am–12:00 pm\nSpeaker: Bianca Dumitrascu\, Columbia Data Science Institute\nTitle: Statistical machine learning for learning representations of embryonic development\nAbstract: During embryonic development\, single cells read in local information from their environments and use this information to move\, divide and specialize. As a result\, the environments themselves change.  However\, it remains unclear how gene expression programs interact with cell morphology and mechanical forces to orchestrate organogenesis in early embryos. Recent advances in single cell techniques and in toto imaging enable unique venues in exploring this link between genomics and biophysics\, which dynamically maps cells to organisms.\nIn this talk\, I will describe statistical machine learning frameworks aimed at understanding how tissue level mechanical and morphometric information impact gene expression patterns in spatio-temporal contexts. We use these tools to understand boundary formation in the early development of mouse embryos and to align data from light sheet recordings of pre-gastrulation development. \n12:00–1:30 pm: Lunch Break \n1:30–2:30 pm\nSpeaker: Melanie Weber\, Harvard Mathematics\nTitle: Data and Model Geometry in Deep Learning\nAbstract: Data with geometric structure is ubiquitous in machine learning. Often such structure arises from fundamental symmetries in the domain\, such as permutation-invariance in graphs and sets\, and translation-invariance in images. In this talk we discuss implications of this structure on the design and complexity of neural networks. Equivariant architectures\, which encode symmetries as inductive bias\, have shown great success in applications with geometric data\, but can suffer from instabilities as their depths increases. We propose a new architecture based on unitary group convolutions\, which allows for deeper networks with less instability. In the second part of the talk we discuss the impact of data and model geometry on the learnability of neural networks. We discuss learnability in several geometric settings\, including equivariant neural networks\, as well as learnability with respect to the geometry of the input data manifold. \n2:30–2:45 pm: Break \n2:45–3:45 pm\nSpeaker: Boris Hanin\, Princeton University\nTitle: Scaling Limits of Neural Networks\nAbstract: Neural networks are often studied analytically through scaling limits: regimes in which taking some structural network parameters (e.g. depth\, width\, number of training datapoints\, and so on) to infinity results in simplified models of learning. I will motivative and discuss recent results using several such approaches. I will emphasize both new theoretical insights into how model\, training data\, and optimizer impact learning and their practical implications for hyperparameter transfer. \n3:45–4:00 pm: Break \n4:00–5:00 pm\nSpeaker: Kun-Hsing Yu\, Harvard Medical School\nTitle: Foundation Models for Real-Time Cancer Diagnosis\nAbstract: Artificial intelligence (AI) is transforming the landscape of medical research and practice. Recent advances in microscopic image digitization\, foundation models\, and scalable computing infrastructure have opened new avenues for AI-enhanced cancer diagnosis. In this talk\, I will highlight recent breakthroughs in multi-modal AI systems for cancer pathology evaluation\, discuss integrative biomedical informatics methods that link cell morphology with molecular profiles\, and outline critical challenges in developing robust medical AI systems. \n  \n\nInformation about the 2023 Big Data Conference can be found here.
URL:https://cmsa.fas.harvard.edu/event/bigdata_2024/
LOCATION:20 Garden Street\, Cambridge\, MA 02138\, MA\, MA\, 02138\, United States
CATEGORIES:Big Data Conference,Conference,Event
ATTACH;FMTTYPE=image/png:https://cmsa.fas.harvard.edu/media/Big-Data-2024_8.5x11-1.png
END:VEVENT
END:VCALENDAR