BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//CMSA - ECPv6.16.3//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-WR-CALNAME:CMSA
X-ORIGINAL-URL:https://cmsa.fas.harvard.edu
X-WR-CALDESC:Events for CMSA
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20210314T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20211107T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20220313T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20221106T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20230312T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20231105T060000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20220826T090000
DTEND;TZID=America/New_York:20220826T130000
DTSTAMP:20260616T145100
CREATED:20230705T044827Z
LAST-MODIFIED:20250328T145239Z
UID:10000058-1661504400-1661518800@cmsa.fas.harvard.edu
SUMMARY:Big Data Conference 2022
DESCRIPTION:On August 26\, 2022 the CMSA hosted our eighth annual Conference on Big Data. The Big Data Conference features speakers from the Harvard community as well as scholars from across the globe\, with talks focusing on computer science\, statistics\, math and physics\, and economics. \nThe 2022 Big Data Conference took place virtually on Zoom. \nOrganizers: \n\nScott Duke Kominers\, MBA Class of 1960 Associate Professor\, Harvard Business\nHorng-Tzer Yau\, Professor of Mathematics\, Harvard University\nSergiy Verstyuk\, CMSA\, Harvard University\n\nSpeakers: \n\nXiaohong Chen\, Yale\nMiles Cranmer\, Princeton\nJessica Jeffers\, University of Chicago\nDan Roberts\, MIT\n\nSchedule \n\n\n\n\n9:00 am\nConference Organizers\nIntroduction and Welcome\n\n\n9:10 am – 9:55 am\nXiaohong Chen\nTitle: On ANN optimal estimation and inference for policy functionals of nonparametric conditional moment restrictions \nAbstract:  Many causal/policy parameters of interest are expectation functionals of unknown infinite-dimensional structural functions identified via conditional moment restrictions. Artificial Neural Networks (ANNs) can be viewed as nonlinear sieves that can approximate complex functions of high dimensional covariates more effectively than linear sieves. In this talk we present ANN optimal estimation and inference on  policy functionals\, such as average elasticities or value functions\, of unknown structural functions of endogenous covariates. We provide ANN efficient estimation and optimal t based confidence interval for regular policy functionals such as average derivatives in nonparametric instrumental variables regressions. We also present ANN quasi likelihood ratio based inference for possibly irregular policy functionals of general nonparametric conditional moment restrictions (such as quantile instrumental variables models or Bellman equations) for time series data. We conduct intensive Monte Carlo studies to investigate computational issues with ANN based optimal estimation and inference in economic structural models with endogeneity. For economic data sets that do not have very high signal to noise ratios\, there are current gaps between theoretical advantage of ANN approximation theory vs inferential performance in finite samples.\nSome of the results are applied to efficient estimation and optimal inference for average price elasticity in consumer demand and BLP type demand. \nThe talk is based on two co-authored papers:\n(1) Efficient Estimation of Average Derivatives in NPIV Models: Simulation Comparisons of Neural Network Estimators\n(Authors: Jiafeng Chen\, Xiaohong Chen and Elie Tamer)\nhttps://arxiv.org/abs/2110.06763 \n(2) Neural network Inference on Nonparametric conditional moment restrictions with weakly dependent data\n(Authors: Xiaohong Chen\, Yuan Liao and Weichen Wang). \nView/Download Lecture Slides (pdf)\n\n\n10:00 am – 10:45 am\nJessica Jeffers\nTitle: Labor Reactions to Credit Deterioration: Evidence from LinkedIn Activity \nAbstract: We analyze worker reactions to their firms’ credit deterioration. Using weekly networking activity on LinkedIn\, we show workers initiate more connections immediately following a negative credit event\, even at firms far from bankruptcy. Our results suggest that workers are driven by concerns about both unemployment and future prospects at their firm. Heightened networking activity is associated with contemporaneous and future departures\, especially at financially healthy firms. Other negative events like missed earnings and equity downgrades do not trigger similar reactions. Overall\, our results indicate that the build-up of connections triggered by credit deterioration represents a source of fragility for firms.\n\n\n10:50 am – 11:35 am\nMiles Cranmer\nTitle: Interpretable Machine Learning for Physics \nAbstract: Would Kepler have discovered his laws if machine learning had been around in 1609? Or would he have been satisfied with the accuracy of some black box regression model\, leaving Newton without the inspiration to discover the law of gravitation? In this talk I will explore the compatibility of industry-oriented machine learning algorithms with discovery in the natural sciences. I will describe recent approaches developed with collaborators for addressing this\, based on a strategy of “translating” neural networks into symbolic models via evolutionary algorithms. I will discuss the inner workings of the open-source symbolic regression library PySR (github.com/MilesCranmer/PySR)\, which forms a central part of this interpretable learning toolkit. Finally\, I will present examples of how these methods have been used in the past two years in scientific discovery\, and outline some current efforts. \nView/Download Lecture Slides (pdf) \n\n\n11:40 am – 12:25 pm\nDan Roberts\nTitle: A Statistical Model of Neural Scaling Laws \nAbstract: Large language models of a huge number of parameters and trained on near internet-sized number of tokens have been empirically shown to obey “neural scaling laws” for which their performance behaves predictably as a power law in either parameters or dataset size until bottlenecked by the other resource. To understand this better\, we first identify the necessary properties allowing such scaling laws to arise and then propose a statistical model — a joint generative data model and random feature model — that captures this neural scaling phenomenology. By solving this model using tools from random matrix theory\, we gain insight into (i) the statistical structure of datasets and tasks that lead to scaling laws (ii) how nonlinear feature maps\, i.e the role played by the deep neural network\, enable scaling laws when trained on these datasets\, and (iii) how such scaling laws can break down\, and what their behavior is when they do. A key feature is the manner in which the power laws that occur in the statistics of natural datasets are translated into power law scalings of the test loss\, and how the finite extent of such power laws leads to both bottlenecks and breakdowns. \nView/Download Lecture Slides (pdf) \n \n\n\n12:30 pm\nConference Organizers\nClosing Remarks\n\n\n\n\n  \nInformation about last year’s conference can be found here.
URL:https://cmsa.fas.harvard.edu/event/big-data-conference-2022/
LOCATION:Virtual
CATEGORIES:Big Data Conference,Conference,Event
ATTACH;FMTTYPE=image/png:https://cmsa.fas.harvard.edu/media/Big-Data-2022_web.png
END:VEVENT
END:VCALENDAR