Upcoming Statistics Colloquia

April 19, 2019: Adric Peccarelli (Northern Illinois University)

2 - 3 p.m
. (DU 268): Information Theory and Data Compression

Abstract: This presentation will cover aspects of Information Theory with regard to data compression. First, information theory will be summarized, including the measure of Shannon Entropy. This will allow for the definition of the various forms of Renyi's Information Dimension along with special results for both discrete and continuous random variables. From this, the relation between fundamental limits of lossless and lossy data compression and information dimension will be described and shown how they are applied in different data settings.

April 26, 2019: Christopher Wikle, PhD (University of Missouri)

2 - 3 p.m
. (DU 268): Data in Motion - How Sports Analytics Are Changing the Game

Abstract: Statistics in sports extends well beyond the “stat sheet.” Recently, team sports like soccer, basketball, football, hockey, and baseball have seen a dramatic increase in data collection and monitoring. Exciting new data include GPS tracking on ball movement and player locations, as well as biophysical data of athletes that measure player fitness and fatigue. Together, these data can be used to inform individual and team strategy and to drive tactical decisions and player development. In this talk, I will introduce some of the different types of data that are being collected across different sports. Then, I will discuss how tools developed in animal movement, precision medicine, and artificial intelligence can be adapted to analyze sports data. I will conclude by describing opportunities for research and development in the area of quantitative sports analytics.

4 - 5 p.m. (DU 268): A Parsimonious “Deep” Approach for Efficient Implementation of Multiscale Spatio-Temporal Statistical Models Applied to Long-Lead Forecasting

Abstract: Spatio-temporal data are ubiquitous in engineering and the sciences, and their study is important for understanding and predicting a wide variety of processes. One of the chief difficulties in modeling spatial processes that change with time is the complexity of the dependence structures that must describe how such a process varies, and the presence of high-dimensional complex datasets and large prediction domains. It is particularly challenging to specify parameterizations for nonlinear dynamical spatio-temporal models that are simultaneously useful scientifically and efficient computationally. One potential parsimonious solution to this problem is a method from the dynamical systems and engineering literature referred to as an echo state network (ESN). ESN models use so-called reservoir computing to efficiently compute recurrent neural network (RNN) forecasts. Moreover, so-called "deep" models have recently been shown to be successful at predicting high-dimensional complex nonlinear processes, particularly those with multiple spatial and temporal scales of variability (such as we often find in spatio-temporal geophysical data). Here we introduce a deep ensemble ESN (D-EESN) model and use it within a hierarchical Bayesian framework that naturally accommodates non-Gaussian data types and multiple levels of uncertainty. The methodology is first applied to a data set simulated from a novel non-Gaussian multiscale Lorenz-96 dynamical system simulation model and then to a long-lead United States (U.S.) soil moisture forecasting application.

View past colloquia >>