## Upcoming Events For Statistics Seminar

## Tue Oct 23, 2018

## ACMS Statistics Seminar: Julie Bessac

###
3:30 PM - 4:30 PM

154 Hurley Hall

Julie Bessac

Argonne National Lab

**3:30 PM**

**154 Hurley Hall**

**1. Stochastic Simulation of Predictive Space-Time Scenarios of Wind Speed Using Observations and Physical Model Outputs**

** and**

**2. Stochastic Parameterization of Subgrid-Scale Velocity Enhancement of Sea Surface Fluxes**

We propose a statistical space-time model for predicting atmospheric wind speed based on deterministic numerical weather predictions (NWP) and historical measurements. We consider a Gaussian multivariate space-time framework that combines multiple sources of past physical model outputs and measurements along with NWP model predictions in order to produce a probabilistic wind speed forecast within the prediction window. The process is expressed hierarchically in order to facilitate the specification of cross-variances between the two datasets. We illustrate this strategy in wind speed forecast during several months in 2012 for a region near the Great Lakes in the United States.…

### Posted In: Statistics Seminar

## Tue Nov 13, 2018

## ACMS Statistics Seminar: Mengyang Gu

###
3:30 PM - 4:30 PM

154 Hurley Hall

Mengyang Gu

John Hopkins University

**3:30 PM**

**154 Hurley Hall**

**A Theoretical Framework Of The Scaled Gaussian Stochastic Process In Prediction and Calibration**

The Gaussian stochastic process (GaSP) is a useful technique for predicting nonlinear functional outcomes. The estimated mean function in a GaSP, however, can be far from the reality in terms of the L2 distance. This problem was widely observed in calibrating imperfect mathematical models using experimental data, when the discrepancy function is modeled as a GaSP. In this work, we study the theoretical properties of the scaled Gaussian stochastic process (S-GaSP), a new stochastic process to address the identifiability problem of the mean function in the GaSP model. We establish the explicit connection between the GaSP and S-GaSP through the orthogonal series representation. We show the predictive mean estimator in the S-GaSP calibration model converges to the reality at the same rate as the GaSP with the suitable choice of the regularization parameter and scaling parameter. We also show the calibrated mathematical model in the S-GaSP calibration converges to the one that minimizes the L2 loss between the reality and mathematical model with the same regularization and scaling parameters, whereas the GaSP model does not have this property. From the regularization perspective, the loss function from the S-GaSP calibration penalizes the native norm and L2 norm of the discrepancy function simultaneously, whereas the one from the GaSP calibration only penalizes the native norm of the discrepancy function. The predictive error from the S-GaSP matches well with the theoretical bound. The simulated data and real data of calibrating the geophysical model of the Kilauea Volcano will be presented concerning the performance of the studied approaches. Both the GaSP and S-GaSP calibration models are implemented in the “RobustCalibration” R Package on CRAN.…

### Posted In: Statistics Seminar

## Tue Nov 20, 2018

## ACMS Statistics Seminar: Min Yang

###
3:30 PM - 4:30 PM

101A Crowley Hall

Min Yang

University of Illinois at Chicago

**3:30 PM**

**101A Crowley Hall**

**On Data Reduction of Big Data**

The big data paradigm has drawn a significant amount of attention in recent years as costs of acquiring and storing data have plummeted. Instead, bottlenecks have been shifted to fast and in-depth analysis. However, this shift has created its own set of problems, the most obvious one is that large datasets are often computationally expensive to process. Proven statistical methods are no longer applicable with extraordinary large data sets due to computational limitations. A critical step in Big Data analysis is data reduction. In this presentation, I will review some existing approaches in data reduction and introduce a new strategy called information–based optimal subdata selection (IBOSS). Under linear and nonlinear models set up, theoretical results and extensive simulations demonstrate that the IBOSS approach is superior to other approaches in term of parameter estimation and predictive performance. The tradeoff between accuracy and computation cost is also investigated. When models are mis-specified, the performance of different data reduction methods are compared through simulation studies. Some ongoing research work as well as some open questions will also be discussed.…