# Statistics & Probability

## Faculty with this focus

- Mihye Ahn
- Yinghan Chen
- Colin Grudzien
- Paul Hurtado
- Tomasz Kozubowski
- Anna Panorska
- Raul Rojas
- Andrey Sarantsev
- Grant Schissler
- Deena Schmidt
- Ilya Zaliapin

## The Science of Data Collection & Analysis

Statistics is the study of the collection, organization, analysis, interpretation, and presentation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments.

### Mihye Ahn

My research focuses on developing novel methodology to solve statistical problems raised from neuroimaging data, including fMRI, sMRI, DTI, and EEG. Generally, the functional neuroimaging data are of spatially and temporally high-dimensions. Analyzing these data includes various statistical topics: time series analysis, dimension reduction, classification, variable selection, longitudinal data analysis, covariance estimation, etc. I am also interested in variable selection methods for repeatedly measured data. I have collaborated with many scientists in various fields, including veterinary science, psychiatry, radiology, neurology, immunology, and biomedical engineering.

### Yinghan Chen

My research interests include Monte Carlo methods, Statistical computing, Bayesian analysis, Latent class models, Item response theory, and Longitudinal analysis. Currently, I'm working on developing sampling algorithms to conduct statistical inference in educational assessments and networks.

### Colin Grudzien

In physical applications, dynamical models and observational data play dual roles in uncertainty quantification, prediction and learning, each representing sources of incomplete and inaccurate information. In data rich problems, first-principle physical laws constrain the degrees of freedom of massive data sets, utilizing our prior insights to complex processes. Respectively, in data sparse problems, dynamical models fill spatial and temporal gaps in observational networks. However, many physical systems exhibit chaos and observations are thus required to update predictions where there is sensitivity to initial conditions and uncertainty in model parameters. Data assimilation broadly refers to the techniques used to combine the information from models and observations to produce an optimal estimate of a probability density or a test statistic. These techniques include methods from Bayesian inference, dynamical systems, numerical analysis and optimal control, among others. My research interests lie in this intersection, using dynamical and statistical tools to develop theory for, and study applications of, statistical learning algorithms in physical systems. My application interests include climate, geophysics and the electric grid.

### Paul Hurtado

I use techniques from the fields of dynamical systems, stochastic processes, probability and statistics to develop and analyze mathematical models of biological systems. I use those models to address questions that arise in population ecology, evolution, epidemiology (infectious diseases) and immunology. Recently, I've begun working with methods for fitting nonlinear dynamic models to time series data. Using these models as statistical models presents a number of challenges, as parameter estimators for these models are not guaranteed to be as statistically well-behaved as, for example, estimators for classical linear models. In addition to parameter estimation for dynamic models, I also use approximation methods that exploit the deeper connections between deterministic models and their stochastic counterparts, as these two modeling frameworks can both be useful in applications.

### Tomasz Kozubowski

My main research interests include theory and applications of stable, geometric stable, and other heavy-tail random variables and stochastic processes. A stable variable has the property of stability: the sum of n copies of X has the same type of distribution as X. More general notions of stability include cases when the number of variables n is itself a random variable and/or or when the variables are combined by operations other than adding. A heavy-tail random variable is one that has a non-negligible probability of resulting in a value relatively far from the center of the distribution. I have worked on applications of stable and related distributions in actuarial science, economics, financial mathematics, as well as other areas. My other research interests include computational statistics, characterizations of probability distributions, and stochastic simulation.

### Anna Panorska

My research interests include probability, statistics, stochastic modeling and interdisciplinary work. In particular, I study the limit theory for random and deterministic sums of random quantities and estimation for heavy tailed distributions. Stochastic modeling and interdisciplinary work cover finance and insurance, hydrology and water resources, atmospheric science and climate, environmental science and biostatistics. Current research projects include statistical estimation for heavy tailed hydrology data, climate and hydrological extremes in the US, and clean water issues in Nevada and California.

### Andrey Sarantsev

My research is in Stochastic Analysis, particularly stochastic differential equations, as well as long-term stability; and in Quantitative Finance and Actuarial Science, where I use Stochastic Analysis and Econometrics tools. I am also interested in other applications of Statistics and Probability, particularly Biology and Ecology.

### Grant Schissler

My research interests are driven by interdisciplinary problems, often in the biomedical domain. Recently, I've help to build statistical informatics tools that allow clinical researchers to interpret molecular data, on the scale of individual patients (aiming to conduct precision medicine). Common themes in the course of these projects include large-scale hypothesis testing, high dimensionality, massively-parallel computing, knowledgebase integration, multivariate statistics, Bayesian analysis, and clustering.

### Deena Schmidt

My research is driven by a desire to understand the roles of stochasticity, structure, and evolution in shaping the dynamics of biological systems. I develop and analyze mathematical models, combining methods from probability and statistics, dynamical systems, and random graph theory to shed light on biological issues while generating new mathematical questions. In particular, I study stochastic processes on networks with applications in neuroscience and stochastic models in genetics.

Currently, I am working on optimal reduction techniques for complex ion channel gating models, which can be represented as a stochastic (Markov) process on a graph. I am also looking at the relative contributions of network structure and node dynamics in determining the collective dynamics of a network, thinking specifically about neuronal networks involved in sleep-wake regulation. Lastly, I am broadly interested in mathematical and statistical applications in population and evolutionary genetics.

### Ilya Zaliapin

My research focuses on theoretical and applied statistical analysis of complex (non-linear) dynamical systems, with emphasis on spatio-temporal pattern formation and development of extreme events. Specifically, I work on multiscale methods of time series analysis, heavy-tailed random processes, and spatial statistics. This choice is predicated by the essential common properties of the observed complex systems: they tend to evolve in multiple spatio-temporal scales; and have observables that exhibit absence of characteristic size, long-range correlations in space-time, and not-negligible probability of assuming extremely large values. The underlying methods of analysis include those of hierarchical aggregation and its inverse - branching processes.

Examples of the observed systems relevant to my research include the Earth's lithosphere which generates destructive earthquakes, its atmosphere that produces El-Ninos, stock-markets subject to financial crashes, etc. My current applications and ongoing collaborations are in Solid Earth geophysics (seismology, geodynamics), climate dynamics, computational finance, biology, and hydrology.