Spring 2017

Colloquia are scheduled from 2:30pm - 3:30pm on Thursday unless otherwise noted. Colloquium speakers give 50-minute presentations on various mathematical and statistical topics.

If you would like to meet with a speaker, please contact math@unr.edu to schedule a meeting. To receive email announcements about future talks and events, please subscribe to our email list! Simply send an email to sympa@lists.unr.edu with a blank subject line and the main body text 'subscribe mathstat-announce EmailAddress FirstName LastName'.

We look forward to your participation in our upcoming colloquia!

Colloquium Schedule
Feb 2 Ania Panorska UNR Statistics and Data Science Graduate Program: a Statistician's Perspective
Abstract for Statistics and Data Science Graduate Program: a Statistician's Perspective

As we develop graduate program in Statistics and Data Science (SDS) at UNR it is useful to take a good look at the leading SDS programs in the US. Considering the structure and fit within the university curriculum of the best SDS programs provides data about the national standards for such programs. The purpose of this talk is to present the vision, the underlying principles, and core curriculum of leading SDS programs including those in UC Berkeley, Stanford, the University of Washington, and UNC Chapel Hill. We will also present how these programs are integrated and placed within the universities they reside in. Further, we will discuss the interdisciplinary nature of the leading SDS programs including the role they play in educating students from all disciplines present on campus. Finally, we will propose solutions and ideas for our program based on the lessons learnt from the programs we researched.The talk should leave plenty of time for a discussion of the ideas and making plans for further development of a quality SDS program at UNR.

AB 635
Feb 9 Brandon Levin University of Chicago Congruences between modular forms
Abstract for Abstract...

The study of congruences between modular forms is a central question in modern number theory which goes back at least to the work of Ramanujan at the beginning of the 20th century. By way of examples, I will first introduce modular forms and their arithmetic counterparts, elliptic curves. In the proof of Fermat's Last Theorem, Wiles together with Taylor introduced a powerful method for constructing congruences between modular forms. I will discuss the relationship between Wiles' work and a 1987 conjecture of Serre about modular forms. Finally, I will describe recent progress on generalizations of these results to higher dimensions.

AB 635
Feb 17 (Fri.) Alfred Grant Schissler University of Arizona Gene set analysis of correlated, paired-sample transcriptome data to enable precision medicine
Abstract for Gene set analysis of correlated, paired-sample transcriptome data to enable precision medicine

I will discuss the development of correlation-adjusted analytics of paired-sample transcriptome data. The major emphasis will be on interdisciplinary science, including innovations in single-subject transcriptome (i.e., gene expression data) methodology for precision medicine. Traditional statistical approaches are largely unavailable in this setting due to prohibitive sample size and lack of independent replication. This leads one to rely on informatic devices including knowledgebase integration (e.g., gene set annotations) and external data sources (gene expression warehouses). Common statistical themes include multivariate statistics (such
as Mahalanobis distance and copulas) and large-scale significance testing. Briefly, I'll describe two projects that have led to the development of a clinically-relevant effect size of gene set (pathway) differential expression, the N-of-1-pathways Mahalanobis distance, and a hypothesis testing procedure that accounts for non-trivial, inter-genetic correlation. Time permitting, I will demonstrate an R implementation of the statistics and visualizations developed on real patient data.

SEM 234
Feb 21 (Tues) Xiang (Shawn) Zhan Fred Hutchinson Cancer Research Center A Small-sample Kernel Independence Test with Application to Microbiome Data
Abstract for A Small-sample Kernel Independence Test with Application to Microbiome Data

The human microbiome, refers to the full collection of genetic materials of all microbes that live in and on the human body, plays an important role in human health and diseases. To fully understand the role of microbiome in human health and diseases, researchers are increasingly interested in assessing the relationship between microbiome composition and host genomic data. The dimensionality of the data as well as complex relationships between microbiota and host genomic data pose considerable challenges for analysis. In this talk, I will present a novel statistical method for testing the global association between microbiome community composition data and multiple outcomes of interest. A kernel-based RV (KRV) coefficient has been proposed, which extends the Pearson-type correlation coefficient to capture more complicated relationships. When kernels are appropriately chosen, our KRV coefficient can also measure the statistical independence between two random vectors. Moreover, to accommodate the relative modest sample size in most current microbiome studies, we study the finite sample distribution of the KRV coefficient and implement a special test design to improve its small-sample performance. The KRV test is demonstrated with simulation studies and real data application.

CFA 153
Feb 24 (Fri) Sunyoung Shin University of Wisconsin-Madison Annotation Regression for Genome-Wide Association Studies
Abstract for Annotation Regression for Genome-Wide Association Studies

Although genome-wide association studies (GWAS) have been successful at identifying many disease-associated genetic variants, these studies are hampered by two obstacles. First, despite ever-increasing sample sizes, these studies are still underpowered for variants with weak effect sizes. Second, and more importantly, a large percentage of identified variants reside in non-coding regions, making them difficult to interpret. In this talk, I will propose a general regression framework utilizing functional annotation data in approaching the challenges. The annotation regression framework for GWAS (ARoG) is based on finite mixture of linear regression models where GWAS association measures are viewed as responses and functional annotations as predictors. This mixture framework addresses heterogeneity of effects of genetic variants by grouping them into clusters and high dimensionality of the functional annotations by enabling annotation selection within each cluster. The framework will be illustrated with computational experiments and analyses of schizophrenia data from Psychiatric Genomics Consortium. I will also discuss an extension of ARoG with multiple phenotypes (multiARoG), which jointly borrows information across phenotypes.

SEM 234
Mar 2 Nilabja Guha Texas A&M Bayesian approaches in inverse problems and uncertainty quantification
Abstract for Bayesian approaches in inverse problems and uncertainty quantification

Predictions related to physical systems governed by complex mathematical models depend on underlying model parameters. For example, prediction of oil production is strongly influenced by subsurface properties, such as permeability, porosity and other spatial fields. These spatial fields may be highly heterogeneous and vary over a rich hierarchy of scales. Given the observations from the system (possibly contaminated with errors), inference on the underlying parameter and its uncertainty constitutes the uncertainty quantification of the inverse problem. The inverse problem may be ill-posed. Bayesian methodology provides a natural framework for such problems by imposing regularization through prior distribution. Solution procedures use Markov Chain Monte Carlo (MCMC) or related methodology, where, for each of the proposed parameter values, we solve the underlying forward problem. The solution requires finite element or finite volume techniques. Because of the high computational cost in evaluating the forward models it is important to develop fast, scalable efficient methodology, without sacrificing accuracy.
We focus on various inverse problems and uncertainty quantification techniques. An inverse problem characterization and uncertainty quantification approach under asymmetric skewed error for heat equation is developed. Later, we consider the flow equation and pressure data where estimation of the underlying high dimensional permeability field is of main interest. Based on separable decomposition, we propose a novel MCMC method. Along with MCMC, we approximate the posterior by variational approximation. The convergence of the posterior solution and its approximation is also established.

AB 635
Mar 3 (Fri) Yinghan Chen University of Illinois at Urbana-Champaign Network Motif Detection and Q-matrix Estimation in Cognitive Diagnosis
Abstract for Network Motif Detection and Q-matrix Estimation in Cognitive Diagnosis

Network motifs are substructures that appear significantly more often in a given network than in random networks. Motif detection is crucial for discovering new characteristics in biological, developmental, and social networks. I will present a novel sequential importance sampling strategy to estimate subgraph frequencies and detect network motifs. The method is developed by sampling subgraphs sequentially node by node using a carefully chosen proposal distribution. The method generates subgraphs from a distribution close to uniform and performs better than competing methods. I will apply the method to real networks to demonstrate its performance.

Cognitive diagnosis models are partially ordered latent class models and are used to classify students into skill mastery profiles. The deterministic inputs, noisy "AND" gate model (DINA) is a popular psychometric model for cognitive diagnosis. Application of the DINA model requires content expert knowledge of a Q matrix, which maps the test item to its corresponding required attributes or skills. I will propose a Bayesian framework for estimating the DINA Q matrix. The proposed algorithms ensure that the estimated Q matrices always satisfy the identifiability constraints. I will present Monte Carlo simulations to support the accuracy of parameter recovery and apply our algorithms to Tatsuoka's fraction-subtraction dataset.

Mar 9 Jonathan Chávez-Casillas University of Calgary Price Dynamics in a Limit Order Book under time-dependent order flow
Abstract for Price Dynamics in a Limit Order Book under time-dependent order flow

In this talk we will introduce Limit Order Books and explain the intricate dynamics between the order flow and the price process. We will discuss some efforts to describe the price dynamics and how they have been generalized to capture some empirical properties observed in the data. We will then introduce a model that considers a time-dependent order flow and, under some assumptions, characterize the price process within this model. We will then finish by describing how this model fit some particular data under the given assumptions.

Mar 30 Galkande Premarathna Texas Tech University Classification of protein binding ligands using their structural dispersion
Abstract for Classification of protein binding ligands using their structural dispersion

It is known that a protein's biological function is in some way related to its physical structure. Many researchers have studied this relationship both for the entire backbone structures of proteins as well as their binding sites, which are where binding activity occurs. However, despite this research, it remains an open challenge to predict a protein's function from its structure. The main purpose of this research is to gain a better understanding of how structure relates to binding activity and to classify proteins according to function via structural information. We approach the problem from the dataset compiled by Kahraman et al (2007) and extended Kahraman dataset. There we calculated the covariance matrices of site's coordinates which use the distance of each atom to the center of mass and calculate the distance from an atom to the 1st, 2nd and 3rd principal axis. Then, we performed classification on these matrices using a variety of techniques, including nearest neighbor. Finally, we compared the performance of this model based technique with alignment based techniques.

AB 635
Mar 31 Hansapani Rodrigo University of South Florida Bayesian Artificial Intelligence Neural Networks in Modeling Nonlinear Poisson Regression
Abstract for Bayesian Artificial Intelligence Neural Networks in Modeling Nonlinear Poisson Regression

With the inspiration originated from biological neuron system, Artificial neural networks (ANN) models are efficiently used for nonlinear modeling. It has been shown that the Bayesian treatment of the ANN provides better prediction accuracies in regression modeling as it avoids the network overfitting associated with maximum likelihood approach. Moreover, Bayesian treatment can be used to identify the relative importance of predictor variables and to determine the effective model complexity utilizing the limited amount data in hand. By incorporating these Bayesian treatments, we have developed a novel nonlinear Poisson regression model using ANN assuming that the log of the expected value of the count responses is nonlinearly related with the predictors. The prediction accuracy of our proposed Poisson regression model has been valuated using a simulation study. We have planned to obtain the survival prediction of lung cancer patients, by extending our ANN model to create a piecewise constant hazard model.

DMSC 103
Apr 20 S. Rao Jammalamadaka UC Santa Barbara Gaps between Observations - What can one learn from them?
Abstract for Gaps between Observations - What can one learn from them?

This talk will provide an overview of some of the main ideas in the theory of spacings, i.e. the gaps between successive observations. After reviewing some basic properties of spacings, their use in testing statistical hypotheses and in estimating parameters will be discussed. Two-samples tests based on "spacings-frequencies" and their relationships to locally most powerful rank tests will be explored, as are some possible extensions to observations in higher dimensions.
About the speaker: After obtaining a Ph.D at the Indian Statistical Institute, Kolkata in 1969, the speaker taught at the Indian University and the University of Wisconsin before settling down at the University of California, Santa Barbara since 1976, where he is now a Distinguished Professor. He is a Fellow of the American Statistical Association and the Institute of Mathematical Statistics among others, and has received an Honorary Doctorate from Sweden recently. More details at: http://www.pstat.ucsb.edu/faculty/jammalam/

AB 635
May 12 Beau Smith UNR (Thesis Defense) Symmetry-breaking Perturbations on the Global Attractor of the Kuramoto-Sivashinsky Equation
Abstract for Symmetry-breaking Perturbations on the Global Attractor of the Kuramoto-Sivashinsky Equation

We study symmetry-breaking of solutions on the global attractor of the Kuramoto-Sivashinsky equation. In our theory we prove that trajectories which result from small perturbations of a point on the global attractor stay close to the global attractor. In our numerics we exhibit a choice of parameters for the Kuramoto-Sivashinsky equation such that every 2π-periodic initial condition (which is neither zero nor periodic on some smaller domain) converges to a traveling wave solution and such that every 4π-periodic initial condition con-verges to a distinctly different fixed point. Our main result is to compute a non-recurrent trajectory on the attractor, connecting the traveling wave to the fixed point, given as the limit of smaller and smaller symmetry breaking perturbations.

DMSC 102 (9:30am) Olson
May 23 Charles Amponsah UNR (Thesis Defense) Mixture Gamma Discrete Pareto Distributions
Abstract for Mixture Gamma Discrete Pareto Distributions

We study a four-parameter generalization of the of bivariate exponential geometric (BEG) law (Kozubowski and Panorska, 2005) and bivariate gamma geometric (BGG) law (Barreto-Souza, 2012). The new distribution is referred to as mixture gamma discrete Pareto (MGDP) law. The bivariate random vector (X;N) follow MGDP law if N has discrete Pareto distribution of type II and X is the sum of N i.i.d gamma random variables where N and X are independent. Our result include conditional and marginal distributions, joint integral transforms, Laplace transform, covariance matrix. We also study the problem of parameter estimation using maximum likelihood and simulation studies using MGDP distributions.

AB 108 (4pm) Kozubowski