主 题：Statistical Analysis of Big, Complex and Inhomogeneous data
主讲人：Colorado State University 王昊南教授
主办单位：统计研究中心 统计学院 科研处
Haonan Wang received his Ph.D. degree in statistics from the University of North Carolina at Chapel Hill in 2003. Currently, he is Professor of Statistics at Colorado State University. His research interests are in object oriented data analysis, functional dynamic modeling of neuron activities and spatial statistics.
In this talk, we consider two types of data from neuroscience: neuromorphology data and neuron activity data. First, we focus on data extracted from brain neuron cells of rodents and model each neuron as a data object with topological and geometric properties characterizing the branching structure, connectedness and orientation of a neuron. We define the notions of topological and geometric medians as well as quantiles based on newly-developed curve representations. In addition, we take a novel approach to define the Pareto medians and quantiles through a multi-objective optimization problem. In particular, we study two different objective functions which measure the topological variation and geometric variation respectively. Analytical solutions are provided for topological and geometric medians and quantiles, and in general, for Pareto medians and quantiles, the genetic algorithm is implemented. The proposed methods are demonstrated in a simulation study and are also applied to analyze a real data set of pyramidal neurons from the hippocampus. Next, we model the neuron spiking activity through nonlinear dynamical systems. We adapt the Volterra series expansion of an analytic function to account for the point-process nature of multiple inputs and a single output (MISO) in a neural ensemble. Our model describes the transformed spiking probability for the output as the sum of kernel-weighted integrals of the inputs. The kernel functions need to be identified and estimated, and both local sparsity (kernel functions may be zero on part of their support) and global sparsity (some kernel functions may be identically zero) are of interest. The kernel functions are approximated by B-splines and a penalized likelihood-based approach is proposed for estimation. Even for moderately complex brain functionality, the identification and estimation of this sparse functional dynamical model poses major computational challenges, which we address with big data techniques that can be implemented on a single, multi-core server. The performance of the proposed method is demonstrated using neural recordings from the hippocampus of a rat during open field tasks. This is the joint work with Dr. Sienkiewicz, Professor Breidt and Professor Song.