Seminar - Statistical Analysis in Large High-Dimensional Data with Network Structure

School of Mathematics and Statistics Research Seminar

Speaker: Dr Sandipan Roy
Time: Friday 2nd June 2017 at 01:00 PM - 02:00 PM
Location: Cotton Club, Cotton 350
Groups: "Mathematics" "Statistics and Operations Research"

Add to Calendar Add to your calendar

Abstract

New technological advancements have allowed collection of datasets of large volume and different levels of complexity. Many of these datasets have an underlying network structure. Networks are capable of capturing dependence relationship among a group of entities and hence analyzing these datasets unearth the underlying structural dependence among the individuals. Examples include gene regulatory networks, understanding stock markets, protein-protein interaction within the cell, online social networks etc.

We present two important aspects of large high-dimensional data with network structure. The first one focuses on a high-dimensional data with network structure that evolves over time. Examples of such data sets in- clude time course gene expression data, voting records of legislative bodies etc. The main task is to estimate the change-point as well as the network structures prior and post it. The network structures are obtained by l1-penalized optimization method and we establish a finite sample esti- mation error bound for the change-point in the high-dimensional regime. The other aspect that we examine is about parameter estimation in large heterogeneous data with network structure. Our primary goal is to de- velop efficient computational techniques based on random subsampling and parallelization to estimate the parameters. We show an application of our communication based parallel algorithm in a stochastic blockmodel with covariates. The performance of the algorithm is evaluated on syn- thetic data sets and compared with competing methods for blockmodel parameter estimation. We also illustrate the model on data from a Face- book social network enhanced with node covariate information.

Go backGo back to the seminar list