Automated Author ProfileGuan, Yongtao
Guan, Yongtao
Current S-Index
Sum of Dataset Indices for all datasets
Average Dataset Index per Dataset
Average Dataset Index per dataset
Total Datasets
Total datasets for this author
Average FAIR Score
Average FAIR Score per dataset
Total Citations
Total citations to the author's datasets
Total Mentions
Total mentions of the author's datasets
S-Index Interpretation
The S-Index (Sharing Index) is a comprehensive metric that represents the cumulative impact of all your datasets. It is calculated as the sum of Dataset Index scores across all your claimed datasets.
What it means:
- A higher S-index indicates greater overall impact of your datasets relative to typical datasets in their fields of research
- The S-Index grows as you add more datasets or as existing datasets gain more citations and mentions
- It provides a single number to track your research data impact over time
Current S-Index: 13.1 (sum of 14 datasets Dataset Index scores)
More information here.
S-Index Over Time
Cumulative Citations Over Time
Cumulative Mentions Over Time
Datasets
We model spatially and temporally indexed point process data as a multi-level log-Gaussian Cox process where the log intensity function depends on a partially linear single-index structure of spatio-temporal covariates and three latent functional random effects representing the spatial and temporal random effects as well as their interactions. We assume that the latent functional effects are Gaussian processes with Karhunen-Loève representations, and model the unknown link function of the single-index as well as the covariance functions of the latent functional effects as splines. We propose to estimate the partially linear coefficients and the single-index link function using a Poisson maximum likelihood method, and the covariance functions of the latent processes using maximum composite likelihood methods. We also propose approaches to predict the functional principal component scores. Under the multi-level dependence structure and allowing the spatio-temporal covariates to be non-stationary, the proposed estimators follow rather unconventional convergence rates which depend on both the number of locations and the number of repeated measures in time. We illustrate the proposed methods through simulation studies and a real-data application in modeling bike-sharing events.
Authors
- Huang, Kun ;
- Chen, Xian ;
- Guan, Yongtao ;
- Li, Yehua
We model spatially and temporally indexed point process data as a multi-level log-Gaussian Cox process where the log intensity function depends on a partially linear single-index structure of spatio-temporal covariates and three latent functional random effects representing the spatial and temporal random effects as well as their interactions. We assume that the latent functional effects are Gaussian processes with Karhunen-Loève representations, and model the unknown link function of the single-index as well as the covariance functions of the latent functional effects as splines. We propose to estimate the partially linear coefficients and the single-index link function using a Poisson maximum likelihood method, and the covariance functions of the latent processes using maximum composite likelihood methods. We also propose approaches to predict the functional principal component scores. Under the multi-level dependence structure and allowing the spatio-temporal covariates to be non-stationary, the proposed estimators follow rather unconventional convergence rates which depend on both the number of locations and the number of repeated measures in time. We illustrate the proposed methods through simulation studies and a real-data application in modeling bike-sharing events.
Authors
- Huang, Kun ;
- Chen, Xian ;
- Guan, Yongtao ;
- Li, Yehua
In this paper, we consider multivariate functional time series with a two-way dependence structure: a serial dependence across time points and a graphical interaction among the multiple functions within each time point. We develop the notion of dynamic weak separability, a more general condition than those assumed in literature, and use it to characterize the two-way structure in multivariate functional time series. Based on the proposed weak separability, we develop a unified framework for functional graphical models and dynamic principal component analysis, and further extend it to optimally reconstruct signals from contaminated functional data using graphical-level information. We investigate asymptotic properties of the resulting estimators and illustrate the effectiveness of our proposed approach through extensive simulations. We apply our method to hourly air pollution data that were collected from a monitoring network in China.
Authors
- Tan, Jianbin ;
- Liang, Decai ;
- Guan, Yongtao ;
- Huang, Hui
In this paper, we consider multivariate functional time series with a two-way dependence structure: a serial dependence across time points and a graphical interaction among the multiple functions within each time point. We develop the notion of dynamic weak separability, a more general condition than those assumed in literature, and use it to characterize the two-way structure in multivariate functional time series. Based on the proposed weak separability, we develop a unified framework for functional graphical models and dynamic principal component analysis, and further extend it to optimally reconstruct signals from contaminated functional data using graphical-level information. We investigate asymptotic properties of the resulting estimators and illustrate the effectiveness of our proposed approach through extensive simulations. We apply our method to hourly air pollution data that were collected from a monitoring network in China.
Authors
- Tan, Jianbin ;
- Liang, Decai ;
- Guan, Yongtao ;
- Huang, Hui
In this work, we study the event occurrences of individuals interacting in a network. To characterize the dynamic interactions among the individuals, we propose a group network Hawkes process (GNHP) model whose network structure is observed and fixed. In particular, we introduce a latent group structure among individuals to account for the heterogeneous user-specific characteristics. A maximum likelihood approach is proposed to simultaneously cluster individuals in the network and estimate model parameters. A fast EM algorithm is subsequently developed by using the branching representation of the proposed GNHP model. Theoretical properties of the resulting estimators of group memberships and model parameters are investigated under both settings when the number of latent groups G is over-specified or correctly specified. A data-driven criterion that can consistently identify the true G under mild conditions is derived. Extensive simulation studies and an application to a dataset collected from Sina Weibo are used to illustrate the effectiveness of the proposed methodology. Supplementary materials for this article are available online.
Authors
- Fang, Guanhua ;
- Xu, Ganggang ;
- Xu, Haochen ;
- Zhu, Xuening ;
- Guan, Yongtao
In this work, we study the event occurrences of individuals interacting in a network. To characterize the dynamic interactions among the individuals, we propose a group network Hawkes process (GNHP) model whose network structure is observed and fixed. In particular, we introduce a latent group structure among individuals to account for the heterogeneous user-specific characteristics. A maximum likelihood approach is proposed to simultaneously cluster individuals in the network and estimate model parameters. A fast EM algorithm is subsequently developed by using the branching representation of the proposed GNHP model. Theoretical properties of the resulting estimators of group memberships and model parameters are investigated under both settings when the number of latent groups G is over-specified or correctly specified. A data-driven criterion that can consistently identify the true G under mild conditions is derived. Extensive simulation studies and an application to a dataset collected from Sina Weibo are used to illustrate the effectiveness of the proposed methodology. Supplementary materials for this article are available online.
Authors
- Fang, Guanhua ;
- Xu, Ganggang ;
- Xu, Haochen ;
- Zhu, Xuening ;
- Guan, Yongtao
Mark-point dependence plays a critical role in research problems that can be fitted into the general framework of marked point processes. In this work, we focus on adjusting for mark-point dependence when estimating the mean and covariance functions of the mark process, given independent replicates of the marked point process. We assume that the mark process is a Gaussian process and the point process is a log-Gaussian Cox process, where the mark-point dependence is generated through the dependence between two latent Gaussian processes. Under this framework, naive local linear estimators ignoring the mark-point dependence can be severely biased. We show that this bias can be corrected using a local linear estimator of the cross-covariance function and establish uniform convergence rates of the bias-corrected estimators. Furthermore, we propose a test statistic based on local linear estimators for mark-point independence, which is shown to converge to an asymptotic normal distribution in a parametric n-convergence rate. Model diagnostics tools are developed for key model assumptions and a robust functional permutation test is proposed for a more general class of mark-point processes. The effectiveness of the proposed methods is demonstrated using extensive simulations and applications to two real data examples. Supplementary materials for this article are available online.
Authors
- Xu, Ganggang ;
- Zhang, Jingfei ;
- Li, Yehua ;
- Guan, Yongtao
Specification of a parametric model for the intensity function is a fundamental task in statistics for spatial point processes. It is, therefore, crucial to be able to assess the appropriateness of a suggested model for a given point pattern data set. For this purpose, we develop a new class of semi-parametric goodness-of-fit tests for the specified parametric first-order intensity, without assuming a full data generating mechanism that is needed for the existing popular Monte-Carlo tests. The proposed tests crucially rely on accurate nonparametric estimation of the second-order properties of a point process. To address this we propose a new nonparametric pair correlation function (PCF) estimator for clustered spatial point processes under some mild shape constraints, which is shown to achieve uniform consistency. The proposed test statistics are computationally efficient owing to closed-form asymptotic distributions and achieve the nominal size even for testing composite hypotheses. In practice, the proposed estimation and testing procedures provide effective tools to improve parametric intensity function modeling, which is demonstrated through extensive simulation studies as well as a real data analysis of street crime activity in Washington DC.
Authors
- Xu, Ganggang ;
- Liang, Chen ;
- Waagepetersen, Rasmus ;
- Guan, Yongtao
Specification of a parametric model for the intensity function is a fundamental task in statistics for spatial point processes. It is, therefore, crucial to be able to assess the appropriateness of a suggested model for a given point pattern data set. For this purpose, we develop a new class of semi-parametric goodness-of-fit tests for the specified parametric first-order intensity, without assuming a full data generating mechanism that is needed for the existing popular Monte-Carlo tests. The proposed tests crucially rely on accurate nonparametric estimation of the second-order properties of a point process. To address this we propose a new nonparametric pair correlation function (PCF) estimator for clustered spatial point processes under some mild shape constraints, which is shown to achieve uniform consistency. The proposed test statistics are computationally efficient owing to closed-form asymptotic distributions and achieve the nominal size even for testing composite hypotheses. In practice, the proposed estimation and testing procedures provide effective tools to improve parametric intensity function modeling, which is demonstrated through extensive simulation studies as well as a real data analysis of street crime activity in Washington DC.
Authors
- Xu, Ganggang ;
- Liang, Chen ;
- Waagepetersen, Rasmus ;
- Guan, Yongtao
Mark-point dependence plays a critical role in research problems that can be fitted into the general framework of marked point processes. In this work, we focus on adjusting for mark-point dependence when estimating the mean and covariance functions of the mark process, given independent replicates of the marked point process. We assume that the mark process is a Gaussian process and the point process is a log-Gaussian Cox process, where the mark-point dependence is generated through the dependence between two latent Gaussian processes. Under this framework, naive local linear estimators ignoring the mark-point dependence can be severely biased. We show that this bias can be corrected using a local linear estimator of the cross-covariance function and establish uniform convergence rates of the bias-corrected estimators. Furthermore, we propose a test statistic based on local linear estimators for mark-point independence, which is shown to converge to an asymptotic normal distribution in a parametric n-convergence rate. Model diagnostics tools are developed for key model assumptions and a robust functional permutation test is proposed for a more general class of mark-point processes. The effectiveness of the proposed methods is demonstrated using extensive simulations and applications to two real data examples. Supplementary materials for this article are available online.
Authors
- Xu, Ganggang ;
- Zhang, Jingfei ;
- Li, Yehua ;
- Guan, Yongtao