2025
- Gao, T., Jin, J., Ke, Z. and Moryoussef, M. (2025).
A Comparison of DeepSeek and Other LLMs. (arXiv) - Prashanth, U.S., Deng, A., O'Brien, K., S V, J., Khan M.A., Borkar J., Choquette-Choo, C.A., Fuehne J.R., Biderman S., Ke, Z.K., Lee, K., Saphra, N. (2025).
Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon. (arXiv)
13th International Conference on Learning Representations (ICLR 2025).
- Chen, D., Ke, Z. and Zhang, S. (2025).
VALISE: A Robust Vertex Hunting Algorithm. (pdf)
Statistica Sinica (in press).
- Jin, J., Ke, Z., Tang, J. and Wang, J. (2025).
Network Goodness-of-Fit for the Block-model Family. (arXiv)
Journal of the American Statistical Association (minor revision).
2024
- Jin, J., Ke, Z., Luo, S. and Ma, Y. (2024).
Optimal Network Pairwise Comparison. (website, arXiv)
Journal of the American Statistical Association (to appear)
- Cai, T., Ke, Z. and Turner, P. (2024).
Testing High-dimensional Multinomials with Applications to Text Analysis. (website, pdf, supplement)
Journal of the Royal Statistical Society Series B, 86(4), 922-942.
- Jiang, Y. and Ke, Z. (2024).
Discussion on "Root and community inference on the latent growth process of a network". (website, pdf)
Journal of the Royal Statistical Society Series B, 86(4), 878-880.
- Ke, Z. and Wang, J. (2024).
Entry-wise Eigenvector Analysis and Improved Rates for Topic Modeling on Short Documents. (website, cover story, arXiv, pdf)
Mathematics, , 12(11), 1682.
- Ke, Z., Ji, P., Jin, J. and Li, W. (2024).
Recent Advances in Text Analysis. (website, arXiv, pdf)
Annual Review of Statistics and Its Application, , 11, 347-372.
- Ke, Z. and Wang, J. (2024).
Optimal Network Membership Estimation under Severe Degree Heterogeneity. (arXiv, pdf, supplement)
Journal of the American Statistical Association (to appear)
- Jin, J., Ke, Z., Moryoussef, M., Tang, J. and Wang, J. (2024).
Improved Algorithm and Bounds for Successive Projection. (arXiv, pdf, website)
12th International Conference on Learning Representations (ICLR 2024).
- Ke, Z., Liu, J. and Ma, Y. (2024).
Power of Knockoff: The Impact of Ranking Algorithm, Augmented Design, and Symmetric Statistic. (website, pdf)
Journal of Machine Learning Research, 25(3), 1-67.
- Jin, J., Ke, Z. and Luo, S. (2024).
Mixed Membership Estimation for Social Networks. (website, arXiv, pdf, supplement, code)
(An old title: Estimating Network Memberships by Simplex Vertex Hunting.)
Journal of Econometrics, 239(2), 105369.
- Ke, Z. and Wang, M. (2024).
Using SVD for Topic Modeling. (website, arXiv, pdf, code)
Journal of the American Statistical Association, 119(545), 434-449.
2023
- Chen, D., Jin, J. and Ke, Z.T. (2023).
Subject Clustering by IF-PCA and Several Recent Methods. (website)
Frontiers in Genetics, 14, 1166404.
- Jiang, Y. and Ke, Z. (2023).
Semi-supervised Community Detection via Structural Similarity Metrics. (arXiv, pdf, website)
11th International Conference on Learning Representations (ICLR 2023).
- Jin, J., Ke, Z., Turner, P. and Zhang, A. (2023).
Phase Transition for Detecting a Small Community in a Large Network. (arXiv, pdf, website)
11th International Conference on Learning Representations (ICLR 2023).
- Cammarata, L. and Ke, Z. (2023).
Power Enhancement and Phase Transitions for Global Testing of the Mixed Membership Stochastic Block Model. (website, arXiv, pdf, supplement, code)
Bernoulli, 29(3), 1741-1763.
- Ke, Z. and Jin J. (2023).
Special Invited Paper: The SCORE Normalization, Especially for Heterogeneous Network and Text Data. (website, arXiv, pdf)
Stat, 12(1), e545.
- Jin, J., Ke, Z., Luo, S. and Wang, M. (2023).
Optimal Estimation of the Number of Network Communities. (website, arXiv, pdf)
Journal of the American Statistical Association, 118(543), 2101-2116.
- Ke, Z., Ma, Y. and Lin, X. (2023).
Estimation of the Number of Spiked Eigenvalues in a Covariance Matrix by Bulk Eigenvalue Matching Analysis. (website, pdf, supplement, code)
Journal of the American Statistical Association, 118(541), 374-392.
2022
- Fan, J., Ke, Z., Liao, Y. and Neuhierl, A. (2022).
Structural Deep Learning in Conditional Asset Pricing. (SSRN, pdf)
- Ke, Z. and Wang, L. (2022).
A Comparison of Hamming Errors of Representative Variable Selection Methods. (website, arXiv, pdf)
10th International Conference on Learning Representations (ICLR 2022).
- Ji, P., Jin, J., Ke, Z. and Li, W. (2022).
Co-citation and Co-authorship Networks of Statisticians (with discussions). (website, pdf (with-supplement), pdf (with-discussions), data and code)
Journal of Business & Economic Statistics, 40(2), 469-485.
- Chen, D., Tashman, K., Palmer, D., Bloemendal, A., Neale, B, Roeder, K., Churchhouse, C. and Ke, Z. (2022).
A Data Harmonization Pipeline to Leverage External Controls and Boost Power in GWAS. (website, pdf, supplement)
Human Molecular Genetics, 31(3), 481-489.
- Jin, J., Ke, Z. and Luo, S. (2022).
Improvements on SCORE, Especially for Weak Signals. (website, pdf, code)
Sankhya A, 84(1), 127-162.
- Hu, Z., Ke, Z. and Liu, J. (2022).
Measurement Error Models: From Nonparametric Methods to Deep Neural Networks. (website, pdf, supplement, code)
Statistical Science, 37(4) 473-493.
- Huang, Y., Ke, Z. and Jin, J. (2022).
Allocation of COVID-19 Testing Budget on a Commute Network of Counties. (website, pdf, supplement, code)
Stat, 11(1), e441.
2018-2021
- Jin, J., Ke, Z. and Luo, S. (2021).
Optimal Adaptivity of Signed-Polygon Statistics for Network Testing. (website, pdf, supplement)
Annals of Statistics, 49(6), 3408-3433.
- Jin, J., Ke, Z. and Liang, J. (2021).
Sharp Impossibility Results for Hyper-graph Testing. (website, pdf, supplement)
Advances in Neural Information Processing Systems (NeurIPS 2021).
- Ke, Z., Shi, F. and Xia, D. (2019).
Community Detection for Hypergraph Networks via Regularized Tensor Power Iteration. (arXiv, pdf) - Ke, Z., Kelly, B. and Xiu, D. (2019).
Predicting Returns with Text Data. (SSRN, pdf) - Ke, Z., Xue, L. and Yang, F. (2019).
Diagonally Dominant Principal Component Analysis. (website, pdf, code)
Journal of Computational and Graphical Statistics, 29(3): 592-607.
- Duan, Y., Ke, Z. and Wang, M. (2019).
State Aggregation Learning from Markov Transition Data. (website, pdf, supplement)
33rd Conference on Neural Information Processing Systems (NeurIPS 2019).
- Jin, J., Ke, Z. and Luo, S. (2018).
Network Global Testing by Counting Graphlets. (website, pdf, supplement)
35th International Conference on Machine Learning (ICML 2018).
- Peng, R., Lee, H., Ke, Z., and Saunders, M. (2018).
Racial Disparities in Kidney Transplant Waitlist Appearance in Chicago: Is it Race or Place? (website, pdf)
Clinical Transplantation, 32(5): e13195.
- Ke, Z., Bose, K. and Fan, J. (2018).
Higher Moment Estimation for Elliptically-distributed Data: Is it Necessary to Use a Sledgehammer to Crack an Egg? (arXiv, pdf)
2017 and before
- Ke, Z. and Yang, F. (2017).
Covariate Assisted Variable Ranking. (arXiv, pdf) - Jin, J. and Ke, Z. (2017).
A Sharp Lower Bound for Mixed-Membership Estimation. (arXiv, pdf) - Jin, J., Ke, Z. and Wang, W. (2017).
Phase Transitions for High-dimensional Clustering and Related Problems. (website, pdf, supplement)
Annals of Statistics, 45(5), 2151-2189.
- Ke, Z. (2016).
Detecting Rare and Weak Spikes in Large Covariance Matrices. (arXiv, pdf) - Ji, P., Jin, J. and Ke, Z. (2016).
Discussion on "Statistical Modeling of Citation Exchange between Statistics Journals". (website, pdf)
Journal of the Royal Statistical Society: Series A, 179(1), 52-52.
- Jin, J. and Ke, Z. (2016).
Rare and Weak Effects in Large-scale Inference: Methods and Phase Diagrams. (website, pdf)
Editor's Invited Review, Statistica Sinica, 26(1), 1-34.
- Fan, J., Ke, Z., Liu, H. and Xia, L. (2015).
QUADRO: A Supervised Dimension Reduction Method via Rayleigh Quotient Optimization. (website, pdf, supplement)
Annals of Statistics, 43(4), 1493-1534.
- Ke, Z., Fan, J. and Wu, Y. (2015).
Homogeneity Pursuit. (website, pdf, supplement)
Journal of the American Statistical Association, 110(509):175-194.
- Fan, J. and Ke, Z. (2014).
Discussion: "A Significance Test for the Lasso". (website, pdf)
Annals of Statistics, 42(2): 483-492.
- Jin, J. and Ke, Z. (2014).
Discussion on "Multiscale Change Point Inference". (website, pdf)
Journal of the Royal Statistical Society: Series B, 76(3): 555-557.
- Ke, Z., Jin, J. and Fan, J. (2014).
Covariate Assisted Screening and Estimation. (website, pdf, supplement)
Annals of Statistics, 42(6): 2202-2242.